A Final Statement – Network Routers

In technology I do so very much hate wastage. Wastage of the resources that go into manufacturing a product and wastage of time in dealing with low quality and unreliable products. Disruption to functioning networks and setups is also a real pain point in organizations and at home. Every device we upgrade whilst it (usually) has its pros the con is certainly the investment of time that goes into migrating your data and settings to it.

The endless upgrade-itis of technology.

Now don’t get me wrong, I like cool tech as much as the next sunlight averse geek  but upgrading for incremental or sometimes useless features is scandalous in my opinion.

Take for example your average home router. Most folks take the one given by their broadband provider. It will tend to be a rebadged and restricted model. Often a variant of those sold by the likes of Netgear, Huawei and Thomson. It will have less features than the manufacturer’s own version and will be locked down to run on only your broadband supplier’s network.

On top of that every 1-2 years consumers are shipped a new device by their provider, with the old device relegated to a dusty closet at best or landfill at worst. The cost of the device is never ‘free’. It’s of course included in your monthly broadband bill.

Stack of routers
This stack is not even exhaustive. There are at least four more routers from BT and various other broadband providers over the years which I couldn’t find for this photo shoot! Some of them were shipped to me despite telling my telco provider that I didn’t need one!

 

 

 

Some of us however try to improve the above state of affairs – those few of us who purchase aftermarket routers for their home networks. It’s a great shame that so few of us do so because the gains are not only less disruption to your home network as you’re forced to swap one router for the next, but the improved feature set, security and fuzzy warm goodness of an aftermarket device are significant.

Through the course of my life with tech I’ve been through many routers from many manufacturers. Seldom do I come across one where I think won’t need to buy another! A piece gear you know is so damn good you’ll never need upgrade again. At the risk of making a 640K level faux pax I declare that I have found a final statement when it comes to routers.

pfSense.

pfSense is without doubt the best router software I have ever used. It’s easy to use and sky’s the limit as far as what you’d like to do.

Want to block your teenager’s pesky Peer-to-Peer habits? No problem. Prioritize VoIP or web browsing? Consider it done. You can rate limit IPs on your local network, create whitelists, blacklists, have a local cache to save bandwidth so you don’t download the same file or image twice, you can even load balance across multiple WAN connections (which works damn near flawlessly) – the list of features just goes on and on.

pfsense webui config page
pfSense WebUI

There are all kinds of cool packages (86 at time of writing) which you can deploy with a single click. Packages like Snort which offers intrusion detection and prevention, monitors packets and looks for suspicious activity like incorrect HTTP headers, SSH and PDF exploits. Any sneaky traffic and Snort blocks the offending host for a defined period (or permanently), stopping any would be attackers in their tracks.

Proxy caching with the Squid package lets you cache images and downloaded files like updates for your iPhone, Android and Mac/Windows machines. No file needs to be downloaded more than once since Squid caches it and serves up the cached copy to subsequent clients – conserving your bandwidth if you’re on metered broadband as well as speeding up your internet browsing.

If you’ve ever wondered which machine’s hogging all your bandwidth wonder no more, there are at least three packages which let you monitor bandwidth and specifically bandwidth per client.

If you need a super-secure VPN you got it with L2TP/IPSec and OpenVPN offering client to server and server to server tunnels making it straightforward to setup connectivity between users and remote sites.

To cap it all off the basic core features are very comprehensive. There’s superb DNS functionality with DNS Caching and forwarding per domain with multiple DNS servers for fallback. There’s comprehensive VLAN support with per VLAN firewall rules, excellent DHCP support with per interface/VLAN DHCP services, DHCP reservations and DHCP options like NTP, TFTP and failover.

Everything is managed via a capable and stable Web GUI. No need for shell diving if you’re uncomfortable doing that and of course, lots and lots of monitoring options from SMART, to system temperatures, firewall traffic, DHCP status, VPN status and much, much more.

One great point about pfSense is that it’s a distribution. It runs on whatever hardware you have at hand. Or you can choose a compatible server to deploy it on. Keeping with the theme of final statements I chose the latter and picked up a Supermicro 5018A-FTN4.

Wow. This thing kills it. An eight core Intel Atom CPU, up to 64GB of RAM (ECC even!), space for 4x 2.5″ SSDs, IPMI and all in a form factor which consumes 20W at load and is pretty quiet as far as server kit goes (33 dbA).

Supermicro 5018A-FTN4 Supermicro 5018A-FTN4

It wasn’t cheap at $900 so if you don’t need this much power you can purchase an excellent two port, compact and fanless variant from pfSense direct for $300. I’d recommend the $50 SSD addon if you want to install packages like Squid, or Snort. The main difference between the various boxes is the amount of bandwidth each box can route, especially when using encrypted VPN connections. For home users the basic model should be more than enough to last you many, many years of solid use.

Supermicro 5018A-FTN4 Tweaks

If you’re using the Supermicro 5018A-FTN4 you might have to make the below modifications to get it to run. I originally deployed pfSense 2.1 and 2.2 is now out so the below might not all be necessary:

Preventing MBUF from maxing out:
It seems the I354 chipset can cause kernel panics. Borrowing from JeGr’s tip add the following lines to /boot/loader.conf.local:

kern.ipc.nmbclusters="131072"

Enabling TRIM for SSD:

Firstly use an Intel SSD. Not Samsung, not Crucial. Intel.

  • Login with SSH or locally and open a shell
  • Run /usr/local/sbin/ufslabels.sh
  • Add ahci_load=”YES” to /boot/loader.conf.local
  • Reboot

For some reason TRIM_set didn’t work for my pfSense gateway I had to:

  • Start pfSense in single user mode.
  • Run /sbin/tunefs -t enable /
  • Reboot

Once the machine has rebooted check the status with: tunefs -p /

[2.1-RELEASE][admin@pfSense.hemma]/root(1): tunefs -p /
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 disabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  2048
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: optimization preference: (-o)                      time
tunefs: volume label: (-L)

Once done you should have a TRIM enabled pfSense install on some seriously kick ass hardware.

Unboxed: A brand spanking new IBM Model M Keyboard

The IBM Model M keyboard is widely considered a bit of a classic. Manufactured from 1984 until the late 90s it is today much sought after by keyboard aficionados owing to its robust build quality, retro looks and of course superlative typing haptics.

I was fortunate enough to purchase one in 2003 at a corporate clearout for the princely sum of £3. I decided not to use it owing to my existing keyboard which was serving me perfectly well. Since then I’ve gone through a myriad array of keyboards including a very much beloved Microsoft Natural keyboard and the small but powerful Happy Hacking Lite.

Following my switch from Arch Linux to the Mac my keyboard tastes became less choosy. I switched to Apple’s range of keyboards. And whilst they are very good I have found my typing accuracy to have gone down with time.

I chalked it up to old age until I stumbled on the excellent deskthority keyboard forum where members reported that different keyboard switches respond at different speeds thus affecting response time and hence the number of typos. This seems to be especially true if a user learnt to type on one type of switch and subsequently switched to another one.

With a light bulb on in my head I recalled my boxed and unopened IBM Model M still sitting in the store room and decided to give it a whirl.

Without further ado, here it is. A brand new, unopened IBM Model M:
IBM Model M Keyboard IBM Model M Keyboard IBM Model M Keyboard IBM Model M Keyboard vs Apple Aluminium (Aluminum) Keyboard IBM Model M Keyboard

The big question is of course how is it in use?

Well it’s loud. Nearly as loud as a type writer if I’m being honest. I have to keep the study door closed at night for fear of waking my kids and as far as using it in the office – that is very much out of the question since the clickety-clack borders on anti-social!

My typing speed and accuracy have certainly gone up according to typeracer which puts me at about 90-100 words per minute with the Apple Alumin(i)um keyboard and at 100-110 wpm with the Model M.

I find that surprising since firstly I’m very used to Apple’s keyboards and secondly the Model M keys do require more effort to press, especially the spacebar which thunks like a type writer carriage return!

But the stats above being what they are I’m willing to give this keyboard a shot despite the amount of desk real estate it consumes and its less conventially attractive appearance.

Despite its foibles the Model M just so happens to be extremely satisfying to type on. I mean really, it’s ridiculously good fun. Certainly about as much fun as typing at a computer gets anyways!

Since the keyboard lacks a Windows/Command key I had to remap Ctrl, Alt and Caps Lock to more closely resemble the Happy Hacking Keyboard layout which works very well and took very little getting used to:

Mac OS X Keyboard Modifier Keys

Getting Things Done (and getting your data out of Cultured Code’s Things.app)

I’m a big fan of David Allen’s book Getting Things Done. When it comes to effective organization you could probably read this book and be done, it really is that good.

One issue I have with the book though is that it’s very paper centric. As a Yard-O-Led fountain pen and Rhodia notebook loving scribbler I have to say I can dig that. I really can.

But the thing that irks me with a paper based workflow is that eventually, at some point, I know I’m going to have to type my paper notes up!

To that end I need some good software to organize my TODO lists.

For years I’ve turned to Cultured Code’s Things. It’s a beautifully designed application but has not been without its problems. In the early days of productivity software Things was relatively alone in the marketplace. Today things are very different with Asana, Todoist, Wunderlist, TaskPaper and a whole myriad of other entrants crowding up the productivity suite market.

Things hasn’t kept up and it’s high time I switched. Unfortunately Cultured Code hasn’t seen fit to put any sort of decent export functionality into Things. Thankfully however they do offer AppleScript functionality so an evening of hacking led me to produce a script which pulls the data out of their database and sticks it into a nice CSV.

From there you can copy and paste it into the task app of your choice or alternatively import it using your own script-fu.

Benchmarking System Performance

A little knowledge is a dangerous thing. Or so the saying goes. When specifying and buying computer hardware it saves time and money knowing the level of performance you get with your existing equipment and the performance you can expect from your new purchase.

There are numerous metrics to measure but in order to obtain meaningful results (relatively) quickly I personally focus on CPU, memory and file and network I/O.

The key tools I use to measure performance are:

  • dd – file/network I/O
  • SysBench – CPU, Memory and file/network I/O
  • iperf – network I/O
  • IOzone – file/network I/O

dd

dd is a simple command which copies standard input to standard output. As a result by directing input/output from and to various destinations we can measure their read and write performance.

To measure write performance:

dd if=/dev/zero of=tmp.bin bs=2048k count=5k && sync

To measure read performance:

dd if=tmp.bin of=/dev/null bs=2048k count=5k && sync

Since block size is 2048k (2MB) your output file tmp.bin will be double the size of your count figure. So for example to test a file size of 10GB specify a count value of 5k.

Aim to test a file size of 2x your system memory. Otherwise you’ll end up caching a lot of your result and hit memory rather than disk.

Output:

10737418240 bytes transferred in 80.956609 secs (132631769 bytes/sec)

Here we’re observing bandwidth of 132631769 bytes/sec or 132MB/s.

Script It!

Takes two arguments, destination and size in GB of the test file.

#!/bin/sh

# Default size in GB
SIZE=100

if [ "$1" = "" ]; then
 echo "Destination path missing"
 exit 1
fi

if [ "$2" != "" ]; then
 SIZE=$2
fi

DEST=$1
COUNT=$(($SIZE / 2))k

echo "Starting Write Test"
dd if=/dev/zero of="$DEST/tmp.bin" bs=2048k count=$COUNT && sync
echo "Completed Write Test"
echo ""
echo "Starting Read Test"
dd if="$DEST/tmp.bin" of=/dev/null bs=2048k count=$COUNT && sync
rm "$DEST/tmp.bin"
echo "Removed test file"
echo "Completed Read Test"

SysBench

SysBench is a benchmarking application which covers a range of performance tests to measure CPU, memory, file IO and MySQL performance.

It can be used with very little setup and allows you to quickly get an idea of overall system performance.

CPU

Execute:

sysbench --test=cpu run

By default the process runs in 1 thread. Specify –num-threads=X for multiprocessor systems where X is the number of CPU cores.

Output:

sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 10000


Test execution summary:
total time: 10.4933s
total number of events: 10000
total time taken by event execution: 10.4909
per-request statistics:
min: 0.99ms
avg: 1.05ms
max: 2.17ms
approx. 95 percentile: 1.27ms

Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 10.4909/0.00

The key figure to look out for is total time: 10.4933s.

Memory

Execute (read):

sysbench --test=memory run

Execute (write):

sysbench --test=memory --memory-oper=write run

Output:

sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Doing memory operations speed test
Memory block size: 1K

Memory transfer size: 102400M

Memory operations type: write
Memory scope type: global
Threads started!
Done.

Operations performed: 104857600 (2187817.58 ops/sec)

102400.00 MB transferred (2136.54 MB/sec)


Test execution summary:
 total time: 47.9279s
 total number of events: 104857600
 total time taken by event execution: 40.6687
 per-request statistics:
 min: 0.00ms
 avg: 0.00ms
 max: 4.36ms
 approx. 95 percentile: 0.00ms

Threads fairness:
 events (avg/stddev): 104857600.0000/0.00
 execution time (avg/stddev): 40.6687/0.00

The key figures to look out for are the transfer rates MB/sec or ops/sec values.

File I/O

Measuring storage performance is a very tricky beast. There are many variables at play from the bandwidth of the interconnect (SATA 3Gb or 6Gb, Ethernet 10Gb or 1Gb etc.) to the amount of memory the system has which affects how much of the benchmark is hitting memory instead of disk. On top of that you need to be aware of the type of data you’ll be pushing; does it involve a lot of small sized random I/O or larger files with a lot of sequential I/O.

For example a database or virtual machine disk store will have a small block size with a lot of random I/O. Large ISOs or media files will have larger block sizes with a lot of sequential I/O. How you specify your storage server will drastically affect its performance in these cases, particularly with random I/O which is the most demanding case.

If a storage system can handle random I/O well it can certainly handle sequential I/O too which is why a lot of storage reviews will tend to focus on random performance. It also requires significantly less exotic (and expensive) hardware to engineer a well performing storage system for lots of sequential I/O so bear this in mind when determining your storage needs. You probably won’t need SSD backed read/write caches or high RPM drives if you’ll be serving media.

Prepare

When using Sybench’s fileio benchmark you will need to create a set of test files to work on.

Execute:

sysbench --test=fileio --file-total-size=4G prepare

It is recommended that the size set using –file-total-size is at least 2x larger than the available memory to ensure that file caching does not influence the workload too much.

Run

Execute:

sysbench --test=fileio --file-total-size=4G --file-test-mode=rndrw --max-time=240 --max-requests=0 --file-block-size=4K --num-threads=4 --file-fsync-all run

The I/O operations to use can be specified using –file-test-mode which takes the values seqwr (sequential write), seqrewr (sequential rewrite), seqrd (sequential read), rndrd (random read)rndwr (random write) and rndrw (random reead/write).

Generally the higher you set –num-threads the higher your result. Beyond a certain point however performance will start to level off. This will tend to happen with a thread count 2x the number of CPUs on the test system.

If testing random I/O a file block size of 4K is suggested using –file-block-size. For sequential I/O use 1M.

Setting the option –file-fsync-all only affects the rndwr and rndrw tests. It forces flushing to disk before moving onto the next write. You would want to do this to emulate very demanding cases such as VMware and NFS stores which force sync on write. Performance is drastically degraded with this option. By default sysbench flushes the writes to disk after 100 writes.

By default sysbench fileio executes 10000 requests. In order to produce effective benchmarks within a period of time we set the –max-requests value to 0 which is unlimited.

We then set the –max-time value to a logical value based upon the file-total-size value in order to ensure the test doesn’t execute requests indefinitely. 240 seconds is a good value for sizes of 4G, for larger sizes such as 60G a time of 720 seconds is good.

Output:

sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Extra file open flags: 0
128 files, 32Mb each
4Gb total file size
Block size 16Kb
Number of random requests for random IO: 10000
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Done.

Operations performed: 6000 Read, 4000 Write, 12800 Other = 22800 Total
Read 93.75Mb Written 62.5Mb Total transferred 156.25Mb (40.973Mb/sec)
 2622.29 Requests/sec executed

Test execution summary:
 total time: 3.8135s
 total number of events: 10000
 total time taken by event execution: 0.3151
 per-request statistics:
 min: 0.00ms
 avg: 0.03ms
 max: 5.88ms
 approx. 95 percentile: 0.02ms

Threads fairness:
 events (avg/stddev): 10000.0000/0.00
 execution time (avg/stddev): 0.3151/0.00

The key figures to look at are the transfer rates MB/sec and Requests/sec which basically equates to your IOPS figure.

A bug in the fileio output shows the bit abbreviation but shows the numerical byte value.

Cleanup

Execute:

sysbench --test=fileio --file-total-size=4G cleanup

To cleanup simply run the above command and the various temp files used to run the fileio test will be removed.

Script It!

Here’s a little script I use to quickly test File I/O performance using sysbench. Simply call it from the folder on the storage device or network share you want to benchmark:

#!/bin/bash

# Set to 2x RAM
FILE_TOTAL_SIZE="4G"

#Set to long enough to complete several runs
MAX_TIME="240"

#For random IO set to 4K otherwise set to 1M for sequential
FILE_BLOCK_SIZE="4K"

logdate=$(date +%F)

echo "Preparing test"
sysbench --test=fileio --file-total-size=$FILE_TOTAL_SIZE prepare

echo "Running tests"
for run in 1 2 3; do
 for each in 1 4 8 16 32 64; do
 echo "############## Running Test - Write - Thread Number:" $each "- Run:" $run "##############"
 sysbench --test=fileio --file-total-size=$FILE_TOTAL_SIZE --file-test-mode=rndwr --max-time=$MAX_TIME --max-requests=0 --file-block-size=$FILE_BLOCK_SIZE --num-threads=$each --file-fsync-all run > log-$logdate-write-${each}T-${run}R.log
 
 echo "############## Running Test - Read - Thread Number:" $each "- Run:" $run "##############"
 sysbench --test=fileio --file-total-size=$FILE_TOTAL_SIZE --file-test-mode=rndrd --max-time=$MAX_TIME --max-requests=0 --file-block-size=$FILE_BLOCK_SIZE --num-threads=$each run > log-$logdate-read-${each}T-${run}R.log
 done
done

echo "Cleaning up"
sysbench --test=fileio --file-total-size=$FILE_TOTAL_SIZE cleanup

IOzone

IOzone is an incredibly comprehensive file IO measurement application. It provides in depth analysis of filesystem performance and measures it across three axis; file size, transfer size and performance.

It also lets you easily produce pretty graphs like this which show the performance effect of CPU cache, memory cache and raw disk speed performance:

IOzone read performance report
IOzone read performance report

With iozone there are two scenarios I typically measure:

  • Direct Attached Storage (DAS)
  • Network Attached Storage (NAS)

To explain the commands below there are a few variables to set in both types of scenario. Firstly I set -g (size) to 2x RAM of the file server being measured. It takes a lot longer to test, especially with large amounts of memory, but the results are much more useful since they give a nice 3D surface chart which shows the sustained speeds you can expect for a given file size as it hits CPU cache, memory cache, SSD cache and finally spinning disks.

The argument -b produces a binary compatible spreadhseet which can be opened in Excel to produce 3D surface charts like below. You can see the measured performance decreases as file size exhausts the CPU cache (top strata) at (7 GB/s), buffer cache (next strata down) and finally hits spinning disks in the pale blue section at the bottom (450 MB/s). That last figure is our sustained speed at load.

Where the chart flatlines is where the result is unmeasured. Be sure to set option -z to avoid that!

IOzone Writer Report (RAID 10 FreeNAS system 64G record size)
IOzone Writer Report (RAID 10 FreeNAS system 64G record size)

 

Direct Attached Storage

Execute:

iozone -Raz -g 4G -f /mnt/ZFS_VOL/ZFS_DATASET/testfile -b iozone-MY_FILE_SERVER-local-size-4g.xls

Network Attached Storage

I use NFS for most of my server file stores. As a result these commands are NFS focused but should work on non-NFS storage as well.

Execute:

iozone -Razc -g 4G -U /mnt/MY_FILE_SERVER -f /mnt/MY_FILE_SERVER/testfile -b iozone-MY_FILE_SERVER-nfs-size-64g.xls

OR

iozone -RazcI -g 4G -f /mnt/MY_FILE_SERVER/testfile -b iozone-MY_FILE_SERVER-nfs-size-64g.xls

For NFS testing ideally you want to use the first argument which unmounts the NFS share between tests and removes the effect of caching. This requires an fstab entry so the test can mount/unmount successfully. Unfortunately I often encounter issues with the remount failing after a few tests. If you encounter that (or can’t be bothered to create an fstab entry) use -I which uses DIRECT I/O for all file operations which tells the filesystem that all operations are to bypass the buffer cache and go directly to disk.

With your XLS file in hand open in Excel and checkout your performance. All figures are in kilobytes.

To produce a graph it’s pretty simple. Select the table, go to Insert and choose a 3D Surface graph.

iozone graphing in excel

Sources

http://joshtronic.com/2014/06/22/ten-dollar-showdown-linode-versus-digitalocean/
http://wiki.mikejung.biz/Sysbench#Sysbench_Fileio_file-extra-flags 

Securing a multi-user Apache Web Server

As part of refining my Apache web server which runs multiple sites I’ve create a user account, database account and home folder per site so for example the site example.com has a user account example, a database account example and a web folder located at:

/home/example/public_html

The corresponding Apache VirtualHost for this site is:

<VirtualHost *:80>
        ServerAdmin admin@example.com
        ServerName www.example.com
        ServerAlias example.com
        ErrorLog /var/log/apache2/error.example.com.log
        LogLevel warn
        CustomLog /var/log/apache2/access.example.com.log combined
        DocumentRoot /home/example/public_html
        <IfModule mod_suexec.c>
                SuexecUserGroup example example
        </IfModule>
</VirtualHost>

Previously to ensure PHP scripts worked I had a Bash cron job to loop over all the user’s public_html folders and set the owner on the public_html folder to the apache user www-data.

Not ideal.

So after a few hours of digging I managed to deploy a solution both secure and flexible, allowing users to logon and edit their web pages without permissioning headaches.

Assuming a basic Apache setup first install the Apache suPHP and suEXEC modules:

sudo apt-get install libapache2-mod-suphp apache2-suexec

Enable the modules:

sudo a2enmod suexec
sudo a2enmod suphp

The suPHP module replaces the Apache PHP4 and PHP5 modules. Having both active prevents suPHP from working properly so you’ll need to disable the PHP4 and PHP5 modules:

sudo a2dismod php4
sudo a2dismod php5

Finally you’ll want to set the permissions on the user folder:

find ~/public_html/ -type f -exec chmod 644 {} \;
find ~/public_html/ -type d -exec chmod 755 {} \;

To get this setup even better I’d ideally like to set those permissions to 600 and 700 respectively but that’s a job for tomorrow.

Addendum:

Awesome link which covers much of the above and then some.

 

Spreading your bets on RAID

In the early days of our startup, bubblegum and duct tape seemed to be the order of the day as we struggled to keep things running on cheap as chips computers bought off ebay and a ragtag bunch of borrowed Dell Optiplexes.

Developer files sat on their individual machines, source code was scattered across the place and the concept of centralised document storage was a share on one of the developer machines called Common in which everyone dumped their stuff.

A year into this rapidly escalating mess I took matters into my own hands and pestered the boss for a £1500 budget to build a file server. A Supermicro SC-743 Cool & Quiet Case coupled with a top notch Xeon board, 8GB of RAM, Intel Quad Core CPU and a top of the line 3ware 9690SA RAID card (with battery backup no less!) meant we were about to take our file server (the aforementioned developer’s machine) from a mewling kitten to a roaring tiger.

The whole thing was assembled beautifully and worked a treat, with a RAID 1 mirror for the Debian installation and 8x Seagate 7200.11 hard drives for the RAID 10 storage array.

In building this machine I made one and only one mistake. All of the drives were the same make and model and doubtless all manufactured at the same time.

Fast forward 12 months and on coming into work on Monday morning I saw a mail from the 3ware monitoring manager: ‘Drive 4 dropped out of array’. Not a problem I thought, we had a monthly offsite backup in place. I hopped online and ordered a spare disk.

Later that afternoon I received another alert: ‘Drive 6 dropped out of array’.

‘Sh****t’ I (probably) exclaimed realizing that if the second drive had dropped out of the same stripe as the previous drive our array would have been toast. I quickly ordered two more drives.

Making hasty backups and crossing fingers I awaited the arrival of the new drives the following day and on their arrival stuck one in to replace the failed disk. A few hours after successfully rebuilding the array I saw another disk fail.

It was at this point that I got down on my knees and began to pray. (I’m just kidding – I did that that the day before).

On a hunch I removed and reinserted the failed drive. It initialized and rebuilt fine. A few hours later one of the new drives dropped out. Over the next few days I was barely playing catch up in ensuring the RAID array didn’t fail entirely with drives dropping out 1-2 times a day and then initializing on reinsertion.

We were making daily backups by now but since this was our main file server and we were going through a pretty lean month it meant that we had zero budget to replace all the disks or get another box.

It was then that I exercised my Google-fu and hit the internet. Turned out Seagate had a bad batch of 7200.11 disks and had issued a firmware update.

The duty of taking the box offline after work and updating the firmware of all 11 drives fell on my shoulders. This ghastly process involved sticking all the disks, one at a time into a desktop and running the firmware update on each one.

Since then the array has run like a champ. We kept it with the original 8 disks and 3 hot spares for good measure…it’s been 7 years and nary a complaint from 3ware’s management tool.

Fast forward to 2013 and our latest storage purchase was a lovely Synology 10 disk NAS. Quick and (very) quiet it came populated by the manufacturer with 10 2TB Seagate disks (Enterprise models no less!). We loaded it up with our data and enjoyed the feel of the new shiny, flashing its pretty lights at us from the equipment rack.

Fast forward 12 months and you guessed it, a drive dropout. Then another, and another, followed by another. Over the course of 6 months we must have replaced more than half of those damned Seagate drives.

Moral of the story? Don’t buy Seagate.

Heheh, just kidding (maybe)…moral of the story is not to buy the same brand and batch of hard disks when speccing your storage array. Since those early days of scraping by we now build some pretty powerful RAID arrays for our customers and we always try and use a 50/50 mix of different brands and batches.

(We also make a lot of backups!)

Device Icons

I’m a great believer in having strong visual cues in user interfaces to help a user orient themselves. To this end I think manufacturers of devices like Kingson, LaCie, Sandisk, etc. should step up to the plate more and offer the user quality icons for their devices.

LaCie are actually fairly good at this, although some of their icons leave a little to be desired. Sandisk and Kingston AFAIK don’t provide any icons for their devices which is a great pity.

The benefit of these icons mean a user interface can go from this:

sans-device-icons

To this:

device-icons

Now isn’t that much better?

 

More for my benefit than yours, but I’ve attached/linked to the icons I use here:

Attributed wherever possible to the original author of the icon.

LaCie Little Big Disk Icon

LaCie 2big Icon

Kingston DTSE9 Icon

Sandisk Titanium Icon (Author: iiroku)

Openfire Single Sign On (SSO)

I’m a dabbler, I like to dabble.

While most people are happily using Google Talk, Facebook chat, Skype and the like I’m busy playing around with my own chat server, writing plugins for it and seeing if I can get things like Single Sign On (SSO), DNS Service Records and Federation working. It’s time consuming, frustrating at times but ultimately rewarding. One particularly frustrating problem I recently tackled was single sign on with Openfire (a Jabber/XMPP messaging server).

My basic setup likely mirrors most enterprise-y networks:

  • Windows Active Directory Domain Controller with Windows Support Tools installed
  • Openfire 3.8 bound to the Windows DC
  • Windows XP/Windows Terminal Server Clients running Pandion/Pidgin
  • Mac OS X Clients Running Adium

The first step is to ensure that you have a working Windows AD network alongside a working Openfire installation.

  • AD Domain: EXAMPLE.COM
  • AD Host: DC.EXAMPLE.COM
  • Openfire (XMPP) Domain: EXAMPLE.COM
  • Openfire Host: OPENFIRE.EXAMPLE.COM
  • Keytab account: xmpp-openfire

Ensure you have an A and reverse DNS record for your Openfire server and then setup your DNS Service Records for Openfire like so:

_xmpp-client._tcp.example.com. 86400 IN SRV 0 0 5222 openfire.example.com.
_xmpp-server._tcp.example.com. 86400 IN SRV 0 0 5269 openfire.example.com.

With DNS done create two new Active Directory accounts. Account one is for binding the Openfire server to the domain (skip this account if you’ve already bound Openfire to your domain).

Account two is to associate your Service Principal Name (SPN) so Kerberos clients can find and authenticate using SSO with your Openfire server.

On account two check under Account properties that User cannot change password, Password never expires and Do not require Kerberos preauthentication are checked.

On the Windows Domain Controller you’ll now need to create the SPN and keytab. The SPN (Service Principal Name) is used by clients to lookup the name of the Openfire server for SSO. The keytab contains pairs of Service Principals and encrypted keys which allows a service to automatically authenticate against the Domain Controller without being prompted for a password.

Creating the SPN:

I created two records since it seems some clients lookup xmpp/openfire.example.com@EXAMPLE.COM and some look up xmpp/openfire.example.com.

setspn -A xmpp/openfire.example.com@EXAMPLE.COM xmpp-openfire
setspn -A xmpp/openfire.example.com xmpp-openfire

Mapping the SPN to the keytab account xmpp-openfire and when prompted enter the xmpp-openfire password:

ktpass -princ xmpp/openfire.example.com@EXAMPLE.COM -mapuser xmpp-openfire@EXAMPLE.COM -pass * -ptype KRB5_NT_PRINCIPAL

Create the keytab:

I found that the Java keytab didn’t work on my Openfire system in which case I used the Windows ktpass utility to create it. Some users report the converse, so see whichever works for you:

Java keytab generation:

ktab -k xmpp.keytab -a xmpp/openfire.example.com@EXAMPLE.COM

Windows keytab generation:

ktpass -princ xmpp/openfire.example.com@EXAMPLE.COM -mapuser xmpp-openfire@EXAMPLE.COM -pass * -ptype KRB5_NT_PRINCIPAL -out xmpp.keytab

Copy the keytab to your Openfire directory, typically /usr/share/openfire or /opt/openfire. The full path will look like this:

/usr/share/openfire/xmpp.keytab

Configuring Linux for Active Directory

Configure Kerberos

First we need to install ntp, kerberos and samba:

apt-get install ntp krb5-config krb5-user krb5-doc winbind samba

Enter your workgroup name:

eg. EXAMPLE.COM

Configure /etc/krb5.conf

[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[libdefaults]
dns_lookup_realm = true
dns_lookup_kdc = true
ticket_lifetime = 24h
forwardable = yes

[appdefaults]
pam = {
debug = false
ticket_lifetime = 36000
renew_lifetime = 36000
forwardable = true
krb4_convert = false
}

Test connection to Active Directory by entering the following commands:

:~# kinit xmpp-openfire@EXAMPLE.COM

Check if the request for the Active Directory ticket was successful using the kinit command

:~# klist

The result of this command should be something like this:

Ticket cache: FILE:/tmp/krb5cc_0
Default principal: xmpp-openfire@EXAMPLE.COM

Valid starting Expires Service principal
07/11/13 21:41:31 07/12/13 07:41:31 krbtgt/EXAMPLE.COM@EXAMPLE.COM
renew until 07/12/14 21:41:31

Join the domain

Configure your smb.conf like so:

#GLOBAL PARAMETERS
[global]
   workgroup = EXAMPLE
   realm = EXAMPLE.COM
   preferred master = no
   server string = Linux Test Machine
   security = ADS
   encrypt passwords = yes
   log level = 3
   log file = /var/log/samba/%m
   max log size = 50
   printcap name = cups
   printing = cups
   winbind enum users = Yes
   winbind enum groups = Yes
   winbind use default domain = Yes
   winbind nested groups = Yes
   winbind separator = +
   idmap uid = 600-20000
   idmap gid = 600-20000
   ;template primary group = "Domain Users"
   template shell = /bin/bash

[homes]
   comment = Home Direcotries
   valid users = %S
   read only = No
   browseable = No

[printers]
   comment = All Printers
   path = /var/spool/cups
   browseable = no
   printable = yes
   guest ok = yes

Join the domain:

:~# net ads join -U administrator

You will be asked to enter the AD Administrator password.

Verify you can list the user’s and groups on the domain:

:~# wbinfo -u
:~# wbinfo -g

Testing the keytab works:

From your Openfire system run the below command:

  kinit -k -t /usr/share/openfire/resources/xmpp.keytab xmpp/openfire.example.com@EXAMPLE.COM -V

You should see:

Authenticated to Kerberos v5

Then create a GSSAPI configuration file called gss.conf in your Openfire configuration folder normally in /etc/openfire or /opt/openfire/conf. Ensure you set the path to your xmpp.keytab file:

com.sun.security.jgss.accept {
    com.sun.security.auth.module.Krb5LoginModule
    required
    storeKey=true
    keyTab="/usr/share/openfire/xmpp.keytab"
    doNotPrompt=true
    useKeyTab=true
    realm="EXAMPLE.COM"
    principal="xmpp/openfire.example.com@EXAMPLE.COM"
    debug=true
    isInitiator=false;
};

Ensure the file is owned by the openfire user.

Stop Openfire and enable GSSAPI by editing your openfire.xml configuration file which is found in the openfire conf directory:

<!-- sasl configuration -->
<sasl>
    <mechs>GSSAPI</mechs>
    <!-- Set this to your Keberos realm name which is usually your AD domain name in all caps. -->
    <realm>EXAMPLE.COM</realm>
    <gssapi>
        <!-- You can set this to false once you have everything working. -->
        <debug>true</debug>
        <!-- Set this to the location of your gss.conf file created earlier -->
        <!-- "/" is used in the path here not "\" even though this is on Windows. -->
        <config>/etc/openfire/gss.conf</config>
        <useSubjectCredsOnly>false</useSubjectCredsOnly>
    </gssapi>
</sasl>

Or add to System Properties:

sasl.gssapi.config /etc/openfire/gss.conf
sasl.gssapi.debug false
sasl.gssapi.useSubjectCredsOnly false
sasl.mechs GSSAPI
sasl.realm EXAMPLE.COM

Restart Openfire

Buying Hi-Def music today is a crapshoot

The loudness war has been going on for some time with musicians, producers and record companies over the past few decades mastering and releasing their records with ever increasing volume and compression. In the days of vinyl there was a physical limit to how loud you could press a record before the needle would be unable to play it – the advent of Compact Discs however changed that. Whilst they boasted a greater dynamic range than vinyl they also defined a maximum peak ampltitude. Through some science and a bunch of signal processing, record engineers could thus push the overall volume of a track so that it became louder throughout, often hitting peak and compressing the dynamic range of the record. The long and short of this is that modern records nearly all tend to have dynamic range compression applied and the result is a loss of sound quality in the form of distortion and clipping.

Why do record companies do this? A popular perception (misconception?) is that the louder a record sounds – the better it sounds – and hence the more likely someone hearing it in the record store or over the radio is to buy it.

Note the mediums over which most people traditionally hear new music – inside record stores, over the radio, in the coffee shop, on their phones, tablets and notebooks – none of these mediums are known for high fidelity listening and their poor quality speakers tend to mask the compression in the music. As a result loud sells.

So if that new track by The Killers sounds good to you playing on the cheap speakers at your local coffee shop just wait until you hear the Muse single coming up  – it’s probably louder and in a noisy coffee shop will sound better.

The problem arises when you listen to that record on your nice, shiny headphones or your stereo at home – in a quiet environment, with good audio equipment those distorted, normalised tracks are going to sound noisy, fatiguing and to be perfectly blunt – a bit crap.

A backlash from consumers and high end audio equipment manufacturers was bound to happen with the demand for high quality, well mastered records ever increasing. Companies like HDtracks, naimlabel and LINN Records to name a few, stepped in to fill the gap. Offering not only well mastered tracks they also boast a higher resolution than CD can provide, with the quality up to 24-bits and 192kHz. One thing which needs to be stressed however is that mastering matters – in fact it matters more than how much fidelity a record has: A poorly mastered 24-bit 192kHz record is not going to sound any better than a well mastered 16-bit 44.1kHz CD. In fact if it’s very poorly mastered it will almost certainly sound worse than an MP3 rip of the CD.

Take Elton John’s self titled album for example. Here it began life in 1970 on vinyl. No sign of clipping or compression here:

Elton John - Elton John - The King Must Die - 1970 Vinyl
Elton John – Elton John – The King Must Die – 1970 Vinyl

In 1985 it was released on CD, again with no discernible compression:

Elton John - Elton John - The King Must Die - 1985 CD
Elton John – Elton John – The King Must Die – 1985 CD

In 1995 it was re-released as a Remastered Edition on CD. You can see the track is louder but it’s just about acceptable:

Elton John - Elton John - The King Must Die - 1995 Remastered CD
Elton John – Elton John – The King Must Die – 1995 Remastered CD

In 2008 it was again re-released as a Deluxe Edition CD. As expected for a modern release, it’s been made loud and sounds compressed and fatiguing as a result:

Elton John - Elton John - The King Must Die - 2008 Deluxe Edition CD
Elton John – Elton John – The King Must Die – 2008 Deluxe Edition CD

Finally Elton John’s album appears on HDtracks in high definition 24-bit 96kHz. It should offer the best sound quality but to take advantage of the vast dynamic range of those 24-bits it will need to be mastered properly. Here we can see that this is definitely not the case. In fact it suffers from more dynamic range compression than the 2008 CD release:

Elton John - Elton John - The King Must Die - HDtracks 24-bit/96kHz
Elton John – Elton John – The King Must Die – HDtracks 24-bit/96kHz

The HDtracks edition should offer the best sound quality – after all it is 24-bit 96kHz and comes from a store which aims to provide high end audio tracks. Sadly it suffers from bad mastering. The result is a lot of clipping and excessive loudness and consequently it sounds worse than the older, less compressed editions.

So what can we surmise from this? Put simply that despite the much touted quality of 24-bit music there’s no certainty that the HD version of the album you’re buying is also mastered properly and free from excessive normalisation and distortion. For those looking to upgrade their album collection it’s clear that there’s no guarantee your new, 24-bit purchases will sound better. This is a great pity since technically the new high definition audio formats offer higher quality than has ever been possible – if only the studios, record producers and artists would oblige. Until then for those seeking quality HD audio tracks it’s a crapshoot.

.

The background to our lives

John F. Kennedy, Robert F. Kennedy, John Lennon, Martin Luther King Jr., Abraham Lincoln and that legendary queen of Queen; Freddie Mercury – so many showmen have graced the stage of life. They have enlivened our worlds and broadened our horizons. They have pushed us from complacency and made us look at the world with new eyes. They have enriched us and inspired us and they have left us before their time.

It was with heavy heart that I heard of Steven P. Jobs’ passing. He was one of the greats. A man who pushed us and his peers, a man who showed us that there was a better way of doing things. Whilst his passage from this life was expected; his health visibly failing at every public appearance – his loss still came as a blow. Felt keenly around the world, it was a loss for which many of us felt wholly unprepared.

Jobs’ legacy, from the Apple II to the Mac, the iPod, the iPhone and the iPad have been part of the background of our lives. Those few who haven’t used his products have certainly used those of his competitors – products which borrowed and benefitted from his great designs. He might not have been the sole creator but his influence was evident in the high standards of each.

My heroes in this world have been those lofty individuals who almost canonised have passed into legend – JFK, RFK, Lennon, MLK, Lincoln, Freddie Mercury. These men however are obvious choices. What surprised me about Steve Jobs’ death was not how acutely his passing was felt, but that I hadn’t realised he’d been a hero all along.

Below are a selection of family images from over the years. From a little girl’s first e-mail to cousins communicating across continents Apple have been a valuable part of our lives: