Matt Pson Just another dead sysadmin blog

18Jul/11Off

Proving a point – building a SAN (part 4 – the neverending story?)

Finally after weeks and weeks of waiting all parts for my SAN project had arrived and I unpacked the last cables and installed them and they sure did work just fine. All 16 disks jumped online and agreed to my configuration. Victory!

(Well, one of the Seagate Constellation 500GB 2,5" disks failed quite early when I started to transfer some data onto the filesystem. No way I would wait another couple of weeks for Cisco to ship me a new so I went by the local PC dealer and picked up a Seagate Momentus 500GB which worked perfectly.)

I decided to spread disks evenly among the 2 physical controllers and the channels so I got:

Card 1, channel 1: 1 SAS disk (system) and 3 SATA (storage pool)
Card 1, channel 2: 1 SSD disk (zil) and 3 SATA (storage pool)
Card 2, channel 1: 1 SAS disk (system) and 3 SATA (storage pool)
Card 2, channel 2: 1 SSD disk (zil) and 3 SATA (storage pool)

Brilliant, now to install Nexentastor CE and make some zpools. I was happy to see that during my wait for the hardware to arrive a new version had been released 3.0.5 which hopefully addressed some evil bugs I had read about.

I set up my zpool with two mirrored logdevices and two raidz2 pools with 6 disks each to be safe even if more than one disk would fail later on. The end result looks like this:

config:

        NAME         STATE     READ WRITE CKSUM
        stor         ONLINE       0     0     0
          raidz2-0   ONLINE       0     0     0
            c0t1d0   ONLINE       0     0     0
            c0t2d0   ONLINE       0     0     0
            c0t3d0   ONLINE       0     0     0
            c1t10d0  ONLINE       0     0     0
            c1t5d0   ONLINE       0     0     0
            c1t6d0   ONLINE       0     0     0
          raidz2-1   ONLINE       0     0     0
            c0t4d0   ONLINE       0     0     0
            c0t8d0   ONLINE       0     0     0
            c0t6d0   ONLINE       0     0     0
            c1t7d0   ONLINE       0     0     0
            c1t8d0   ONLINE       0     0     0
            c1t9d0   ONLINE       0     0     0
        logs
          mirror-2   ONLINE       0     0     0
            c0t0d0   ONLINE       0     0     0
            c1t4d0   ONLINE       0     0     0

errors: No known data errors

 

Sure, there was alot of diskspace 'wasted' with this configuration as I ended up with approximately 3.5TB of usable space from 6TB (12x500GB) of disks. Better safe than sorry when it comes to data and in this case the data would be virtual disks for a bunch of servers and if there is one thing I hate it's sitting in the middle of the night and restoring terabytes of data in case of an emergency (it's a fact that disks never break during daytime).

Enough labtesting, everything seemed to work just fine. Down to the datacenter to rack the server so it could be put into pre-production before my summer vacation in a couple of days. Took the 4 gigabit ports made a nice LACP of them in order to make no single NFS-client be able to use up all network bandwidth.

Made a NFS share and mounted it in our VMware ESXi cluster and transfered some uncritical VMs to the new SAN. and...

"Damn! That's fast..."

The difference when using proper a proper hardware setup (read: SSD as ZIL in a zpool) was like day and night compared to our old, SSD-less, Sun 7210. Average write latency reports in VMware was in the single digits compared to the 3 digit numbers we were used to.

Edit: now a few weeks later we have done even more test and even transfered some critical system to the new SAN and it just works without a single hickup so far (knock on wood). Next up is to transfer all data off the Sun 7210 system so we can upgrade the firmware and re-install the system with the latest updates.

2May/11Off

Proving a point – building a SAN

A few weeks back I noticed that patching a bunch of virtual machines running Windows felt slower than it used to do. Not that patching Windows servers ever felt quick and there was a big list of patched going onto these systems, but something was not quite right that afternoon. All these VMs had in common that they ran of a Sun Storage 7210 SAN that we got a few years back that has served us well. After some detective work using the awesome DTrace analytics of this box combined with the ESXi statistics it was quite obvious that write latency was suffering when patching several VMs in parallel.

As this SAN is using the ZFS filesystem I knew that there was an option when we bought it to add a "Logzilla", a SSD device to speed up and cache write requests which in combination with VMware ESXi always doing sync-writes over NFS should solve the write latency problems (well, there wasn't a problem with those just yet unless we did heavy random writes in multiple VMs at the same time - but it was a first indication that we would sooner or later have a problem with that). VMware provides a troubleshooting guide that recommends that the average write latency should stay below 20ms under load and let's just say that we saw numbers way higher than that when we patched.

Ok, so I dropped a mail to our Oracle dealer asking for a price on those SSD devices. They phoned back a bit later sounding really hesitant to give me a quote, telling me that the price obviously had went up a bit since Oracle bought Sun. Uh-huh, so what was the price then? Over €10000 for a 18GB SSD disk?! Wait? What? So, um, we were expected to pay big for something that, probably, could improve write latency on this SAN.

What to do? Our first decision was to figure out if that "Logzilla" would actually improve the situation if we were to add it (doubtful given the price, but anyway) . But how?  Why not build a SAN? To the batcave...

Using mostly equipment that we had spare I managed to get hold of a Cisco UCS C200 M2 server with 4GB RAM and 3 Samsung 1Tb consumer-grade desktop SATA disks. We picked up a €150 standard 60Gb SSD disk from the local computer shop (we picked a cheap one that had decent write speed according to the datasheet, probably around 220Mb/s). Adding the free version of NexentaStor (SAN-appliance software based on Solaris, good stuff) on top of that turning our equipment into something best descibed as a getto-SAN.

Doing a next-next-next install using one of the SATA disks as a single system disk, using the other two as a mirror with the SSD as a log device - all using the onboard SATA controller in the server. Hooking up the SAN to the storage network using a single gigabit port and using Storage vMotion to move a couple of, non-critical but live production, VMs to our testing SAN. They have been running there for almost two weeks now in order to gather some real numbers and the statistics tell us that the average write latency stay well below the 20ms level defined by VMware, most about 1/10th of that, 2ms. Throughput maxed out the gigabit connection more or less, we saw speeds around 80-100MB/s.

Hm, yeah, our getto-SAN that cost us less than €1000 to put together outperformed our €30000+ SAN from 2 years back thanks to a log device for the ZFS filesystem. Sure, the workload maybe wasn't comparable but we still regard this as proof that ZFS performance improves greatly if you give it a "Logzilla". Still, €10000 for a 18GB SSD disk feels a bit steep as it doesn't add anything else to the SAN. So next step for us is to build a "real" SAN to use in production to see what performance we can get if we use 'proper' hardware.

I'll write more when I have put together what I can get for my limited budget of €4-5000.