Re: [discuss] Performance on new quad socket system

Sašo Kiselkov Tue, 26 Mar 2013 13:56:10 -0700

On 03/26/2013 06:44 PM, Franz Schober wrote:
> Hi thanks all for your input,


Hi Franz,

Looking at your lgrpinfo below, it seems your machine's topology was
correctly mapped out - not all that much of a surprise, considering the
legacy of systems like the E10k and friends.

In regards to your low throughput, I see that you're doing ZFS on the
initiator as well, i.e. zvol on target gets exported as a LUN to the
initiator, which itself creates a ZFS filesystem on it and writes to it.

Can you try and have the client write to the LUN directly? That is, not
write through a "second" ZFS layer (even though it lives on another
host). It's possible we might be looking at ZFS on FC suckiness here.

> I want to clarify my intentions, and try give answers to
> questions suggestions below.
> 
> We want to test ZFS throughput over Fibrechannel/COMSTAR
> on two 32 core quadsocket systems, one target one initator.
> 
> Therefor I created a ZPOOL with 2 x 6 Disks (1GB SAS) RAIDZ2 + ZEUS Ram ZIL
> on the FC target (2x Emulex LP12002 / 8 Gbit) running OmniOS stable.
> The initator is another system with identical hardware and software,
> they are interconnected over a zoned CISCO fabric switch (9148).
> 
> 1) The local dd performance on the target is around 750 MB/s and is ok
> for me.
> 
>  time dd if=/dev/zero of=/jbod_a1/largefile bs=128k count=64k
>  65536+0 records in 65536+0 records out
>  8589934592 bytes (8,6 GB) copied, 11,0914 s, 774 MB/s
> 
> 2) Then exporting the largefile/ZVOL (tested both,same perf.)
> as LUN over 2 x FC 8 GB Links  and creating a zpool on it brings
> in the following dd test around 189 MB/s at the initiator,
> this is not ok for me.
> 
>  zpool create fcvol1 c0t600144F00E7FCC0000005151C9A50003d0
> 
>  time dd if=/dev/zero of=/fcvol1/file1 bs=128k count=64k
>  65536+0 records in 65536+0 records out
>  8589934592 bytes (8,6 GB) copied, 45,3851 s, 189 MB/s

Here, just try doing:

# dd if=/dev/zero of=/dev/rdsk/c0t600144F00E7FCC0000005151C9A50003d0s0 \
  bs=128k count=64k

Also, monitor mpstat on both the initiator and the target to make sure
you're not hitting some performance bottlenecks in regards to CPU. If
you have some sort of raw-throughput testing tools for FC, it would be
great to verify that the FC link between initiator and target is capable
of doing more than 2 Gbit/s.

Next, have a look at iostat on the target and verify the number of sync
IO operations that are being issued; via Richard Elling's zilstat:
http://www.richardelling.com/Home/scripts-and-programs-1/zilstat
200 MB/s of sync IO is quite a lot. Possibly try disabling the ZIL, this
should tell you right away if this is your pain point.

Another test that should provide some more clues is trying to have
multiple independent initiators issue IO at the same time. If it
throughput scales nearly linearly (e.g. two initiators getting close to
400MB/s and four initiators nearly 800 MB/s), then we're looking at a
client-side pathology; otherwise if everybody counted together gets 200
MB/s, then the server side is likely the bottleneck (or the server FC
link, but you can check on your FC link speed and utilization pretty
accurately on the FC switch).

These should provide important clues as to what's going on.

> 3) Eliminating the zpool disks on the target side by replacing
> with a RAM disk or a file in /tmp is not a good idea as i learned
> through the discussion. The observation that a dd test to tmpfs
> without FC is much slower then on all my other systems is still
> strange to me.
> 
> Another observation I made is, that disabling hyper threading gave a
> performance increase
> of about 20 % with the dd in tmpfs.

We're probably looking at lock contention here, or some suboptimal part
of the tmpfs implementation (perhaps excessive cross-calls due to some
inefficient kernel memory manipulation in tmpfs). At this moment, I
wouldn't terribly worry about it. It's the local ZFS vs. COMSTAR FC
performance that's the discrepancy we need to worry about.

> My next steps would be firmware and driver updates of the LPE-12002
> interface cards, then eliminating
> the fabric switch and directly connecting the ports, then trying to use
> other tools to investigate the problem.
> I would be very glad for any suggestions/help.

I highly doubt firmware upgrades or removing the fabric switch will help
here. It appears your link speed isn't the limiting factor here - 8GbFC
vs. 2 Gbit/s of throughput is just too deep a discrepancy here.

Hope we can get to the bottom of this.

Cheers,
--
Saso



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Re: [discuss] Performance on new quad socket system

Reply via email to