Re: [zfs-discuss] L2ARC and poor read performance

2011-06-08 Thread Daniel Carosone
On Wed, Jun 08, 2011 at 11:44:16AM -0700, Marty Scholes wrote:
> And I looked in the source.  My C is a little rusty, yet it appears
> that prefetch items are not stored in L2ARC by default.  Prefetches
> will satisfy a good portion of sequential reads but won't go to
> L2ARC.  

Won't go to L2ARC while they're still speculative reads, maybe.
Once they're actually used by the app to satisfy a good portion of the
actual reads, they'll have hits stats and will.

I suspect the problem is the threshold for l2arc writes.  Sequential
reads can be much faster than this rate, meaning it can take a lot of
effort/time to fill.

You could test by doing slow sequential reads, and see if the l2arc
fills any more for the same reads spread over a longer time.

--
Dan.

pgp0CnUan5EkQ.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-08 Thread Marty Scholes
> This is not a true statement. If the primarycache
> policy is set to the default, all data will
> be cached in the ARC.

Richard, you know this stuff so well that I am hesitant to disagree with you.  
At the same time, I have seen this myself, trying to load video files into 
L2ARC without success.

> The ARC statistics are nicely documented in arc.c and
> available as kstats.

And I looked in the source.  My C is a little rusty, yet it appears that 
prefetch items are not stored in L2ARC by default.  Prefetches will satisfy a 
good portion of sequential reads but won't go to L2ARC.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-08 Thread Richard Elling
On Jun 7, 2011, at 9:12 AM, Phil Harman wrote:

> Ok here's the thing ...
> 
> A customer has some big tier 1 storage, and has presented 24 LUNs (from four 
> RAID6 groups) to an OI148 box which is acting as a kind of iSCSI/FC bridge 
> (using some of the cool features of ZFS along the way). The OI box currently 
> has 32GB configured for the ARC, and 4x 223GB SSDs for L2ARC. It has a dual 
> port QLogic HBA, and is currently configured to do round-robin MPXIO over two 
> 4Gbps links. The iSCSI traffic is over a dual 10Gbps card (rather like the 
> one Sun used to sell).

The ARC size is not big enough to hold the data for the L2ARC headers for the 
size
of the L2ARC.

> 
> I've just built a fresh pool, and have created 20x 100GB zvols which are 
> mapped to iSCSI clients. I have initialised the first 20GB of each zvol with 
> random data. I've had a lot of success with write performance (e.g. in 
> earlier tests I had 20 parallel streams writing 100GB each at over 600MB/sec 
> aggregate), but read performance is very poor.
> 
> Right now I'm just playing with 20 parallel streams of reads from the first 
> 2GB of each zvol (i.e. 40GB in all). During each run, I see lots of writes to 
> the L2ARC, but less than a quarter the volume of reads. Yet my FC LUNS are 
> hot with 1000s of reads per second. This doesn't change from run to run. Why?

Writes to the L2ARC devices are throttled to 8 or 16 MB/sec. If the L2ARC fill 
cannot keep up,
the data is unceremoniously evicted.

> Surely 20x 2GB of data (and it's associated metadata) will sit nicely in 4x 
> 223GB SSDs?

On Jun 7, 2011, at 12:34 PM, Marty Scholes wrote:

> I'll throw out some (possibly bad) ideas.
> 
> Is ARC satisfying the caching needs?  32 GB for ARC should almost cover the 
> 40GB of total reads, suggesting that the L2ARC doesn't add any value for this 
> test.
> 
> Are the SSD devices saturated from an I/O standpoint?  Put another way, can 
> ZFS put data to them fast enough?  If they aren't taking writes fast enough, 
> then maybe they can't effectively load for caching.  Certainly if they are 
> saturated for writes they can't do much for reads.
> 
> Are some of the reads sequential?  Sequential reads don't go to L2ARC.

This is not a true statement. If the primarycache policy is set to the default, 
all data will
be cached in the ARC.

> 
> What does iostat say for the SSD units?  What does arc_summary.pl (maybe 
> spelled differently) say about the ARC / L2ARC usage?  How much of the SSD 
> units are in use as reported in zpool iostat -v?

The ARC statistics are nicely documented in arc.c and available as kstats.
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-08 Thread Phil Harman

On 08/06/2011 14:35, Marty Scholes wrote:

Are some of the reads sequential?  Sequential reads

don't go to L2ARC.

That'll be it. I assume the L2ARC is just taking
metadata. In situations
such as mine, I would quite like the option of
routing sequential read
data to the L2ARC also.

The good news is that it is almost a certaintly that actual iSCSI usage will be 
of a (more) random nature than your tests, suggesting higher L2ARC usage in 
real world application.

I'm not sure how zfs makes the distinction between a random and sequential 
read, but the more you think about it, not caching sequential requests makes 
sense.

Yes, in most cases, but I can think of some counter examples ;)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-08 Thread Marty Scholes
> > Are some of the reads sequential?  Sequential reads
> don't go to L2ARC.
> 
> That'll be it. I assume the L2ARC is just taking
> metadata. In situations 
> such as mine, I would quite like the option of
> routing sequential read 
> data to the L2ARC also.

The good news is that it is almost a certaintly that actual iSCSI usage will be 
of a (more) random nature than your tests, suggesting higher L2ARC usage in 
real world application.

I'm not sure how zfs makes the distinction between a random and sequential 
read, but the more you think about it, not caching sequential requests makes 
sense.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-07 Thread Phil Harman

On 07/06/2011 22:57, LaoTsao wrote:

You have un balance setup
Fc 4gbps vs 10gbps nic


It's actually 2x 4Gbps (using MPXIO) vs 1x 10Gbps.


After 10b/8b encoding it is even worse, but this not yet impact your benchmark 
yet

Sent from my iPad
Hung-Sheng Tsao ( LaoTsao) Ph.D

On Jun 7, 2011, at 5:46 PM, Phil Harman  wrote:


On 07/06/2011 20:34, Marty Scholes wrote:

I'll throw out some (possibly bad) ideas.

Thanks for taking the time.


Is ARC satisfying the caching needs?  32 GB for ARC should almost cover the 
40GB of total reads, suggesting that the L2ARC doesn't add any value for this 
test.

Are the SSD devices saturated from an I/O standpoint?  Put another way, can ZFS 
put data to them fast enough?  If they aren't taking writes fast enough, then 
maybe they can't effectively load for caching.  Certainly if they are saturated 
for writes they can't do much for reads.

The SSDs are barely ticking over, and can deliver almost as much throughput as 
the current SAN storage.


Are some of the reads sequential?  Sequential reads don't go to L2ARC.

That'll be it. I assume the L2ARC is just taking metadata. In situations such 
as mine, I would quite like the option of routing sequential read data to the 
L2ARC also.

I do notice a benefit with a sequential update (i.e. COW for each block), and I 
think this is because the L2ARC satisfies most of the metadata reads instead of 
having to read them from the SAN.


What does iostat say for the SSD units?  What does arc_summary.pl (maybe 
spelled differently) say about the ARC / L2ARC usage?  How much of the SSD 
units are in use as reported in zpool iostat -v?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-07 Thread LaoTsao
You have un balance setup
Fc 4gbps vs 10gbps nic
After 10b/8b encoding it is even worse, but this not yet impact your benchmark 
yet

Sent from my iPad
Hung-Sheng Tsao ( LaoTsao) Ph.D

On Jun 7, 2011, at 5:46 PM, Phil Harman  wrote:

> On 07/06/2011 20:34, Marty Scholes wrote:
>> I'll throw out some (possibly bad) ideas.
> 
> Thanks for taking the time.
> 
>> Is ARC satisfying the caching needs?  32 GB for ARC should almost cover the 
>> 40GB of total reads, suggesting that the L2ARC doesn't add any value for 
>> this test.
>> 
>> Are the SSD devices saturated from an I/O standpoint?  Put another way, can 
>> ZFS put data to them fast enough?  If they aren't taking writes fast enough, 
>> then maybe they can't effectively load for caching.  Certainly if they are 
>> saturated for writes they can't do much for reads.
> 
> The SSDs are barely ticking over, and can deliver almost as much throughput 
> as the current SAN storage.
> 
>> Are some of the reads sequential?  Sequential reads don't go to L2ARC.
> 
> That'll be it. I assume the L2ARC is just taking metadata. In situations such 
> as mine, I would quite like the option of routing sequential read data to the 
> L2ARC also.
> 
> I do notice a benefit with a sequential update (i.e. COW for each block), and 
> I think this is because the L2ARC satisfies most of the metadata reads 
> instead of having to read them from the SAN.
> 
>> What does iostat say for the SSD units?  What does arc_summary.pl (maybe 
>> spelled differently) say about the ARC / L2ARC usage?  How much of the SSD 
>> units are in use as reported in zpool iostat -v?
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-07 Thread Phil Harman

On 07/06/2011 20:34, Marty Scholes wrote:

I'll throw out some (possibly bad) ideas.


Thanks for taking the time.


Is ARC satisfying the caching needs?  32 GB for ARC should almost cover the 
40GB of total reads, suggesting that the L2ARC doesn't add any value for this 
test.

Are the SSD devices saturated from an I/O standpoint?  Put another way, can ZFS 
put data to them fast enough?  If they aren't taking writes fast enough, then 
maybe they can't effectively load for caching.  Certainly if they are saturated 
for writes they can't do much for reads.


The SSDs are barely ticking over, and can deliver almost as much 
throughput as the current SAN storage.



Are some of the reads sequential?  Sequential reads don't go to L2ARC.


That'll be it. I assume the L2ARC is just taking metadata. In situations 
such as mine, I would quite like the option of routing sequential read 
data to the L2ARC also.


I do notice a benefit with a sequential update (i.e. COW for each 
block), and I think this is because the L2ARC satisfies most of the 
metadata reads instead of having to read them from the SAN.



What does iostat say for the SSD units?  What does arc_summary.pl (maybe 
spelled differently) say about the ARC / L2ARC usage?  How much of the SSD 
units are in use as reported in zpool iostat -v?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] L2ARC and poor read performance

2011-06-07 Thread Marty Scholes
I'll throw out some (possibly bad) ideas.

Is ARC satisfying the caching needs?  32 GB for ARC should almost cover the 
40GB of total reads, suggesting that the L2ARC doesn't add any value for this 
test.

Are the SSD devices saturated from an I/O standpoint?  Put another way, can ZFS 
put data to them fast enough?  If they aren't taking writes fast enough, then 
maybe they can't effectively load for caching.  Certainly if they are saturated 
for writes they can't do much for reads.

Are some of the reads sequential?  Sequential reads don't go to L2ARC.

What does iostat say for the SSD units?  What does arc_summary.pl (maybe 
spelled differently) say about the ARC / L2ARC usage?  How much of the SSD 
units are in use as reported in zpool iostat -v?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] L2ARC and poor read performance

2011-06-07 Thread Phil Harman

Ok here's the thing ...

A customer has some big tier 1 storage, and has presented 24 LUNs (from 
four RAID6 groups) to an OI148 box which is acting as a kind of iSCSI/FC 
bridge (using some of the cool features of ZFS along the way). The OI 
box currently has 32GB configured for the ARC, and 4x 223GB SSDs for 
L2ARC. It has a dual port QLogic HBA, and is currently configured to do 
round-robin MPXIO over two 4Gbps links. The iSCSI traffic is over a dual 
10Gbps card (rather like the one Sun used to sell).


I've just built a fresh pool, and have created 20x 100GB zvols which are 
mapped to iSCSI clients. I have initialised the first 20GB of each zvol 
with random data. I've had a lot of success with write performance (e.g. 
in earlier tests I had 20 parallel streams writing 100GB each at over 
600MB/sec aggregate), but read performance is very poor.


Right now I'm just playing with 20 parallel streams of reads from the 
first 2GB of each zvol (i.e. 40GB in all). During each run, I see lots 
of writes to the L2ARC, but less than a quarter the volume of reads. Yet 
my FC LUNS are hot with 1000s of reads per second. This doesn't change 
from run to run. Why?


Surely 20x 2GB of data (and it's associated metadata) will sit nicely in 
4x 223GB SSDs?


Phil
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss