Hello,
 
 I'm on FreeBSD 9 with ZFS v28, and it's possible this combination is causing 
my issue, but I thought I'd start here first and will cross-post to the FreeBSD 
ZFS threads if the Solaris crowd thinks this is a FreeBSD problem.

 The issue: From carefully watching my ARC/L2ARC size and activity when I set 
primarycache=metadata and secondarycache=all, the secondarycache isn't acting 
like "all", it's just metadata.

I've read a lot on the functionality of the ARC/L2ARC, and I know that L2ARC is 
filled by a scanning the ARC for soon to expire objects, and then copying them 
to the L2ARC.

There is my first question:

Q1 - Doesn't this behavior mean that the L2ARC can never get data objects if 
the ARC doesn't hold them? Is setting primary to metadata and secondary to all 
an impossible request?

I believe the user data still goes through the ARC, it's just not kept when 
primarycache=metadata. 


I should back up and explain what I have, and what I'm trying to do. 

I'm running a 24 gig ZFS system with a 20 TB pool, roughly 12 TB full.  It's 
serving NFS data to an ESX server for my various VM's and it's running great, 
albeit a bit slow. I have 4 120 gig SSD's as my L2ARC.

I'm estimating my DeDeuplicationTable (DDT) needs to be well over 24 gigs, and 
with the default primary and secondary cache settings of "all" there is a lot 
of churning of both the ARC and L2ARC. I can see it on my L2ARC SSD drives, as 
they all fill up within a day, and stay chock full as the server continues to 
run 50+ VM's.  However, there is so much user data to be accessed, that it 
keeps a constant pressure on the metadata, and eventually the user data flowing 
in erodes most of the metadata. 

I'm trying to make sure my DDT is as available as possible without putting more 
RAM in this server. If I can dedicate my ARC to holding as much of the DDT as 
possible, and then use my L2ARC for any DDT overflow, and then lastly for data, 
I'd be happy with the performance.  I understand there's a performance hit here 
over RAM. 

When I set primarycache and secondarycache both to metadata, I find that my 
L2ARC drives fill up to around 20 Gig Each, for a total of 80 gigs on my L2ARC, 
and it won't go any further than this, no matter how hard I beat on it. 

However, when I set primarycache=metadata and secondarycache=all, I find the 
same behavior as setting both to metadata. My L2ARC's doesn't budge past 20 
Gigs each.

I'm familiar with arc_meta_used and arc_meta_limit. I've increased my 
arc_meta_limit to be 75% of my ARC, as I'm not interested in caching data in 
the ARC. I can watch arc_meta_used drop when I set primarycache=all, due to the 
pressure of the data objects pushing out the metadata, and setting 
primarycache=metadata allows the arc_meta_used to grow back to it's limit. 

Q2 is basically: What's the best way to keep as much of the DDT live in ARC and 
L2ARC without buying more RAM? 

Hopefully I'm not dismissed with "Just buy more RAM", which I can see as a 
valid point in a lot of situations where people are running on 4 and 8 gig 
systems and trying to access 12+ TB of data in a DeDupe situation. 

Keeping my RAM at 24 gigs isn't just a budget request, but it also the max RAM 
you can get in most workstation boards these days, and it's moving in the 
direction of better energy usage and heat generation, as my 120 Gig SSD's burn 
a fraction of the power than another 24 gigs of DDR3 will.  

Even if I added more RAM, I'd still have this problem. Say I have 196 Gigs of 
RAM in this server, and I want to dedicate the ARC just to metadata, and the 
L2ARC to user data. From my experiments, this wouldn't work. As I keep scaling 
up this server, it will keep eroding the metadata as userdata pours through the 
cache system. 

I think the ultimate would be cache priority - Setting metadata to be the most 
important, and then user data as secondary importance, so that it never evicts 
any metadata from ARC/L2ARC for user data, however I don't think this is 
possible today, and I'm unsure if it's on the drawing board. 

Your input is appreciated. 

Thanks. 
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to