Re: [Pvfs2-developers] server crash on startup with millions of files

Sam Lang Fri, 23 Feb 2007 09:44:25 -0800


Hi Phil,

Cool. It might help to increase that MAX_NUM_VERIFY_HANDLE_COUNTvalue in trove-handle-mgmt.h to something even bigger than 4096. I'mguessing your SAN will give you the best performance when you'redoing reads in the megabytes range, so putting it at as high as 16000might be ok.

Given these results it might make sense to make similar changes tothe keyval iterate code as well. Even with a max of 32 entries atonce, we're still likely to get some improvement.


-sam

On Feb 23, 2007, at 11:23 AM, Phil Carns wrote:

Ok, I have tried several iterations both with and without thesepatches. The test system is again using a SAN, this time with adataspace_attributes.db file of about 451 MB on a particularserver. I'm not sure how many files are on the file system; I justcranked out files on it until the db file looked big enough to getgood measurements on the startup time. I was able to turn on the"trove,server" logging mask along with the "usec" timestamp to seethe scan time on both versions without any logging occuring duringthe actual scan itself.
for example:
[D 10:00:46.541646] dbpf collection 752900094 - Setting collectionhandle ranges to 4-536870914,4294967292-4831838202[D 10:04:19.414723] dbpf collection 752900094 - SettingHIGH_WATERMARK to -1
If I unmount between each server start, the original version takesan average of 3 minutes, 17 seconds to complete the scan.
The patched version takes an average of 2 minutes, 22 seconds tocomplete the same scan.
This is definitely a big improvement- almost 30% in my test case.

-Phil

Phil Carns wrote:
Thanks Sam!  We will give these patches a try and report back.
-Phil
Sam Lang wrote:
Hi Phil,
Attached mult.patch implements iterating over the dspace dbusing DB_MULTIPLE_KEY. This may allow for the db get call to dolarger reads from your SAN. I was seeing slightly betterperformance with local disk after creating 20K files in a freshstorage space. Doing strace doesn't show fewer mmaps or largerreads though, so I'm not sure how berkeley db pulls in itspages. Anyway, if it helps improve performance for you guys, Ican clean it up a bit and commit it. I don't think anythinguses dspace_iterate_handles besides that ledger handlemanagement code.
You can fiddle the MAX_NUM_VERIFY_HANDLE_COUNT value to set howmany handles to get at a time. Right now its set to 4096. Keepin mind that this requires a much larger buffer allocated indbpf_dspace_iterate_handles_op_svc, since we have to get keysand values, so essentially we do a get with a buffer that's 4096*(sizeof (handle) + sizeof(stored_attr)), which ends up beingabout 300K.
I also attached a patch (server-start.patch) that prints out thestart message as well as ready message after serverinitialization has completed. If you set the Logstamp to usec,you'll be able to see the time it takes to initialize theserver. Also, this might help in knowing when you can mount theclients, although, hopefully at some point we'll be able to addthe zero-conf stuff and then we can return EAGAIN or something.
I'm not sure its time to replace the ledger code. It seems towork ok, and to fix the slowness you're seeing would meanswitching to some kind of range tree that could be serialized todisk so that we wouldn't have to iterate through the entiredspace db on startup. That opens up the possibility of thedspace db and the ledger-on-disk getting out of sync, which I'drather avoid.
We could hand out new handles by choosing one randomly, and thenchecking if its in the DB, getting rid of the need for a ledgerentirely, but I assume this idea was already scratched to avoidthe potential costs at creation time, especially as thefilesystem grows.
-sam



On Feb 20, 2007, at 11:23 AM, Phil Carns wrote:
Robert Latham wrote:
On Tue, Feb 20, 2007 at 07:29:16AM -0500, Phil Carns wrote:
Oh, and one other detail; the memory usage of the serverslooks fine during startup, so this doesn't appear to be amemory leak. There is quite a bit of CPU work, but I amguessing that is just berkeley db keeping busy in theiteration function.
How long does it take to scan 1.4 million files on startup?
==rob
That's an interesting issue :)

A few observations:
- we were looking at this on SAN; the results may be differenton local disks
- the db files are on the order of 500 MB for this particular setup
- the time to scan varies depending on if the db files are hotin the Linux buffer cache
If we start the daemon right after killing another one thatjust did the same scan, then the process is CPU intensive, butfast (about 5 seconds). If we unmount/mount the SAN betweenthe two runs so that the buffer cache is cleared, then it isvery slow (about 5 minutes).
An interesting trick is to use dd with a healthy buffer size toread the .db files and throw the output into /dev/null beforestarting the servers. This only takes a few seconds, and makesit so that the scan consistently finishes in just a few secondsas well. I think the reason is just that it forces the db datainto the Linux buffer cache using an efficient access patternso that berkeley db doesn't have to wait on disk latency forwhatever small accesses it is performing.
This seems to indicate that berkeley db's access patterngenerated by PVFS2 for this case isn't very friendly, at leastto SANs that aren't specifically tuned for it.
The 5 minute scan time is a problem, because it makes it hardto tell when you will actually be able to mount the file systemafter the daemons appear to have started. We would be happy totry out any optimizations here :)
-Phil

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] server crash on startup with millions of files

Reply via email to