In article <[EMAIL PROTECTED]> you wrote: > In short, I'm distributing logs in realtime for about 600,000 > websites. The sources of the logs (http, ftp, realmedia, etc) are > flexible, however the base framework was build around a large cluster > of webservers. The output can be to several hundred thousand files > across about two dozen filers for user consumption - some can be very > active, some can be completely inactive.
Asuming you have multiple request log summary files, I would just run multiple "splitters". > You can certainly open the file, but not block on the call to do it. > What confuses me is why the kernel would "block" for 415ms on an open > call. Thats an eternity to suspend a process that has to distribute > data such as this. Because it has to, to return the result with the given API. But If you would have a async interface, the operation would still take that long and your throughput will still be limited by the opens/sec your filers support, or? > Except I cant very well keep 600,000 files open over NFS. :) Pool > and queue, and cycle through the pool. I've managed to achieve a > balance in my production deployment with this method - my email was > more of a rant after months of trying to work around a problem (caused > by a limitation in system calls), I agree that a unified async layer is nice from the programmers POV, but I disagree that it would help your performance problem which is caused by NFS and/or NetApp (and I wont blame them). Gruss Bernd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/