In article <[EMAIL PROTECTED]> you wrote:
> In short, I'm distributing logs in realtime for about 600,000
> websites.  The sources of the logs (http, ftp, realmedia, etc) are
> flexible, however the base framework was build around a large cluster
> of webservers.  The output can be to several hundred thousand files
> across about two dozen filers for user consumption - some can be very
> active, some can be completely inactive.

Asuming you have multiple request log summary files, I would just run
multiple "splitters".

> You can certainly open the file, but not block on the call to do it.
> What confuses me is why the kernel would "block" for 415ms on an open
> call.  Thats an eternity to suspend a process that has to distribute
> data such as this.

Because it has to, to return the result with the given API. 

But If you would have a async interface, the operation would still take that
long and your throughput will still be limited by the opens/sec your filers
support, or?

> Except I cant very well keep 600,000 files open over NFS.  :)  Pool
> and queue, and cycle through the pool.  I've managed to achieve a
> balance in my production deployment with this method - my email was
> more of a rant after months of trying to work around a problem (caused
> by a limitation in system calls),

I agree that a unified async layer is nice from the programmers POV, but I
disagree that it would help your performance problem which is caused by NFS
and/or NetApp (and I wont blame them).

Gruss
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to