MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1) wrote:
I had a couple of inputs here : I was talking to our specweb person, and he
had the following views :

1. most modern day os'es cache the files, and not do a disk io for every
single file request. (duh !!.)

yep. Yesterday I powered up wimp for the first time in ages and did a mini-SPECweb experimental run in preparation for fiddling with the stat() in mod_specweb99. I got really horrible results at first and couldn't figure out what was wrong. It turned out that I just needed to warm up the kernel's disk cache more. The results got 50% better after an hour or so.


2. when doing writes, do a 64M block writes, instead of write to disk every
time.. (Lazy write)

I would hope a smart file system/kernel would take care of that for us.

3. caching the fd's would be more than sufficient (than caching the
contents).

yep, it actually would be better for big files because we can do sendfile. If your NIC has hardware checksum, a smart kernel/device driver can just do DMA from the file cache. But how many fd's can you cache?


4. on hp-ux, eliminating the stat/fstat would not make a lot of difference..
I dont know about other os'es - but, based on his logic, since the fd for
that file is already available, fstat should not take a lot of time.

Some of my buddies in IBM's Linux Technology Center who run SPECweb99 say that one of the things that inhibits our SMP scalability with out-of-the-box Linux kernels is contention on the dcache spinlock. That comes into play every time you do a syscall and pass a file path that the kernel has to walk. So open()s and stat()s are the prime suspects.


I'm wondering why we need the stat() at all in mod_specweb99? It looks like the only thing we use from it is s.size. But IIRC the sendfile() syscall is happy if you give it a size of zero, which means send the whole thing. This obviously needs to be tested to see how our code reacts to EAGAIN + size == 0. If we can't get rid of it altogether, I would prefer to use fstat() a.k.a. apr_file_info_get()

The LTC guys use a "dcache RCU (read-copy-update)" patch that eliminates the spinlock contention, but I doubt if a high percentage of our users are willing to build custom kernels. The kernel still has to walk the file path, lock or no lock.

5. Another interesting question : why do we need the poll everytime ?.
Instead do a accept, and if there's no request available, accept would
return EWOULDBLOCK.

If your box supports SINGLE_LISTEN_UNSERIALIZED_ACCEPT, you shouldn't have any polls before the accepts, barring bugs. SPECweb99 does a lot of keepalive requests, and we poll for new data between requests on the same connection. That's probably what you're seeing.


Greg



Reply via email to