On Nov 29, 2006, at 3:44 PM, Rob Ross wrote:
That's what I was thinking -- that we could ask the I/O thread to
do the syncing rather than stalling out other progress.
Wanna try it and see if it helps :)?
Rob
Phil Carns wrote:
No. Both alt aio and the normal dbpf method sync as a seperate
step after the aio list operation completes.
This is technically possible with alt aio, though- you would just
need to pass a flag through to tell the I/O thread to sync after
the pwrite(). That would probably be pretty helpful, so the trove
worker thread doesn't get stuck waiting on the sync...
-Phil
Rob Ross wrote:
This is similar to using O_DIRECT, which has also shown benefits.
With alt aio, do we sync in the context of the I/O thread?
Thanks,
Rob
Phil Carns wrote:
One thing that we noticed while testing for storage challenge
was that (and everyone correct me if I'm wrong here) enabling
the data-sync causes a flush/sync to occur after every sizeof
(FlowBuffer) bytes had been written. I can imagine how this
would help a SAN, but I'm perplexed how it helps localdisk,
what buffer size are you playing with?
We found that unless we were using HUGE (~size of cache on
storage controller) flowbuffers that this caused way too many
syncs/seeks on the disks and hurt performance quite a bit,
maybe even as bad as 50% performance because things were not
being optimized for our disk subsystems and we were issuing
many small ops instead of fewer large ones.
Granted I havent been able to get 2.6.0 building properly yet
to test the latest out, but this was definitely the case for us
on the 2.5 releases.
You are definitely right about the data sync option causing a
flush/sync on every sizeof(FlowBuffer).
I had a note that we should change the default aio data-sync code to
only sync at the end of an IO request, instead of for each trove
operation (in FlowBufferSize chunks). Doing this at the end of an
io.sm seemed a little messy, but if/when we have request ids (hints)
being passed to the trove interface, we could use that as a way to
know to flush at the end. In any case, it sounds like its better to
flush early and often than at the end of a request?
From a user perspective, we usually tell people to enable data sync
if they're concerned about losing data. Now we're talking about
getting better performance with data sync enabled (at least in some
cases). Does it make sense to sync even with data sync disabled if
we can figure out that better performance would result?
-sam
I don't really have a good explanation for why this doesn't
seem to burn us anymore on local disk. Our settings are
standard, except for:
- 512KB flow buffer size
- alt aio method
- 512KB tcp buffers (with larger /proc tcp settings)
This testing was done on some version prior to 2.6.0 also (I
think it was a merge of some in-between release, so it is hard
to pin down a version number).
It may also have something to do with the controller and local
disks being used? All of our local disk configurations are
actually hardware raid 5 with some variety of the megaraid
controller, and these are fairly new boxes.
-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-
developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers