Re: [Pvfs2-developers] I/O statemachines adaption for migration ?

Sam Lang Mon, 28 Aug 2006 08:51:24 -0700


On Aug 28, 2006, at 8:23 AM, Julian Martin Kunkel wrote:

Hi,
I want to adapt the I/O statemachines to reread the dfile array incase a I/Oserver responds with PVFS_ENOENT during the flow or within theinital I/OACK. This might happen if the file is migrated away and the clientdoes not
have the updated dfile array befor it initiates the I/O.
Thus, I want to reread the dfile array and only restart the I/O forthisparticular server. The progress of the other I/O requests shouldnot be
influenced.
While looking at the sys-io.sm I wonder if the transition for the case
IO_RETRY in the state io_analyze_results does this. Maybe someextra linescould be added for example to restart the process if the initialacknowledgereturns with PVFS_ENOENT and also do not increase the retry countin this
case ?
I'm thankful for any suggestions how that could be implemented easily.

I think IO_RETRY is a little different. The first step (before theIO request/response) of the sys-io.sm is a getattr to the metadataserver to get the datafile handles. Its this step that you want torepeat if the IO request to the IO server fails, right? So insteadof jumping back to io_datafile_post_msgpairs, I think you'll want tojump all the way back to io_init. Its probably easier to createanother return code (IO_REINIT or something), and return that fromio_datafile_complete_operations. I think there will be some cleanupthat you have to do in complete_operations before you can jump backup to init as well.

Also, it seems unlikely that the dfile handle array would havechanged from the initial getattr to the IO requests (wouldn't amigrate disable the metadata server temporarily?), so this retry isprobably only necessary if the attribute cache holding the dfilehandle array has become stale. You could just turn that attr cacheoff with a 0 timeout for now, otherwise you'll have to invalidate thecache (at least the dfile handle array bits of it) before doing thegetattr again.

In this context a weird error message:
In case the fs is corrupted, e.g. there is a metafile pointing to anon-existing datafile I think the I/O should abort quickly instead ofdoingretries (in the migration case retry to get dfiles if they did notchangeabort). Currently on the client sm returns the error: "Operationnow inprogress". You can try this by removing a datafile with pvfs2-remove-object
(first get object number with pvfs2-viewdist).

Hmm..that is a little odd. I think the EINPROGRESS only getsreturned from aio_error though...I don't see us setting it anywherein our code. My guess is that the remove-object may not remove theactual fd from the open cache, so the IO doesn't fail sooner withENOENT as it should. I haven't looked at the code to verify thatthough.


-sam


thanks,
julian

---
Ben (Obi-Wan) Kenobi:
        Use the Force, Luke!
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] I/O statemachines adaption for migration ?

Reply via email to