Hi,
> of jumping back to io_datafile_post_msgpairs, I think you'll want to  
> jump all the way back to io_init.  Its probably easier to create  
> another return code (IO_REINIT or something), and return that from  
> io_datafile_complete_operations.  I think there will be some cleanup  
> that you have to do in complete_operations before you can jump back  
> up to init as well.
That means in theory it should work ? Is the getattr statemachine confused if 
I/O operations are completed in the meantime ? 
I mean for example in case the getattr statemachine waits for a message from 
the metadataserver and the completion of another I/O operation activates the 
machine again ? 

> Also, it seems unlikely that the dfile handle array would have  
> changed from the initial getattr to the IO requests (wouldn't a  
> migrate disable the metadata server temporarily?), 
The scenario might be that a lot of clients do I/O and one client program or 
the user decides to migrate a datafile to another potential better location.
During the migration no write with the particular datafile is possible, 
however reads could happen (I have to integrate that later in the req 
schedulre). New requests (at least write requests) get queued up in the 
request scheduler and once the migration finishs the old datafile is deleted 
(to ensure that we don't do I/O with invalid files). Then these requests 
return ENOENT back to the client, now the new datafile is completely valid 
and should be used.
> off with a 0 timeout for now, otherwise you'll have to invalidate the  
> cache (at least the dfile handle array bits of it) before doing the  
> getattr again.
Yeah, I have to invalidatate the acache data also before I reget the dfile 
array.
> in our code.  My guess is that the remove-object may not remove the  
> actual fd from the open cache, so the IO doesn't fail sooner with  
> ENOENT as it should.  I haven't looked at the code to verify that  
> though.
Actually the server returns ENOENT, I see that because the I/O context reports 
the corresponding string error with the message "io_process_context_recv 
(op_status): No such file or directory" but the client sm retries to get the 
stuff and aborts with another error...

Thanks,
Julian

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to