anuengineer commented on issue #108: HDDS-1987. Fix listStatus API
URL: https://github.com/apache/hadoop-ozone/pull/108#issuecomment-548105219
 
 
   1. First of all, this is complicated code. But not your fault, what you need 
to do at this level is complicated. 
   2. I agree that is one way  of doing this change; save one question. How are 
we guaranteeing that the iterator that you are holding is the only valid one to 
the double buffer. Think about this edge case. Suppose you enter the listing 
with the iterator pointing to the buffer A. You build the deletedKeys list, or 
you build the added keys list.
   
   Then you are doing an/O to the underlying DB — and read a set of keys, then 
you are walking that set of keys to do a subtract from deletedKeySet and 
addition of addedKeys. makes sense. 
   
   But this can take a while, during this time two things might happen.
   
   1. The Buffer A might get tumbled over and Buffer B might becomes the active 
one.  I want you to consider if this is a possibility. -- The new Buffer B is 
active, a Key is deleted, but you don't know about it. Then you are operating 
against an inconsistent snap shot of data -- which is ok, since we can define 
that as the behavior of the system. In other words, the list operation might 
return key names which are already deleted. The DB I/O delay might make it 
acute.
   
   2. The buffer A might get committed -- now all the work you did is for 
nothing -- the data is already there in the DB.. You might want to define what 
should be the behavior at this level.
   
   3. In your patch, you don't check if you have seen all the deleted Key from 
the DB, which is again ok, since you have no way of predicting if the double 
buffer is actually syncing and when.
   
   Another way to write this code might be to ask the table that controls the 
double buffer to do the right  thing. That way, you are make these reasonings 
more local. Just my 2 cents.
   
   4. Also the keyManagerImpl is a file that we want to depricate in the long 
run.. we made a mistake and wrote the file system code inside that class, just 
FYI. Nothing to do with this patch.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to