[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437416#comment-15437416
 ] 

Anatoli Shein commented on HDFS-10754:
--------------------------------------

Thank for the review, [~bobhansen].

I have addressed your comments as follows:

* The new recursive methods should not use future/promises internally. That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status. If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more." It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

Also because all of this is asynchronous, you can't allocate the lock and state 
variables on the stack. When the consumer calls SetOwner('/', true, handler), 
the function is going to return as soon as the find operation is kicked off, 
destroying all of the elements on the stack. We'll need to create a little 
struct for SetOwner that is maintained with a shared_ptr and cleaned up when 
the last request is done.

(/) I removed futures and promises from recursive methods, and created a small 
struct to keep the state. Now it is purely async.

Minor points:

* In find, perhaps recursion_counter is a bit of a misnomer at this point. It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.
(/) Done.

* Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).
(/) Done.

* In CurrentState, perhaps "depth" is more accurate than position?
(/) Yes, I changed it to "depth" now.

* Do we support a globbing find without recursion? Can I find "/dir?/path*/" 
"*.db", and not have it recurse to the sub-directories of path*?
(/) POSIX find supports globbing without recursion by setting maxdepth to zero. 
I added maxdepth functionality for our tool also.

* Can we push the shims and state into the .cpp file and keep them out of the 
itnerface (even if private)?
(/) Done.

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10754
>                 URL: https://issues.apache.org/jira/browse/HDFS-10754
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch, HDFS-10754.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to