[ 
https://issues.apache.org/jira/browse/HDFS-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anatoli Shein updated HDFS-10679:
---------------------------------
    Attachment: HDFS-10679.HDFS-8707.001.patch

Attaching the initial working patch. Has both sync and async versions.

I tested like this:

#Make many dirs to test find
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a1/b/c/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b1/c/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c1/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d1/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e1/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e/f1/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e/f/g1
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a2/b/c/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b2/c/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c2/d/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d2/e/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e2/f/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e/f2/g
hadoop fs -mkdir -p webhdfs://localhost.localdomain:10433/a/b/c/d/e/f/g2

#run it with wildcards
hadoop-hdfs-native-client/target/main/native/libhdfspp/examples/cpp/find/find 
hdfs://localhost.localdomain:9433/a/*/c? f*

Please review

> libhdfs++: Implement parallel find with wildcards tool
> ------------------------------------------------------
>
>                 Key: HDFS-10679
>                 URL: https://issues.apache.org/jira/browse/HDFS-10679
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10679.HDFS-8707.000.patch, 
> HDFS-10679.HDFS-8707.001.patch
>
>
> The find tool will issue the GetListing namenode operation on a given 
> directory, and filter the results using posix globbing library.
> If the recursive option is selected, for each returned entry that is a 
> directory the tool will issue another asynchronous call GetListing and repeat 
> the result processing in a recursive fashion.
> One implementation issue that needs to be addressed is the way how results 
> are returned back to the user: we can either buffer the results and return 
> them to the user in bulk, or we can return results continuously as they 
> arrive. While buffering would be an easier solution, returning results as 
> they arrive would be more beneficial to the user in terms of performance, 
> since the result processing can start as soon as the first results arrive 
> without any delay. In order to do that we need the user to use a loop to 
> process arriving results, and we need to send a special message back to the 
> user when the search is over.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to