[ https://issues.apache.org/jira/browse/DRILL-7004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752402#comment-16752402 ]
benj commented on DRILL-7004: ----------------------------- Nice, it had escaped me (maybe I need to buy new eyes). Thanks. I find that that with _storage.list_files_recursively_ option, it takes a long time as it don't take care of information in WHERE clause like ROOT_SCHEMA_NAME or WORKSPACE_NAME. {code:java} SELECT * FROM INFORMATION_SCHEMA.`FILES` where root_schema_name= 'mydfs' AND workspace_name = 'dld' and relative_path = 'NAMES' LIMIT 15; {code} Although the plan say : {code:java} ... 00-05 Scan(table=[[information_schema, FILES]], groupscan=[FILES, filter=booleanand(equal(Field=ROOT_SCHEMA_NAME,Literal=mydfs),equal(Field=WORKSPACE_NAME,Literal=dld))]) {code} * ls -lahR of the dld path takes 0.3 seconds * INFORMATION_SCHEMA takes 33 seconds (100 times slower !) - (~ like a ls -lahR /) Is it possible to precise an entry point (a path) to reduce significantly time of the scan ? (try to delete all workspaces but one with no difference) > improve show files functionnality > --------------------------------- > > Key: DRILL-7004 > URL: https://issues.apache.org/jira/browse/DRILL-7004 > Project: Apache Drill > Issue Type: Wish > Components: Storage - Other > Affects Versions: 1.15.0 > Reporter: benj > Priority: Major > > For instant, it's possible to show files/directories in a particular > directory with the command > {code:java} > SHOW files FROM tmp.`mypath`; > {code} > It would be certainly very useful to improve this functionality with : > * possibility to list recursively > * possibility to use at least wildcard > {code:java} > SHOW files FROM tmp.`mypath/*/test/*/*a*`; > {code} > * possibility to use the result like a table > {code:java} > SELECT p.* FROM (SHOW files FROM tmp.`mypath`) AS p WHERE ... > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)