Hello Bharath Vissapragada, Vihang Karajgaonkar, Anonymous Coward (486),

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/12991

to review the following change.


Change subject: Initial support for recursive file listing within a partition
......................................................................

Initial support for recursive file listing within a partition

This adds support to FileMetadataLoader to recursively list a directory
and create file descriptors. The changes are as follows:

* FileMetadataLoader can now take a 'recursive' argument to trigger the
  new behavior. All the non-test code paths still use non-recursive
  (i.e. this new feature isn't exposed for real tables as of yet).

* FileSystemUtil has some functionality for recursive directory listing.
  There are a few notes there around unexpected optimizations for S3 vs
  HDFS.

* Renamed the 'file_name' field to 'relative_path' for FileDescriptor
  and HDFS splits, since now the file descriptors may be more than a
  single path component.

The new functionality is just unit tested at the moment. Later, this
functionality will be used in a couple cases, including:

- ability to access "bucketed" tables written by Hive or Spark in a
  read-only manner. Today we ignore the bucketing and they end up being
  read as empty tables.

- ability to list files inside the hierarchical layout for ACID tables.

We may want to expose recursive listing support for user tables as well
(as suggested in IMPALA-4596). However, the global configuration flag
suggested in that JIRA doesn't seem so great, so I'm leaving that out
for now as well.

Change-Id: I9b151d7abb8443c0d9de0a0d82a9f13e07ad5109
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M common/fbs/CatalogObjects.fbs
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/testutil/BlockIdGenerator.java
16 files changed, 253 insertions(+), 59 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/12991/1
--
To view, visit http://gerrit.cloudera.org:8080/12991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9b151d7abb8443c0d9de0a0d82a9f13e07ad5109
Gerrit-Change-Number: 12991
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>

Reply via email to