[jira] [Updated] (HUDI-1406) Add new DFS path sector implementation for listing date based partitions
[ https://issues.apache.org/jira/browse/HUDI-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1406: - Status: Open (was: New) > Add new DFS path sector implementation for listing date based partitions > > > Key: HUDI-1406 > URL: https://issues.apache.org/jira/browse/HUDI-1406 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: Bhavani Sudha >Assignee: Bhavani Sudha >Priority: Minor > Labels: pull-request-available > Fix For: 0.7.0 > > > Deltastreamer DFS source lists files from table path and determine files > changed recently based on modification time. For certain workloads where only > the latest partitions are affected, we might benefit by listing source input > only from recent partitions. This especially helps data in S3 with multi > partition fields and listing is time consuming. > > To support this, I propose adding a DFS selector implementation based on date > partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1406) Add new DFS path sector implementation for listing date based partitions
[ https://issues.apache.org/jira/browse/HUDI-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1406: - Labels: pull-request-available (was: ) > Add new DFS path sector implementation for listing date based partitions > > > Key: HUDI-1406 > URL: https://issues.apache.org/jira/browse/HUDI-1406 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: Bhavani Sudha >Assignee: Bhavani Sudha >Priority: Minor > Labels: pull-request-available > Fix For: 0.6.1 > > > Deltastreamer DFS source lists files from table path and determine files > changed recently based on modification time. For certain workloads where only > the latest partitions are affected, we might benefit by listing source input > only from recent partitions. This especially helps data in S3 with multi > partition fields and listing is time consuming. > > To support this, I propose adding a DFS selector implementation based on date > partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1406) Add new DFS path sector implementation for listing date based partitions
[ https://issues.apache.org/jira/browse/HUDI-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-1406: Description: Deltastreamer DFS source lists files from table path and determine files changed recently based on modification time. For certain workloads where only the latest partitions are affected, we might benefit by listing source input only from recent partitions. This especially helps data in S3 with multi partition fields and listing is time consuming. To support this, I propose adding a DFS selector implementation based on date partitions. was: Deltastreamer DFS source lists files from table path and determine files changed recently based on modification time. For certain workloads where only the latest partitions are affected, we might benefit by listing source input only from recent partitions. This especially helps data in S3 with multi partition fields and listing is time consuming. To support this, I propose adding a DFS selector implementation based and date. > Add new DFS path sector implementation for listing date based partitions > > > Key: HUDI-1406 > URL: https://issues.apache.org/jira/browse/HUDI-1406 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: Bhavani Sudha >Assignee: Bhavani Sudha >Priority: Minor > Fix For: 0.6.1 > > > Deltastreamer DFS source lists files from table path and determine files > changed recently based on modification time. For certain workloads where only > the latest partitions are affected, we might benefit by listing source input > only from recent partitions. This especially helps data in S3 with multi > partition fields and listing is time consuming. > > To support this, I propose adding a DFS selector implementation based on date > partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)