Yanjia Gary Li created HUDI-597: ----------------------------------- Summary: Enable incremental pulling from defined partitions Key: HUDI-597 URL: https://issues.apache.org/jira/browse/HUDI-597 Project: Apache Hudi (incubating) Issue Type: New Feature Reporter: Yanjia Gary Li Assignee: Yanjia Gary Li
For the use case that I only need to pull the incremental part of certain partitions, I need to do the incremental pulling from the entire dataset first then filtering in Spark. If we can use the folder partitions directly as part of the input path, it could run faster by only load relevant parquet files. Example: {code:java} spark.read.format("org.apache.hudi") .option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY,DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL) .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY, "000") .load(path, "year=2020/*/*/*") {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)