[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Forward Xu updated HUDI-110: ---------------------------- Summary: Better defaults for Partition extractor for Spark DataSource and DeltaStreamer (was: Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer) > Better defaults for Partition extractor for Spark DataSource and DeltaStreamer > ------------------------------------------------------------------------------ > > Key: HUDI-110 > URL: https://issues.apache.org/jira/browse/HUDI-110 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer, Spark Integration, Usability > Reporter: Balaji Varadarajan > Priority: Critical > Labels: user-support-issues > Fix For: 0.11.0 > > > Currently > SlashEncodedDayPartitionValueExtractor is the default being used. This is not > a common format outside Uber. > > Also, Spark DataSource provides partitionedBy clauses which has not been > integrated for Hudi Data Source. We need to investigate how we can leverage > partitionBy clause for partitioning. -- This message was sent by Atlassian Jira (v8.20.1#820001)