In Hive, partition does two things: 1. Act as an index to speed up data scan 2. Act as a way to manage the data. People can add/drop partitions.
How do you unify these 2 things in your API design? On Fri, Jul 17, 2020 at 12:03 AM JackyLee <[email protected]> wrote: > Hi devs, > > In order to support Partition Commands for datasourcev2 and Lakehouse, I'm > trying to add Partition API for multiple Catalog. > > They are widely used APIs in mysql or hive or other datasources, we can use > these API to mange Partition metadata in Lakehouse. > > JIRA: https://issues.apache.org/jira/browse/SPARK-31694 > PR: https://github.com/apache/spark/pull/28617 > > We have already use these APIs to support Lakehouse on Delta Lake and hive > on datasourcev2, and it does solves partition supports on datasourcev2. > Could anyone review it? > > Thanks, > Jacky Lee > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: [email protected] > >
