+1 Thanks Lvhu for bringing up the idea.  As Alexey suggested, it would be
good for you to write down the proposal with design details for discussion
in the community.

On Thu, Feb 16, 2023 at 11:28 AM Alexey Kudinkin <ale...@onehouse.ai> wrote:

> Thanks for your contribution, Lvhu!
>
> I think we should actually kick-start this effort with an small RFC
> outlining proposed changes first, as this is modifying the core read-flow
> for all Hudi tables and we want to make sure our approach there is
> rock-solid.
>
> On Thu, Feb 16, 2023 at 6:34 AM 吕虎 <lvh...@163.com> wrote:
>
> > Hi folks,
> >       PR 7984【 https://github.com/apache/hudi/pull/7984 】 implements
> hash
> > partitioning.
> >       As you know, It is often difficult to find an appropriate partition
> > key in the existing big data. Hash partitioning can easily solve this
> > problem. it can greatly improve the performance of hudi's big data
> > processing.
> >       The idea is to use the hash partition field as one of the partition
> > fields of the ComplexKeyGenerator, so this PR  implementation does not
> > involve logic modification of core code.
> >       The codes are easy to review, but I think hash partition is very
> > usefull. we really need it.
> >       How to use hash partition in spark data source can refer to
> >
> https://github.com/lvhu-goodluck/hudi/blob/hash_partition_spark_data_source/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestHoodieSparkSqlWriter.scala
> >  #testHashPartition
> >
> >       No public API or user-facing feature change or any performance
> > impact if the hash partition parameters are not specified.
> >
> >       When hash.partition.fields is specified and partition.fields
> > contains _hoodie_hash_partition, a column named _hoodie_hash_partition
> will
> > be added in this table as one of the partition key.
> >
> >       If predicates of hash.partition.fields appear in the query
> > statement, the _hoodie_hash_partition = X predicate will be automatically
> > added to the query statement for partition pruning.
> >
> >         Hope folks help and review!
> >       Thanks!
> > Lvhu
> >
>

Reply via email to