The RFC pr link :)
https://github.com/apache/hudi/pull/4326

I am personally inclined to add a new index option (DYANMIC_BUCKET_INDEX),
to keep a clean & performable hash index (BUCEKT_INDEX) option for
experienced users.
We could also save potential migration trouble by ensuring a consistent
behavior of the hash index.

The impact of resizing (split/merge) has been described in the RFC. In
short, the split/merge is async embedded in the clustering process and
doesn't block concurrent readers & writers. By controlling its processing
granularity, we can further alleviate the performance impact on concurrent
operations.

I agree that merge (shrink table) may be a very infrequent operation. But I
guess we still need to implement it for completeness :)

On Tue, Dec 14, 2021 at 1:52 AM Vinoth Chandar <[email protected]> wrote:

> +1 on the overall idea.
>
> I am wondering if we can layer this on top of Hash Index as a way for just
> expanding the number of buckets.
>
> While Split/Merge sounds great, IMO there is significant operational
> overhead to it. Most practical scenarios can be met with ability to expand
> with zero impact as you describe it?
> In fact, back when I worked on voldemort (linkedin's dynamo impl), we never
> shrunk the tables for this reason as well.
>
> In any case, look forward to the RFC. please grab a RFC number!
>
> On Mon, Dec 13, 2021 at 6:24 AM Gary Li <[email protected]> wrote:
>
> > +1, looking forward to the RFC.
> >
> > Best,
> > Gary
> >
> > On Sun, Dec 12, 2021 at 7:12 PM leesf <[email protected]> wrote:
> >
> > > +1 for the improvement to make bucket index more comprehensive and
> > looking
> > > forward to the RFC for more details.
> > >
> > > Yuwei Xiao <[email protected]> 于2021年12月10日周五 16:22写道:
> > >
> > > > Dear Hudi Community,
> > > >
> > > > I would like to propose Consistent Hashing Indexing to enable dynamic
> > > > bucket number, saving hyper-parameter tuning for Hudi users.
> > > >
> > > > Currently, we have Bucket Index on landing [1]. It is an effective
> > index
> > > > approach to address the performance issue during Upsert. I observed
> ~3x
> > > > throughput improvement for Upsert in my local setup compared to the
> > Bloom
> > > > Filter approach. However, it requires pre-configure a bucket number
> > when
> > > > creating the table. As described in [1], this imposes two
> limitations:
> > > >
> > > > - Due to the one-one mapping between buckets and file groups, the
> size
> > > of a
> > > > single file group may grow infinitely. Services like compaction will
> > take
> > > > longer because of the larger read/write amplification.
> > > >
> > > > - There may exist data skew because of imbalance data distribution,
> > > > resulting in long-tail read/write.
> > > >
> > > > Based on the above observation, supporting dynamic bucket number is
> > > > necessary, especially for rapidly changing hudi tables. Looking at
> the
> > > > market, Consistent Hashing has been adopted in DB systems[2][3]. The
> > main
> > > > idea of it is to turn the "key->bucket" mapping into
> > > > "key->hash_value->(range mapping)->bucket", constraining the
> re-hashing
> > > > process to touch only several local buckets (e.g., only large file
> > > groups)
> > > > rather than shuffling the whole hash table.
> > > >
> > > > In order to introduce Consistent Hashing to Hudi, we need to consider
> > the
> > > > following issues:
> > > >
> > > > - Storing hashing metadata, such as range mapping infos. Metadata
> size
> > > and
> > > > concurrent updates to metadata should also be considered.
> > > >
> > > > - Splitting & Merging criteria. We need to design a (or several)
> > policies
> > > > to manage 'when and how to split & merge bucket'. A simple policy
> would
> > > be
> > > > splitting in the middle when the file group reaches the size
> threshold.
> > > >
> > > > - Supporting concurrent write & read. The splitting or merging must
> not
> > > > block concurrent writer & reader, and the whole process should be
> fast
> > > > enough (e.g., one bucket at a time) to minimize the impact on other
> > > > operations.
> > > >
> > > > - Integrating splitting & merging process into existing hudi table
> > > service
> > > > pipelines.
> > > >
> > > > I have sketched a prototype design to address the above problems:
> > > >
> > > > - Maintain hashing metadata for each partition (persisted as files),
> > and
> > > > use instant to manage multi-version and concurrent updates of it.
> > > >
> > > > - A flexible framework will be implemented for different pluggable
> > > > policies. The splitting plan, specifying which and how the bucket to
> > > split
> > > > (merge), will be generated during the scheduling (just like how
> > > compaction
> > > > does).
> > > >
> > > > - Dual-write will be activated once the writer observes the
> > splitting(or
> > > > merging) process, upserting records as log files into both old and
> new
> > > > buckets (file groups). Readers can see records once the writer
> > completes,
> > > > regardless of the splitting process.
> > > >
> > > > - The splitting & merging could be integrated as a sub-task into the
> > > > Clustering service, because we could view them as a special case of
> the
> > > > Clustering's goal (i.e., managing file groups based on file size).
> > Though
> > > > we need to modify Clustering to handle log files, the bucket index
> > > enhances
> > > > Clustering by allowing concurrent updates.
> > > >
> > > >
> > > > Would love to hear your thoughts and any feedback about the
> proposal. I
> > > can
> > > > draft an RFC with a detailed design once we reach an agreement.
> > > >
> > > > [1]
> > > >
> > https://cwiki.apache.org/confluence/display/HUDI/RFC+-+29%3A+Hash+Index
> > > >
> > > > [2] YugabyteDB
> > > >
> > > >
> > >
> >
> https://docs.yugabyte.com/latest/architecture/docdb-sharding/sharding/#example
> > > >
> > > > [3] PolarDB-X
> > > >
> https://help.aliyun.com/document_detail/316603.html#title-y5n-2i1-5ws
> > > >
> > > >
> > > >
> > > > Best,
> > > >
> > > > Yuwei Xiao
> > > >
> > >
> >
>

Reply via email to