Hi Roopa,
Bucketing is a more general concept. I think what you are referring to is how
to integrate with spark sql bucketing syntax. I was proposing a Hudi native
solution where we can implement Bucket indexing which gives the same end result
of writing compacted (parquet) files with keys
Hi Balaji,
Thanks for your response. I went through HoodieIndex in source code but I am
not sure how indexing alone could help with bucketing.
Spark Bucketing would involve writing the compacted files in bucketed/clustered
fashion such that when a spark sql query has a certain id, only the