Re: About bucket feature in carbon

Ravindra Pesala Fri, 09 Feb 2018 05:30:49 -0800

Yes Jacky, we will do refactor and use the partition flow.

On 9 February 2018 at 13:44, Jacky Li <13561...@qq.com> wrote:


> Hi Ravindra,
>
> You mean we can do one round of refactory for bucketed table feature in
> CarbonData 1.4.
> I am fine with it.
>
> Regards,
> Jacky
>
>
> > 在 2018年2月9日，下午3:49，Ravindra Pesala <ravi.pes...@gmail.com> 写道：
> >
> > Hi Likun,
> >
> > I feel it is better to change the implementation to use sparks bucketing
> > generation just like how standard hive partitions generates. It will be
> > easy to change it after implementing of partition feature. And it is a
> > useful feature for joining big tables and hash based buckets and
> clustered
> > by enables the queries faster.  So it is better to change the
> > implementation instead of removing it.
> >
> > Regards,
> > Ravindra.
> >
> > On 9 February 2018 at 13:14, Jacky Li <jacky.li...@qq.com> wrote:
> >
> >> Hi,
> >>
> >> One year ago, CarbonData 1.0.0 has introduced bucket table feature, it
> was
> >> expected to improve join performance by avoiding shuffling if both
> tables
> >> are bucketed on same column with same number of buckets.
> >>
> >> However, after this feature was introduced, personally speaking it was
> not
> >> widely used in the community and it creates maintenance overhead for the
> >> developers in the community (for very new Pull Request, all bucket
> related
> >> testcase need to be fixed)
> >>
> >> And now carbon has integrated with spark standard partition, developer
> can
> >> add bucket support using spark bucketed table feature in future if it
> >> requires.
> >>
> >> So, I propose to remove bucket feature after CarbonData 1.3.0 version.
> >> What do you think?
> >>
> >> Regards,
> >> Jacky
> >>
> >>
> >
> >
> > --
> > Thanks & Regards,
> > Ravi
>
>
>
>


-- 
Thanks & Regards,
Ravi

Re: About bucket feature in carbon

Reply via email to