Yes Jacky, we will do refactor and use the partition flow. On 9 February 2018 at 13:44, Jacky Li <13561...@qq.com> wrote:
> Hi Ravindra, > > You mean we can do one round of refactory for bucketed table feature in > CarbonData 1.4. > I am fine with it. > > Regards, > Jacky > > > > 在 2018年2月9日,下午3:49,Ravindra Pesala <ravi.pes...@gmail.com> 写道: > > > > Hi Likun, > > > > I feel it is better to change the implementation to use sparks bucketing > > generation just like how standard hive partitions generates. It will be > > easy to change it after implementing of partition feature. And it is a > > useful feature for joining big tables and hash based buckets and > clustered > > by enables the queries faster. So it is better to change the > > implementation instead of removing it. > > > > Regards, > > Ravindra. > > > > On 9 February 2018 at 13:14, Jacky Li <jacky.li...@qq.com> wrote: > > > >> Hi, > >> > >> One year ago, CarbonData 1.0.0 has introduced bucket table feature, it > was > >> expected to improve join performance by avoiding shuffling if both > tables > >> are bucketed on same column with same number of buckets. > >> > >> However, after this feature was introduced, personally speaking it was > not > >> widely used in the community and it creates maintenance overhead for the > >> developers in the community (for very new Pull Request, all bucket > related > >> testcase need to be fixed) > >> > >> And now carbon has integrated with spark standard partition, developer > can > >> add bucket support using spark bucketed table feature in future if it > >> requires. > >> > >> So, I propose to remove bucket feature after CarbonData 1.3.0 version. > >> What do you think? > >> > >> Regards, > >> Jacky > >> > >> > > > > > > -- > > Thanks & Regards, > > Ravi > > > > -- Thanks & Regards, Ravi