Hi All,

Do we have some statistics about current bottlenecks which part is taking
more time??

*@Ravindra* Please correct me If I am wrong I think our current unsafe sort
is also in-place  it is only swapping the offsets not data. Only from
comparison it is getting the data from off-heap to on-heap.

Our current *GC* bottleneck is in writing step, there we are keeping the
data for some time. There also we can store the data in off-heap and we can
go for unsafe sort while sorting the column level data.

It's better to get the current bottlenecks and proceed further.

-Regards
Kumar Vishal



On Mon, May 22, 2017 at 4:37 PM, Jacky Li <jacky.li...@qq.com> wrote:

> Yes, and if after dictionary encoding, SORT_COLUMNS can fit in 6 bytes,
> our approach can be even better, because the 8 bytes data can be put in
> cache totally, without the remaining portion in memory.
>
> Regards,
> Jacky
>
> > 在 2017年5月22日,下午5:23,Ravindra Pesala <ravi.pes...@gmail.com> 写道:
> >
> > Hi,
> >
> > I think you are referring to tungsten sort, there they tried keep pointer
> > and key together to simulate cache aware computation. It is only possible
> > if the sort keys are always starts with fixed keys like dictionary keys.
> So
> > basically first encountered few dictionary columns can be kept along with
> > pointer and starts sorting, if that is equal then we can go and retrieve
> > remaining key and compare it.
> > It is simple to implement in our current design as our current
> > implementation of unsafe sort is also inspired from tungsten sort.
> >
> > Regards,
> > Ravindra.
> >
> > On 22 May 2017 at 09:31, Jacky Li <jacky.li...@qq.com> wrote:
> >
> >> For sorting, I think more optimization we can do, I am currently
> thinking
> >> these:
> >> 1. Do not sort the whole TablePage, only KeyPage is required as the sort
> >> key
> >>
> >> 2. Should find a more memory efficient sorting algorithm than
> >> System.arraycopy which requires doubling space.
> >>
> >> 3. Should try to hold the KeyPage as well as the RowId in a compact data
> >> structure, it is best if it fits in CPU cache. Modern L3 CPU is larger
> than
> >> 8MB. For this, I am thinking to have a 8 bytes encoded format that
> includes
> >> RowID and SORT_COLUMNS (partial or full), for example, 2 bytes for
> RowId,
> >> remaining 6 bytes for 2 to 3 columns after dictionary encoding.
> >>     a) If we can hold the RowID + whole SORT_COLUMNS in 8 bytes, it will
> >> be most efficient to leverage CPU cache to do sorting, use in-place
> update
> >> approach while sorting. So no extra storage is needed, and the RowID +
> >> whole SORT_COLUMNS will be sorted.
> >>     b) If we can only hold the RowID + partial SORT_COLUMNS in 8 bytes,
> >> we can employ strategy like the sorting in Spark Tungsten project.
> (first
> >> compare the 8 bytes in cache, if  it equals then compare remaining
> bytes in
> >> memory)
> >>
> >> Regards,
> >> Jacky
> >>
> >>> 在 2017年5月22日,上午10:19,David CaiQiang <david.c...@gmail.com> 写道:
> >>>
> >>> As I known, System.arrayCopy of object array is a shallow copy, so I
> >> think
> >>> both KeyPage and TablePage maybe have the same performance on
> >> Arrays.sort.
> >>>
> >>>
> >>> -----
> >>> Best Regards
> >>> David Cai
> >>> --
> >>> View this message in context: http://apache-carbondata-dev-m
> >> ailing-list-archive.1130556.n5.nabble.com/DISCUSS-Data-loadi
> >> ng-improvement-tp11429p13056.html
> >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> >> archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > Thanks & Regards,
> > Ravi
>
>
>
>

Reply via email to