Hello Sonu, you're welcome; Thank you for sharing the experience with Kylin
5!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC,
Apache Incubator PMC,
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Singh Sonu <sonusingh.javat...@gmail.com> 于2023年8月21日周一 17:30写道:

> Hi ShaoFeng,
>
> Thanks for your valuable suggestions.
> I will surely apply your inputs to the model and let the Kylin team know
> about the improvement.
>
>
>
>
>  Email- sonusingh.javat...@gmail.com
>
>  with regards,
>  Sonu Kumar Singh
>
>
>
>
> On Mon, Aug 21, 2023 at 2:48 PM ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
>
> > Hello Sonu,
> >
> > In Kylin5, you can add multiple raw index in a data model. For each raw
> > index, you can add the column you want, specify column sequence (put the
> > filter column ahead of non-filter column), and specify a "ShardBy" column
> > (which should be a high-cardinality column, which use be used to
> distribute
> > data into different shard/bucket/file). Then Kylin will sort, distribute
> > and index the data with your preferred method, which will benefit the
> query
> > performance.
> >
> > With the case you mentioned, "a table with 118 columns, and a filter
> query
> > can run on any column", I think it is difficult to achieve a sub-second
> > performance, under a reasonable resource.  Some suggestions here:
> >
> > 1) build the data incrementally into multiple segments, instead of one
> > whole build; by doing so, Kylin will have segment level index which may
> > help to do high level pruning;
> > 2) add and build multiple raw index, but no need to add too many as that
> > will take much space;
> > 3) in each raw index, carefully select the column sequence and "ShardBy"
> > column, so that Spark can do as much file prunning as it can;
> >
> > Other developers may also have other inputs. Hope this helps.
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC,
> > Apache Incubator PMC,
> > Email: shaofeng...@apache.org
> >
> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >
> >
> >
> >
> > Singh Sonu <sonusingh.javat...@gmail.com> 于2023年8月21日周一 16:33写道:
> >
> > > Hi Kylin Team,
> > >
> > > How to use the inverted and sorted index in Kylin 5?
> > > Against aggregation data, queries are working fast, but when I am
> trying
> > to
> > > run the query against raw data or a search query, Kylin 5 is not
> > performing
> > > fast. I created a model against a table with 118 columns, and a filter
> > > query can run on any column.
> > >
> > > Please suggest.
> > >
> > >
> > >
> > >
> > >  You can reach me out at
> > >  Mb. No- 7092292112
> > >  Email- sonusingh.javat...@gmail.com
> > >
> > >  with regards,
> > >  Sonu Kumar Singh
> > >
> >
>

Reply via email to