Re: Kylin 5

Singh Sonu Mon, 21 Aug 2023 02:30:05 -0700

Hi ShaoFeng,

Thanks for your valuable suggestions.
I will surely apply your inputs to the model and let the Kylin team know
about the improvement.





 Email- [email protected]

 with regards,
 Sonu Kumar Singh




On Mon, Aug 21, 2023 at 2:48 PM ShaoFeng Shi <[email protected]> wrote:

> Hello Sonu,
>
> In Kylin5, you can add multiple raw index in a data model. For each raw
> index, you can add the column you want, specify column sequence (put the
> filter column ahead of non-filter column), and specify a "ShardBy" column
> (which should be a high-cardinality column, which use be used to distribute
> data into different shard/bucket/file). Then Kylin will sort, distribute
> and index the data with your preferred method, which will benefit the query
> performance.
>
> With the case you mentioned, "a table with 118 columns, and a filter query
> can run on any column", I think it is difficult to achieve a sub-second
> performance, under a reasonable resource.  Some suggestions here:
>
> 1) build the data incrementally into multiple segments, instead of one
> whole build; by doing so, Kylin will have segment level index which may
> help to do high level pruning;
> 2) add and build multiple raw index, but no need to add too many as that
> will take much space;
> 3) in each raw index, carefully select the column sequence and "ShardBy"
> column, so that Spark can do as much file prunning as it can;
>
> Other developers may also have other inputs. Hope this helps.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC,
> Apache Incubator PMC,
> Email: [email protected]
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: [email protected]
> Join Kylin dev mail group: [email protected]
>
>
>
>
> Singh Sonu <[email protected]> 于2023年8月21日周一 16:33写道：
>
> > Hi Kylin Team,
> >
> > How to use the inverted and sorted index in Kylin 5?
> > Against aggregation data, queries are working fast, but when I am trying
> to
> > run the query against raw data or a search query, Kylin 5 is not
> performing
> > fast. I created a model against a table with 118 columns, and a filter
> > query can run on any column.
> >
> > Please suggest.
> >
> >
> >
> >
> >  You can reach me out at
> >  Mb. No- 7092292112
> >  Email- [email protected]
> >
> >  with regards,
> >  Sonu Kumar Singh
> >
>

Re: Kylin 5

Reply via email to