Hello Sonu, you're welcome; Thank you for sharing the experience with Kylin 5!
Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC, Apache Incubator PMC, Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org Singh Sonu <sonusingh.javat...@gmail.com> 于2023年8月21日周一 17:30写道: > Hi ShaoFeng, > > Thanks for your valuable suggestions. > I will surely apply your inputs to the model and let the Kylin team know > about the improvement. > > > > > Email- sonusingh.javat...@gmail.com > > with regards, > Sonu Kumar Singh > > > > > On Mon, Aug 21, 2023 at 2:48 PM ShaoFeng Shi <shaofeng...@apache.org> > wrote: > > > Hello Sonu, > > > > In Kylin5, you can add multiple raw index in a data model. For each raw > > index, you can add the column you want, specify column sequence (put the > > filter column ahead of non-filter column), and specify a "ShardBy" column > > (which should be a high-cardinality column, which use be used to > distribute > > data into different shard/bucket/file). Then Kylin will sort, distribute > > and index the data with your preferred method, which will benefit the > query > > performance. > > > > With the case you mentioned, "a table with 118 columns, and a filter > query > > can run on any column", I think it is difficult to achieve a sub-second > > performance, under a reasonable resource. Some suggestions here: > > > > 1) build the data incrementally into multiple segments, instead of one > > whole build; by doing so, Kylin will have segment level index which may > > help to do high level pruning; > > 2) add and build multiple raw index, but no need to add too many as that > > will take much space; > > 3) in each raw index, carefully select the column sequence and "ShardBy" > > column, so that Spark can do as much file prunning as it can; > > > > Other developers may also have other inputs. Hope this helps. > > > > Best regards, > > > > Shaofeng Shi 史少锋 > > Apache Kylin PMC, > > Apache Incubator PMC, > > Email: shaofeng...@apache.org > > > > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html > > Join Kylin user mail group: user-subscr...@kylin.apache.org > > Join Kylin dev mail group: dev-subscr...@kylin.apache.org > > > > > > > > > > Singh Sonu <sonusingh.javat...@gmail.com> 于2023年8月21日周一 16:33写道: > > > > > Hi Kylin Team, > > > > > > How to use the inverted and sorted index in Kylin 5? > > > Against aggregation data, queries are working fast, but when I am > trying > > to > > > run the query against raw data or a search query, Kylin 5 is not > > performing > > > fast. I created a model against a table with 118 columns, and a filter > > > query can run on any column. > > > > > > Please suggest. > > > > > > > > > > > > > > > You can reach me out at > > > Mb. No- 7092292112 > > > Email- sonusingh.javat...@gmail.com > > > > > > with regards, > > > Sonu Kumar Singh > > > > > >