Re: New document: "How to optimize cube build"

2017-01-25 Thread ShaoFeng Shi
Hi Alberto, Thanks for your comments! In many cases the data is imported to Hadoop in T+1 mode. Especially when everyday's data is tens of GB, it is reasonable to partition the Hive table by date. The problem is whether it worth to keep a long history data in Hive; Usually user only keep a couple

Re: org.apache.kylin.dict.CachedTreeMap use a couple classes from org.apache.hadoop.fs

2017-01-25 Thread ShaoFeng Shi
Yerui, You're correct, serialize the dictionary isn't a good idea; I will try to initialize these big objects inner executors, instead of transfering them from driver; I will get back to you if have problem. Thanks! 2017-01-25 15:56 GMT+08:00 Yerui Sun : > Hi,shaofeng, >Sorry for my slow res

Re: query double column in fact table

2017-01-25 Thread Li Yang
Make the price column a dimension and you should be able to do the query. However double being a very high cardinality type may not work well (high expansion rate, long build time etc). If you could lower the price's cardinality by rounding to integer or even tens/hundreds, the resulted cube will b

Qlik

2017-01-25 Thread Alberto Ramón
It's in Spanish, but the picture is very clear https://www.linkedin.com/pulse/qlik-cloudera-bigdatasmartdataanalytics-felipe-trigo?trk=hp-feed-article-title-publish

Re: New document: "How to optimize cube build"

2017-01-25 Thread Alberto Ramón
Be careful about partition by "FLIGHTDATE" >From https://github.com/albertoRamon/Kylin/tree/master/KylinPerformance *"Option 1: Use id_date as partition column on Hive table. This have a big problem: the Hive metastore is meant for few hundred of partitions not thousand (Hive 9452 there is an ide

Re: hbase Very high read or write request count in a single RegionServer

2017-01-25 Thread Alberto Ramón
The solution of Li Yand works works from CDH 5.4, but If your production env is HBase 1.2, dreaming is free ;) , you can also try: https://issues.apache.org/jira/browse/HBASE-10070 2017-01-25 5:52 GMT+01:00 Li Yang : > Try google 'hbase read replica'. > > Cheers > > On Tue, Jan 17, 2017 at 1

New document: "How to optimize cube build"

2017-01-25 Thread ShaoFeng Shi
Hello, A new document is added for the practices of cube build. Any suggestion or comment is welcomed. We can update the doc later with feedbacks; Here is the link: https://kylin.apache.org/docs16/howto/howto_optimize_build.html -- Best regards, Shaofeng Shi 史少锋

Re: Proposal for updating master branch to use HBase 1.x

2017-01-25 Thread Billy Liu
+1, nice work. 2017-01-25 15:59 GMT+08:00 Yerui Sun : > +1 > > Use hbase 1.x as default version really make sense. > > However, we also should support 0.98 version as past, since upgrading is > not easy in some production environments. > > > 在 2017年1月23日,11:41,nichunen 写道: > > > > +1 > > > > Hba

[jira] [Created] (KYLIN-2417) Compare the performance between HDFSMetaStore and HBaseMetaStore

2017-01-25 Thread XIE FAN (JIRA)
XIE FAN created KYLIN-2417: -- Summary: Compare the performance between HDFSMetaStore and HBaseMetaStore Key: KYLIN-2417 URL: https://issues.apache.org/jira/browse/KYLIN-2417 Project: Kylin Issue Typ

Re: Proposal for updating master branch to use HBase 1.x

2017-01-25 Thread Yerui Sun
+1 Use hbase 1.x as default version really make sense. However, we also should support 0.98 version as past, since upgrading is not easy in some production environments. > 在 2017年1月23日,11:41,nichunen 写道: > > +1 > > Hbase 1.x is more commonly used in our clients. > > > George/倪春恩 > > Mob