[GitHub] coveralls commented on issue #274: KYLIN-3597 Improve code smell
coveralls commented on issue #274: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/274#issuecomment-425620750 ## Pull Request Test Coverage Report for [Build 3713](https://coveralls.io/builds/19260169) * **21** of **38** **(55.26%)** changed or added relevant lines in **3** files are covered. * **2** unchanged lines in **2** files lost coverage. * Overall coverage increased (+**0.002%**) to **23.174%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [core-cube/src/main/java/org/apache/kylin/gridtable/GTUtil.java](https://coveralls.io/builds/19260169/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fgridtable%2FGTUtil.java#L197) | 0 | 2 | 0.0% | [server-base/src/main/java/org/apache/kylin/rest/controller/CubeController.java](https://coveralls.io/builds/19260169/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Fcontroller%2FCubeController.java#L370) | 0 | 2 | 0.0% | [core-cube/src/main/java/org/apache/kylin/gridtable/GTFilterScanner.java](https://coveralls.io/builds/19260169/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fgridtable%2FGTFilterScanner.java#L85) | 21 | 34 | 61.76% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [server-base/src/main/java/org/apache/kylin/rest/controller/CubeController.java](https://coveralls.io/builds/19260169/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Fcontroller%2FCubeController.java#L369) | 1 | 0.0% | | [core-common/src/main/java/org/apache/kylin/common/persistence/IdentifierFileResourceStore.java](https://coveralls.io/builds/19260169/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FIdentifierFileResourceStore.java#L40) | 1 | 0.0% | | Totals | [![Coverage Status](https://coveralls.io/builds/19260169/badge)](https://coveralls.io/builds/19260169) | | :-- | --: | | Change from base [Build 3711](https://coveralls.io/builds/19259592): | 0.002% | | Covered Lines: | 16162 | | Relevant Lines: | 69742 | --- # 💛 - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #274: KYLIN-3597 Improve code smell
codecov-io commented on issue #274: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/274#issuecomment-425620454 # [Codecov](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=h1) Report > Merging [#274](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=desc) into [master](https://codecov.io/gh/apache/kylin/commit/10587a65fe0552179a5c8a6e1151686ce1c8a135?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `42.1%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/274/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #274 +/- ## + Coverage 21.16% 21.16% +<.01% Complexity 4405 4405 Files 1086 1086 Lines 6974569742 -3 Branches 1008810087 -1 Hits 1476114761 + Misses5358653582 -4 - Partials 1398 1399 +1 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...torage/gtrecord/SortedIteratorMergerWithLimit.java](https://codecov.io/gh/apache/kylin/pull/274/diff?src=pr&el=tree#diff-Y29yZS1zdG9yYWdlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9zdG9yYWdlL2d0cmVjb3JkL1NvcnRlZEl0ZXJhdG9yTWVyZ2VyV2l0aExpbWl0LmphdmE=) | `78.72% <ø> (ø)` | `2 <0> (ø)` | :arrow_down: | | [...ommon/persistence/IdentifierFileResourceStore.java](https://codecov.io/gh/apache/kylin/pull/274/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9JZGVudGlmaWVyRmlsZVJlc291cmNlU3RvcmUuamF2YQ==) | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...c/main/java/org/apache/kylin/gridtable/GTUtil.java](https://codecov.io/gh/apache/kylin/pull/274/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9ncmlkdGFibGUvR1RVdGlsLmphdmE=) | `1.37% <0%> (ø)` | `2 <0> (ø)` | :arrow_down: | | [...g/apache/kylin/rest/controller/CubeController.java](https://codecov.io/gh/apache/kylin/pull/274/diff?src=pr&el=tree#diff-c2VydmVyLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL3Jlc3QvY29udHJvbGxlci9DdWJlQ29udHJvbGxlci5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...va/org/apache/kylin/gridtable/GTFilterScanner.java](https://codecov.io/gh/apache/kylin/pull/274/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9ncmlkdGFibGUvR1RGaWx0ZXJTY2FubmVyLmphdmE=) | `37.34% <47.05%> (+0.76%)` | `2 <1> (ø)` | :arrow_down: | | [...lin/dict/lookup/cache/RocksDBLookupTableCache.java](https://codecov.io/gh/apache/kylin/pull/274/diff?src=pr&el=tree#diff-Y29yZS1kaWN0aW9uYXJ5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9kaWN0L2xvb2t1cC9jYWNoZS9Sb2Nrc0RCTG9va3VwVGFibGVDYWNoZS5qYXZh) | `76.16% <0%> (-0.52%)` | `27% <0%> (ø)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=footer). Last update [10587a6...fbb2f66](https://codecov.io/gh/apache/kylin/pull/274?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] hit-lacus opened a new pull request #274: KYLIN-3597 Improve code smell
hit-lacus opened a new pull request #274: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/274 https://issues.apache.org/jira/projects/KYLIN/issues/KYLIN-3597 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #274: KYLIN-3597 Improve code smell
asfgit commented on issue #274: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/274#issuecomment-425619076 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #274: KYLIN-3597 Improve code smell
asfgit commented on issue #274: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/274#issuecomment-425619077 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: [DISCUSS] Columnar storage engine for Apache Kylin
Hi Yanghong, Thanks for your question. I think it is not required that other engines know how to read Kylin's storage, but it is a nice to have if possible. We can extend the file format if Parquet or ORC couldn't match Kylin's requirement, but not necessary to re-invent a new format. Zhong, Yanghong 于2018年9月29日周六 上午10:59写道: > I have one question about the characteristics of Kylin columnar storage > files. That is whether it should be a standard or common one. Since the > data stored in the storage engine is Kylin specified, is it necessary for > other engines to know how to build data into and how to read data from the > storage engine? > > In my opinion, it's not necessary. And Kylin columnar storage files should > be Kylin specified. We can leverage the advantages of other columnar files, > like data skip indexes, bloom filters, dictionaries. Then create a new file > format with Kylin specified requirements, like cuboid info. > > -- > Best regards, > Yanghong Zhong > > > On 9/28/18, 2:15 PM, "ShaoFeng Shi" wrote: > > Hi Kylin developers. > > HBase has been Kylin’s storage engine since the first day; Kylin on > HBase > has been verified as a success which can support low latency & high > concurrency queries on a very large data scale. Thanks to HBase, most > Kylin > users can get on average less than 1-second query response. > > But we also see some limitations when putting Cubes into HBase; I > shared > some of them in the HBaseConf Asia 2018[1] this August. The typical > limitations include: > >- Rowkey is the primary index, no secondary index so far; > > Filtering by row key’s prefix and suffix can get very different > performance > result. So the user needs to do a good design about the row key; > otherwise, > the query would be slow. This is difficult sometimes because the user > might > not predict the filtering patterns ahead of cube design. > >- HBase is a key-value instead of a columnar storage > > Kylin combines multiple measures (columns) into fewer column families > for > smaller data size (row key size is remarkable). This causes HBase often > needing to read more data than requested. > >- HBase couldn't run on YARN > > This makes the deployment and auto-scaling a little complicated, > especially > in the cloud. > > In one word, HBase is complicated to be Kylin’s storage. The > maintenance, > debugging is also hard for normal developers. Now we’re planning to > seek a > simple, light-weighted, read-only storage engine for Kylin. The new > solution should have the following characteristics: > >- Columnar layout with compression for efficient I/O; >- Index by each column for quick filtering and seeking; >- MapReduce / Spark API for parallel processing; >- HDFS compliant for scalability and availability; >- Mature, stable and extensible; > > With the plugin architecture[2] introduced in Kylin 1.5, adding > multiple > storages to Kylin is possible. Some companies like Kyligence Inc and > Meituan.com, have developed their customized storage engine for Kylin > in > their product or platform. In their experience, columnar storage is a > good > supplement for the HBase engine. Kaisen Kang from Meituan.com has > shared > their KOD (Kylin on Druid) solution[3] in this August’s Kylin meetup in > Beijing. > > We plan to do a PoC with Apache Parquet + Apache Spark in the next > phase. > Parquet is a standard columnar file format and has been widely > supported by > many projects like Hive, Impala, Drill, etc. Parquet is adding the page > level column index to support fine-grained filtering. Apache Spark can > provide the parallel computing over Parquet and can be deployed on > YARN/Mesos and Kubernetes. With this combination, the data persistence > and > computation are separated, which makes the scaling in/out much easier > than > before. Benefiting from Spark's flexibility, we can not only push down > more > computation from Kylin to the Hadoop cluster. Except for Parquet, > Apache > ORC is also a candidate. > > Now I raise this discussion to get your ideas about Kylin’s > next-generation > storage engine. If you have good ideas or any related data, welcome > discuss in > the community. > > Thank you! > > [1] Apache Kylin on HBase > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.slideshare.net%2FShiShaoFeng1%2Fapache-kylin-on-hbase-extreme-olap-engine-for-big-data&data=02%7C01%7Cyangzhong%40ebay.com%7C71e694ab5386420bb32908d62509c003%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636737121143223312&sdata=TuIOe6FxdubqsoRVX8BQb%2FkvSFRrfI0ZvBRDB0euZWk%3D&reserved=0 > [2] Apache Kylin Plugin Architecture > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkylin.apache.org%2Fdevelopment%2Fplugin_arch.html&data=02%7C01%7Cy
[GitHub] codecov-io commented on issue #273: clearKylin 3597
codecov-io commented on issue #273: clearKylin 3597 URL: https://github.com/apache/kylin/pull/273#issuecomment-425618669 # [Codecov](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=h1) Report > Merging [#273](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=desc) into [master](https://codecov.io/gh/apache/kylin/commit/10587a65fe0552179a5c8a6e1151686ce1c8a135?src=pr&el=desc) will **decrease** coverage by `<.01%`. > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/273/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #273 +/- ## - Coverage 21.16% 21.16% -0.01% - Complexity 4405 4406 +1 Files 1086 1086 Lines 6974569746 +1 Branches 1008810088 - Hits 1476114759 -2 - Misses5358653588 +2 - Partials 1398 1399 +1 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...che/kylin/engine/spark/SparkMergingDictionary.java](https://codecov.io/gh/apache/kylin/pull/273/diff?src=pr&el=tree#diff-ZW5naW5lLXNwYXJrL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9lbmdpbmUvc3BhcmsvU3BhcmtNZXJnaW5nRGljdGlvbmFyeS5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...g/apache/kylin/engine/spark/SparkFactDistinct.java](https://codecov.io/gh/apache/kylin/pull/273/diff?src=pr&el=tree#diff-ZW5naW5lLXNwYXJrL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9lbmdpbmUvc3BhcmsvU3BhcmtGYWN0RGlzdGluY3QuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...g/apache/kylin/source/datagen/ColumnGenerator.java](https://codecov.io/gh/apache/kylin/pull/273/diff?src=pr&el=tree#diff-Y29yZS1tZXRhZGF0YS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc291cmNlL2RhdGFnZW4vQ29sdW1uR2VuZXJhdG9yLmphdmE=) | `70.94% <0%> (-1.36%)` | `8% <0%> (ø)` | | | [...rg/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://codecov.io/gh/apache/kylin/pull/273/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2lubWVtY3ViaW5nL01lbURpc2tTdG9yZS5qYXZh) | `69.6% <0%> (-0.61%)` | `7% <0%> (ø)` | | | [...org/apache/kylin/rest/util/QueryRequestLimits.java](https://codecov.io/gh/apache/kylin/pull/273/diff?src=pr&el=tree#diff-c2VydmVyLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL3Jlc3QvdXRpbC9RdWVyeVJlcXVlc3RMaW1pdHMuamF2YQ==) | `40.47% <0%> (+4.76%)` | `6% <0%> (+1%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=footer). Last update [10587a6...6ffc862](https://codecov.io/gh/apache/kylin/pull/273?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] coveralls commented on issue #273: clearKylin 3597
coveralls commented on issue #273: clearKylin 3597 URL: https://github.com/apache/kylin/pull/273#issuecomment-425618578 ## Pull Request Test Coverage Report for [Build 3712](https://coveralls.io/builds/19260056) * **0** of **57** **(0.0%)** changed or added relevant lines in **2** files are covered. * **4** unchanged lines in **4** files lost coverage. * Overall coverage decreased (**-0.002%**) to **23.17%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkMergingDictionary.java](https://coveralls.io/builds/19260056/source?filename=engine-spark%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fengine%2Fspark%2FSparkMergingDictionary.java#L127) | 0 | 19 | 0.0% | [engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkFactDistinct.java](https://coveralls.io/builds/19260056/source?filename=engine-spark%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fengine%2Fspark%2FSparkFactDistinct.java#L167) | 0 | 38 | 0.0% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [core-metadata/src/main/java/org/apache/kylin/source/datagen/ColumnGenerator.java](https://coveralls.io/builds/19260056/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fsource%2Fdatagen%2FColumnGenerator.java#L319) | 1 | 81.08% | | [engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkFactDistinct.java](https://coveralls.io/builds/19260056/source?filename=engine-spark%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fengine%2Fspark%2FSparkFactDistinct.java#L236) | 1 | 0.0% | | [core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/19260056/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L553) | 1 | 78.12% | | [engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkMergingDictionary.java](https://coveralls.io/builds/19260056/source?filename=engine-spark%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fengine%2Fspark%2FSparkMergingDictionary.java#L159) | 1 | 0.0% | | Totals | [![Coverage Status](https://coveralls.io/builds/19260056/badge)](https://coveralls.io/builds/19260056) | | :-- | --: | | Change from base [Build 3711](https://coveralls.io/builds/19259592): | -0.002% | | Covered Lines: | 16160 | | Relevant Lines: | 69746 | --- # 💛 - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] caolijun1166 closed pull request #273: clearKylin 3597
caolijun1166 closed pull request #273: clearKylin 3597 URL: https://github.com/apache/kylin/pull/273 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkFactDistinct.java b/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkFactDistinct.java index 043f479e7d..5cfd2d7ccb 100644 --- a/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkFactDistinct.java +++ b/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkFactDistinct.java @@ -164,73 +164,75 @@ protected void execute(OptionsHelper optionsHelper) throws Exception { conf.set("spark.kryo.registrationRequired", "true").registerKryoClasses(kryoClassArray); KylinSparkJobListener jobListener = new KylinSparkJobListener(); -JavaSparkContext sc = new JavaSparkContext(conf); -sc.sc().addSparkListener(jobListener); -HadoopUtil.deletePath(sc.hadoopConfiguration(), new Path(outputPath)); +try (JavaSparkContext sc = new JavaSparkContext(conf)) { +sc.sc().addSparkListener(jobListener); +HadoopUtil.deletePath(sc.hadoopConfiguration(), new Path(outputPath)); -final SerializableConfiguration sConf = new SerializableConfiguration(sc.hadoopConfiguration()); -KylinConfig envConfig = AbstractHadoopJob.loadKylinConfigFromHdfs(sConf, metaUrl); +final SerializableConfiguration sConf = new SerializableConfiguration(sc.hadoopConfiguration()); +KylinConfig envConfig = AbstractHadoopJob.loadKylinConfigFromHdfs(sConf, metaUrl); -final CubeInstance cubeInstance = CubeManager.getInstance(envConfig).getCube(cubeName); +final CubeInstance cubeInstance = CubeManager.getInstance(envConfig).getCube(cubeName); -final Job job = Job.getInstance(sConf.get()); +final Job job = Job.getInstance(sConf.get()); -final FactDistinctColumnsReducerMapping reducerMapping = new FactDistinctColumnsReducerMapping(cubeInstance); +final FactDistinctColumnsReducerMapping reducerMapping = new FactDistinctColumnsReducerMapping( +cubeInstance); -logger.info("RDD Output path: {}", outputPath); -logger.info("getTotalReducerNum: {}", reducerMapping.getTotalReducerNum()); -logger.info("getCuboidRowCounterReducerNum: {}", reducerMapping.getCuboidRowCounterReducerNum()); -logger.info("counter path {}", counterPath); +logger.info("RDD Output path: {}", outputPath); +logger.info("getTotalReducerNum: {}", reducerMapping.getTotalReducerNum()); +logger.info("getCuboidRowCounterReducerNum: {}", reducerMapping.getCuboidRowCounterReducerNum()); +logger.info("counter path {}", counterPath); -boolean isSequenceFile = JoinedFlatTable.SEQUENCEFILE.equalsIgnoreCase(envConfig.getFlatTableStorageFormat()); +boolean isSequenceFile = JoinedFlatTable.SEQUENCEFILE +.equalsIgnoreCase(envConfig.getFlatTableStorageFormat()); -// calculate source record bytes size -final LongAccumulator bytesWritten = sc.sc().longAccumulator(); +// calculate source record bytes size +final LongAccumulator bytesWritten = sc.sc().longAccumulator(); -final JavaRDD recordRDD = SparkUtil.hiveRecordInputRDD(isSequenceFile, sc, inputPath, hiveTable); +final JavaRDD recordRDD = SparkUtil.hiveRecordInputRDD(isSequenceFile, sc, inputPath, hiveTable); -JavaPairRDD flatOutputRDD = recordRDD.mapPartitionsToPair( -new FlatOutputFucntion(cubeName, segmentId, metaUrl, sConf, samplingPercent, bytesWritten)); +JavaPairRDD flatOutputRDD = recordRDD.mapPartitionsToPair( +new FlatOutputFucntion(cubeName, segmentId, metaUrl, sConf, samplingPercent, bytesWritten)); -JavaPairRDD> aggredRDD = flatOutputRDD -.groupByKey(new FactDistinctPartitioner(cubeName, metaUrl, sConf, reducerMapping.getTotalReducerNum())); +JavaPairRDD> aggredRDD = flatOutputRDD.groupByKey( +new FactDistinctPartitioner(cubeName, metaUrl, sConf, reducerMapping.getTotalReducerNum())); -JavaPairRDD> outputRDD = aggredRDD -.mapPartitionsToPair(new MultiOutputFunction(cubeName, metaUrl, sConf, samplingPercent)); +JavaPairRDD> outputRDD = aggredRDD +.mapPartitionsToPair(new MultiOutputFunction(cubeName, metaUrl, sConf, samplingPercent)); -// make each reducer output to respective dir -MultipleOutputs.addNamedOutput(job, BatchConstants.CFG_OUTPUT_COLUMN, SequenceFileOutputFormat.class,
[GitHub] asfgit commented on issue #273: clearKylin 3597
asfgit commented on issue #273: clearKylin 3597 URL: https://github.com/apache/kylin/pull/273#issuecomment-425617489 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #273: clearKylin 3597
asfgit commented on issue #273: clearKylin 3597 URL: https://github.com/apache/kylin/pull/273#issuecomment-425617490 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] caolijun1166 opened a new pull request #273: clearKylin 3597
caolijun1166 opened a new pull request #273: clearKylin 3597 URL: https://github.com/apache/kylin/pull/273 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re:Re: [DISCUSS] Columnar storage engine for Apache Kylin
I like parquet, it is very efficient format and supported by various projects, but there are some questions if we use parquet as the cube storage format: 1. Is it possible to locate a cuboid quickly in a parquet file? How to save cuboid metadata info in the parquet's FileMetaData, just in the metadata's key/value pair? 2. I notice that there is schema field in parquet's FileMetaData, but in a cube, different cuboids have different schemas, so we just save the basic cuboid schema in the schema field? Will this cause storage waste? 3. Can parquet support extension to add index easily, like bitmap index or B tree index for each column? 4. Do we need to build rpc server? if just use yarn to schedule spark tasks to do query, start/stop jvm may take seconds, then most queries will be slower than using HBase. Of course, it is more scalable, and some queries maybe faster. Besides using parquet/orc, I think there are two other options: 1. Use customized columnar format, it is more flexible, we can add Kylin specific concepts in the storage, like cuboid, etc. also it will be easy to add different type index as we need. The disadvantage is need more effort to define the format and development(cannot leverage existing lib to read/write, and need to take care of compression), also cube data file cannot be used by other projects(Do we have this needs?). 2. Use local storage rather than HDFS, like Kudu/Druid/ClickHouse. Advantage of this solution is the query performance will be very good, and everything can be controlled by Kylin. Disvantage is need more effort to do the development, especially for the cluster management, fail over, scalability. At 2018-09-29 10:53:35, "Zhong, Yanghong" wrote: >I have one question about the characteristics of Kylin columnar storage files. >That is whether it should be a standard or common one. Since the data stored >in the storage engine is Kylin specified, is it necessary for other engines to >know how to build data into and how to read data from the storage engine? > >In my opinion, it's not necessary. And Kylin columnar storage files should be >Kylin specified. We can leverage the advantages of other columnar files, like >data skip indexes, bloom filters, dictionaries. Then create a new file format >with Kylin specified requirements, like cuboid info. > >-- >Best regards, >Yanghong Zhong > > >On 9/28/18, 2:15 PM, "ShaoFeng Shi" wrote: > >Hi Kylin developers. > >HBase has been Kylin’s storage engine since the first day; Kylin on HBase >has been verified as a success which can support low latency & high >concurrency queries on a very large data scale. Thanks to HBase, most Kylin >users can get on average less than 1-second query response. > >But we also see some limitations when putting Cubes into HBase; I shared >some of them in the HBaseConf Asia 2018[1] this August. The typical >limitations include: > > - Rowkey is the primary index, no secondary index so far; > >Filtering by row key’s prefix and suffix can get very different performance >result. So the user needs to do a good design about the row key; otherwise, >the query would be slow. This is difficult sometimes because the user might >not predict the filtering patterns ahead of cube design. > > - HBase is a key-value instead of a columnar storage > >Kylin combines multiple measures (columns) into fewer column families for >smaller data size (row key size is remarkable). This causes HBase often >needing to read more data than requested. > > - HBase couldn't run on YARN > >This makes the deployment and auto-scaling a little complicated, especially >in the cloud. > >In one word, HBase is complicated to be Kylin’s storage. The maintenance, >debugging is also hard for normal developers. Now we’re planning to seek a >simple, light-weighted, read-only storage engine for Kylin. The new >solution should have the following characteristics: > > - Columnar layout with compression for efficient I/O; > - Index by each column for quick filtering and seeking; > - MapReduce / Spark API for parallel processing; > - HDFS compliant for scalability and availability; > - Mature, stable and extensible; > >With the plugin architecture[2] introduced in Kylin 1.5, adding multiple >storages to Kylin is possible. Some companies like Kyligence Inc and >Meituan.com, have developed their customized storage engine for Kylin in >their product or platform. In their experience, columnar storage is a good >supplement for the HBase engine. Kaisen Kang from Meituan.com has shared >their KOD (Kylin on Druid) solution[3] in this August’s Kylin meetup in >Beijing. > >We plan to do a PoC with Apache Parquet + Apache Spark in the next phase. >Parquet is a standard columnar file format and has been widely s
[GitHub] luguosheng1314 closed pull request #265: KYLIN-3599 Bulk Add Measures
luguosheng1314 closed pull request #265: KYLIN-3599 Bulk Add Measures URL: https://github.com/apache/kylin/pull/265 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/webapp/app/js/controllers/cubeMeasures.js b/webapp/app/js/controllers/cubeMeasures.js index 6fb82f2451..7beb528d6b 100644 --- a/webapp/app/js/controllers/cubeMeasures.js +++ b/webapp/app/js/controllers/cubeMeasures.js @@ -383,52 +383,116 @@ KylinApp.controller('CubeMeasuresCtrl', function ($scope, $modal,MetaModel,cubes } } if($scope.newMeasure.function.parameter.type=="column"&&$scope.newMeasure.function.expression!=="COUNT_DISTINCT"){ + $scope.newMeasure.function.returntype = $scope.getReturnType($scope.newMeasure.function.parameter.value, $scope.newMeasure.function.expression); +} + } + + // Open bulk add modal. + $scope.openBulkAddModal = function () { + +$scope.initBulkAddMeasures(); + +var modalInstance = $modal.open({ +templateUrl: 'bulkAddMeasures.html', +controller: cubeBulkAddMeasureModalCtrl, +backdrop: 'static', +scope: $scope +}); + }; - var column = $scope.newMeasure.function.parameter.value; - if(column&&(typeof column=="string")){ -var colType = $scope.getColumnType(VdmUtil.removeNameSpace(column), VdmUtil.getNameSpaceAliasName(column)); // $scope.getColumnType defined in cubeEdit.js +$scope.initBulkAddMeasures = function() { +// init bulk add measure view model +$scope.bulkMeasuresView = { + SUM: [], + MAX: [], + MIN: [], + RAW: [], + PERCENTILE: [] +}; +angular.forEach($scope.getCommonMetricColumns(), function(paramValue) { + var measures = _.filter($scope.cubeMetaFrame.measures, function(measure){ return measure.function.parameter.value == paramValue}); + for (var expression in $scope.bulkMeasuresView) { +var bulkMeasure = { + name: expression + '_' + paramValue.split('.')[1], + parameter: paramValue, + returntype: $scope.getReturnType(paramValue, expression), + select: false, + force: false +}; + +if (measures.length) { + var measure = _.find(measures, function(measure){ return measure.function.expression == expression}); + if (!!measure) { +bulkMeasure.name = measure.name; +bulkMeasure.force = true; +bulkMeasure.select = true; + } +} +$scope.bulkMeasuresView[expression].push(bulkMeasure); } - if(colType==""||!colType){ -$scope.newMeasure.function.returntype = ""; -return; +}); + +// init expression selector +$scope.bulkMeasureOptions = { + expressionList: [] +}; + +for (var expression in $scope.bulkMeasuresView) { + var selectArr = _.filter($scope.bulkMeasuresView[expression], function(measure){ return measure.select && measure.force}); + var selectAll = $scope.getCommonMetricColumns().length == selectArr.length; + var expressionSelect = { +expression: expression, +selectAll: selectAll, +force: selectAll } + $scope.bulkMeasureOptions.expressionList.push(expressionSelect); +} +$scope.bulkMeasureOptions.currentExpression = $scope.bulkMeasureOptions.expressionList[0]; + }; - switch($scope.newMeasure.function.expression){ -case "SUM": - if(colType==="tinyint"||colType==="smallint"||colType==="int"||colType==="bigint"||colType==="integer"){ -$scope.newMeasure.function.returntype= 'bigint'; + $scope.getReturnType = function(parameter, expression) { +if(parameter && (typeof parameter=="string")){ + var colType = $scope.getColumnType(VdmUtil.removeNameSpace(parameter), VdmUtil.getNameSpaceAliasName(parameter)); // $scope.getColumnType defined in cubeEdit.js +} +if(colType == '' || !colType) { + return ''; +} + +switch(expression) { + case 'SUM': +if(colType === 'tinyint' || colType === 'smallint' || colType === 'int' || colType === 'bigint' || colType === 'integer') { + return 'bigint'; +} else { + if(colType.indexOf('decimal') != -1) { +var returnRegex = new RegExp('(\\w+)(?:\\((\\w+?)(?:\\,(\\w+?))?\\))?') +var returnValue = returnRegex.exec(colType) +var precision = 19 +var scale = returnValue[3] +return 'decimal(' + precision + ',' + scale + ')'; }else{ -if(colType.indexOf('decimal')!=-1){ - var returnRegex = new RegExp('(\\w+)(?:\\((\\w+?)(?:\\,(\\w+?))?\\))?') - var returnValue = returnRegex.exec(colType) - var precision = 19 - var
[GitHub] coveralls edited a comment on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
coveralls edited a comment on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#issuecomment-425606821 ## Pull Request Test Coverage Report for [Build 3709](https://coveralls.io/builds/19259293) * **0** of **3** **(0.0%)** changed or added relevant lines in **1** file are covered. * **7** unchanged lines in **4** files lost coverage. * Overall coverage remained the same at **23.169%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java](https://coveralls.io/builds/19259293/source?filename=storage-hbase%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fhbase%2Futil%2FUpdateHTableHostCLI.java#L62) | 0 | 3 | 0.0% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java](https://coveralls.io/builds/19259293/source?filename=storage-hbase%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fhbase%2Futil%2FUpdateHTableHostCLI.java#L66) | 1 | 0.0% | | [server-base/src/main/java/org/apache/kylin/rest/util/QueryRequestLimits.java](https://coveralls.io/builds/19259293/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Futil%2FQueryRequestLimits.java#L72) | 1 | 47.62% | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://coveralls.io/builds/19259293/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2FTreeCuboidScheduler.java#L124) | 2 | 68.46% | | [core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/19259293/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L449) | 3 | 78.42% | | Totals | [![Coverage Status](https://coveralls.io/builds/19259293/badge)](https://coveralls.io/builds/19259293) | | :-- | --: | | Change from base [Build 3703](https://coveralls.io/builds/19245700): | 0.0% | | Covered Lines: | 16159 | | Relevant Lines: | 69745 | --- # 💛 - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: [DISCUSS] Columnar storage engine for Apache Kylin
I have one question about the characteristics of Kylin columnar storage files. That is whether it should be a standard or common one. Since the data stored in the storage engine is Kylin specified, is it necessary for other engines to know how to build data into and how to read data from the storage engine? In my opinion, it's not necessary. And Kylin columnar storage files should be Kylin specified. We can leverage the advantages of other columnar files, like data skip indexes, bloom filters, dictionaries. Then create a new file format with Kylin specified requirements, like cuboid info. -- Best regards, Yanghong Zhong On 9/28/18, 2:15 PM, "ShaoFeng Shi" wrote: Hi Kylin developers. HBase has been Kylin’s storage engine since the first day; Kylin on HBase has been verified as a success which can support low latency & high concurrency queries on a very large data scale. Thanks to HBase, most Kylin users can get on average less than 1-second query response. But we also see some limitations when putting Cubes into HBase; I shared some of them in the HBaseConf Asia 2018[1] this August. The typical limitations include: - Rowkey is the primary index, no secondary index so far; Filtering by row key’s prefix and suffix can get very different performance result. So the user needs to do a good design about the row key; otherwise, the query would be slow. This is difficult sometimes because the user might not predict the filtering patterns ahead of cube design. - HBase is a key-value instead of a columnar storage Kylin combines multiple measures (columns) into fewer column families for smaller data size (row key size is remarkable). This causes HBase often needing to read more data than requested. - HBase couldn't run on YARN This makes the deployment and auto-scaling a little complicated, especially in the cloud. In one word, HBase is complicated to be Kylin’s storage. The maintenance, debugging is also hard for normal developers. Now we’re planning to seek a simple, light-weighted, read-only storage engine for Kylin. The new solution should have the following characteristics: - Columnar layout with compression for efficient I/O; - Index by each column for quick filtering and seeking; - MapReduce / Spark API for parallel processing; - HDFS compliant for scalability and availability; - Mature, stable and extensible; With the plugin architecture[2] introduced in Kylin 1.5, adding multiple storages to Kylin is possible. Some companies like Kyligence Inc and Meituan.com, have developed their customized storage engine for Kylin in their product or platform. In their experience, columnar storage is a good supplement for the HBase engine. Kaisen Kang from Meituan.com has shared their KOD (Kylin on Druid) solution[3] in this August’s Kylin meetup in Beijing. We plan to do a PoC with Apache Parquet + Apache Spark in the next phase. Parquet is a standard columnar file format and has been widely supported by many projects like Hive, Impala, Drill, etc. Parquet is adding the page level column index to support fine-grained filtering. Apache Spark can provide the parallel computing over Parquet and can be deployed on YARN/Mesos and Kubernetes. With this combination, the data persistence and computation are separated, which makes the scaling in/out much easier than before. Benefiting from Spark's flexibility, we can not only push down more computation from Kylin to the Hadoop cluster. Except for Parquet, Apache ORC is also a candidate. Now I raise this discussion to get your ideas about Kylin’s next-generation storage engine. If you have good ideas or any related data, welcome discuss in the community. Thank you! [1] Apache Kylin on HBase https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.slideshare.net%2FShiShaoFeng1%2Fapache-kylin-on-hbase-extreme-olap-engine-for-big-data&data=02%7C01%7Cyangzhong%40ebay.com%7C71e694ab5386420bb32908d62509c003%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636737121143223312&sdata=TuIOe6FxdubqsoRVX8BQb%2FkvSFRrfI0ZvBRDB0euZWk%3D&reserved=0 [2] Apache Kylin Plugin Architecture https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkylin.apache.org%2Fdevelopment%2Fplugin_arch.html&data=02%7C01%7Cyangzhong%40ebay.com%7C71e694ab5386420bb32908d62509c003%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636737121143223312&sdata=6WPLbX9Rat51rj3VCc1AuVDxTw5HO2ezPO0Cj8m231g%3D&reserved=0 [3] 基于Druid的Kylin存储引擎实践 https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.bcmeng.com%2Fpost%2Fkylin-on-druid.html--&data=02%7C01%7Cyangzhong%40ebay.com%7C71e694ab5386420bb32908d62509c003%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636737121143223
[GitHub] shaofengshi closed pull request #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
shaofengshi closed pull request #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java b/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java index bf5c4e84e0..ae8582cd5c 100644 --- a/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java +++ b/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java @@ -59,9 +59,10 @@ public UpdateHTableHostCLI(List htables, String oldHostValue) throws IOException { this.htables = htables; this.oldHostValue = oldHostValue; -Connection conn = ConnectionFactory.createConnection(HBaseConfiguration.create()); -hbaseAdmin = conn.getAdmin(); -this.kylinConfig = KylinConfig.getInstanceFromEnv(); +try (Connection conn = ConnectionFactory.createConnection(HBaseConfiguration.create());) { +hbaseAdmin = conn.getAdmin(); +this.kylinConfig = KylinConfig.getInstanceFromEnv(); +} } public static void main(String[] args) throws Exception { This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
codecov-io edited a comment on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#issuecomment-425606722 # [Codecov](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=h1) Report > Merging [#271](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=desc) into [master](https://codecov.io/gh/apache/kylin/commit/65ab55e920a611c12a331f084599bbca6e3381bc?src=pr&el=desc) will **not change** coverage. > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/271/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master #271 +/- ## = Coverage 21.15% 21.15% Complexity 4406 4406 = Files 1086 1086 Lines 6974569745 Branches 1008810088 = Hits 1475814758 Misses5358853588 Partials 1399 1399 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../kylin/storage/hbase/util/UpdateHTableHostCLI.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-c3RvcmFnZS1oYmFzZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc3RvcmFnZS9oYmFzZS91dGlsL1VwZGF0ZUhUYWJsZUhvc3RDTEkuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...org/apache/kylin/rest/util/QueryRequestLimits.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-c2VydmVyLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL3Jlc3QvdXRpbC9RdWVyeVJlcXVlc3RMaW1pdHMuamF2YQ==) | `35.71% <0%> (-4.77%)` | `5% <0%> (-1%)` | | | [.../apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2N1Ym9pZC9UcmVlQ3Vib2lkU2NoZWR1bGVyLmphdmE=) | `63.84% <0%> (-2.31%)` | `0% <0%> (ø)` | | | [...rg/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2lubWVtY3ViaW5nL01lbURpc2tTdG9yZS5qYXZh) | `70.21% <0%> (+0.91%)` | `7% <0%> (ø)` | :arrow_down: | | [...he/kylin/dict/lookup/cache/RocksDBLookupTable.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1kaWN0aW9uYXJ5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9kaWN0L2xvb2t1cC9jYWNoZS9Sb2Nrc0RCTG9va3VwVGFibGUuamF2YQ==) | `78.37% <0%> (+5.4%)` | `6% <0%> (+1%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=footer). Last update [65ab55e...8cdd103](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #272: KYLIN-3232 Add document for ops tools
asfgit commented on issue #272: KYLIN-3232 Add document for ops tools URL: https://github.com/apache/kylin/pull/272#issuecomment-425608683 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #272: KYLIN-3232 Add document for ops tools
asfgit commented on issue #272: KYLIN-3232 Add document for ops tools URL: https://github.com/apache/kylin/pull/272#issuecomment-425608684 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] GinaZhai opened a new pull request #272: KYLIN-3232 Add document for ops tools
GinaZhai opened a new pull request #272: KYLIN-3232 Add document for ops tools URL: https://github.com/apache/kylin/pull/272 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] coveralls commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
coveralls commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#issuecomment-425606821 ## Pull Request Test Coverage Report for [Build 3706](https://coveralls.io/builds/19259026) * **0** of **3** **(0.0%)** changed or added relevant lines in **1** file are covered. * **6** unchanged lines in **3** files lost coverage. * Overall coverage increased (+**0.001%**) to **23.17%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java](https://coveralls.io/builds/19259026/source?filename=storage-hbase%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fhbase%2Futil%2FUpdateHTableHostCLI.java#L62) | 0 | 3 | 0.0% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java](https://coveralls.io/builds/19259026/source?filename=storage-hbase%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fhbase%2Futil%2FUpdateHTableHostCLI.java#L66) | 1 | 0.0% | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://coveralls.io/builds/19259026/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2FTreeCuboidScheduler.java#L124) | 2 | 68.46% | | [core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/19259026/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L449) | 3 | 78.42% | | Totals | [![Coverage Status](https://coveralls.io/builds/19259026/badge)](https://coveralls.io/builds/19259026) | | :-- | --: | | Change from base [Build 3703](https://coveralls.io/builds/19245700): | 0.001% | | Covered Lines: | 16160 | | Relevant Lines: | 69745 | --- # 💛 - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
codecov-io commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#issuecomment-425606722 # [Codecov](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=h1) Report > Merging [#271](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=desc) into [master](https://codecov.io/gh/apache/kylin/commit/65ab55e920a611c12a331f084599bbca6e3381bc?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/271/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #271 +/- ## + Coverage 21.15% 21.16% +<.01% + Complexity 4406 4405 -1 Files 1086 1086 Lines 6974569745 Branches 1008810088 + Hits 1475814759 +1 + Misses5358853587 -1 Partials 1399 1399 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../kylin/storage/hbase/util/UpdateHTableHostCLI.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-c3RvcmFnZS1oYmFzZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc3RvcmFnZS9oYmFzZS91dGlsL1VwZGF0ZUhUYWJsZUhvc3RDTEkuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [.../apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2N1Ym9pZC9UcmVlQ3Vib2lkU2NoZWR1bGVyLmphdmE=) | `63.84% <0%> (-2.31%)` | `0% <0%> (ø)` | | | [...a/org/apache/kylin/dict/Number2BytesConverter.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1kaWN0aW9uYXJ5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9kaWN0L051bWJlcjJCeXRlc0NvbnZlcnRlci5qYXZh) | `81.74% <0%> (-0.8%)` | `17% <0%> (-1%)` | | | [...rg/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2lubWVtY3ViaW5nL01lbURpc2tTdG9yZS5qYXZh) | `70.21% <0%> (+0.91%)` | `7% <0%> (ø)` | :arrow_down: | | [...g/apache/kylin/source/datagen/ColumnGenerator.java](https://codecov.io/gh/apache/kylin/pull/271/diff?src=pr&el=tree#diff-Y29yZS1tZXRhZGF0YS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc291cmNlL2RhdGFnZW4vQ29sdW1uR2VuZXJhdG9yLmphdmE=) | `72.29% <0%> (+1.35%)` | `8% <0%> (ø)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=footer). Last update [65ab55e...f7e994e](https://codecov.io/gh/apache/kylin/pull/271?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yiming187 commented on issue #270: KYLIN-2924 enable google error-prone in compile phase
yiming187 commented on issue #270: KYLIN-2924 enable google error-prone in compile phase URL: https://github.com/apache/kylin/pull/270#issuecomment-425606016 @shaofengshi yes. I saw it. I have fixed all "ERROR" level issues, but the "WARNING" issues generated too much logs. Trying to ignore the "WARNING" log to reduce the output size. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] shaofengshi commented on a change in pull request #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
shaofengshi commented on a change in pull request #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#discussion_r221411226 ## File path: storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/UpdateHTableHostCLI.java ## @@ -59,9 +59,10 @@ public UpdateHTableHostCLI(List htables, String oldHostValue) throws IOException { this.htables = htables; this.oldHostValue = oldHostValue; -Connection conn = ConnectionFactory.createConnection(HBaseConfiguration.create()); -hbaseAdmin = conn.getAdmin(); -this.kylinConfig = KylinConfig.getInstanceFromEnv(); +try(Connection conn = ConnectionFactory.createConnection(HBaseConfiguration.create());){ Review comment: Please check the code format. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] caolijun1166 opened a new pull request #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
caolijun1166 opened a new pull request #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271 Close the HBase connection after used in UpdateHTableHostCLI This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
asfgit commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#issuecomment-425605188 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI
asfgit commented on issue #271: KYLIN-3603 Close the HBase connection after used in UpdateHTableHostCLI URL: https://github.com/apache/kylin/pull/271#issuecomment-425605187 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (KYLIN-3603) HBase connection isn't closed in UpdateHTableHostCLI
Lijun Cao created KYLIN-3603: Summary: HBase connection isn't closed in UpdateHTableHostCLI Key: KYLIN-3603 URL: https://issues.apache.org/jira/browse/KYLIN-3603 Project: Kylin Issue Type: Bug Reporter: Lijun Cao Assignee: Lijun Cao Fix For: v2.6.0 The HBase connection is only used in the constructor to get an *_Admin_* , it should be closed after used. {code:java} public UpdateHTableHostCLI(List htables, String oldHostValue) throws IOException { this.htables = htables; this.oldHostValue = oldHostValue; Connection conn = ConnectionFactory.createConnection(HBaseConfiguration.create()); hbaseAdmin = conn.getAdmin(); this.kylinConfig = KylinConfig.getInstanceFromEnv(); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] shaofengshi commented on issue #270: KYLIN-2924 enable google error-prone in compile phase
shaofengshi commented on issue #270: KYLIN-2924 enable google error-prone in compile phase URL: https://github.com/apache/kylin/pull/270#issuecomment-425600739 The travis-ci is failed with "The job exceeded the maximum log length, and has been terminated." This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] coveralls commented on issue #270: KYLIN-2924 enable google error-prone in compile phase
coveralls commented on issue #270: KYLIN-2924 enable google error-prone in compile phase URL: https://github.com/apache/kylin/pull/270#issuecomment-425479277 ## Pull Request Test Coverage Report for [Build 3705](https://coveralls.io/builds/19249459) * **1** of **12** **(8.33%)** changed or added relevant lines in **8** files are covered. * **79** unchanged lines in **26** files lost coverage. * Overall coverage decreased (**-0.02%**) to **23.152%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [core-common/src/main/java/org/apache/kylin/common/util/SparkEntry.java](https://coveralls.io/builds/19249459/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Futil%2FSparkEntry.java#L35) | 0 | 1 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/model/DimensionDesc.java](https://coveralls.io/builds/19249459/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fmodel%2FDimensionDesc.java#L173) | 0 | 1 | 0.0% | [core-job/src/main/java/org/apache/kylin/job/impl/threadpool/DefaultScheduler.java](https://coveralls.io/builds/19249459/source?filename=core-job%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fjob%2Fimpl%2Fthreadpool%2FDefaultScheduler.java#L54) | 0 | 1 | 0.0% | [server-base/src/main/java/org/apache/kylin/rest/controller/ProjectController.java](https://coveralls.io/builds/19249459/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Fcontroller%2FProjectController.java#L134) | 0 | 1 | 0.0% | [server-base/src/main/java/org/apache/kylin/rest/controller/ModelController.java](https://coveralls.io/builds/19249459/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Fcontroller%2FModelController.java#L120) | 0 | 1 | 0.0% | [core-metrics/src/main/java/org/apache/kylin/metrics/lib/impl/InstantReservoir.java](https://coveralls.io/builds/19249459/source?filename=core-metrics%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmetrics%2Flib%2Fimpl%2FInstantReservoir.java#L53) | 0 | 2 | 0.0% | [server-base/src/main/java/org/apache/kylin/rest/util/AclPermissionUtil.java](https://coveralls.io/builds/19249459/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Futil%2FAclPermissionUtil.java#L38) | 0 | 4 | 0.0% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [server-base/src/main/java/org/apache/kylin/rest/util/QueryRequestLimits.java](https://coveralls.io/builds/19249459/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Futil%2FQueryRequestLimits.java#L72) | 1 | 47.62% | | [server-base/src/main/java/org/apache/kylin/rest/util/AclPermissionUtil.java](https://coveralls.io/builds/19249459/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Futil%2FAclPermissionUtil.java#L43) | 1 | 0.0% | | [server-base/src/main/java/org/apache/kylin/rest/security/springacl/AclRecord.java](https://coveralls.io/builds/19249459/source?filename=server-base%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Frest%2Fsecurity%2Fspringacl%2FAclRecord.java#L180) | 1 | 0.0% | | [storage-hbase/src/main/java/org/apache/kylin/storage/hbase/cube/v2/coprocessor/endpoint/CubeVisitService.java](https://coveralls.io/builds/19249459/source?filename=storage-hbase%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fhbase%2Fcube%2Fv2%2Fcoprocessor%2Fendpoint%2FCubeVisitService.java#L367) | 1 | 0.0% | | [core-storage/src/main/java/org/apache/kylin/storage/gtrecord/GTCubeStorageQueryBase.java](https://coveralls.io/builds/19249459/source?filename=core-storage%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fgtrecord%2FGTCubeStorageQueryBase.java#L157) | 1 | 0.0% | | [core-metadata/src/main/java/org/apache/kylin/metadata/filter/CompareTupleFilter.java](https://coveralls.io/builds/19249459/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmetadata%2Ffilter%2FCompareTupleFilter.java#L308) | 1 | 34.0% | | [tool/src/main/java/org/apache/kylin/tool/MetadataCleanupJob.java](https://coveralls.io/builds/19249459/source?filename=tool%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Ftool%2FMetadataCleanupJob.java#L61) | 1 | 0.0% | | [core-job/src/main/java/org/apache/kylin/job/impl/threadpool/DefaultScheduler.java](https://coveralls.io/builds/19249459/source?filename=core-job%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fjob%2Fimpl%2Fthreadpool%2FDefaultScheduler.java#L184) | 1 | 73.33% | | [atopcalcite/src/main/java/org/apache/calcite/sql/type/SqlTypeUtil.java](https://coveralls.io/builds/19249459/source?filename=atopcalcite%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fcalcite%2Fsql%2Ftype%2FSqlTypeUtil.java#L1313) | 1 | 0.0% | | [atopcalcite/src/main/java/org/apache/calcite/runtime/SqlFunctions.java](https://coveralls.io/builds/192
[GitHub] asfgit commented on issue #270: KYLIN-2924 enable google error-prone in compile phase
asfgit commented on issue #270: KYLIN-2924 enable google error-prone in compile phase URL: https://github.com/apache/kylin/pull/270#issuecomment-425467594 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asfgit commented on issue #270: KYLIN-2924 enable google error-prone in compile phase
asfgit commented on issue #270: KYLIN-2924 enable google error-prone in compile phase URL: https://github.com/apache/kylin/pull/270#issuecomment-425467595 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yiming187 opened a new pull request #270: KYLIN-2924 enable google error-prone in compile phase
yiming187 opened a new pull request #270: KYLIN-2924 enable google error-prone in compile phase URL: https://github.com/apache/kylin/pull/270 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: regarding creating and modifying cube
Hi, you can observe the Spark generated cuboid file size in kylin's working dir (for example: /kylin/kylin_metadata/kylin-fd785bab-b875-4626-8bc3-7d46e8862d88/kylin_sales_cube/cuboid/level_base_cuboid/, please replace the uuid and cube name with yours); If there are small files (e.g several Mbs), you should increase this configuration to make the partition bigger (e.g, 64 MB); Usually, this is needed when your cube has some advanced measures like count distinct, topn, percentile etc, whose size estimation is a little wild. The situation got improved in v2.5.0, as we enhanced the size estimation for those measures. With 2.5 you don't need to care much about it I think. vishnuvardhanG 于2018年9月28日周五 下午6:41写道: > http://kylin.apache.org/docs20/tutorial/cube_spark.html > > In the above link there is mentioning about the affect of > "kylin.engine.spark.rdd-partition-cut-mb" on cube building performance. > > how to decide the optimum value of > "kylin.engine.spark.rdd-partition-cut-mb" for cube creation? > > -- Best regards, Shaofeng Shi 史少锋
[jira] [Created] (KYLIN-3602) Enable more checkstyle rules
Yichen Zhou created KYLIN-3602: -- Summary: Enable more checkstyle rules Key: KYLIN-3602 URL: https://issues.apache.org/jira/browse/KYLIN-3602 Project: Kylin Issue Type: Improvement Components: Others Reporter: Yichen Zhou Fix For: v2.6.0 Attachments: checkstyle-aggregate.html The checkstyle rules of kylin is too weak. We need to reinfore them to achieve better code quality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
regarding creating and modifying cube
http://kylin.apache.org/docs20/tutorial/cube_spark.html In the above link there is mentioning about the affect of "kylin.engine.spark.rdd-partition-cut-mb" on cube building performance. how to decide the optimum value of "kylin.engine.spark.rdd-partition-cut-mb" for cube creation?
[GitHub] shaofengshi closed pull request #262: KYLIN-3597 Improve code smell
shaofengshi closed pull request #262: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/262 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/SegmentCubeTupleIterator.java b/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/SegmentCubeTupleIterator.java index 6711664161..629c02563b 100644 --- a/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/SegmentCubeTupleIterator.java +++ b/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/SegmentCubeTupleIterator.java @@ -98,14 +98,23 @@ public GTInfo getInfo() { return scanRequest.getInfo(); } -public void close() throws IOException {} +public void close() { +// Underlying resource is hold by scanner and it will be closed at +// SegmentCubeTupleIterator#close, caller is SequentialCubeTupleIterator +} public Iterator iterator() { return records; } }; -GTStreamAggregateScanner aggregator = new GTStreamAggregateScanner(inputScanner, scanRequest); -return aggregator.valuesIterator(gtDimsIdx, gtMetricsIdx); +Iterator result; +try (GTStreamAggregateScanner aggregator = new GTStreamAggregateScanner(inputScanner, scanRequest)) { +result = aggregator.valuesIterator(gtDimsIdx, gtMetricsIdx); +} catch (IOException ioe) { +// implementation of close method of anonymous IGTScanner is empty, no way throw exception +throw new IllegalStateException("IOException is not expected here.", ioe); +} +return result; } // simply decode records @@ -149,10 +158,10 @@ public boolean hasNext() { if (!gtValues.hasNext()) { return false; } -Object[] gtValues = this.gtValues.next(); +Object[] values = this.gtValues.next(); // translate into tuple -advMeasureFillers = cubeTupleConverter.translateResult(gtValues, tuple); +advMeasureFillers = cubeTupleConverter.translateResult(values, tuple); // the simple case if (advMeasureFillers == null) { diff --git a/source-kafka/src/main/java/org/apache/kylin/source/kafka/config/KafkaConsumerProperties.java b/source-kafka/src/main/java/org/apache/kylin/source/kafka/config/KafkaConsumerProperties.java index cc32ed9592..a1b9ab253d 100644 --- a/source-kafka/src/main/java/org/apache/kylin/source/kafka/config/KafkaConsumerProperties.java +++ b/source-kafka/src/main/java/org/apache/kylin/source/kafka/config/KafkaConsumerProperties.java @@ -20,6 +20,7 @@ import java.io.File; import java.io.FileInputStream; +import java.io.FileNotFoundException; import java.io.IOException; import java.util.Arrays; import java.util.HashSet; @@ -28,7 +29,6 @@ import java.util.Properties; import java.util.Set; -import org.apache.commons.io.IOUtils; import org.apache.commons.lang.StringUtils; import org.apache.hadoop.conf.Configuration; import org.apache.kafka.clients.consumer.ConsumerConfig; @@ -56,8 +56,8 @@ public static KafkaConsumerProperties getInstanceFromEnv() { try { KafkaConsumerProperties config = new KafkaConsumerProperties(); config.properties = config.loadKafkaConsumerProperties(); - -logger.info("Initialized a new KafkaConsumerProperties from getInstanceFromEnv : " + System.identityHashCode(config)); +logger.info("Initialized a new KafkaConsumerProperties from getInstanceFromEnv : {}", +System.identityHashCode(config)); ENV_INSTANCE = config; } catch (IllegalArgumentException e) { throw new IllegalStateException("Failed to find KafkaConsumerProperties ", e); @@ -79,7 +79,7 @@ public static Properties extractKafkaConfigToProperties(Configuration configurat Set configNames = new HashSet(); try { configNames = ConsumerConfig.configNames(); -} catch (Error e) { +} catch (Exception e) { // the Kafka configNames api is supported on 0.10.1.0+, in case NoSuchMethodException which is an Error, not Exception String[] configNamesArray = ("metric.reporters, metadata.max.age.ms, partition.assignment.strategy, reconnect.backoff.ms," + "sasl.kerberos.ticket.renew.window.factor, max.partition.fetch.bytes, bootstrap.servers, ssl.keystore.type," + " enable.auto.commit, sasl.mechanism, interceptor.classes, exclud
[jira] [Created] (KYLIN-3601) PreparedContextPool生成的连接数据与配置不一致
huaicui created KYLIN-3601: -- Summary: PreparedContextPool生成的连接数据与配置不一致 Key: KYLIN-3601 URL: https://issues.apache.org/jira/browse/KYLIN-3601 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: v2.5.0 Reporter: huaicui Attachments: FirstResponseDistribute.jpg, SixthResponseDistribute.jpg, image-2018-09-28-15-14-00-288.png 因为并发性能不够,使用了magang提供的PrepareStatement方法进行测试。性能有所有提高,但随着测试次数的增加,吞吐率会越来越低而且数据超时也越来越多。经过修改代码在queryAndUpdateCache最后返回前加入日志打印:logger.debug("BorrowedCount:"+preparedContextPool.getBorrowedCount() +",DestroyedCount:"+preparedContextPool.getDestroyedCount() +",CreatedCount:"+preparedContextPool.getCreatedCount() +",ReturnedCount:"+preparedContextPool.getReturnedCount() 同时配置文件加入该配置: kylin.query.statement-cache-max-num-per-key=200 日志显示,当同一sql并发一段时间后,PreparedContextPool创建了越来越多PrepareStatement,并没有进行阻塞后续来的请求。 !image-2018-09-28-15-14-00-288.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] coveralls edited a comment on issue #262: KYLIN-3597 Improve code smell
coveralls edited a comment on issue #262: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/262#issuecomment-425150222 ## Pull Request Test Coverage Report for [Build 3701](https://coveralls.io/builds/19241540) * **4** of **70** **(5.71%)** changed or added relevant lines in **4** files are covered. * No unchanged relevant lines lost coverage. * Overall coverage increased (+**0.005%**) to **23.174%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [source-kafka/src/main/java/org/apache/kylin/source/kafka/config/KafkaConsumerProperties.java](https://coveralls.io/builds/19241540/source?filename=source-kafka%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fsource%2Fkafka%2Fconfig%2FKafkaConsumerProperties.java#L82) | 4 | 12 | 33.33% | [core-storage/src/main/java/org/apache/kylin/storage/gtrecord/SegmentCubeTupleIterator.java](https://coveralls.io/builds/19241540/source?filename=core-storage%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fgtrecord%2FSegmentCubeTupleIterator.java#L104) | 0 | 9 | 0.0% | [storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/RowCounterCLI.java](https://coveralls.io/builds/19241540/source?filename=storage-hbase%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstorage%2Fhbase%2Futil%2FRowCounterCLI.java#L45) | 0 | 20 | 0.0% | [source-kafka/src/main/java/org/apache/kylin/source/kafka/util/KafkaSampleProducer.java](https://coveralls.io/builds/19241540/source?filename=source-kafka%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fsource%2Fkafka%2Futil%2FKafkaSampleProducer.java#L58) | 0 | 29 | 0.0% | Totals | [![Coverage Status](https://coveralls.io/builds/19241540/badge)](https://coveralls.io/builds/19241540) | | :-- | --: | | Change from base [Build 3700](https://coveralls.io/builds/19241264): | 0.005% | | Covered Lines: | 16163 | | Relevant Lines: | 69745 | --- # 💛 - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #262: KYLIN-3597 Improve code smell
codecov-io commented on issue #262: KYLIN-3597 Improve code smell URL: https://github.com/apache/kylin/pull/262#issuecomment-425344030 # [Codecov](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=h1) Report > Merging [#262](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=desc) into [master](https://codecov.io/gh/apache/kylin/commit/8e7d2a90fa8488e6720130c511d139296939b32d?src=pr&el=desc) will **increase** coverage by `0.01%`. > The diff coverage is `5.71%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/262/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #262 +/- ## + Coverage 21.15% 21.16% +0.01% - Complexity 4405 4407 +2 Files 1086 1086 Lines 6974369745 +2 Branches 1008810088 + Hits 1475714765 +8 + Misses5358653584 -2 + Partials 1400 1396 -4 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/kylin/storage/hbase/util/RowCounterCLI.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-c3RvcmFnZS1oYmFzZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc3RvcmFnZS9oYmFzZS91dGlsL1Jvd0NvdW50ZXJDTEkuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...e/kylin/source/kafka/util/KafkaSampleProducer.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-c291cmNlLWthZmthL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9zb3VyY2Uva2Fma2EvdXRpbC9LYWZrYVNhbXBsZVByb2R1Y2VyLmphdmE=) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...lin/storage/gtrecord/SegmentCubeTupleIterator.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-Y29yZS1zdG9yYWdlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9zdG9yYWdlL2d0cmVjb3JkL1NlZ21lbnRDdWJlVHVwbGVJdGVyYXRvci5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...n/source/kafka/config/KafkaConsumerProperties.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-c291cmNlLWthZmthL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9zb3VyY2Uva2Fma2EvY29uZmlnL0thZmthQ29uc3VtZXJQcm9wZXJ0aWVzLmphdmE=) | `62.85% <33.33%> (ø)` | `12 <0> (ø)` | :arrow_down: | | [...lin/dict/lookup/cache/RocksDBLookupTableCache.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-Y29yZS1kaWN0aW9uYXJ5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9kaWN0L2xvb2t1cC9jYWNoZS9Sb2Nrc0RCTG9va3VwVGFibGVDYWNoZS5qYXZh) | `76.68% <0%> (+0.51%)` | `27% <0%> (ø)` | :arrow_down: | | [.../apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2N1Ym9pZC9UcmVlQ3Vib2lkU2NoZWR1bGVyLmphdmE=) | `66.15% <0%> (+2.3%)` | `0% <0%> (ø)` | :arrow_down: | | [...org/apache/kylin/rest/util/QueryRequestLimits.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-c2VydmVyLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL3Jlc3QvdXRpbC9RdWVyeVJlcXVlc3RMaW1pdHMuamF2YQ==) | `40.47% <0%> (+4.76%)` | `6% <0%> (+1%)` | :arrow_up: | | [...he/kylin/dict/lookup/cache/RocksDBLookupTable.java](https://codecov.io/gh/apache/kylin/pull/262/diff?src=pr&el=tree#diff-Y29yZS1kaWN0aW9uYXJ5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9kaWN0L2xvb2t1cC9jYWNoZS9Sb2Nrc0RCTG9va3VwVGFibGUuamF2YQ==) | `78.37% <0%> (+5.4%)` | `6% <0%> (+1%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=footer). Last update [8e7d2a9...25bf7c8](https://codecov.io/gh/apache/kylin/pull/262?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services