[jira] [Comment Edited] (KYLIN-3487) Create a new measure for precise count distinct
[ https://issues.apache.org/jira/browse/KYLIN-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596105#comment-16596105 ] Zhong Yanghong edited comment on KYLIN-3487 at 8/30/18 6:02 AM: Hi [~kangkaisen], the feature is kind of extension of [KYLIN-2622]. By the feature introduced by [KYLIN-2622], we can solve the infinite growth issue of global dictionary. However, segments of cubes cannot be merged. By introducing this extension, segment merge is allowed. And it will bring three advantages caused by segment merge: * {color:#f79232}reduce HTable number{color} * reduce storage cost, the same row key across segments can be merged(limited in case of the partition column as mandatory) * improve query efficiency, reduce rpcs to multiple segments for the same row key(limited in case of the partition column as mandatory) was (Author: yaho): Hi [~kangkaisen], the feature is kind of extension of [KYLIN-2622]. By the feature introduced by [KYLIN-2622], we can solve the infinite growth issue of global dictionary. However, segments of cubes cannot be merged. By introducing this extension, segment merge is allowed. And it will bring two advantages caused by segment merge: * reduce storage cost, the same row key across segments can be merged * improve query efficiency, reduce rpcs to multiple segments for the same row key > Create a new measure for precise count distinct > --- > > Key: KYLIN-3487 > URL: https://issues.apache.org/jira/browse/KYLIN-3487 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: Backlog > > > To compute the precise count distinct, we can use bitmap and global > dictionary. However, there's a limitation for the global dictionary. It maps > from values to ids whose type is integer, which means the number of ids will > be less than 2B. And it's like a Pixiu for which there's increase but no > decrease. > In eBay, there's a requirement of calculating precise count distinct of > session. The session cardinality is large and will grow as time goes on. It > will not be feasible to use the global dictionary when its cardinality > exceeds the upper bound 2B. How can we deal with this? > There's good news that a session never crosses days. With this feature, we > don't need to merge bitmap across days. To calculate precise session > cardinality, we can assign each day a bitmap and directly summarize the > cardinalities estimated by each bitmap. No bitmap merge is needed. > To use bitmap for cardinality calculation, we need to map raw data from value > to an integer id, which is achieved by encoding the value with a dictionary. > Previously, for the ability of merging bitmaps from multiple segments, global > dictionary is used. However, in this case, there's no need of bitmap merge, > the global dictionary is not needed. > And we don't need to filter by or group by session. Then there's no need to > map from value to id and from id to value after the related bitmap is > constructed. Therefore, we don't need to store dictionaries for session. Only > the bitmap is enough. > To deal with segment merge, since bitmaps of each segment are not able to > merge to one bitmap, we use a map for storing multiple bitmaps. In the map, > the key is the segment name and the value is the segment-level bitmap. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3487) Create a new measure for precise count distinct
[ https://issues.apache.org/jira/browse/KYLIN-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaofeng SHI updated KYLIN-3487: Fix Version/s: (was: v2.5.0) Backlog > Create a new measure for precise count distinct > --- > > Key: KYLIN-3487 > URL: https://issues.apache.org/jira/browse/KYLIN-3487 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: Backlog > > > To compute the precise count distinct, we can use bitmap and global > dictionary. However, there's a limitation for the global dictionary. It maps > from values to ids whose type is integer, which means the number of ids will > be less than 2B. And it's like a Pixiu for which there's increase but no > decrease. > In eBay, there's a requirement of calculating precise count distinct of > session. The session cardinality is large and will grow as time goes on. It > will not be feasible to use the global dictionary when its cardinality > exceeds the upper bound 2B. How can we deal with this? > There's good news that a session never crosses days. With this feature, we > don't need to merge bitmap across days. To calculate precise session > cardinality, we can assign each day a bitmap and directly summarize the > cardinalities estimated by each bitmap. No bitmap merge is needed. > To use bitmap for cardinality calculation, we need to map raw data from value > to an integer id, which is achieved by encoding the value with a dictionary. > Previously, for the ability of merging bitmaps from multiple segments, global > dictionary is used. However, in this case, there's no need of bitmap merge, > the global dictionary is not needed. > And we don't need to filter by or group by session. Then there's no need to > map from value to id and from id to value after the related bitmap is > constructed. Therefore, we don't need to store dictionaries for session. Only > the bitmap is enough. > To deal with segment merge, since bitmaps of each segment are not able to > merge to one bitmap, we use a map for storing multiple bitmaps. In the map, > the key is the segment name and the value is the segment-level bitmap. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3518) Coprocessor reports NPE when execute a query on HBase 2.0
[ https://issues.apache.org/jira/browse/KYLIN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597036#comment-16597036 ] ASF GitHub Bot commented on KYLIN-3518: --- codecov-io commented on issue #215: KYLIN-3518 Coprocessor reports NPE when execute a query on HBase 2.0 URL: https://github.com/apache/kylin/pull/215#issuecomment-417182476 # [Codecov](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=h1) Report > :exclamation: No coverage uploaded for pull request base (`master-hadoop3.1-2.5.0@d707a81`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit). > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/215/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master-hadoop3.1-2.5.0 #215 +/- ## = Coverage ? 21.07% Complexity? 4290 = Files ? 1071 Lines ?67874 Branches ? 9834 = Hits ?14304 Misses?52207 Partials ? 1363 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...cube/v2/coprocessor/endpoint/CubeVisitService.java](https://codecov.io/gh/apache/kylin/pull/215/diff?src=pr&el=tree#diff-c3RvcmFnZS1oYmFzZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc3RvcmFnZS9oYmFzZS9jdWJlL3YyL2NvcHJvY2Vzc29yL2VuZHBvaW50L0N1YmVWaXNpdFNlcnZpY2UuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (?)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=footer). Last update [d707a81...8b78b9b](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Coprocessor reports NPE when execute a query on HBase 2.0 > - > > Key: KYLIN-3518 > URL: https://issues.apache.org/jira/browse/KYLIN-3518 > Project: Kylin > Issue Type: Bug > Components: Storage - HBase >Reporter: Shaofeng SHI >Priority: Major > > On HDP 3.0, build a cube and then run a simple count query, NPE occurred: > > {code:java} > 2018-08-28 01:30:16,969 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseRPC:315 : hbase.rpc.timeout = 9 ms, use 81000 ms as timeout > for coprocessor > 2018-08-28 01:30:16,983 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseEndpointRPC:141 : Serialized scanRequestBytes 522 bytes, > rawScanBytesString 44 bytes > 2018-08-28 01:30:16,984 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseEndpointRPC:143 : The scan 67b41fc6 for segment > kylin_sales_cube_clone[2012010100_2013010100] is as below with 1 > separate raw scans, shard part of start/end key is set to 0 > 2018-08-28 01:30:16,991 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseRPC:288 : Visiting hbase table KYLIN_5Q088VO5I0: cuboid require > post aggregation, from 0 to 16384 Start: > \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00 > (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00) Stop: > \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\xFF\xFF\xFF\x00 > (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\xFF\xFF\xFF\x00), No Fuzzy Key > 2018-08-28 01:30:16,991 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseEndpointRPC:148 : Submitting rpc to 1 shards starting from shard > 0, scan range count 1 > 2018-08-28 01:30:17,010 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > common.KylinConfig:332 : Loading kylin-def
[jira] [Commented] (KYLIN-3482) Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer
[ https://issues.apache.org/jira/browse/KYLIN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597024#comment-16597024 ] ASF subversion and git services commented on KYLIN-3482: Commit c8972772af60d0a6736acb063ff6c4b775790b4a in kylin's branch refs/heads/master from shaofengshi [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c897277 ] KYLIN-3482 Unclosed SetAndUnsetThreadLocalConfig in Spark engine > Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer > --- > > Key: KYLIN-3482 > URL: https://issues.apache.org/jira/browse/KYLIN-3482 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: jiatao.tao >Priority: Minor > Fix For: v2.5.0 > > > Here is related code: > {code} > KylinConfig kylinConfig = > AbstractHadoopJob.loadKylinConfigFromHdfs(sConf, metaUrl); > > KylinConfig.setAndUnsetThreadLocalConfig(kylinConfig); > {code} > The return value from setAndUnsetThreadLocalConfig should be closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3488) Support MySQL as Kylin metadata storage
[ https://issues.apache.org/jira/browse/KYLIN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597022#comment-16597022 ] ASF GitHub Bot commented on KYLIN-3488: --- coveralls commented on issue #216: KYLIN-3488 Support MySQL as Kylin metadata storage URL: https://github.com/apache/kylin/pull/216#issuecomment-417178965 ## Pull Request Test Coverage Report for [Build 3494](https://coveralls.io/builds/18743844) * **1** of **622** **(0.16%)** changed or added relevant lines in **12** files are covered. * No unchanged relevant lines lost coverage. * Overall coverage decreased (**-0.2%**) to **22.754%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [core-common/src/main/java/org/apache/kylin/common/persistence/ResourceStore.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FResourceStore.java#L236) | 0 | 1 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/util/HadoopUtil.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Futil%2FHadoopUtil.java#L88) | 0 | 4 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/KylinConfig.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2FKylinConfig.java#L538) | 0 | 7 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/persistence/BrokenEntity.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FBrokenEntity.java#L25) | 0 | 13 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/persistence/BrokenInputStream.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FBrokenInputStream.java#L32) | 0 | 15 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/persistence/JDBCSqlQueryFormatProvider.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCSqlQueryFormatProvider.java#L28) | 0 | 15 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/persistence/JDBCResource.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCResource.java#L31) | 0 | 16 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/persistence/JDBCSqlQueryFormat.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCSqlQueryFormat.java#L26) | 0 | 22 | 0.0% | [core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2FKylinConfigBase.java#L265) | 1 | 41 | 2.44% | [core-common/src/main/java/org/apache/kylin/common/persistence/JDBCConnectionManager.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCConnectionManager.java#L41) | 0 | 60 | 0.0% | Totals | [![Coverage Status](https://coveralls.io/builds/18743844/badge)](https://coveralls.io/builds/18743844) | | :-- | --: | | Change from base [Build 3490](https://coveralls.io/builds/18723455): | -0.2% | | Covered Lines: | 15850 | | Relevant Lines: | 69659 | --- # 💛 - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support MySQL as Kylin metadata storage > --- > > Key: KYLIN-3488 > URL: https://issues.apache.org/jira/browse/KYLIN-3488 > Project: Kylin > Issue Type: New Feature > Components: Metadata >Reporter: Shaofeng SHI >Priority: Major > > Kylin uses HBase as the metastore; But in some cases user expects the > metadata not in HBase. > Sonny Heer from mailing list mentioned: > "I'm fairly certain anyone using Kylin with AWS EMR will benefit from this. > Having multiple hbase clusters across AZs is a huge benefit. BTW only thing > blocking at the moment is write operations happening from kylin query nodes." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3488) Support MySQL as Kylin metadata storage
[ https://issues.apache.org/jira/browse/KYLIN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597020#comment-16597020 ] ASF GitHub Bot commented on KYLIN-3488: --- codecov-io commented on issue #216: KYLIN-3488 Support MySQL as Kylin metadata storage URL: https://github.com/apache/kylin/pull/216#issuecomment-417178593 # [Codecov](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=h1) Report > :exclamation: No coverage uploaded for pull request base (`master@2889e36`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit). > The diff coverage is `0.16%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/216/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master #216 +/- ## = Coverage ? 20.76% Complexity? 4337 = Files ? 1087 Lines ?69659 Branches ?10076 = Hits ?14466 Misses?53811 Partials ? 1382 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...he/kylin/common/persistence/BrokenInputStream.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9Ccm9rZW5JbnB1dFN0cmVhbS5qYXZh) | `0% <0%> (ø)` | `0 <0> (?)` | | | [...apache/kylin/common/persistence/ResourceStore.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9SZXNvdXJjZVN0b3JlLmphdmE=) | `62.18% <0%> (ø)` | `29 <0> (?)` | | | [.../java/org/apache/kylin/common/util/HadoopUtil.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi91dGlsL0hhZG9vcFV0aWwuamF2YQ==) | `17.44% <0%> (ø)` | `10 <0> (?)` | | | [...e/kylin/common/persistence/JDBCSqlQueryFormat.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDU3FsUXVlcnlGb3JtYXQuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (?)` | | | [.../apache/kylin/common/persistence/BrokenEntity.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9Ccm9rZW5FbnRpdHkuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (?)` | | | [...he/kylin/common/persistence/JDBCResourceStore.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDUmVzb3VyY2VTdG9yZS5qYXZh) | `0% <0%> (ø)` | `0 <0> (?)` | | | [...ache/kylin/common/persistence/JDBCResourceDAO.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDUmVzb3VyY2VEQU8uamF2YQ==) | `0% <0%> (ø)` | `0 <0> (?)` | | | [...common/persistence/JDBCSqlQueryFormatProvider.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDU3FsUXVlcnlGb3JtYXRQcm92aWRlci5qYXZh) | `0% <0%> (ø)` | `0 <0> (?)` | | | [.../apache/kylin/common/persistence/JDBCResource.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDUmVzb3VyY2UuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (?)` | | | [...main/java/org/apache/kylin/common/KylinConfig.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9LeWxpbkNvbmZpZy5qYXZh) | `33.73% <0%> (ø)` | `22 <0> (?)` | | | ... and [2 more](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=footer). Last update [2889e36...69ef6ac](https://codecov.io/gh/apache/kyli
[jira] [Commented] (KYLIN-3488) Support MySQL as Kylin metadata storage
[ https://issues.apache.org/jira/browse/KYLIN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597012#comment-16597012 ] ASF GitHub Bot commented on KYLIN-3488: --- GinaZhai opened a new pull request #216: KYLIN-3488 Support MySQL as Kylin metadata storage URL: https://github.com/apache/kylin/pull/216 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support MySQL as Kylin metadata storage > --- > > Key: KYLIN-3488 > URL: https://issues.apache.org/jira/browse/KYLIN-3488 > Project: Kylin > Issue Type: New Feature > Components: Metadata >Reporter: Shaofeng SHI >Priority: Major > > Kylin uses HBase as the metastore; But in some cases user expects the > metadata not in HBase. > Sonny Heer from mailing list mentioned: > "I'm fairly certain anyone using Kylin with AWS EMR will benefit from this. > Having multiple hbase clusters across AZs is a huge benefit. BTW only thing > blocking at the moment is write operations happening from kylin query nodes." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3447) Upgrade zookeeper to 3.4.13
[ https://issues.apache.org/jira/browse/KYLIN-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated KYLIN-3447: -- Description: zookeeper 3.4.13 is being released with the following fixes: ZOOKEEPER-2959 fixes data loss when observer is used ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / cloud) environment was: zookeeper 3.4.13 is being released. ZOOKEEPER-2959 fixes data loss when observer is used ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / cloud) environment > Upgrade zookeeper to 3.4.13 > --- > > Key: KYLIN-3447 > URL: https://issues.apache.org/jira/browse/KYLIN-3447 > Project: Kylin > Issue Type: Improvement >Reporter: Ted Yu >Priority: Major > > zookeeper 3.4.13 is being released with the following fixes: > ZOOKEEPER-2959 fixes data loss when observer is used > ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container > / cloud) > environment -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3484) Update Hadoop version to 2.7.7
[ https://issues.apache.org/jira/browse/KYLIN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated KYLIN-3484: -- Description: We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick up bug and security fixes. (was: We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick up bug and security fixes .) > Update Hadoop version to 2.7.7 > -- > > Key: KYLIN-3484 > URL: https://issues.apache.org/jira/browse/KYLIN-3484 > Project: Kylin > Issue Type: Task >Reporter: Ted Yu >Priority: Minor > > We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick up bug and > security fixes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3518) Coprocessor reports NPE when execute a query on HBase 2.0
[ https://issues.apache.org/jira/browse/KYLIN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596977#comment-16596977 ] ASF GitHub Bot commented on KYLIN-3518: --- caolijun1166 opened a new pull request #215: KYLIN-3518 Coprocessor reports NPE when execute a query on HBase 2.0 URL: https://github.com/apache/kylin/pull/215 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Coprocessor reports NPE when execute a query on HBase 2.0 > - > > Key: KYLIN-3518 > URL: https://issues.apache.org/jira/browse/KYLIN-3518 > Project: Kylin > Issue Type: Bug > Components: Storage - HBase >Reporter: Shaofeng SHI >Priority: Major > > On HDP 3.0, build a cube and then run a simple count query, NPE occurred: > > {code:java} > 2018-08-28 01:30:16,969 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseRPC:315 : hbase.rpc.timeout = 9 ms, use 81000 ms as timeout > for coprocessor > 2018-08-28 01:30:16,983 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseEndpointRPC:141 : Serialized scanRequestBytes 522 bytes, > rawScanBytesString 44 bytes > 2018-08-28 01:30:16,984 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseEndpointRPC:143 : The scan 67b41fc6 for segment > kylin_sales_cube_clone[2012010100_2013010100] is as below with 1 > separate raw scans, shard part of start/end key is set to 0 > 2018-08-28 01:30:16,991 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseRPC:288 : Visiting hbase table KYLIN_5Q088VO5I0: cuboid require > post aggregation, from 0 to 16384 Start: > \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00 > (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00) Stop: > \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\xFF\xFF\xFF\x00 > (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\xFF\xFF\xFF\x00), No Fuzzy Key > 2018-08-28 01:30:16,991 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] > v2.CubeHBaseEndpointRPC:148 : Submitting rpc to 1 shards starting from shard > 0, scan range count 1 > 2018-08-28 01:30:17,010 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > common.KylinConfig:332 : Loading kylin-defaults.properties from > file:/root/shaofengshi/apache-kylin-2.5.0-SNAPSHOT-bin/tomcat/webapps/kylin/WEB-INF/lib/kylin-core-common-2.5.0-SNAPSHOT.jar!/kylin-defaults.properties > 2018-08-28 01:30:17,033 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] > common.KylinConfig:291 : KYLIN_CONF property was not set, will seek > KYLIN_HOME env variable > 2018-08-28 01:30:17,051 INFO [pool-14-thread-1] hbase.HBaseConnection:110 : > Creating coprocessor thread pool with max of 2048, core of 2048 > 2018-08-28 01:30:17,094 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] > gtrecord.SequentialCubeTupleIterator:73 : Using SortedIteratorMergerWithLimit > to merge segment results > 2018-08-28 01:30:17,097 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] > enumerator.OLAPEnumerator:117 : return TupleIterator... > 2018-08-28 01:30:21,607 INFO [kylin-coproc--pool9-t1] > client.RpcRetryingCallerImpl:134 : Call exception, tries=6, retries=6, > started=4410 ms ago, cancelled=false, msg=java.io.IOException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:468) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: java.lang.NullPointerException > at > org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.visitCube(CubeVisitService.java:253) > at > org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(CubeVisitProtos.java:) > at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8032) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2426) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2408) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42010) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > ... 3 more > , details=row '' on table 'KYLIN_5Q088VO5I0' at > region=KYLIN_5Q088VO5I0,,1535417272444.27b82cb4702db4557a98b9a7e60b7692., > hostname=ignite03.com,16020,1534313612401, seqNum=2 > 2018-08-28 01:30:25,633 INFO [kylin-coproc--pool9-t1] > cli
[jira] [Updated] (KYLIN-3515) Cubing jobs may interfere with each other if use same hive view
[ https://issues.apache.org/jira/browse/KYLIN-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Casandra julie mitchell updated KYLIN-3515: --- Attachment: Getting started > Cubing jobs may interfere with each other if use same hive view > > > Key: KYLIN-3515 > URL: https://issues.apache.org/jira/browse/KYLIN-3515 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.4.0 >Reporter: nichunen >Assignee: nichunen >Priority: Major > Fix For: Future > > Attachments: Getting started > > > The root cause is for hive view, during cubing, kylin will materialize the > view by creating an intermediate table(drop intermediate table first). The > intermediate tables' name is like kylin_intermediate_\{view_name}, that means > jobs will create tables with the same name if the same view is referenced. So > one job's intermediate table may be dropped by another job, in such cases, > error like "table not found" will happen -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3520) Deal with NULL values of measures for inmem cubing
Zhong Yanghong created KYLIN-3520: - Summary: Deal with NULL values of measures for inmem cubing Key: KYLIN-3520 URL: https://issues.apache.org/jira/browse/KYLIN-3520 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong Fix For: v2.5.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3487) Create a new measure for precise count distinct
[ https://issues.apache.org/jira/browse/KYLIN-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596105#comment-16596105 ] Zhong Yanghong commented on KYLIN-3487: --- Hi [~kangkaisen], the feature is kind of extension of [KYLIN-2622]. By the feature introduced by [KYLIN-2622], we can solve the infinite growth issue of global dictionary. However, segments of cubes cannot be merged. By introducing this extension, segment merge is allowed. And it will bring two advantages caused by segment merge: * reduce storage cost, the same row key across segments can be merged * improve query efficiency, reduce rpcs to multiple segments for the same row key > Create a new measure for precise count distinct > --- > > Key: KYLIN-3487 > URL: https://issues.apache.org/jira/browse/KYLIN-3487 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: v2.5.0 > > > To compute the precise count distinct, we can use bitmap and global > dictionary. However, there's a limitation for the global dictionary. It maps > from values to ids whose type is integer, which means the number of ids will > be less than 2B. And it's like a Pixiu for which there's increase but no > decrease. > In eBay, there's a requirement of calculating precise count distinct of > session. The session cardinality is large and will grow as time goes on. It > will not be feasible to use the global dictionary when its cardinality > exceeds the upper bound 2B. How can we deal with this? > There's good news that a session never crosses days. With this feature, we > don't need to merge bitmap across days. To calculate precise session > cardinality, we can assign each day a bitmap and directly summarize the > cardinalities estimated by each bitmap. No bitmap merge is needed. > To use bitmap for cardinality calculation, we need to map raw data from value > to an integer id, which is achieved by encoding the value with a dictionary. > Previously, for the ability of merging bitmaps from multiple segments, global > dictionary is used. However, in this case, there's no need of bitmap merge, > the global dictionary is not needed. > And we don't need to filter by or group by session. Then there's no need to > map from value to id and from id to value after the related bitmap is > constructed. Therefore, we don't need to store dictionaries for session. Only > the bitmap is enough. > To deal with segment merge, since bitmaps of each segment are not able to > merge to one bitmap, we use a map for storing multiple bitmaps. In the map, > the key is the segment name and the value is the segment-level bitmap. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3451) Cloned cube doesn't have Mandatory Cuboids copied
[ https://issues.apache.org/jira/browse/KYLIN-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596022#comment-16596022 ] ASF subversion and git services commented on KYLIN-3451: Commit 9b762f5365fccdce01ddb8f18ea1a5bb209be261 in kylin's branch refs/heads/2.4.x from xingpeng1 [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=9b762f5 ] KYLIN-3451 the cloned cube don't have Mandatory Cuboids > Cloned cube doesn't have Mandatory Cuboids copied > - > > Key: KYLIN-3451 > URL: https://issues.apache.org/jira/browse/KYLIN-3451 > Project: Kylin > Issue Type: Bug > Components: Web >Affects Versions: v2.3.0 >Reporter: Peng Xing >Assignee: Peng Xing >Priority: Minor > Fix For: v2.4.1, v2.5.0 > > Attachments: > 0001-KYLIN-3451-the-cloned-cube-don-t-have-Mandatory-Cubo.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)