[jira] [Comment Edited] (KYLIN-3487) Create a new measure for precise count distinct

2018-08-29 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596105#comment-16596105
 ] 

Zhong Yanghong edited comment on KYLIN-3487 at 8/30/18 6:02 AM:


Hi [~kangkaisen], the feature is kind of extension of [KYLIN-2622]. By the 
feature introduced by [KYLIN-2622], we can solve the infinite growth issue of 
global dictionary. However, segments of cubes cannot be merged. By introducing 
this extension, segment merge is allowed. And it will bring three advantages 
caused by segment merge:
* {color:#f79232}reduce HTable number{color}
* reduce storage cost, the same row key across segments can be merged(limited 
in case of the partition column as mandatory)
* improve query efficiency, reduce rpcs to multiple segments for the same row 
key(limited in case of the partition column as mandatory)


was (Author: yaho):
Hi [~kangkaisen], the feature is kind of extension of [KYLIN-2622]. By the 
feature introduced by [KYLIN-2622], we can solve the infinite growth issue of 
global dictionary. However, segments of cubes cannot be merged. By introducing 
this extension, segment merge is allowed. And it will bring two advantages 
caused by segment merge:
* reduce storage cost, the same row key across segments can be merged
* improve query efficiency, reduce rpcs to multiple segments for the same row 
key

> Create a new measure for precise count distinct
> ---
>
> Key: KYLIN-3487
> URL: https://issues.apache.org/jira/browse/KYLIN-3487
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: Backlog
>
>
> To compute the precise count distinct, we can use bitmap and global 
> dictionary. However, there's a limitation for the global dictionary. It maps 
> from values to ids whose type is integer, which means the number of ids will 
> be less than 2B. And it's like a Pixiu for which there's increase but no 
> decrease. 
> In eBay, there's a requirement of calculating precise count distinct of 
> session. The session cardinality is large and will grow as time goes on. It 
> will not be feasible to use the global dictionary when its cardinality 
> exceeds the upper bound 2B. How can we deal with this?
> There's good news that a session never crosses days. With this feature, we 
> don't need to merge bitmap across days. To calculate precise session 
> cardinality, we can assign each day a bitmap and directly summarize the 
> cardinalities estimated by each bitmap. No bitmap merge is needed. 
> To use bitmap for cardinality calculation, we need to map raw data from value 
> to an integer id, which is achieved by encoding the value with a dictionary. 
> Previously, for the ability of merging bitmaps from multiple segments, global 
> dictionary is used. However, in this case, there's no need of bitmap merge, 
> the global dictionary is not needed. 
> And we don't need to filter by or group by session. Then there's no need to 
> map from value to id and from id to value after the related bitmap is 
> constructed. Therefore, we don't need to store dictionaries for session. Only 
> the bitmap is enough.
> To deal with segment merge, since bitmaps of each segment are not able to 
> merge to one bitmap, we use a map for storing multiple bitmaps. In the map, 
> the key is the segment name and the value is the segment-level bitmap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3487) Create a new measure for precise count distinct

2018-08-29 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3487:

Fix Version/s: (was: v2.5.0)
   Backlog

> Create a new measure for precise count distinct
> ---
>
> Key: KYLIN-3487
> URL: https://issues.apache.org/jira/browse/KYLIN-3487
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: Backlog
>
>
> To compute the precise count distinct, we can use bitmap and global 
> dictionary. However, there's a limitation for the global dictionary. It maps 
> from values to ids whose type is integer, which means the number of ids will 
> be less than 2B. And it's like a Pixiu for which there's increase but no 
> decrease. 
> In eBay, there's a requirement of calculating precise count distinct of 
> session. The session cardinality is large and will grow as time goes on. It 
> will not be feasible to use the global dictionary when its cardinality 
> exceeds the upper bound 2B. How can we deal with this?
> There's good news that a session never crosses days. With this feature, we 
> don't need to merge bitmap across days. To calculate precise session 
> cardinality, we can assign each day a bitmap and directly summarize the 
> cardinalities estimated by each bitmap. No bitmap merge is needed. 
> To use bitmap for cardinality calculation, we need to map raw data from value 
> to an integer id, which is achieved by encoding the value with a dictionary. 
> Previously, for the ability of merging bitmaps from multiple segments, global 
> dictionary is used. However, in this case, there's no need of bitmap merge, 
> the global dictionary is not needed. 
> And we don't need to filter by or group by session. Then there's no need to 
> map from value to id and from id to value after the related bitmap is 
> constructed. Therefore, we don't need to store dictionaries for session. Only 
> the bitmap is enough.
> To deal with segment merge, since bitmaps of each segment are not able to 
> merge to one bitmap, we use a map for storing multiple bitmaps. In the map, 
> the key is the segment name and the value is the segment-level bitmap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3518) Coprocessor reports NPE when execute a query on HBase 2.0

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597036#comment-16597036
 ] 

ASF GitHub Bot commented on KYLIN-3518:
---

codecov-io commented on issue #215: KYLIN-3518 Coprocessor reports NPE when 
execute a query on HBase 2.0
URL: https://github.com/apache/kylin/pull/215#issuecomment-417182476
 
 
   # [Codecov](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base 
(`master-hadoop3.1-2.5.0@d707a81`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `0%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/kylin/pull/215/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master-hadoop3.1-2.5.0 #215   +/-   ##
   =
 Coverage  ?   21.07%   
 Complexity? 4290   
   =
 Files ? 1071   
 Lines ?67874   
 Branches  ? 9834   
   =
 Hits  ?14304   
 Misses?52207   
 Partials  ? 1363
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...cube/v2/coprocessor/endpoint/CubeVisitService.java](https://codecov.io/gh/apache/kylin/pull/215/diff?src=pr&el=tree#diff-c3RvcmFnZS1oYmFzZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vc3RvcmFnZS9oYmFzZS9jdWJlL3YyL2NvcHJvY2Vzc29yL2VuZHBvaW50L0N1YmVWaXNpdFNlcnZpY2UuamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=footer). Last 
update 
[d707a81...8b78b9b](https://codecov.io/gh/apache/kylin/pull/215?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Coprocessor reports NPE when execute a query on HBase 2.0
> -
>
> Key: KYLIN-3518
> URL: https://issues.apache.org/jira/browse/KYLIN-3518
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Shaofeng SHI
>Priority: Major
>
> On HDP 3.0, build a cube and then run a simple count query, NPE occurred:
>  
> {code:java}
> 2018-08-28 01:30:16,969 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseRPC:315 : hbase.rpc.timeout = 9 ms, use 81000 ms as timeout 
> for coprocessor
> 2018-08-28 01:30:16,983 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseEndpointRPC:141 : Serialized scanRequestBytes 522 bytes, 
> rawScanBytesString 44 bytes
> 2018-08-28 01:30:16,984 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseEndpointRPC:143 : The scan 67b41fc6 for segment 
> kylin_sales_cube_clone[2012010100_2013010100] is as below with 1 
> separate raw scans, shard part of start/end key is set to 0
> 2018-08-28 01:30:16,991 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseRPC:288 : Visiting hbase table KYLIN_5Q088VO5I0: cuboid require 
> post aggregation, from 0 to 16384 Start: 
> \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00 
> (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00) Stop: 
> \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\xFF\xFF\xFF\x00 
> (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\xFF\xFF\xFF\x00), No Fuzzy Key
> 2018-08-28 01:30:16,991 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseEndpointRPC:148 : Submitting rpc to 1 shards starting from shard 
> 0, scan range count 1
> 2018-08-28 01:30:17,010 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> common.KylinConfig:332 : Loading kylin-def

[jira] [Commented] (KYLIN-3482) Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer

2018-08-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597024#comment-16597024
 ] 

ASF subversion and git services commented on KYLIN-3482:


Commit c8972772af60d0a6736acb063ff6c4b775790b4a in kylin's branch 
refs/heads/master from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c897277 ]

KYLIN-3482 Unclosed SetAndUnsetThreadLocalConfig in Spark engine


> Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer
> ---
>
> Key: KYLIN-3482
> URL: https://issues.apache.org/jira/browse/KYLIN-3482
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: jiatao.tao
>Priority: Minor
> Fix For: v2.5.0
>
>
> Here is related code:
> {code}
> KylinConfig kylinConfig = 
> AbstractHadoopJob.loadKylinConfigFromHdfs(sConf, metaUrl);
> 
> KylinConfig.setAndUnsetThreadLocalConfig(kylinConfig);
> {code}
> The return value from setAndUnsetThreadLocalConfig should be closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3488) Support MySQL as Kylin metadata storage

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597022#comment-16597022
 ] 

ASF GitHub Bot commented on KYLIN-3488:
---

coveralls commented on issue #216: KYLIN-3488 Support MySQL as Kylin metadata 
storage
URL: https://github.com/apache/kylin/pull/216#issuecomment-417178965
 
 
   ## Pull Request Test Coverage Report for [Build 
3494](https://coveralls.io/builds/18743844)
   
   * **1** of **622**   **(0.16%)**  changed or added relevant lines in **12** 
files are covered.
   * No unchanged relevant lines lost coverage.
   * Overall coverage decreased (**-0.2%**) to **22.754%**
   
   ---
   
   |  Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
   | :-|--||---: |
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/ResourceStore.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FResourceStore.java#L236)
 | 0 | 1 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/util/HadoopUtil.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Futil%2FHadoopUtil.java#L88)
 | 0 | 4 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/KylinConfig.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2FKylinConfig.java#L538)
 | 0 | 7 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/BrokenEntity.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FBrokenEntity.java#L25)
 | 0 | 13 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/BrokenInputStream.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FBrokenInputStream.java#L32)
 | 0 | 15 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/JDBCSqlQueryFormatProvider.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCSqlQueryFormatProvider.java#L28)
 | 0 | 15 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/JDBCResource.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCResource.java#L31)
 | 0 | 16 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/JDBCSqlQueryFormat.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCSqlQueryFormat.java#L26)
 | 0 | 22 | 0.0%
   | 
[core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2FKylinConfigBase.java#L265)
 | 1 | 41 | 2.44%
   | 
[core-common/src/main/java/org/apache/kylin/common/persistence/JDBCConnectionManager.java](https://coveralls.io/builds/18743844/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fpersistence%2FJDBCConnectionManager.java#L41)
 | 0 | 60 | 0.0%
   
   
   
   |  Totals | [![Coverage 
Status](https://coveralls.io/builds/18743844/badge)](https://coveralls.io/builds/18743844)
 |
   | :-- | --: |
   | Change from base [Build 3490](https://coveralls.io/builds/18723455): |  
-0.2% |
   | Covered Lines: | 15850 |
   | Relevant Lines: | 69659 |
   
   ---
   # 💛  - [Coveralls](https://coveralls.io)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support MySQL as Kylin metadata storage
> ---
>
> Key: KYLIN-3488
> URL: https://issues.apache.org/jira/browse/KYLIN-3488
> Project: Kylin
>  Issue Type: New Feature
>  Components: Metadata
>Reporter: Shaofeng SHI
>Priority: Major
>
> Kylin uses HBase as the metastore; But in some cases user expects the 
> metadata not in HBase.
> Sonny Heer from mailing list mentioned:
> "I'm fairly certain anyone using Kylin with AWS EMR will benefit from this.   
> Having multiple hbase clusters across AZs is a huge benefit.  BTW only thing 
> blocking at the moment is write operations happening from kylin query nodes."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3488) Support MySQL as Kylin metadata storage

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597020#comment-16597020
 ] 

ASF GitHub Bot commented on KYLIN-3488:
---

codecov-io commented on issue #216: KYLIN-3488 Support MySQL as Kylin metadata 
storage
URL: https://github.com/apache/kylin/pull/216#issuecomment-417178593
 
 
   # [Codecov](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@2889e36`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `0.16%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/kylin/pull/216/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master #216   +/-   ##
   =
 Coverage  ?   20.76%   
 Complexity? 4337   
   =
 Files ? 1087   
 Lines ?69659   
 Branches  ?10076   
   =
 Hits  ?14466   
 Misses?53811   
 Partials  ? 1382
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...he/kylin/common/persistence/BrokenInputStream.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9Ccm9rZW5JbnB1dFN0cmVhbS5qYXZh)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...apache/kylin/common/persistence/ResourceStore.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9SZXNvdXJjZVN0b3JlLmphdmE=)
 | `62.18% <0%> (ø)` | `29 <0> (?)` | |
   | 
[.../java/org/apache/kylin/common/util/HadoopUtil.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi91dGlsL0hhZG9vcFV0aWwuamF2YQ==)
 | `17.44% <0%> (ø)` | `10 <0> (?)` | |
   | 
[...e/kylin/common/persistence/JDBCSqlQueryFormat.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDU3FsUXVlcnlGb3JtYXQuamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[.../apache/kylin/common/persistence/BrokenEntity.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9Ccm9rZW5FbnRpdHkuamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...he/kylin/common/persistence/JDBCResourceStore.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDUmVzb3VyY2VTdG9yZS5qYXZh)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...ache/kylin/common/persistence/JDBCResourceDAO.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDUmVzb3VyY2VEQU8uamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...common/persistence/JDBCSqlQueryFormatProvider.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDU3FsUXVlcnlGb3JtYXRQcm92aWRlci5qYXZh)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[.../apache/kylin/common/persistence/JDBCResource.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9wZXJzaXN0ZW5jZS9KREJDUmVzb3VyY2UuamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...main/java/org/apache/kylin/common/KylinConfig.java](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9LeWxpbkNvbmZpZy5qYXZh)
 | `33.73% <0%> (ø)` | `22 <0> (?)` | |
   | ... and [2 
more](https://codecov.io/gh/apache/kylin/pull/216/diff?src=pr&el=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/kylin/pull/216?src=pr&el=footer). Last 
update 
[2889e36...69ef6ac](https://codecov.io/gh/apache/kyli

[jira] [Commented] (KYLIN-3488) Support MySQL as Kylin metadata storage

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597012#comment-16597012
 ] 

ASF GitHub Bot commented on KYLIN-3488:
---

GinaZhai opened a new pull request #216: KYLIN-3488 Support MySQL as Kylin 
metadata storage
URL: https://github.com/apache/kylin/pull/216
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support MySQL as Kylin metadata storage
> ---
>
> Key: KYLIN-3488
> URL: https://issues.apache.org/jira/browse/KYLIN-3488
> Project: Kylin
>  Issue Type: New Feature
>  Components: Metadata
>Reporter: Shaofeng SHI
>Priority: Major
>
> Kylin uses HBase as the metastore; But in some cases user expects the 
> metadata not in HBase.
> Sonny Heer from mailing list mentioned:
> "I'm fairly certain anyone using Kylin with AWS EMR will benefit from this.   
> Having multiple hbase clusters across AZs is a huge benefit.  BTW only thing 
> blocking at the moment is write operations happening from kylin query nodes."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3447) Upgrade zookeeper to 3.4.13

2018-08-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3447:
--
Description: 
zookeeper 3.4.13 is being released with the following fixes:

ZOOKEEPER-2959 fixes data loss when observer is used

ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / 
cloud)
environment

  was:
zookeeper 3.4.13 is being released.

ZOOKEEPER-2959 fixes data loss when observer is used

ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / 
cloud)
environment


> Upgrade zookeeper to 3.4.13
> ---
>
> Key: KYLIN-3447
> URL: https://issues.apache.org/jira/browse/KYLIN-3447
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Major
>
> zookeeper 3.4.13 is being released with the following fixes:
> ZOOKEEPER-2959 fixes data loss when observer is used
> ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container 
> / cloud)
> environment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3484) Update Hadoop version to 2.7.7

2018-08-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3484:
--
Description: We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick 
up bug and security fixes.  (was: We should upgrade the Hadoop 2.7 dependency 
to 2.7.7, to pick up bug and security fixes .)

> Update Hadoop version to 2.7.7
> --
>
> Key: KYLIN-3484
> URL: https://issues.apache.org/jira/browse/KYLIN-3484
> Project: Kylin
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Minor
>
> We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick up bug and 
> security fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3518) Coprocessor reports NPE when execute a query on HBase 2.0

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596977#comment-16596977
 ] 

ASF GitHub Bot commented on KYLIN-3518:
---

caolijun1166 opened a new pull request #215: KYLIN-3518 Coprocessor reports NPE 
when execute a query on HBase 2.0
URL: https://github.com/apache/kylin/pull/215
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Coprocessor reports NPE when execute a query on HBase 2.0
> -
>
> Key: KYLIN-3518
> URL: https://issues.apache.org/jira/browse/KYLIN-3518
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Shaofeng SHI
>Priority: Major
>
> On HDP 3.0, build a cube and then run a simple count query, NPE occurred:
>  
> {code:java}
> 2018-08-28 01:30:16,969 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseRPC:315 : hbase.rpc.timeout = 9 ms, use 81000 ms as timeout 
> for coprocessor
> 2018-08-28 01:30:16,983 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseEndpointRPC:141 : Serialized scanRequestBytes 522 bytes, 
> rawScanBytesString 44 bytes
> 2018-08-28 01:30:16,984 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseEndpointRPC:143 : The scan 67b41fc6 for segment 
> kylin_sales_cube_clone[2012010100_2013010100] is as below with 1 
> separate raw scans, shard part of start/end key is set to 0
> 2018-08-28 01:30:16,991 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseRPC:288 : Visiting hbase table KYLIN_5Q088VO5I0: cuboid require 
> post aggregation, from 0 to 16384 Start: 
> \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00 
> (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00) Stop: 
> \x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\xFF\xFF\xFF\x00 
> (\x00\x00\x00\x00\x00\x00\x00\x00@\x00\xFF\xFF\xFF\x00), No Fuzzy Key
> 2018-08-28 01:30:16,991 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> v2.CubeHBaseEndpointRPC:148 : Submitting rpc to 1 shards starting from shard 
> 0, scan range count 1
> 2018-08-28 01:30:17,010 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> common.KylinConfig:332 : Loading kylin-defaults.properties from 
> file:/root/shaofengshi/apache-kylin-2.5.0-SNAPSHOT-bin/tomcat/webapps/kylin/WEB-INF/lib/kylin-core-common-2.5.0-SNAPSHOT.jar!/kylin-defaults.properties
> 2018-08-28 01:30:17,033 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> common.KylinConfig:291 : KYLIN_CONF property was not set, will seek 
> KYLIN_HOME env variable
> 2018-08-28 01:30:17,051 INFO [pool-14-thread-1] hbase.HBaseConnection:110 : 
> Creating coprocessor thread pool with max of 2048, core of 2048
> 2018-08-28 01:30:17,094 INFO [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> gtrecord.SequentialCubeTupleIterator:73 : Using SortedIteratorMergerWithLimit 
> to merge segment results
> 2018-08-28 01:30:17,097 DEBUG [Query f7bf8004-b516-e372-18df-0d507075d471-71] 
> enumerator.OLAPEnumerator:117 : return TupleIterator...
> 2018-08-28 01:30:21,607 INFO [kylin-coproc--pool9-t1] 
> client.RpcRetryingCallerImpl:134 : Call exception, tries=6, retries=6, 
> started=4410 ms ago, cancelled=false, msg=java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:468)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.visitCube(CubeVisitService.java:253)
> at 
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(CubeVisitProtos.java:)
> at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8032)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2426)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2408)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42010)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> ... 3 more
> , details=row '' on table 'KYLIN_5Q088VO5I0' at 
> region=KYLIN_5Q088VO5I0,,1535417272444.27b82cb4702db4557a98b9a7e60b7692., 
> hostname=ignite03.com,16020,1534313612401, seqNum=2
> 2018-08-28 01:30:25,633 INFO [kylin-coproc--pool9-t1] 
> cli

[jira] [Updated] (KYLIN-3515) Cubing jobs may interfere with each other if use same hive view

2018-08-29 Thread Casandra julie mitchell (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casandra julie mitchell updated KYLIN-3515:
---
Attachment: Getting started

> Cubing jobs may interfere with each other if use same hive view
> 
>
> Key: KYLIN-3515
> URL: https://issues.apache.org/jira/browse/KYLIN-3515
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.4.0
>Reporter: nichunen
>Assignee: nichunen
>Priority: Major
> Fix For: Future
>
> Attachments: Getting started
>
>
> The root cause is for hive view, during cubing, kylin will materialize the 
> view by creating an intermediate table(drop intermediate table first). The 
> intermediate tables' name is like kylin_intermediate_\{view_name}, that means 
> jobs will create tables with the same name if the same view is referenced. So 
> one job's intermediate table may be dropped by another job, in such cases, 
> error like "table not found" will happen



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3520) Deal with NULL values of measures for inmem cubing

2018-08-29 Thread Zhong Yanghong (JIRA)
Zhong Yanghong created KYLIN-3520:
-

 Summary: Deal with NULL values of measures for inmem cubing
 Key: KYLIN-3520
 URL: https://issues.apache.org/jira/browse/KYLIN-3520
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong
 Fix For: v2.5.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3487) Create a new measure for precise count distinct

2018-08-29 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596105#comment-16596105
 ] 

Zhong Yanghong commented on KYLIN-3487:
---

Hi [~kangkaisen], the feature is kind of extension of [KYLIN-2622]. By the 
feature introduced by [KYLIN-2622], we can solve the infinite growth issue of 
global dictionary. However, segments of cubes cannot be merged. By introducing 
this extension, segment merge is allowed. And it will bring two advantages 
caused by segment merge:
* reduce storage cost, the same row key across segments can be merged
* improve query efficiency, reduce rpcs to multiple segments for the same row 
key

> Create a new measure for precise count distinct
> ---
>
> Key: KYLIN-3487
> URL: https://issues.apache.org/jira/browse/KYLIN-3487
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.5.0
>
>
> To compute the precise count distinct, we can use bitmap and global 
> dictionary. However, there's a limitation for the global dictionary. It maps 
> from values to ids whose type is integer, which means the number of ids will 
> be less than 2B. And it's like a Pixiu for which there's increase but no 
> decrease. 
> In eBay, there's a requirement of calculating precise count distinct of 
> session. The session cardinality is large and will grow as time goes on. It 
> will not be feasible to use the global dictionary when its cardinality 
> exceeds the upper bound 2B. How can we deal with this?
> There's good news that a session never crosses days. With this feature, we 
> don't need to merge bitmap across days. To calculate precise session 
> cardinality, we can assign each day a bitmap and directly summarize the 
> cardinalities estimated by each bitmap. No bitmap merge is needed. 
> To use bitmap for cardinality calculation, we need to map raw data from value 
> to an integer id, which is achieved by encoding the value with a dictionary. 
> Previously, for the ability of merging bitmaps from multiple segments, global 
> dictionary is used. However, in this case, there's no need of bitmap merge, 
> the global dictionary is not needed. 
> And we don't need to filter by or group by session. Then there's no need to 
> map from value to id and from id to value after the related bitmap is 
> constructed. Therefore, we don't need to store dictionaries for session. Only 
> the bitmap is enough.
> To deal with segment merge, since bitmaps of each segment are not able to 
> merge to one bitmap, we use a map for storing multiple bitmaps. In the map, 
> the key is the segment name and the value is the segment-level bitmap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3451) Cloned cube doesn't have Mandatory Cuboids copied

2018-08-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596022#comment-16596022
 ] 

ASF subversion and git services commented on KYLIN-3451:


Commit 9b762f5365fccdce01ddb8f18ea1a5bb209be261 in kylin's branch 
refs/heads/2.4.x from xingpeng1
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=9b762f5 ]

KYLIN-3451 the cloned cube don't have Mandatory Cuboids


> Cloned cube doesn't have Mandatory Cuboids copied
> -
>
> Key: KYLIN-3451
> URL: https://issues.apache.org/jira/browse/KYLIN-3451
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.3.0
>Reporter: Peng Xing
>Assignee: Peng Xing
>Priority: Minor
> Fix For: v2.4.1, v2.5.0
>
> Attachments: 
> 0001-KYLIN-3451-the-cloned-cube-don-t-have-Mandatory-Cubo.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)