[jira] [Commented] (KYLIN-3728) unexpected behavior when do fix holes for steaming cube

2018-12-24 Thread Lijun Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728610#comment-16728610
 ] 

Lijun Cao commented on KYLIN-3728:
--

I can filled up the hole by using a sample streaming cube. Could you provide 
more details ?

> unexpected behavior when do fix holes for steaming cube
> ---
>
> Key: KYLIN-3728
> URL: https://issues.apache.org/jira/browse/KYLIN-3728
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: wangxianbin
>Priority: Major
> Attachments: after new segment ready.png, fix hole finished.png, 
> fix_holes_kylin.log, in process of fix holes.png
>
>
> after we finished fix holes, exist holes did not been filled up, and sometime 
> more exist segment will become hole, check fix hole log in attachment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread Chao Long (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728600#comment-16728600
 ] 

Chao Long commented on KYLIN-3731:
--

Sorry, I don't understand what's your meaning. If you use hive view, you may 
convert array to string in the view, then Kylin will load string type data not 
array type data, right?

If you provide query sql, maybe I can understand well.

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io commented on issue #398: Kylin 3597 fix sonar issues

2018-12-24 Thread GitBox
codecov-io commented on issue #398: Kylin 3597 fix sonar issues
URL: https://github.com/apache/kylin/pull/398#issuecomment-449808086
 
 
   # [Codecov](https://codecov.io/gh/apache/kylin/pull/398?src=pr=h1) Report
   > Merging [#398](https://codecov.io/gh/apache/kylin/pull/398?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/kylin/commit/1af62e46516b7b9a31d3de5a1a7867f9cb51799b?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `0%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/kylin/pull/398/graphs/tree.svg?width=650=JawVgbgsVo=150=pr)](https://codecov.io/gh/apache/kylin/pull/398?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master #398  +/-   ##
   
   - Coverage 24.39%   24.38%   -0.01% 
   + Complexity 4935 4934   -1 
   
 Files  1143 1143  
 Lines 6925969259  
 Branches   9859 9859  
   
   - Hits  1689716892   -5 
   - Misses5066550668   +3 
   - Partials   1697 1699   +2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/kylin/pull/398?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...che/kylin/cube/inmemcubing2/InMemCubeBuilder2.java](https://codecov.io/gh/apache/kylin/pull/398/diff?src=pr=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2lubWVtY3ViaW5nMi9Jbk1lbUN1YmVCdWlsZGVyMi5qYXZh)
 | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...he/kylin/dict/lookup/cache/RocksDBLookupTable.java](https://codecov.io/gh/apache/kylin/pull/398/diff?src=pr=tree#diff-Y29yZS1kaWN0aW9uYXJ5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9kaWN0L2xvb2t1cC9jYWNoZS9Sb2Nrc0RCTG9va3VwVGFibGUuamF2YQ==)
 | `72.97% <0%> (-5.41%)` | `6% <0%> (-1%)` | |
   | 
[...rg/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://codecov.io/gh/apache/kylin/pull/398/diff?src=pr=tree#diff-Y29yZS1jdWJlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9reWxpbi9jdWJlL2lubWVtY3ViaW5nL01lbURpc2tTdG9yZS5qYXZh)
 | `69.3% <0%> (-0.92%)` | `7% <0%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/kylin/pull/398?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/kylin/pull/398?src=pr=footer). Last 
update 
[1af62e4...37389bf](https://codecov.io/gh/apache/kylin/pull/398?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] hit-lacus commented on a change in pull request #397: KYLIN-3722 Disable limit push down after join

2018-12-24 Thread GitBox
hit-lacus commented on a change in pull request #397: KYLIN-3722 Disable limit 
push down after join
URL: https://github.com/apache/kylin/pull/397#discussion_r243877021
 
 

 ##
 File path: query/src/main/java/org/apache/kylin/query/relnode/OLAPLimitRel.java
 ##
 @@ -82,7 +82,8 @@ public void implementOLAP(OLAPImplementor implementor) {
 // ignore limit after having clause
 // ignore limit after another limit, e.g. select A, count(*) from 
(select A,B from fact group by A,B limit 100) limit 10
 // ignore limit after outer aggregate, e.g. select count(1) from 
(select A,B from fact group by A,B ) limit 10
-if (!context.afterHavingClauseFilter && !context.afterLimit && 
!context.afterOuterAggregate) {
+// ignore limit after join
+if (!context.afterHavingClauseFilter && !context.afterLimit && 
!context.afterOuterAggregate && !context.afterJoin) {
 
 Review comment:
   Good point, I am looking for better solution now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shaofengshi commented on a change in pull request #397: KYLIN-3722 Disable limit push down after join

2018-12-24 Thread GitBox
shaofengshi commented on a change in pull request #397: KYLIN-3722 Disable 
limit push down after join
URL: https://github.com/apache/kylin/pull/397#discussion_r243876893
 
 

 ##
 File path: query/src/main/java/org/apache/kylin/query/relnode/OLAPLimitRel.java
 ##
 @@ -82,7 +82,8 @@ public void implementOLAP(OLAPImplementor implementor) {
 // ignore limit after having clause
 // ignore limit after another limit, e.g. select A, count(*) from 
(select A,B from fact group by A,B limit 100) limit 10
 // ignore limit after outer aggregate, e.g. select count(1) from 
(select A,B from fact group by A,B ) limit 10
-if (!context.afterHavingClauseFilter && !context.afterLimit && 
!context.afterOuterAggregate) {
+// ignore limit after join
+if (!context.afterHavingClauseFilter && !context.afterLimit && 
!context.afterOuterAggregate && !context.afterJoin) {
 
 Review comment:
   My concern is, this change may make many normal queries (join with the 
limit) much inefficient.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (KYLIN-3739) Use table alias rather than table identity for snapshots in CubeSegment, CubeInstance, CubeDesc

2018-12-24 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728578#comment-16728578
 ] 

Shaofeng SHI commented on KYLIN-3739:
-

Today the table snapshots are shared across cubes, as the table ID is the same. 
If change to this way, can the snapshot be shared as before?

> Use table alias rather than table identity for snapshots in CubeSegment, 
> CubeInstance, CubeDesc
> ---
>
> Key: KYLIN-3739
> URL: https://issues.apache.org/jira/browse/KYLIN-3739
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Zhong Yanghong
>Priority: Major
>
> In 2.0.0, Kylin introduced table alias in DataModelDesc. Most of the elements 
> in CubeDesc, CubeInstance & CubeSegment use the table alias rather than the 
> table identity. However, for the lookup table snapshots, it still uses table 
> identity. 
> It's better for us to use only table alias instead of table identity in 
> CubeDesc, CubeInstance & CubeSegment. If so, it can provide several 
> advantages:
> # For CubeDesc, CubeInstance & CubeSegment, what exposed to them is only the 
> snowflake model and they don't need to care which real table is used.
> # If users want to change the table name in the snowflake model, we can still 
> keep the table alias unchanged and what we need to change is only the 
> DataModelDesc. And we don't need to do any change for CubeDesc, CubeInstance 
> & CubeSegment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3739) Use table alias rather than table identity for snapshots in CubeSegment, CubeInstance, CubeDesc

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728577#comment-16728577
 ] 

Zhong Yanghong commented on KYLIN-3739:
---

Hi [~liyang.g...@gmail.com] & [~Shaofengshi], what do you think?

> Use table alias rather than table identity for snapshots in CubeSegment, 
> CubeInstance, CubeDesc
> ---
>
> Key: KYLIN-3739
> URL: https://issues.apache.org/jira/browse/KYLIN-3739
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Zhong Yanghong
>Priority: Major
>
> In 2.0.0, Kylin introduced table alias in DataModelDesc. Most of the elements 
> in CubeDesc, CubeInstance & CubeSegment use the table alias rather than the 
> table identity. However, for the lookup table snapshots, it still uses table 
> identity. 
> It's better for us to use only table alias instead of table identity in 
> CubeDesc, CubeInstance & CubeSegment. If so, it can provide several 
> advantages:
> # For CubeDesc, CubeInstance & CubeSegment, what exposed to them is only the 
> snowflake model and they don't need to care which real table is used.
> # If users want to change the table name in the snowflake model, we can still 
> keep the table alias unchanged and what we need to change is only the 
> DataModelDesc. And we don't need to do any change for CubeDesc, CubeInstance 
> & CubeSegment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3559) Use Splitter for splitting String

2018-12-24 Thread Wu Bin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wu Bin resolved KYLIN-3559.
---
Resolution: Fixed

> Use Splitter for splitting String
> -
>
> Key: KYLIN-3559
> URL: https://issues.apache.org/jira/browse/KYLIN-3559
> Project: Kylin
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Wu Bin
>Priority: Major
> Fix For: v2.6.0
>
>
> See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
> preferred .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728569#comment-16728569
 ] 

ASF GitHub Bot commented on KYLIN-3738:
---

shaofengshi commented on pull request #414: KYLIN-3738 Edit cube measure may 
make the decimal type change unexpectly
URL: https://github.com/apache/kylin/pull/414
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728568#comment-16728568
 ] 

Shaofeng SHI commented on KYLIN-3738:
-

Julian, thanks for the confirmation. Please go ahead.

> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2243) TopN memory estimation is inaccurate in some cases

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728571#comment-16728571
 ] 

ASF subversion and git services commented on KYLIN-2243:


Commit 1af62e46516b7b9a31d3de5a1a7867f9cb51799b in kylin's branch 
refs/heads/master from liapan
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=1af62e4 ]

KYLIN-3738 Edit cube measure may make the decimal type change unexpectly
revert KYLIN-2243 8c0c44b887e2caa21b097c2334f8d21c42462e80


> TopN memory estimation is inaccurate in some cases
> --
>
> Key: KYLIN-2243
> URL: https://issues.apache.org/jira/browse/KYLIN-2243
> Project: Kylin
>  Issue Type: Bug
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.0.0
>
>
> TopNCounterSerializer.maxLength() and 
> TopNCounterSerializer.getStorageBytesEstimate() might be inaccurate, 
> especially when there are multiple "group by" columns in one TopN measure and 
> some uses long bytes encoding like "fixed_length:16"
> The inaccurate estimation may cause memory issue when using in-mem cubing, 
> and will cause the estimation on final cube size inaccurate.
> The root cause is the data type like "top(100)" doesn't have the info of how 
> long a key can be. So far it uses a default value 4 which is too small when 
> the encoding is something like "fixed_length:16". The solution is extending 
> the expression of data type to "top(100, 16)" to indicate that one key can be 
> 16 bytes long. If the "scale" is absent, use 4 bytes as default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728570#comment-16728570
 ] 

ASF subversion and git services commented on KYLIN-3738:


Commit 1af62e46516b7b9a31d3de5a1a7867f9cb51799b in kylin's branch 
refs/heads/master from liapan
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=1af62e4 ]

KYLIN-3738 Edit cube measure may make the decimal type change unexpectly
revert KYLIN-2243 8c0c44b887e2caa21b097c2334f8d21c42462e80


> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] shaofengshi closed pull request #414: KYLIN-3738 Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread GitBox
shaofengshi closed pull request #414: KYLIN-3738 Edit cube measure may make the 
decimal type change unexpectly
URL: https://github.com/apache/kylin/pull/414
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/webapp/app/js/controllers/cubeMeasures.js 
b/webapp/app/js/controllers/cubeMeasures.js
index f1821dda87..fb5610f571 100644
--- a/webapp/app/js/controllers/cubeMeasures.js
+++ b/webapp/app/js/controllers/cubeMeasures.js
@@ -58,7 +58,6 @@ KylinApp.controller('CubeMeasuresCtrl', function ($scope, 
$modal,MetaModel,cubes
 $scope.nextParameters = [];
 $scope.measureParamValueColumn=$scope.getCommonMetricColumns();
 $scope.newMeasure = (!!measure)? jQuery.extend(true, 
{},measure):CubeDescModel.createMeasure();
-
$scope.newMeasure.function.returntype=$scope.newMeasure.function.returntype.replace(/\,\d+/,'');
 if(!!measure && measure.function.parameter.next_parameter){
   $scope.nextPara.value = measure.function.parameter.next_parameter.value;
 }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread HongBo Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728560#comment-16728560
 ] 

HongBo  Dai commented on KYLIN-3731:


Hi, with complex data types in kylin array column as dimension table query, 
just can't directly use need to be converted into the hive view, through the 
link to query the fact table, in the hive view can use function explodes 
multidimensional arrays can be converted to the form of query, whether in 
version 2.3 or higher, if again in 2.3 using did not join DataTypeOrder when 
the class is no problem now in 2.5 x version will have that problem.

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread HongBo Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728560#comment-16728560
 ] 

HongBo  Dai edited comment on KYLIN-3731 at 12/25/18 1:47 AM:
--

Hi, with complex data types in kylin array column as dimension table query, 
just can't directly use need to be converted into the hive view, through the 
link to query the fact table, in the hive view can use function explode 
multidimensional arrays can be converted to the form of query, whether in 
version 2.3 or higher, if again in 2.3 using did not join DataTypeOrder when 
the class is no problem now in 2.5 x version will have that problem.


was (Author: ville):
Hi, with complex data types in kylin array column as dimension table query, 
just can't directly use need to be converted into the hive view, through the 
link to query the fact table, in the hive view can use function explodes 
multidimensional arrays can be converted to the form of query, whether in 
version 2.3 or higher, if again in 2.3 using did not join DataTypeOrder when 
the class is no problem now in 2.5 x version will have that problem.

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3540) Improve Mandatory Cuboid Recommendation Algorithm

2018-12-24 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong resolved KYLIN-3540.
---
Resolution: Resolved

> Improve Mandatory Cuboid Recommendation Algorithm
> -
>
> Key: KYLIN-3540
> URL: https://issues.apache.org/jira/browse/KYLIN-3540
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> Previously to add cuboids which are not prebuilt,  the cube planner turns to 
> mandatory cuboids which are selected if its rollup row count is above some 
> threshold. There are two shortcomings:
> * The way to estimate the rollup row count is not good
> * It's hard to determine the threshold of rollup row count for recommending 
> mandatory cuboids
> bq. {color:#f79232}The improved way to estimate the rollup row count is as 
> follows:{color}
> Current criteria to recommend mandatory cuboids is based on the average 
> rollup count collected with query metrics. There's a disadvantage. An example 
> is as follows:
> Cuboid (A,B) has 1000 rows, prebuilt; Cuboid (B) has 10 rows, not prebuilt; 
> The ground truth for the rollup count from Cuboid (A,B) to Cuboid (B) is
> {code}
> Cuboid (A,B) - Cuboid (A) = 1000 - 10 = 990
> {code}
> Suppose B is evenly composed with A. Then for each value of B with A, the row 
> count is 1000 * (10/100) = 100.
> Now for sql 
> {code}
> select B, count(*)
> from T
> where B = 'e1'
> group by B
> {code}
> Then the rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1'}) - return count = 100 - 1 = 99
> {code}
> which is much smaller than 990 due to the influence of lots of filtered row 
> count.
> It's better to calculate the rollup rate first and then multiple the parent 
> cuboid row count to estimate the rollup count. The refined formula is as 
> follows:
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1'}) = 
> 1000-1000*1/100 = 990
> {code}
> Another sql
> {code}
> select count(*)
> from T
> where B in {'e1','e2'}
> {code}
> The rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1','e2'}) - return count = 100*2 - 1 = 199
> {code}
> The rollup count by refined algorithm will be
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1','e2'}) = 
> 1000-1000*1/(100*2) = 995
> {code}
> Above all, the refined algorithm will be much less influenced by filters in 
> sql.
> bq. {color:#f79232}Don't recommend mandatory cuboids & don't need the 
> threshold
> {color}
> Previously the reason to recommend mandatory cuboids is that they are not 
> prebuilt and their row count statistics are not known, which causes it's not 
> possible to apply cube planner algorithm for them. Now by the improved way of 
> estimating rollup row count, we can better estimate the row count statistics 
> for those cuboids which are not prebuilt. Then the cost-based cube planner 
> algorithm will decide which cuboid to be built or not and the threshold is 
> not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3540) Improve Mandatory Cuboid Recommendation Algorithm

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728555#comment-16728555
 ] 

ASF subversion and git services commented on KYLIN-3540:


Commit 8cfe32cf4f61d218439e57e134ccc3413aa98f89 in kylin's branch 
refs/heads/master from kyotoYaho
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=8cfe32c ]

KYLIN-3540 move queryService of CubeController to CubeService


> Improve Mandatory Cuboid Recommendation Algorithm
> -
>
> Key: KYLIN-3540
> URL: https://issues.apache.org/jira/browse/KYLIN-3540
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> Previously to add cuboids which are not prebuilt,  the cube planner turns to 
> mandatory cuboids which are selected if its rollup row count is above some 
> threshold. There are two shortcomings:
> * The way to estimate the rollup row count is not good
> * It's hard to determine the threshold of rollup row count for recommending 
> mandatory cuboids
> bq. {color:#f79232}The improved way to estimate the rollup row count is as 
> follows:{color}
> Current criteria to recommend mandatory cuboids is based on the average 
> rollup count collected with query metrics. There's a disadvantage. An example 
> is as follows:
> Cuboid (A,B) has 1000 rows, prebuilt; Cuboid (B) has 10 rows, not prebuilt; 
> The ground truth for the rollup count from Cuboid (A,B) to Cuboid (B) is
> {code}
> Cuboid (A,B) - Cuboid (A) = 1000 - 10 = 990
> {code}
> Suppose B is evenly composed with A. Then for each value of B with A, the row 
> count is 1000 * (10/100) = 100.
> Now for sql 
> {code}
> select B, count(*)
> from T
> where B = 'e1'
> group by B
> {code}
> Then the rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1'}) - return count = 100 - 1 = 99
> {code}
> which is much smaller than 990 due to the influence of lots of filtered row 
> count.
> It's better to calculate the rollup rate first and then multiple the parent 
> cuboid row count to estimate the rollup count. The refined formula is as 
> follows:
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1'}) = 
> 1000-1000*1/100 = 990
> {code}
> Another sql
> {code}
> select count(*)
> from T
> where B in {'e1','e2'}
> {code}
> The rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1','e2'}) - return count = 100*2 - 1 = 199
> {code}
> The rollup count by refined algorithm will be
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1','e2'}) = 
> 1000-1000*1/(100*2) = 995
> {code}
> Above all, the refined algorithm will be much less influenced by filters in 
> sql.
> bq. {color:#f79232}Don't recommend mandatory cuboids & don't need the 
> threshold
> {color}
> Previously the reason to recommend mandatory cuboids is that they are not 
> prebuilt and their row count statistics are not known, which causes it's not 
> possible to apply cube planner algorithm for them. Now by the improved way of 
> estimating rollup row count, we can better estimate the row count statistics 
> for those cuboids which are not prebuilt. Then the cost-based cube planner 
> algorithm will decide which cuboid to be built or not and the threshold is 
> not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3540) Improve Mandatory Cuboid Recommendation Algorithm

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728556#comment-16728556
 ] 

ASF subversion and git services commented on KYLIN-3540:


Commit 4db6a37c7220c122cedb9fac1a2c735f10e27226 in kylin's branch 
refs/heads/master from kyotoYaho
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=4db6a37 ]

KYLIN-3540 refactor the interface of querying on SYSTEM project


> Improve Mandatory Cuboid Recommendation Algorithm
> -
>
> Key: KYLIN-3540
> URL: https://issues.apache.org/jira/browse/KYLIN-3540
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> Previously to add cuboids which are not prebuilt,  the cube planner turns to 
> mandatory cuboids which are selected if its rollup row count is above some 
> threshold. There are two shortcomings:
> * The way to estimate the rollup row count is not good
> * It's hard to determine the threshold of rollup row count for recommending 
> mandatory cuboids
> bq. {color:#f79232}The improved way to estimate the rollup row count is as 
> follows:{color}
> Current criteria to recommend mandatory cuboids is based on the average 
> rollup count collected with query metrics. There's a disadvantage. An example 
> is as follows:
> Cuboid (A,B) has 1000 rows, prebuilt; Cuboid (B) has 10 rows, not prebuilt; 
> The ground truth for the rollup count from Cuboid (A,B) to Cuboid (B) is
> {code}
> Cuboid (A,B) - Cuboid (A) = 1000 - 10 = 990
> {code}
> Suppose B is evenly composed with A. Then for each value of B with A, the row 
> count is 1000 * (10/100) = 100.
> Now for sql 
> {code}
> select B, count(*)
> from T
> where B = 'e1'
> group by B
> {code}
> Then the rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1'}) - return count = 100 - 1 = 99
> {code}
> which is much smaller than 990 due to the influence of lots of filtered row 
> count.
> It's better to calculate the rollup rate first and then multiple the parent 
> cuboid row count to estimate the rollup count. The refined formula is as 
> follows:
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1'}) = 
> 1000-1000*1/100 = 990
> {code}
> Another sql
> {code}
> select count(*)
> from T
> where B in {'e1','e2'}
> {code}
> The rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1','e2'}) - return count = 100*2 - 1 = 199
> {code}
> The rollup count by refined algorithm will be
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1','e2'}) = 
> 1000-1000*1/(100*2) = 995
> {code}
> Above all, the refined algorithm will be much less influenced by filters in 
> sql.
> bq. {color:#f79232}Don't recommend mandatory cuboids & don't need the 
> threshold
> {color}
> Previously the reason to recommend mandatory cuboids is that they are not 
> prebuilt and their row count statistics are not known, which causes it's not 
> possible to apply cube planner algorithm for them. Now by the improved way of 
> estimating rollup row count, we can better estimate the row count statistics 
> for those cuboids which are not prebuilt. Then the cost-based cube planner 
> algorithm will decide which cuboid to be built or not and the threshold is 
> not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3540) Improve Mandatory Cuboid Recommendation Algorithm

2018-12-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728554#comment-16728554
 ] 

ASF GitHub Bot commented on KYLIN-3540:
---

kyotoYaho commented on pull request #407: KYLIN-3540 estimate the row counts of 
source cuboids which are not built & remove mandatory cuboids recommendation
URL: https://github.com/apache/kylin/pull/407
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve Mandatory Cuboid Recommendation Algorithm
> -
>
> Key: KYLIN-3540
> URL: https://issues.apache.org/jira/browse/KYLIN-3540
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> Previously to add cuboids which are not prebuilt,  the cube planner turns to 
> mandatory cuboids which are selected if its rollup row count is above some 
> threshold. There are two shortcomings:
> * The way to estimate the rollup row count is not good
> * It's hard to determine the threshold of rollup row count for recommending 
> mandatory cuboids
> bq. {color:#f79232}The improved way to estimate the rollup row count is as 
> follows:{color}
> Current criteria to recommend mandatory cuboids is based on the average 
> rollup count collected with query metrics. There's a disadvantage. An example 
> is as follows:
> Cuboid (A,B) has 1000 rows, prebuilt; Cuboid (B) has 10 rows, not prebuilt; 
> The ground truth for the rollup count from Cuboid (A,B) to Cuboid (B) is
> {code}
> Cuboid (A,B) - Cuboid (A) = 1000 - 10 = 990
> {code}
> Suppose B is evenly composed with A. Then for each value of B with A, the row 
> count is 1000 * (10/100) = 100.
> Now for sql 
> {code}
> select B, count(*)
> from T
> where B = 'e1'
> group by B
> {code}
> Then the rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1'}) - return count = 100 - 1 = 99
> {code}
> which is much smaller than 990 due to the influence of lots of filtered row 
> count.
> It's better to calculate the rollup rate first and then multiple the parent 
> cuboid row count to estimate the rollup count. The refined formula is as 
> follows:
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1'}) = 
> 1000-1000*1/100 = 990
> {code}
> Another sql
> {code}
> select count(*)
> from T
> where B in {'e1','e2'}
> {code}
> The rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1','e2'}) - return count = 100*2 - 1 = 199
> {code}
> The rollup count by refined algorithm will be
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1','e2'}) = 
> 1000-1000*1/(100*2) = 995
> {code}
> Above all, the refined algorithm will be much less influenced by filters in 
> sql.
> bq. {color:#f79232}Don't recommend mandatory cuboids & don't need the 
> threshold
> {color}
> Previously the reason to recommend mandatory cuboids is that they are not 
> prebuilt and their row count statistics are not known, which causes it's not 
> possible to apply cube planner algorithm for them. Now by the improved way of 
> estimating rollup row count, we can better estimate the row count statistics 
> for those cuboids which are not prebuilt. Then the cost-based cube planner 
> algorithm will decide which cuboid to be built or not and the threshold is 
> not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3540) Improve Mandatory Cuboid Recommendation Algorithm

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728557#comment-16728557
 ] 

ASF subversion and git services commented on KYLIN-3540:


Commit 4850dacece8a2e90f4333b50e2e8304635a730a2 in kylin's branch 
refs/heads/master from kyotoYaho
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=4850dac ]

KYLIN-3540 estimate the row counts of source cuboids which are not built & 
remove mandatory cuboids recommendation


> Improve Mandatory Cuboid Recommendation Algorithm
> -
>
> Key: KYLIN-3540
> URL: https://issues.apache.org/jira/browse/KYLIN-3540
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> Previously to add cuboids which are not prebuilt,  the cube planner turns to 
> mandatory cuboids which are selected if its rollup row count is above some 
> threshold. There are two shortcomings:
> * The way to estimate the rollup row count is not good
> * It's hard to determine the threshold of rollup row count for recommending 
> mandatory cuboids
> bq. {color:#f79232}The improved way to estimate the rollup row count is as 
> follows:{color}
> Current criteria to recommend mandatory cuboids is based on the average 
> rollup count collected with query metrics. There's a disadvantage. An example 
> is as follows:
> Cuboid (A,B) has 1000 rows, prebuilt; Cuboid (B) has 10 rows, not prebuilt; 
> The ground truth for the rollup count from Cuboid (A,B) to Cuboid (B) is
> {code}
> Cuboid (A,B) - Cuboid (A) = 1000 - 10 = 990
> {code}
> Suppose B is evenly composed with A. Then for each value of B with A, the row 
> count is 1000 * (10/100) = 100.
> Now for sql 
> {code}
> select B, count(*)
> from T
> where B = 'e1'
> group by B
> {code}
> Then the rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1'}) - return count = 100 - 1 = 99
> {code}
> which is much smaller than 990 due to the influence of lots of filtered row 
> count.
> It's better to calculate the rollup rate first and then multiple the parent 
> cuboid row count to estimate the rollup count. The refined formula is as 
> follows:
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1'}) = 
> 1000-1000*1/100 = 990
> {code}
> Another sql
> {code}
> select count(*)
> from T
> where B in {'e1','e2'}
> {code}
> The rollup count by current algorithm will be
> {code}
> Cuboid (A,{'e1','e2'}) - return count = 100*2 - 1 = 199
> {code}
> The rollup count by refined algorithm will be
> {code}
> Cuboid (A,B) - Cuboid (A,B) * (return count) / Cuboid (A,{'e1','e2'}) = 
> 1000-1000*1/(100*2) = 995
> {code}
> Above all, the refined algorithm will be much less influenced by filters in 
> sql.
> bq. {color:#f79232}Don't recommend mandatory cuboids & don't need the 
> threshold
> {color}
> Previously the reason to recommend mandatory cuboids is that they are not 
> prebuilt and their row count statistics are not known, which causes it's not 
> possible to apply cube planner algorithm for them. Now by the improved way of 
> estimating rollup row count, we can better estimate the row count statistics 
> for those cuboids which are not prebuilt. Then the cost-based cube planner 
> algorithm will decide which cuboid to be built or not and the threshold is 
> not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kyotoYaho closed pull request #407: KYLIN-3540 estimate the row counts of source cuboids which are not built & remove mandatory cuboids recommendation

2018-12-24 Thread GitBox
kyotoYaho closed pull request #407: KYLIN-3540 estimate the row counts of 
source cuboids which are not built & remove mandatory cuboids recommendation
URL: https://github.com/apache/kylin/pull/407
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java 
b/core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java
index f67f6b3479..b63062e31a 100644
--- a/core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java
+++ b/core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java
@@ -316,8 +316,7 @@ public String getMetastoreBigCellHdfsDirectory() {
 public String getReadHdfsWorkingDirectory() {
 if (StringUtils.isNotEmpty(getHBaseClusterFs())) {
 Path workingDir = new Path(getHdfsWorkingDirectory());
-return new Path(getHBaseClusterFs(), 
Path.getPathWithoutSchemeAndAuthority(workingDir)).toString()
-+ "/";
+return new Path(getHBaseClusterFs(), 
Path.getPathWithoutSchemeAndAuthority(workingDir)).toString() + "/";
 }
 
 return getHdfsWorkingDirectory();
@@ -644,8 +643,12 @@ public int getCubePlannerRecommendCuboidCacheMaxSize() {
 return 
Integer.parseInt(getOptional("kylin.cube.cubeplanner.recommend-cache-max-size", 
"200"));
 }
 
-public long getCubePlannerMandatoryRollUpThreshold() {
-return 
Long.parseLong(getOptional("kylin.cube.cubeplanner.mandatory-rollup-threshold", 
"1000"));
+public double getCubePlannerQueryUncertaintyRatio() {
+return 
Double.parseDouble(getOptional("kylin.cube.cubeplanner.query-uncertainty-ratio",
 "0.1"));
+}
+
+public double getCubePlannerBPUSMinBenefitRatio() {
+return 
Double.parseDouble(getOptional("kylin.cube.cubeplanner.bpus-min-benefit-ratio", 
"0.01"));
 }
 
 public int getCubePlannerAgreedyAlgorithmAutoThreshold() {
@@ -1910,12 +1913,13 @@ public boolean isJsonAlwaysSmallCell() {
 }
 
 public int getSmallCellMetadataWarningThreshold() {
-return 
Integer.parseInt(getOptional("kylin.metadata.jdbc.small-cell-meta-size-warning-threshold",
-String.valueOf(100 << 20))); //100mb
+return Integer.parseInt(
+
getOptional("kylin.metadata.jdbc.small-cell-meta-size-warning-threshold", 
String.valueOf(100 << 20))); //100mb
 }
 
 public int getSmallCellMetadataErrorThreshold() {
-return 
Integer.parseInt(getOptional("kylin.metadata.jdbc.small-cell-meta-size-error-threshold",
 String.valueOf(1 << 30))); // 1gb
+return Integer.parseInt(
+
getOptional("kylin.metadata.jdbc.small-cell-meta-size-error-threshold", 
String.valueOf(1 << 30))); // 1gb
 }
 
 public int getJdbcResourceStoreMaxCellSize() {
diff --git 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java
 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java
index 6316858d58..39c52dafe9 100755
--- 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java
+++ 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java
@@ -142,7 +142,7 @@ public boolean ifEfficient(CuboidBenefitModel best) {
 }
 
 public double getMinBenefitRatio() {
-return 0.01;
+return cuboidStats.getBpusMinBenefitRatio();
 }
 
 @Override
diff --git 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidRecommender.java
 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidRecommender.java
index baacb51791..0e6a844a95 100644
--- 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidRecommender.java
+++ 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidRecommender.java
@@ -154,12 +154,11 @@ public static CuboidRecommender getInstance() {
 
 Map recommendCuboidsWithStats = Maps.newLinkedHashMap();
 for (Long cuboid : recommendCuboidList) {
-if (cuboid.equals(cuboidStats.getBaseCuboid())) {
-recommendCuboidsWithStats.put(cuboid, 
cuboidStats.getCuboidCount(cuboid));
-} else if 
(cuboidStats.getAllCuboidsForSelection().contains(cuboid)) {
-recommendCuboidsWithStats.put(cuboid, 
cuboidStats.getCuboidCount(cuboid));
+if (cuboid == 0L) {
+// for zero cuboid, just simply recommend the cheapest cuboid.
+handleCuboidZeroRecommend(cuboidStats, 
recommendCuboidsWithStats);
 } else {
-recommendCuboidsWithStats.put(cuboid, -1L);
+recommendCuboidsWithStats.put(cuboid, 
cuboidStats.getCuboidCount(cuboid));
 }
 

[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Pan, Julian (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728548#comment-16728548
 ] 

Pan, Julian commented on KYLIN-3738:


Hi shaofeng, I test in my local: After revert the pervious commit, the topN 
measure ($scope.newMeasure) is same as before and the decimal issue will be 
resolved.

> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728400#comment-16728400
 ] 

Shaofeng SHI commented on KYLIN-3738:
-

hi Julian, thanks for the reporting. Is there drawback if revert the previous 
commit? Can it be fixed?

> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728334#comment-16728334
 ] 

ASF GitHub Bot commented on KYLIN-3738:
---

sanjulian commented on pull request #414: KYLIN-3738 Edit cube measure may make 
the decimal type change unexpectly
URL: https://github.com/apache/kylin/pull/414
 
 
   revert KYLIN-2243 8c0c44b887e2caa21b097c2334f8d21c42462e80
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] asfgit commented on issue #414: KYLIN-3738 Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread GitBox
asfgit commented on issue #414: KYLIN-3738 Edit cube measure may make the 
decimal type change unexpectly
URL: https://github.com/apache/kylin/pull/414#issuecomment-449721083
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sanjulian opened a new pull request #414: KYLIN-3738 Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread GitBox
sanjulian opened a new pull request #414: KYLIN-3738 Edit cube measure may make 
the decimal type change unexpectly
URL: https://github.com/apache/kylin/pull/414
 
 
   revert KYLIN-2243 8c0c44b887e2caa21b097c2334f8d21c42462e80


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread Chao Long (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728330#comment-16728330
 ] 

Chao Long commented on KYLIN-3731:
--

Yes, Kylin support loading array type data from hive and can build 
successfully, but when I query with array type column, I get some error like 
"String cannot cast to array". So I think Kylin doesn't really support complex 
data type. 

So, I want to know how does your query sql like when you are using kylin2.3.

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Pan, Julian (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728319#comment-16728319
 ] 

Pan, Julian commented on KYLIN-3738:


Hi [~Shaofengshi], should we revert the commit? I test in my local, the 
configuration attribute will cover encoding.

> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Pan, Julian (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pan, Julian updated KYLIN-3738:
---
Description: When edit cube's measure and click save, the origin return 
type maybe changed from decimal(19,4) to decimal(19), that will cause cube 
build result not correct and query result incorrectly.  (was: When edit cube's 
measure and click save, the origin return type maybe changed from decimal(19,4) 
to decimal(19), that will cause cube build result not correct and query 
incorrectly.)

> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct and query result incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread HongBo Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728308#comment-16728308
 ] 

HongBo  Dai commented on KYLIN-3731:


Hi, mainly hive data storage table is stored inside the fact table is a complex 
data type is an array, and create a hive view, and kylin dimension table to 
build inside have to field data cube, so an error directly, I read the class 
source code should be under the lack of judgment logic, so lead to kylin third 
step direct error when creating the cube.

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread HongBo Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728308#comment-16728308
 ] 

HongBo  Dai edited comment on KYLIN-3731 at 12/24/18 10:12 AM:
---

Hi, mainly hive data storage table is stored inside the fact table is a complex 
data type is an array, and create a hive view, and kylin dimension table to 
build inside have to field data cube, so an error directly, I read the class 
source code should be under the lack of judgment logic, so lead to kylin third 
step direct error when creating the cube.
kylin itself supports complex data types, and that logic aside, if kylin does 
not support complex data types, hive data tables cannot be loaded and used in 
kylin's web interface.


was (Author: ville):
Hi, mainly hive data storage table is stored inside the fact table is a complex 
data type is an array, and create a hive view, and kylin dimension table to 
build inside have to field data cube, so an error directly, I read the class 
source code should be under the lack of judgment logic, so lead to kylin third 
step direct error when creating the cube.

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3731) java.lang.IllegalArgumentException: Unsupported data type array at

2018-12-24 Thread Chao Long (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728302#comment-16728302
 ] 

Chao Long commented on KYLIN-3731:
--

As I know, Kylin doesn't support complex data type, like array, map...

How does it work in your scenario?In other words, what's your query pattern 
with the column which data type is array?

> java.lang.IllegalArgumentException: Unsupported data type array at 
> ---
>
> Key: KYLIN-3731
> URL: https://issues.apache.org/jira/browse/KYLIN-3731
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.1
>Reporter: HongBo  Dai
>Assignee: Chao Long
>Priority: Critical
>  Labels: build
> Fix For: v2.5.1
>
> Attachments: error of kylin.txt, image-2018-12-20-10-59-04-060.png
>
>
> As kylin was recently upgraded from 2.3 to 2.5.1, its data type of array 
> metadata was found to be unsupported and the following exception occurred
> "java. lang. IllegalArgumentException: Unsupported data type array", are in 
> kylin2.3 hive data storage array before running this type is no problem, 
> there is the lead in building a cube when the third step is as follows
> "org. apache. kylin. engine. mr. Exception. MapReduceException: no counters 
> for the job", could you tell me how to solve the problem without changing 
> data structure situation now? please  look up  attachment. 
> !image-2018-12-20-10-59-04-060.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3021) Check MapReduce job failed reason and include the diagnostics into email notification

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728289#comment-16728289
 ] 

ASF subversion and git services commented on KYLIN-3021:


Commit 0c60c6b9cad6ffefd716fd2a3e41a3b9b45788b5 in kylin's branch 
refs/heads/master from Wang Ken
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=0c60c6b ]

KYLIN-3021 check MapReduce job failed reason and include the diagnostics into 
email notification


> Check MapReduce job failed reason and include the diagnostics into email 
> notification
> -
>
> Key: KYLIN-3021
> URL: https://issues.apache.org/jira/browse/KYLIN-3021
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> the current kylin.log and failed job email notification, we do not have the 
> detailed error info that why the map reduce jobs are failed. We just log  "no 
> counters for job" or "Counters: 0".
>  
> 2017-08-03 18:24:10,197 WARN  [pool-10-thread-17] common.HadoopCmdOutput:90 : 
> no counters for job job_1497957612021_709431
>  
> 2017-08-03 15:08:02,351 DEBUG [pool-10-thread-3] common.HadoopCmdOutput:95 : 
> Counters: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3291) 在构建好的cube上提交逻辑相同的sql查询结果不同

2018-12-24 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-3291:
-

Assignee: Zhong Yanghong

> 在构建好的cube上提交逻辑相同的sql查询结果不同
> --
>
> Key: KYLIN-3291
> URL: https://issues.apache.org/jira/browse/KYLIN-3291
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.0.0
> Environment: kylin 2.0hbase 1.2.0
>Reporter: zhang
>Assignee: Zhong Yanghong
>Priority: Blocker
>  Labels: patch
> Fix For: v2.6.0
>
>
> select 
> a.agent
>   ,b.channel_name
>   ,a.ONLINE_SECONDS_TYPE
>   ,a.pt_dt
>   ,count(*) ct
> from (
> select
> agent , ONLINE_SECONDS_TYPE ,pt_dt
> from zhangyc02.DM_CHL_REGUSER_1D_WIDETABLE_D 
> where pt_dt>='2017-12-01' and pt_dt<='2017-12-03' and agent in 
> (6,3)
> ) a
> left join zhangyc02.dim_res_info b
> on a.agent = b.channel_id
> group by a.agent,b.channel_name,a.ONLINE_SECONDS_TYPE,a.pt_dt
> order by agent, pt_dt , ONLINE_SECONDS_TYPE   ;
> 这种查询结果是错误的
> select
> a.agent
> ,b.channel_name agent_name
> ,a.ONLINE_SECONDS_TYPE
> ,pt_dt
> ,count(*) ct
> from zhangyc02.DM_CHL_REGUSER_1D_WIDETABLE_D a
> left join zhangyc02.dim_res_info b
> on a.agent = b.channel_id
> where pt_dt>='2017-12-01' and pt_dt<='2017-12-03' and a.agent in (6,3)
> group by a.agent,b.channel_name,a.ONLINE_SECONDS_TYPE,a.pt_dt
> order by agent, pt_dt , ONLINE_SECONDS_TYPE   ;
> 这种查询结果是正确的
> 校验方式:我将两个sql在impala中分别查询,结果一致并且与kylin中的下面的sql结果一致。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3021) Check MapReduce job failed reason and include the diagnostics into email notification

2018-12-24 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong resolved KYLIN-3021.
---
Resolution: Resolved

> Check MapReduce job failed reason and include the diagnostics into email 
> notification
> -
>
> Key: KYLIN-3021
> URL: https://issues.apache.org/jira/browse/KYLIN-3021
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> the current kylin.log and failed job email notification, we do not have the 
> detailed error info that why the map reduce jobs are failed. We just log  "no 
> counters for job" or "Counters: 0".
>  
> 2017-08-03 18:24:10,197 WARN  [pool-10-thread-17] common.HadoopCmdOutput:90 : 
> no counters for job job_1497957612021_709431
>  
> 2017-08-03 15:08:02,351 DEBUG [pool-10-thread-3] common.HadoopCmdOutput:95 : 
> Counters: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3559) Use Splitter for splitting String

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728294#comment-16728294
 ] 

Zhong Yanghong commented on KYLIN-3559:
---

Hi [~skywind2006], thanks very much. I think you can mark it by yourself:D

> Use Splitter for splitting String
> -
>
> Key: KYLIN-3559
> URL: https://issues.apache.org/jira/browse/KYLIN-3559
> Project: Kylin
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Wu Bin
>Priority: Major
> Fix For: v2.6.0
>
>
> See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
> preferred .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Pan, Julian (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728292#comment-16728292
 ] 

Pan, Julian commented on KYLIN-3738:


The bug is related by KYLIN-2243
{code:java}
$scope.newMeasure.function.returntype=$scope.newMeasure.function.returntype.replace(/\,\d+/,'');
{code}
The code in addNewMeasure will change decimal(19,4) to decimal(19).

> Edit cube measure may make the decimal type change unexpectly
> -
>
> Key: KYLIN-3738
> URL: https://issues.apache.org/jira/browse/KYLIN-3738
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v2.5.2
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
>
> When edit cube's measure and click save, the origin return type maybe changed 
> from decimal(19,4) to decimal(19), that will cause cube build result not 
> correct



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-2924) Utilize error-prone to discover common coding mistakes

2018-12-24 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-2924:

Fix Version/s: (was: v2.6.0)

> Utilize error-prone to discover common coding mistakes
> --
>
> Key: KYLIN-2924
> URL: https://issues.apache.org/jira/browse/KYLIN-2924
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Billy Liu
>Priority: Major
>
> http://errorprone.info/ is a tool which detects common coding mistakes.
> We should incorporate into Kylin build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3698) check-env.sh should print more details about checking items

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728279#comment-16728279
 ] 

Zhong Yanghong commented on KYLIN-3698:
---

Hi [~DDDQ], what's the status of this? Can we move this to the next release?

> check-env.sh should print more details about checking items
> ---
>
> Key: KYLIN-3698
> URL: https://issues.apache.org/jira/browse/KYLIN-3698
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v2.5.1
>Reporter: May Zhou
>Assignee: May Zhou
>Priority: Minor
> Fix For: v2.6.0
>
>
> In the current version, when users run _check-env.sh_, if there's no error 
> message, it means everything is OK.
> From my perspective,  adding more details about the checking items when 
> executing check-env.sh is better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3738) Edit cube measure may make the decimal type change unexpectly

2018-12-24 Thread Pan, Julian (JIRA)
Pan, Julian created KYLIN-3738:
--

 Summary: Edit cube measure may make the decimal type change 
unexpectly
 Key: KYLIN-3738
 URL: https://issues.apache.org/jira/browse/KYLIN-3738
 Project: Kylin
  Issue Type: Bug
  Components: Web 
Affects Versions: v2.5.2
Reporter: Pan, Julian
Assignee: Pan, Julian


When edit cube's measure and click save, the origin return type maybe changed 
from decimal(19,4) to decimal(19), that will cause cube build result not correct



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3021) Check MapReduce job failed reason and include the diagnostics into email notification

2018-12-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728288#comment-16728288
 ] 

ASF GitHub Bot commented on KYLIN-3021:
---

shaofengshi commented on pull request #413: KYLIN-3021 check MapReduce job 
failed reason and include the diagnost…
URL: https://github.com/apache/kylin/pull/413
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Check MapReduce job failed reason and include the diagnostics into email 
> notification
> -
>
> Key: KYLIN-3021
> URL: https://issues.apache.org/jira/browse/KYLIN-3021
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> the current kylin.log and failed job email notification, we do not have the 
> detailed error info that why the map reduce jobs are failed. We just log  "no 
> counters for job" or "Counters: 0".
>  
> 2017-08-03 18:24:10,197 WARN  [pool-10-thread-17] common.HadoopCmdOutput:90 : 
> no counters for job job_1497957612021_709431
>  
> 2017-08-03 15:08:02,351 DEBUG [pool-10-thread-3] common.HadoopCmdOutput:95 : 
> Counters: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] shaofengshi closed pull request #413: KYLIN-3021 check MapReduce job failed reason and include the diagnost…

2018-12-24 Thread GitBox
shaofengshi closed pull request #413: KYLIN-3021 check MapReduce job failed 
reason and include the diagnost…
URL: https://github.com/apache/kylin/pull/413
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/HadoopCmdOutput.java
 
b/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/HadoopCmdOutput.java
index d82b988665..df89ed8553 100644
--- 
a/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/HadoopCmdOutput.java
+++ 
b/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/HadoopCmdOutput.java
@@ -18,6 +18,7 @@
 
 package org.apache.kylin.engine.mr.common;
 
+import java.io.IOException;
 import java.util.Collections;
 import java.util.HashMap;
 import java.util.Map;
@@ -26,6 +27,8 @@
 import org.apache.hadoop.mapreduce.Counters;
 import org.apache.hadoop.mapreduce.FileSystemCounter;
 import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.JobStatus;
+import org.apache.hadoop.mapreduce.TaskCompletionEvent;
 import org.apache.hadoop.mapreduce.TaskCounter;
 import org.apache.kylin.common.KylinConfig;
 import 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.RawDataCounter;
@@ -92,29 +95,66 @@ public void updateJobCounter() {
 String errorMsg = "no counters for job " + getMrJobId();
 logger.warn(errorMsg);
 output.append(errorMsg);
-return;
+} else {
+this.output.append(counters.toString()).append("\n");
+logger.debug(counters.toString());
+
+mapInputRecords = 
String.valueOf(counters.findCounter(TaskCounter.MAP_INPUT_RECORDS).getValue());
+rawInputBytesRead = 
String.valueOf(counters.findCounter(RawDataCounter.BYTES).getValue());
+
+String outputFolder = 
job.getConfiguration().get("mapreduce.output.fileoutputformat.outputdir",
+
KylinConfig.getInstanceFromEnv().getHdfsWorkingDirectory());
+logger.debug("outputFolder is " + outputFolder);
+Path outputPath = new Path(outputFolder);
+String fsScheme = 
outputPath.getFileSystem(job.getConfiguration()).getScheme();
+long bytesWritten = counters.findCounter(fsScheme, 
FileSystemCounter.BYTES_WRITTEN).getValue();
+if (bytesWritten == 0) {
+logger.debug("Seems no counter found for " + fsScheme);
+bytesWritten = counters.findCounter("FileSystemCounters", 
"HDFS_BYTES_WRITTEN").getValue();
+}
+hdfsBytesWritten = String.valueOf(bytesWritten);
 }
-this.output.append(counters.toString()).append("\n");
-logger.debug(counters.toString());
-
-mapInputRecords = 
String.valueOf(counters.findCounter(TaskCounter.MAP_INPUT_RECORDS).getValue());
-rawInputBytesRead = 
String.valueOf(counters.findCounter(RawDataCounter.BYTES).getValue());
-
-String outputFolder = 
job.getConfiguration().get("mapreduce.output.fileoutputformat.outputdir", 
KylinConfig.getInstanceFromEnv().getHdfsWorkingDirectory());
-logger.debug("outputFolder is " + outputFolder);
-Path outputPath = new Path(outputFolder);
-String fsScheme = 
outputPath.getFileSystem(job.getConfiguration()).getScheme();
-long bytesWritten = counters.findCounter(fsScheme, 
FileSystemCounter.BYTES_WRITTEN).getValue();
-if (bytesWritten == 0) {
-logger.debug("Seems no counter found for " + fsScheme);
-bytesWritten = counters.findCounter("FileSystemCounters", 
"HDFS_BYTES_WRITTEN").getValue();
+JobStatus jobStatus = job.getStatus();
+if (jobStatus.getState() == JobStatus.State.FAILED) {
+logger.warn("Job Diagnostics:" + jobStatus.getFailureInfo());
+output.append("Job 
Diagnostics:").append(jobStatus.getFailureInfo()).append("\n");
+TaskCompletionEvent taskEvent = getOneTaskFailure(job);
+if (taskEvent != null) {
+String[] fails = 
job.getTaskDiagnostics(taskEvent.getTaskAttemptId());
+logger.warn("Failure task Diagnostics:");
+output.append("Failure task Diagnostics:").append("\n");
+for (String failure : fails) {
+logger.warn(failure);
+output.append(failure).append("\n");
+}
+}
 }
-hdfsBytesWritten = String.valueOf(bytesWritten);
-
 } catch (Exception e) {
 logger.error(e.getLocalizedMessage(), e);
 

[GitHub] shaofengshi commented on issue #382: Kylin-3654 New Kylin Streaming

2018-12-24 Thread GitBox
shaofengshi commented on issue #382: Kylin-3654 New Kylin Streaming
URL: https://github.com/apache/kylin/pull/382#issuecomment-449708009
 
 
   Staged in realtime-streaming branch; Close this PR now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shaofengshi closed pull request #386: Kylin on Druid blog

2018-12-24 Thread GitBox
shaofengshi closed pull request #386: Kylin on Druid blog
URL: https://github.com/apache/kylin/pull/386
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/website/_posts/blog/2018-12-12-why-did-meituan-develop-kylin-on-druid-part1-of-2.md
 
b/website/_posts/blog/2018-12-12-why-did-meituan-develop-kylin-on-druid-part1-of-2.md
new file mode 100644
index 00..11c7df5527
--- /dev/null
+++ 
b/website/_posts/blog/2018-12-12-why-did-meituan-develop-kylin-on-druid-part1-of-2.md
@@ -0,0 +1,186 @@
+---
+layout: post-blog
+title:  Why did Meituan develop Kylin On Druid (part 1 of 2)?
+date:   2018-12-12 17:30:00
+author: Xiaoxiang Yu
+categories: blog
+---
+
+## Preface
+
+In the Big Data field, Apache Kylin and Apache Druid(incubating) are two 
commonly adopted OLAP engines, both of which enable fast querying on huge 
datasets. In the enterprises that heavily rely on big data analytics, they 
often run both for different use cases.
+
+During the Apache Kylin Meetup in August 2018, the Meituan team shared their 
Kylin on Druid (KoD) solution. Why did they develop this hybrid system? What’s 
the rationale behind it? This article will answer these questions and help you 
to understand the differences and the pros and cons of each OLAP engine.
+
+## 01 Introduction to Apache Kylin 
+Apache Kylin is an open source distributed big data analytics engine. It 
constructs data models on top of huge datasets, builds pre-calculated Cubes to 
support multi-dimensional analysis, and provides a SQL query interface and 
multi-dimensional analysis on top of Hadoop, with general ODBC, JDBC, and 
RESTful API interfaces. Apache Kylin’s unique pre-calculation ability enables 
it to handle extremely large datasets with sub-second query response times.
+![](/images/blog/Kylin-On-Durid/1 kylin_architecture.png)
+Graphic  1 Kylin Architecture
+
+## 02 Apache Kylin’s Advantage
+1. The mature, Hadoop-based computing engines (MapReduce and Spark) that 
provide strong capability of pre-calculation on super large datasets, which can 
be deployed out-of-the-box on any mainstream Hadoop platform.
+2. Support of ANSI SQL that allows users to do data analysis with SQL directly.
+3. Sub-second, low-latency query response times.
+4. Common OLAP Star/Snowflake Schema data modeling.
+5. A rich OLAP function set including Sum, Count Distinct, Top N, Percentile, 
etc.
+6. Intelligent trimming of Cuboids that reduces consumption of storage and 
computing power.
+7. Direct integration with mainstream BI tools and rich interfaces.
+8. Support of both batch loading of super large historical datasets and 
micro-batches of data streams.
+
+## 03 Introduction to Apache Druid (incubating)
+Druid was created in 2012. It’s an open source distributed data store. Its 
core design combines the concept of analytical databases, time-series 
databases, and search systems, and it can support data collection and analytics 
on fairly large datasets. Druid uses an Apache V2 license and is an Apache 
incubator project.
+
+Druid Architecture
+From the perspective of deployment architectures, Druid’s processes mostly 
fall into 3 categories based on their roles.
+
+### •  Data Node (Slave node for data ingestion and calculation)
+The Historical node is in charge of loading segments (committed immutable 
data) and receiving queries on historical data.
+Middle Manager is in charge of data ingestion and commit segments. Each task 
is done by a separate JVM. 
+Peon is in charge of completing a single task, which is managed and monitored 
by the Middle Manager.
+
+### •  Query Node
+Broker receives query requests, determines on which segment the data resides, 
and distributes sub-queries and merges query results.
+
+### •  Master Node (Task Coordinator and Cluster Manager)
+Coordinator monitors Historical nodes, dispatches segments and monitor 
workload.
+Overlord monitors Middle Manager, dispatches tasks to Middle Manager, and 
assists releasing of segments.
+
+
+### External Dependency
+At the same time, Druid has 3 replaceable external dependencies.
+
+### •  Deep Storage (distributed storage)
+Druid uses Deep storage to transfer data files between nodes.
+ 
+### •  Metadata Storage
+Metadata Storage stores the metadata about segment positions and task output.
+
+### •  Zookeeper (cluster management and task coordination)
+Druid uses Zookeeper (ZK) to ensure consistency of the cluster status.
+![](/images/blog/Kylin-On-Durid/2 druid_architecture.png)
+Graphic 2 Druid Architecture
+
+## Data Source and Segment
+Druid stores data in Data Source. Data Source is equivalent to Table in RDBMS. 
Data Source is divided into multiple Chunks based on timestamps, and data 
within the same time range will be organized into the same Chunk. Each 

[jira] [Commented] (KYLIN-3559) Use Splitter for splitting String

2018-12-24 Thread Wu Bin (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728262#comment-16728262
 ] 

Wu Bin commented on KYLIN-3559:
---

[~yaho] I think it's resolved. Should I mark it or wait for the admin?

 

> Use Splitter for splitting String
> -
>
> Key: KYLIN-3559
> URL: https://issues.apache.org/jira/browse/KYLIN-3559
> Project: Kylin
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Wu Bin
>Priority: Major
> Fix For: v2.6.0
>
>
> See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
> preferred .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3597) Fix sonar reported static code issues

2018-12-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728271#comment-16728271
 ] 

ASF GitHub Bot commented on KYLIN-3597:
---

shaofengshi commented on pull request #401: KYLIN-3597 fix sonar issues
URL: https://github.com/apache/kylin/pull/401
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix sonar reported static code issues
> -
>
> Key: KYLIN-3597
> URL: https://issues.apache.org/jira/browse/KYLIN-3597
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Reporter: Shaofeng SHI
>Priority: Major
> Fix For: v2.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3597) Fix sonar reported static code issues

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728272#comment-16728272
 ] 

ASF subversion and git services commented on KYLIN-3597:


Commit 3b9c5a55139ca85e60e45bb5c748f480178a93d5 in kylin's branch 
refs/heads/master from whuwb
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=3b9c5a5 ]

KYLIN-3597 fix sonar issues


> Fix sonar reported static code issues
> -
>
> Key: KYLIN-3597
> URL: https://issues.apache.org/jira/browse/KYLIN-3597
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Reporter: Shaofeng SHI
>Priority: Major
> Fix For: v2.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] shaofengshi closed pull request #401: KYLIN-3597 fix sonar issues

2018-12-24 Thread GitBox
shaofengshi closed pull request #401: KYLIN-3597 fix sonar issues
URL: https://github.com/apache/kylin/pull/401
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidStatsUtil.java
 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidStatsUtil.java
index def1e68116..dc3471b4b7 100644
--- 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidStatsUtil.java
+++ 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidStatsUtil.java
@@ -55,25 +55,26 @@
 
 for (Map.Entry hitFrequency : hitFrequencyMap.entrySet()) {
 long cuboid = hitFrequency.getKey();
-if (statistics.get(cuboid) != null) {
-continue;
-}
-if (rollingUpCountSourceMap.get(cuboid) == null || 
rollingUpCountSourceMap.get(cuboid).isEmpty()) {
-continue;
-}
-long totalEstScanCount = 0L;
-for (long estScanCount : 
rollingUpCountSourceMap.get(cuboid).values()) {
-totalEstScanCount += estScanCount;
-}
-totalEstScanCount /= rollingUpCountSourceMap.get(cuboid).size();
-if ((hitFrequency.getValue() * 1.0 / totalHitFrequency)
-* totalEstScanCount >= rollUpThresholdForMandatory) {
-mandatoryCuboidSet.add(cuboid);
+
+if (isCuboidMandatory(cuboid, statistics, 
rollingUpCountSourceMap)) {
+long totalEstScanCount = 0L;
+for (long estScanCount : 
rollingUpCountSourceMap.get(cuboid).values()) {
+totalEstScanCount += estScanCount;
+}
+totalEstScanCount /= 
rollingUpCountSourceMap.get(cuboid).size();
+if ((hitFrequency.getValue() * 1.0 / totalHitFrequency)
+* totalEstScanCount >= rollUpThresholdForMandatory) {
+mandatoryCuboidSet.add(cuboid);
+}
 }
 }
 return mandatoryCuboidSet;
 }
 
+private static boolean isCuboidMandatory(Long cuboid, Map 
statistics, Map> rollingUpCountSourceMap) {
+return !statistics.containsKey(cuboid) && 
rollingUpCountSourceMap.containsKey(cuboid) && 
!rollingUpCountSourceMap.get(cuboid).isEmpty();
+}
+
 /**
  * Complement row count for mandatory cuboids
  * with its best parent's row count
@@ -81,7 +82,7 @@
 public static void complementRowCountForMandatoryCuboids(Map 
statistics, long baseCuboid,
 Set mandatoryCuboidSet) {
 // Sort entries order by row count asc
-SortedSet> sortedStatsSet = new 
TreeSet>(
+SortedSet> sortedStatsSet = new TreeSet<>(
 new Comparator>() {
 public int compare(Map.Entry o1, 
Map.Entry o2) {
 return o1.getValue().compareTo(o2.getValue());
diff --git 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/greedy/GreedyAlgorithm.java
 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/greedy/GreedyAlgorithm.java
index 7f415de0bc..e8b0ae894a 100755
--- 
a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/greedy/GreedyAlgorithm.java
+++ 
b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/greedy/GreedyAlgorithm.java
@@ -110,12 +110,12 @@ public GreedyAlgorithm(final long timeout, BenefitPolicy 
benefitPolicy, CuboidSt
 
 List excluded = Lists.newArrayList(remaining);
 remaining.retainAll(selected);
-Preconditions.checkArgument(remaining.size() == 0,
+Preconditions.checkArgument(remaining.isEmpty(),
 "There should be no intersection between excluded list and 
selected list.");
 logger.info("Greedy Algorithm finished.");
 
 if (logger.isTraceEnabled()) {
-logger.trace("Excluded cuboidId size:" + excluded.size());
+logger.trace(String.format(Locale.ROOT, "Excluded cuboidId 
size:%d", excluded.size()));
 logger.trace("Excluded cuboidId detail:");
 for (Long cuboid : excluded) {
 logger.trace(String.format(Locale.ROOT, "cuboidId %d and Cost: 
%d and Space: %f", cuboid,
diff --git 
a/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/CubeScanRangePlanner.java
 
b/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/CubeScanRangePlanner.java
index 1a02e1aa3a..3095c8f708 100644
--- 
a/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/CubeScanRangePlanner.java
+++ 
b/core-storage/src/main/java/org/apache/kylin/storage/gtrecord/CubeScanRangePlanner.java
@@ -24,6 +24,7 @@
 import java.util.Collections;
 import java.util.Comparator;
 

[jira] [Commented] (KYLIN-3737) Refactor cache part for RDBMS

2018-12-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728266#comment-16728266
 ] 

ASF subversion and git services commented on KYLIN-3737:


Commit 5982bb79bacc5e0728df52a6e602239e1d2c6b26 in kylin's branch 
refs/heads/master from woyumen4597
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=5982bb7 ]

KYLIN-3737 refactor cache part for RDBMS


> Refactor cache part for RDBMS
> -
>
> Key: KYLIN-3737
> URL: https://issues.apache.org/jira/browse/KYLIN-3737
> Project: Kylin
>  Issue Type: Improvement
>  Components: RDBMS Source
>Affects Versions: v2.6.0
> Environment: MacOSx,JDK1.8
>Reporter: rongchuan.jin
>Assignee: rongchuan.jin
>Priority: Major
> Fix For: v2.6.0
>
>
> Currently, Kylin cache part for RDBMS has poor performance while load many 
> tables with sql-case-sensitive,it will take much time to load 
> database,table,column identifier to cache in order to fix sql-case-sensitive 
> problem for RDBMS.I found it has space to imporve .So I'd like to  contribute 
> a patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3737) Refactor cache part for RDBMS

2018-12-24 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728265#comment-16728265
 ] 

ASF GitHub Bot commented on KYLIN-3737:
---

shaofengshi commented on pull request #411: KYLIN-3737 refactor cache part for 
RDBMS
URL: https://github.com/apache/kylin/pull/411
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor cache part for RDBMS
> -
>
> Key: KYLIN-3737
> URL: https://issues.apache.org/jira/browse/KYLIN-3737
> Project: Kylin
>  Issue Type: Improvement
>  Components: RDBMS Source
>Affects Versions: v2.6.0
> Environment: MacOSx,JDK1.8
>Reporter: rongchuan.jin
>Assignee: rongchuan.jin
>Priority: Major
> Fix For: v2.6.0
>
>
> Currently, Kylin cache part for RDBMS has poor performance while load many 
> tables with sql-case-sensitive,it will take much time to load 
> database,table,column identifier to cache in order to fix sql-case-sensitive 
> problem for RDBMS.I found it has space to imporve .So I'd like to  contribute 
> a patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] shaofengshi closed pull request #411: KYLIN-3737 refactor cache part for RDBMS

2018-12-24 Thread GitBox
shaofengshi closed pull request #411: KYLIN-3737 refactor cache part for RDBMS
URL: https://github.com/apache/kylin/pull/411
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/AbstractJdbcAdaptor.java
 
b/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/AbstractJdbcAdaptor.java
index 3e36faedf3..3a66499b33 100644
--- 
a/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/AbstractJdbcAdaptor.java
+++ 
b/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/AbstractJdbcAdaptor.java
@@ -53,11 +53,11 @@
 protected final DataSourceDef dataSourceDef;
 protected SqlConverter.IConfigurer configurer;
 protected final Cache> columnsCache = 
CacheBuilder.newBuilder()
-.expireAfterWrite(1, TimeUnit.DAYS).maximumSize(30).build();
+.expireAfterWrite(1, TimeUnit.DAYS).maximumSize(4096).build();
 protected final Cache> databasesCache = 
CacheBuilder.newBuilder()
-.expireAfterWrite(1, TimeUnit.DAYS).maximumSize(30).build();
+.expireAfterWrite(1, TimeUnit.DAYS).maximumSize(4096).build();
 protected final Cache> tablesCache = 
CacheBuilder.newBuilder()
-.expireAfterWrite(1, TimeUnit.DAYS).maximumSize(30).build();
+.expireAfterWrite(1, TimeUnit.DAYS).maximumSize(4096).build();
 
 private static Joiner joiner = Joiner.on("_");
 
@@ -308,7 +308,7 @@ public String getDataSourceId() {
  */
 public List listDatabasesWithCache(boolean init) throws 
SQLException {
 if (configurer.enableCache()) {
-String cacheKey = config.datasourceId + config.url + "_databases";
+String cacheKey = joiner.join(config.datasourceId, config.url, 
"databases");
 List cachedDatabases;
 if (init || (cachedDatabases = 
databasesCache.getIfPresent(cacheKey)) == null) {
 cachedDatabases = listDatabases();
@@ -429,7 +429,7 @@ public String getDataSourceId() {
  */
 public List listColumnsWithCache(String database, String 
tableName, boolean init) throws SQLException {
 if (configurer.enableCache()) {
-String cacheKey = config.datasourceId + config.url + "_" + 
tableName + "_columns";
+String cacheKey = joiner.join(config.datasourceId, config.url, 
database, tableName, "columns");
 List cachedColumns;
 if (init || (cachedColumns = columnsCache.getIfPresent(cacheKey)) 
== null) {
 cachedColumns = listColumns(database, tableName);
diff --git 
a/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/DefaultAdaptor.java
 
b/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/DefaultAdaptor.java
index 66c45e1dcf..da24e9831b 100644
--- 
a/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/DefaultAdaptor.java
+++ 
b/datasource-sdk/src/main/java/org/apache/kylin/sdk/datasource/adaptor/DefaultAdaptor.java
@@ -28,6 +28,7 @@
 import java.util.Map;
 import javax.sql.rowset.CachedRowSet;
 
+import com.google.common.base.Joiner;
 import org.apache.commons.lang.StringUtils;
 
 /**
@@ -36,9 +37,7 @@
  */
 public class DefaultAdaptor extends AbstractJdbcAdaptor {
 
-protected static final String QUOTE_REG_LFT = "[`\"\\[]*";
-protected static final String QUOTE_REG_RHT = "[`\"\\]]*";
-private final static String [] POSSIBLE_TALBE_END= {",", " ", ")", "\r", 
"\n", "."};
+private static Joiner joiner = Joiner.on('_');
 
 public DefaultAdaptor(AdaptorConfig config) throws Exception {
 super(config);
@@ -140,19 +139,6 @@ public String fixSql(String sql) {
 return sql;
 }
 
-private boolean checkSqlContainstable(String orig, String table) {
-// ensure table is single match(e.g match account but not match 
accountant)
-if (orig.endsWith(table.toUpperCase(Locale.ROOT))) {
-return true;
-}
-for (String end:POSSIBLE_TALBE_END) {
-if (orig.contains(table.toUpperCase(Locale.ROOT) + end)){
-return true;
-}
-}
-return false;
-}
-
 /**
  * By default, use schema as database of kylin.
  * @return
@@ -270,26 +256,100 @@ public CachedRowSet getTableColumns(String schema, 
String table) throws SQLExcep
 public String fixIdentifierCaseSensitve(String identifier) {
 try {
 List databases = listDatabasesWithCache();
-for (String db : databases) {
-if (db.equalsIgnoreCase(identifier)) {
-return db;
+for (String database : databases) {
+if (identifier.equalsIgnoreCase(database)) {
+  

[jira] [Resolved] (KYLIN-3724) Kylin IT test sql is unreasonable

2018-12-24 Thread XiaoXiang Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiaoXiang Yu resolved KYLIN-3724.
-
Resolution: Fixed

> Kylin IT test sql is unreasonable
> -
>
> Key: KYLIN-3724
> URL: https://issues.apache.org/jira/browse/KYLIN-3724
> Project: Kylin
>  Issue Type: Bug
>Reporter: XiaoXiang Yu
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: image-2018-12-17-16-55-13-816.png, 
> image-2018-12-17-16-58-28-349.png
>
>
> In {color:#33}*kylin-it,*{color} we use query under 
> +_sql_distinct_precisely_+ folder to test the +*COUNT_DISTINCT(Bitmap)*+ . 
> But we find that query04 using a COUNT_DISTINCT(HLL)  in having condition, it 
> is unreasonable and can cause some data reduction. And I think it maybe 
> causing some unpredictable test failure.
>  
> {quote}select test_cal_dt.cal_dt,sum(test_kylin_fact.price) as GMV
>  , count(1) as TRANS_CNT
>  , count(distinct TEST_COUNT_DISTINCT_BITMAP) as user_count
>  , count(distinct site_name) as site_count
>  from test_kylin_fact
>  inner JOIN edw.test_cal_dt as test_cal_dt
>  ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt
>  inner JOIN test_category_groupings
>  on test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id and
>  test_kylin_fact.lstg_site_id = test_category_groupings.site_id
>  inner JOIN edw.test_sites as test_sites
>  on test_kylin_fact.lstg_site_id = test_sites.site_id
>  inner JOIN edw.test_seller_type_dim as test_seller_type_dim
>  on test_kylin_fact.slr_segment_cd = test_seller_type_dim.seller_type_cd
>  where test_kylin_fact.lstg_format_name='FP-GTC'
>  and test_cal_dt.cal_dt between DATE '2013-05-01' and DATE '2013-08-01'
>  group by test_cal_dt.cal_dt
>  having count(distinct seller_id) > 2
> {quote}
>  
>  
>  
> !image-2018-12-17-16-58-28-349.png!  
>  In our jenkin server, sometime we got a build failure, but when I run again 
> without modify code, the CI test pass.
> !image-2018-12-17-16-55-13-816.png!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3411) kylin scan different in same sql

2018-12-24 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-3411:
--
Issue Type: Improvement  (was: Bug)

> kylin scan different in same sql
> 
>
> Key: KYLIN-3411
> URL: https://issues.apache.org/jira/browse/KYLIN-3411
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Affects Versions: v2.3.1
>Reporter: Lemont
>Assignee: Zhong Yanghong
>Priority: Minor
> Fix For: v2.6.0
>
>
> There are two sql:
> select sum(value) from test where time > 1524326400 group by id
> and
> select sum(value) from test where time > (1524931200-7*86400) group by id
> As we can see 1524326400 =(1524931200-7*86400) 
> but the second sql query slower than the first sql
> Cuboid Ids: [3904]
> Total scan count: 1157959
> Total scan bytes: 265530668
> Result row count: 34991
> Cuboid Ids: [3904]
> Total scan count: 611795
> Total scan bytes: 140681855
> Result row count: 34991



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3411) kylin scan different in same sql

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728260#comment-16728260
 ] 

Zhong Yanghong commented on KYLIN-3411:
---

Good catch.

> kylin scan different in same sql
> 
>
> Key: KYLIN-3411
> URL: https://issues.apache.org/jira/browse/KYLIN-3411
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.3.1
>Reporter: Lemont
>Priority: Minor
> Fix For: v2.6.0
>
>
> There are two sql:
> select sum(value) from test where time > 1524326400 group by id
> and
> select sum(value) from test where time > (1524931200-7*86400) group by id
> As we can see 1524326400 =(1524931200-7*86400) 
> but the second sql query slower than the first sql
> Cuboid Ids: [3904]
> Total scan count: 1157959
> Total scan bytes: 265530668
> Result row count: 34991
> Cuboid Ids: [3904]
> Total scan count: 611795
> Total scan bytes: 140681855
> Result row count: 34991



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3411) kylin scan different in same sql

2018-12-24 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-3411:
-

Assignee: Zhong Yanghong

> kylin scan different in same sql
> 
>
> Key: KYLIN-3411
> URL: https://issues.apache.org/jira/browse/KYLIN-3411
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.3.1
>Reporter: Lemont
>Assignee: Zhong Yanghong
>Priority: Minor
> Fix For: v2.6.0
>
>
> There are two sql:
> select sum(value) from test where time > 1524326400 group by id
> and
> select sum(value) from test where time > (1524931200-7*86400) group by id
> As we can see 1524326400 =(1524931200-7*86400) 
> but the second sql query slower than the first sql
> Cuboid Ids: [3904]
> Total scan count: 1157959
> Total scan bytes: 265530668
> Result row count: 34991
> Cuboid Ids: [3904]
> Total scan count: 611795
> Total scan bytes: 140681855
> Result row count: 34991



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3724) Kylin IT test sql is unreasonable

2018-12-24 Thread XiaoXiang Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728253#comment-16728253
 ] 

XiaoXiang Yu commented on KYLIN-3724:
-

[~yaho], I have marked this as Resolved.

> Kylin IT test sql is unreasonable
> -
>
> Key: KYLIN-3724
> URL: https://issues.apache.org/jira/browse/KYLIN-3724
> Project: Kylin
>  Issue Type: Bug
>Reporter: XiaoXiang Yu
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: image-2018-12-17-16-55-13-816.png, 
> image-2018-12-17-16-58-28-349.png
>
>
> In {color:#33}*kylin-it,*{color} we use query under 
> +_sql_distinct_precisely_+ folder to test the +*COUNT_DISTINCT(Bitmap)*+ . 
> But we find that query04 using a COUNT_DISTINCT(HLL)  in having condition, it 
> is unreasonable and can cause some data reduction. And I think it maybe 
> causing some unpredictable test failure.
>  
> {quote}select test_cal_dt.cal_dt,sum(test_kylin_fact.price) as GMV
>  , count(1) as TRANS_CNT
>  , count(distinct TEST_COUNT_DISTINCT_BITMAP) as user_count
>  , count(distinct site_name) as site_count
>  from test_kylin_fact
>  inner JOIN edw.test_cal_dt as test_cal_dt
>  ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt
>  inner JOIN test_category_groupings
>  on test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id and
>  test_kylin_fact.lstg_site_id = test_category_groupings.site_id
>  inner JOIN edw.test_sites as test_sites
>  on test_kylin_fact.lstg_site_id = test_sites.site_id
>  inner JOIN edw.test_seller_type_dim as test_seller_type_dim
>  on test_kylin_fact.slr_segment_cd = test_seller_type_dim.seller_type_cd
>  where test_kylin_fact.lstg_format_name='FP-GTC'
>  and test_cal_dt.cal_dt between DATE '2013-05-01' and DATE '2013-08-01'
>  group by test_cal_dt.cal_dt
>  having count(distinct seller_id) > 2
> {quote}
>  
>  
>  
> !image-2018-12-17-16-58-28-349.png!  
>  In our jenkin server, sometime we got a build failure, but when I run again 
> without modify code, the CI test pass.
> !image-2018-12-17-16-55-13-816.png!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2924) Utilize error-prone to discover common coding mistakes

2018-12-24 Thread Billy Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728249#comment-16728249
 ] 

Billy Liu commented on KYLIN-2924:
--

[~yaho] I think [~Shaofengshi] has disabled this feature, to avoid too much 
log. 

> Utilize error-prone to discover common coding mistakes
> --
>
> Key: KYLIN-2924
> URL: https://issues.apache.org/jira/browse/KYLIN-2924
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Billy Liu
>Priority: Major
> Fix For: v2.6.0
>
>
> http://errorprone.info/ is a tool which detects common coding mistakes.
> We should incorporate into Kylin build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3724) Kylin IT test sql is unreasonable

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728252#comment-16728252
 ] 

Zhong Yanghong commented on KYLIN-3724:
---

Hi [~hit_lacus], should we mark this as Resolved?

> Kylin IT test sql is unreasonable
> -
>
> Key: KYLIN-3724
> URL: https://issues.apache.org/jira/browse/KYLIN-3724
> Project: Kylin
>  Issue Type: Bug
>Reporter: XiaoXiang Yu
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: image-2018-12-17-16-55-13-816.png, 
> image-2018-12-17-16-58-28-349.png
>
>
> In {color:#33}*kylin-it,*{color} we use query under 
> +_sql_distinct_precisely_+ folder to test the +*COUNT_DISTINCT(Bitmap)*+ . 
> But we find that query04 using a COUNT_DISTINCT(HLL)  in having condition, it 
> is unreasonable and can cause some data reduction. And I think it maybe 
> causing some unpredictable test failure.
>  
> {quote}select test_cal_dt.cal_dt,sum(test_kylin_fact.price) as GMV
>  , count(1) as TRANS_CNT
>  , count(distinct TEST_COUNT_DISTINCT_BITMAP) as user_count
>  , count(distinct site_name) as site_count
>  from test_kylin_fact
>  inner JOIN edw.test_cal_dt as test_cal_dt
>  ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt
>  inner JOIN test_category_groupings
>  on test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id and
>  test_kylin_fact.lstg_site_id = test_category_groupings.site_id
>  inner JOIN edw.test_sites as test_sites
>  on test_kylin_fact.lstg_site_id = test_sites.site_id
>  inner JOIN edw.test_seller_type_dim as test_seller_type_dim
>  on test_kylin_fact.slr_segment_cd = test_seller_type_dim.seller_type_cd
>  where test_kylin_fact.lstg_format_name='FP-GTC'
>  and test_cal_dt.cal_dt between DATE '2013-05-01' and DATE '2013-08-01'
>  group by test_cal_dt.cal_dt
>  having count(distinct seller_id) > 2
> {quote}
>  
>  
>  
> !image-2018-12-17-16-58-28-349.png!  
>  In our jenkin server, sometime we got a build failure, but when I run again 
> without modify code, the CI test pass.
> !image-2018-12-17-16-55-13-816.png!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3597) Fix sonar reported static code issues

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728242#comment-16728242
 ] 

Zhong Yanghong commented on KYLIN-3597:
---

Hi [~Shaofengshi], what's the status of this? Can it be marked as resolved?

> Fix sonar reported static code issues
> -
>
> Key: KYLIN-3597
> URL: https://issues.apache.org/jira/browse/KYLIN-3597
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Reporter: Shaofeng SHI
>Priority: Major
> Fix For: v2.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3658) The keywords of Hive are not supported By Kylin

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728250#comment-16728250
 ] 

Zhong Yanghong commented on KYLIN-3658:
---

Hi [~zhixin], what's the status of this issue? Should we move this to next 
release?

> The keywords of Hive are not supported By Kylin
> ---
>
> Key: KYLIN-3658
> URL: https://issues.apache.org/jira/browse/KYLIN-3658
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.5.0
>Reporter: liuzhixin
>Priority: Major
> Fix For: v2.6.0
>
>
> Hive2.x version strictly limited in the SQL keywords which must be added on 
> the quotes, 
> e.g.  ` date `, `timestamp` ...
> When Kylin visits Hive, the generated SQL statement does not add the quotes ` 
> ` governing the SQL keywords, it will bring some problems.
> #



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3559) Use Splitter for splitting String

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728240#comment-16728240
 ] 

Zhong Yanghong commented on KYLIN-3559:
---

Hi [~skywind2006], what's the status of this?

> Use Splitter for splitting String
> -
>
> Key: KYLIN-3559
> URL: https://issues.apache.org/jira/browse/KYLIN-3559
> Project: Kylin
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Wu Bin
>Priority: Major
> Fix For: v2.6.0
>
>
> See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
> preferred .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3628) The wrong result when a query with one lookup table

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728247#comment-16728247
 ] 

Zhong Yanghong commented on KYLIN-3628:
---

Hi [~Na Zhai], could you explain more details about your solution for this 
issue? And what's the status of the PR now?

> The wrong result when a query with one lookup table
> ---
>
> Key: KYLIN-3628
> URL: https://issues.apache.org/jira/browse/KYLIN-3628
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Na Zhai
>Assignee: Na Zhai
>Priority: Major
> Fix For: v2.6.0
>
>
> Two cubes use the same lookup table, and then the lookup table data in Hive 
> changes. One of Kylin's cubes builds the new data, and the other doesn't. 
> When Kylin queries the lookup table, the data gets confused.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-1295) Add new document to describe key concepts like model/cube/segment etc

2018-12-24 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-1295:

Fix Version/s: (was: v2.6.0)

Move to future release.

> Add new document to describe key concepts like model/cube/segment etc
> -
>
> Key: KYLIN-1295
> URL: https://issues.apache.org/jira/browse/KYLIN-1295
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: liyang
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3571) Not build Spark in Kylin's binary package

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728241#comment-16728241
 ] 

Zhong Yanghong commented on KYLIN-3571:
---

Hi [~Wayne0101], what's the status of this issue?

> Not build Spark in Kylin's binary package
> -
>
> Key: KYLIN-3571
> URL: https://issues.apache.org/jira/browse/KYLIN-3571
> Project: Kylin
>  Issue Type: Improvement
>  Components: Environment 
>Reporter: Shaofeng SHI
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-3559) Use Splitter for splitting String

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728240#comment-16728240
 ] 

Zhong Yanghong edited comment on KYLIN-3559 at 12/24/18 8:20 AM:
-

Hi [~skywind2006], what's the status of this? And can it be marked as Resolved?


was (Author: yaho):
Hi [~skywind2006], what's the status of this?

> Use Splitter for splitting String
> -
>
> Key: KYLIN-3559
> URL: https://issues.apache.org/jira/browse/KYLIN-3559
> Project: Kylin
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Wu Bin
>Priority: Major
> Fix For: v2.6.0
>
>
> See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
> preferred .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] woyumen4597 commented on issue #411: KYLIN-3737 refactor cache part for RDBMS

2018-12-24 Thread GitBox
woyumen4597 commented on issue #411: KYLIN-3737 refactor cache part for RDBMS
URL: https://github.com/apache/kylin/pull/411#issuecomment-449701482
 
 
   Local CI has passed.
   
![image](https://user-images.githubusercontent.com/24585832/50394312-a74fed00-0797-11e9-935e-dca7fe728b50.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (KYLIN-3310) Use lint for maven-compiler-plugin

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728236#comment-16728236
 ] 

Zhong Yanghong commented on KYLIN-3310:
---

Hi [~Aron.tao], what's the status of this?

> Use lint for maven-compiler-plugin
> --
>
> Key: KYLIN-3310
> URL: https://issues.apache.org/jira/browse/KYLIN-3310
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Reporter: Ted Yu
>Assignee: Jiatao Tao
>Priority: Major
> Fix For: v2.6.0
>
>
> lint helps identify structural problems.
> We should enable lint for maven-compiler-plugin
> {code}
>   maven-compiler-plugin
>   ${maven-compiler-plugin.version}
>   
> 1.8
> 1.8
> 
>   -Xlint:all
>   ${compiler.error.flag}
>   
>   -Xlint:-options
>   
>   -Xlint:-cast
>   -Xlint:-deprecation
>   -Xlint:-processing
>   -Xlint:-rawtypes
>   -Xlint:-serial
>   -Xlint:-try
>   -Xlint:-unchecked
>   -Xlint:-varargs
>   
>   
>   
> 
> true
> 
> false
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3093) Upgrade curator to 2.12

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728234#comment-16728234
 ] 

Zhong Yanghong commented on KYLIN-3093:
---

Hi [~Shaofengshi], what's the status of this issue? Will it be fixed in future?

> Upgrade curator to 2.12
> ---
>
> Key: KYLIN-3093
> URL: https://issues.apache.org/jira/browse/KYLIN-3093
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Reporter: Ted Yu
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.6.0
>
>
> curator-2.10.0 has several bug fixes over current version (2.7.1), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-2973) Potential issue of not atomically update cube instance map

2018-12-24 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong resolved KYLIN-2973.
---
Resolution: Fixed

> Potential issue of not atomically update cube instance map
> --
>
> Key: KYLIN-2973
> URL: https://issues.apache.org/jira/browse/KYLIN-2973
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> P1
> {code}
> try {
> getStore().putResource(cube.getResourcePath(), cube, 
> CUBE_SERIALIZER);
> } catch (IllegalStateException ise) {
> logger.warn("Write conflict to update cube " + cube.getName() + " 
> at try " + retry + ", will retry...");
> if (retry >= 7) {
> logger.error("Retried 7 times till got error, abandoning...", 
> ise);
> throw ise;
> }
> cube = reloadCubeLocal(cube.getName());
> update.setCubeInstance(cube);
> retry++;
> cube = updateCubeWithRetry(update, retry);
> }
> {code}
> P2
> {code}
> if (toRemoveResources.size() > 0) {
> for (String resource : toRemoveResources) {
> try {
> getStore().deleteResource(resource);
> } catch (IOException ioe) {
> logger.error("Failed to delete resource " + 
> toRemoveResources.toString());
> }
> }
> }
> {code}
> P3
> {code}
> cubeMap.put(cube.getName(), cube);
> {code}
> There's a chance like:
> # Thread t1, goes into P2;
> # Then Thread t2, goes into P1, P2, P3; the cube instance in the map will be 
> updated by t2
> # Then Thread t1 goes into P3; the cube instance in the map will be updated 
> by t1, which is not correct



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2924) Utilize error-prone to discover common coding mistakes

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728231#comment-16728231
 ] 

Zhong Yanghong commented on KYLIN-2924:
---

Hi [~yimingliu], what's the status of this?

> Utilize error-prone to discover common coding mistakes
> --
>
> Key: KYLIN-2924
> URL: https://issues.apache.org/jira/browse/KYLIN-2924
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Billy Liu
>Priority: Major
> Fix For: v2.6.0
>
>
> http://errorprone.info/ is a tool which detects common coding mistakes.
> We should incorporate into Kylin build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-1577) make kylin metadata store support multiple replication

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728224#comment-16728224
 ] 

Zhong Yanghong commented on KYLIN-1577:
---

Hi [~liyang.g...@gmail.com] and [~Shaofengshi], what's the status of this patch?

> make kylin metadata store support multiple replication
> --
>
> Key: KYLIN-1577
> URL: https://issues.apache.org/jira/browse/KYLIN-1577
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: KYLIN-1577.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-1295) Add new document to describe key concepts like model/cube/segment etc

2018-12-24 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728220#comment-16728220
 ] 

Zhong Yanghong edited comment on KYLIN-1295 at 12/24/18 7:59 AM:
-

Hi [~Shaofengshi], how about the progress? And is it feasible to finish this by 
v2.6.0?


was (Author: yaho):
Hi [~Shaofengshi], how about the progress?

> Add new document to describe key concepts like model/cube/segment etc
> -
>
> Key: KYLIN-1295
> URL: https://issues.apache.org/jira/browse/KYLIN-1295
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: liyang
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)