[jira] [Updated] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2604: -- Attachment: (was: KYLIN-2604.patch) > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2604: -- Attachment: KYLIN-2604.patch Update the patch. ReGenerate AdvancedDict as long as measures change. > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Issue Comment Deleted] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2604: -- Comment: was deleted (was: Update the patch.) > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-1926) Loosen the constraint on FK-PK data type matching
[ https://issues.apache.org/jira/browse/KYLIN-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094130#comment-16094130 ] kangkaisen commented on KYLIN-1926: --- Yes, I know. I don't understand why the compatible data type (int and tinyint ) result in wrong execution plan. > Loosen the constraint on FK-PK data type matching > - > > Key: KYLIN-1926 > URL: https://issues.apache.org/jira/browse/KYLIN-1926 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: all >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Minor > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1926-FK-PK-data-type-matching.patch > > > If lookup table's PK datatype isn't equal to fact table's FK datatype, Kylin > will report error saying "Primary key are not consistent with Foreign key". > This constraint is too strong. Should allow user to disable this check. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2653) Spark cubing support HBase cluster with kerberos
[ https://issues.apache.org/jira/browse/KYLIN-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097154#comment-16097154 ] kangkaisen commented on KYLIN-2653: --- liyang. Thanks for your review. I agree with you. 1 There no doubt that {{KylinConfig}} and {{KylinConfigBase}} should be real-only. I have rolled back the signature for {{getAllProperties}} and {{reloadKylinConfig}}. But I changed the {{getAllProperties}} in KylinConfigExt to public. I think which is reasonable because we need at least one way to get the all properties and this operation is read-only. what do you think of ? 2 This commit didn't invoke {KylinConfigBase.setProperty()}}. > Spark cubing support HBase cluster with kerberos > > > Key: KYLIN-2653 > URL: https://issues.apache.org/jira/browse/KYLIN-2653 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > > Currently, Spark cubing doesn't support HBase cluster with kerberos. > Temporarily,we could support HBase cluster with kerberos on Yarn client mode, > because which is easy. > In the long term,we should avoid access HBase in Spark cubing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2744) Should return correct type for SUM measure in web
[ https://issues.apache.org/jira/browse/KYLIN-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097168#comment-16097168 ] kangkaisen commented on KYLIN-2744: --- Thanks liyang and Billy for your comment. I think the column type and sum measure type should be same in web, and back-end use current Ingester and Aggregator handle sum measure, and the final result will consistent with Presto and Hive. which is reasonable, user-friendly and easy. As for the sum for double type is not fully precise,If user want to get fully precise result, user should use decimal type in Hive. > Should return correct type for SUM measure in web > - > > Key: KYLIN-2744 > URL: https://issues.apache.org/jira/browse/KYLIN-2744 > Project: Kylin > Issue Type: Bug > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2744.patch > > > Currently, Kylin return decimal type for the sum measure of double type, > which will result in wrong result. So, We should return correct type for SUM > measure in web. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2672) Only clean necessary cache for CubeMigrationCLI
[ https://issues.apache.org/jira/browse/KYLIN-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097175#comment-16097175 ] kangkaisen commented on KYLIN-2672: --- KYLIN-2717 is great. After KYLIN-2717, we could only reload the related tables. I will update this patch when you finish KYLIN-2717. Thanks you. > Only clean necessary cache for CubeMigrationCLI > --- > > Key: KYLIN-2672 > URL: https://issues.apache.org/jira/browse/KYLIN-2672 > Project: Kylin > Issue Type: Improvement > Components: Tools, Build and Test >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2672.patch > > > Currently, we simply clear ALL cache in CubeMigrationCLI. which will make a > few of queries slower in prod env when we have many tables, models, cubes and > migrate cube often. > So, we could only clean necessary cache for CubeMigrationCLI. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2740) FileNotFoundException on base cuboid build if GlobalDictionary is used
[ https://issues.apache.org/jira/browse/KYLIN-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097539#comment-16097539 ] kangkaisen commented on KYLIN-2740: --- Hi, sterligovak. Thanks you. KYLIN-2506 has fixed this issue. After KYLIN-2506, the GlobalDictionary is more robust. > FileNotFoundException on base cuboid build if GlobalDictionary is used > -- > > Key: KYLIN-2740 > URL: https://issues.apache.org/jira/browse/KYLIN-2740 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.0.0 >Reporter: Alexander Sterligov >Assignee: kangkaisen > Attachments: KYLIN-2740-patch > > > 2017-07-13 15:25:20,515 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: > java.io.FileNotFoundException: No such file or directory: > 'home/production/bi/kylin/kylin_metadata/resources/GlobalDict/dict/MART.STAR_MAIN_EVENT/DEVICE_ID/version_1499959477799/.index' > at > org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:129) > at org.apache.kylin.cube.CubeManager.getDictionary(CubeManager.java:264) > at org.apache.kylin.cube.CubeSegment.getDictionary(CubeSegment.java:329) > at > org.apache.kylin.cube.CubeSegment.buildDictionaryMap(CubeSegment.java:321) > at > org.apache.kylin.engine.mr.common.BaseCuboidBuilder.(BaseCuboidBuilder.java:86) > at > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.setup(BaseCuboidMapperBase.java:70) > at > org.apache.kylin.engine.mr.steps.HiveToBaseCuboidMapper.setup(HiveToBaseCuboidMapper.java:36) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:796) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.io.FileNotFoundException: No such file or directory: > 'home/production/bi/kylin/kylin_metadata/resources/GlobalDict/dict/MART.STAR_MAIN_EVENT/DEVICE_ID/version_1499959477799/.index' > The reason of the exception is that flushIndex in > org.apache.kylin.dict.AppendTrieDictionary flushes and closes file after > CachedTreeMap is committed. .index file is left in working directory. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2706) Should disable Storage limit push down when singleValuesD doesn't containsAll othersD
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097583#comment-16097583 ] kangkaisen commented on KYLIN-2706: --- After discussed with [~mahongbin], we thought the root cause of this issue is the Comparator of SortedIteratorMergerWithLimit has a bug. I think we only need to compare with group columns in SortedIteratorMergerWithLimit. > Should disable Storage limit push down when singleValuesD doesn't containsAll > othersD > - > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2706) Should disable Storage limit push down when singleValuesD doesn't containsAll othersD
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2706: -- Attachment: KYLIN-2706.patch This patch fix the bug for the comparator in SortedIteratorMergerWithLimit > Should disable Storage limit push down when singleValuesD doesn't containsAll > othersD > - > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > Attachments: KYLIN-2706.patch, KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2706) Should disable Storage limit push down when singleValuesD doesn't containsAll othersD
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2706: -- Attachment: (was: KYLIN-2706.patch) > Should disable Storage limit push down when singleValuesD doesn't containsAll > othersD > - > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2706) Fix the bug for the comparator in SortedIteratorMergerWithLimit
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2706: -- Summary: Fix the bug for the comparator in SortedIteratorMergerWithLimit (was: Should disable Storage limit push down when singleValuesD doesn't containsAll othersD) > Fix the bug for the comparator in SortedIteratorMergerWithLimit > --- > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-1926) Loosen the constraint on FK-PK data type matching
[ https://issues.apache.org/jira/browse/KYLIN-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097588#comment-16097588 ] kangkaisen commented on KYLIN-1926: --- OK. Thanks liyang. > Loosen the constraint on FK-PK data type matching > - > > Key: KYLIN-1926 > URL: https://issues.apache.org/jira/browse/KYLIN-1926 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: all >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Minor > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1926-FK-PK-data-type-matching.patch > > > If lookup table's PK datatype isn't equal to fact table's FK datatype, Kylin > will report error saying "Primary key are not consistent with Foreign key". > This constraint is too strong. Should allow user to disable this check. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2653) Spark cubing support HBase cluster with kerberos
[ https://issues.apache.org/jira/browse/KYLIN-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097589#comment-16097589 ] kangkaisen commented on KYLIN-2653: --- Update the commit: https://github.com/apache/kylin/commit/d8d0395a80cc50fcb59bab4d402c7675aef6cd22 > Spark cubing support HBase cluster with kerberos > > > Key: KYLIN-2653 > URL: https://issues.apache.org/jira/browse/KYLIN-2653 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > > Currently, Spark cubing doesn't support HBase cluster with kerberos. > Temporarily,we could support HBase cluster with kerberos on Yarn client mode, > because which is easy. > In the long term,we should avoid access HBase in Spark cubing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (KYLIN-1926) Loosen the constraint on FK-PK data type matching
[ https://issues.apache.org/jira/browse/KYLIN-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen reopened KYLIN-1926: --- > Loosen the constraint on FK-PK data type matching > - > > Key: KYLIN-1926 > URL: https://issues.apache.org/jira/browse/KYLIN-1926 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: all >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Minor > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1926-FK-PK-data-type-matching.patch > > > If lookup table's PK datatype isn't equal to fact table's FK datatype, Kylin > will report error saying "Primary key are not consistent with Foreign key". > This constraint is too strong. Should allow user to disable this check. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-1926) Loosen the constraint on FK-PK data type matching
[ https://issues.apache.org/jira/browse/KYLIN-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101734#comment-16101734 ] kangkaisen commented on KYLIN-1926: --- We can reproduce this issue easily by changing the KYLIN_SALES.BUYER_ID from bigint to int. Then we query this SQL: {code:java} select SUM(price) from KYLIN_SALES inner join KYLIN_ACCOUNT on KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID {code} Finally, the "No model found" error will happen. The logic chain for this error is : 1 The datatype for KYLIN_SALES.BUYER_ID and KYLIN_ACCOUNT.ACCOUNT_ID is inconsistent 2 Calcite cast BUYER_ID from int to bigint 3 Calcite pushDownJoinConditions join 4 Calcite create a Project with all KYLIN_SALES column and BUYER_ID cast column 5 Kylin Add the column startwith _KY_ to the context.allColumns 6 real.getAllColumnDescs() don't contain all context.allColumns because real.getAllColumnDescs() don't contain the column startwith _KY_ in ModelChooser 7 The "No model found" error will happen in ModelChooser > Loosen the constraint on FK-PK data type matching > - > > Key: KYLIN-1926 > URL: https://issues.apache.org/jira/browse/KYLIN-1926 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: all >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Minor > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1926-FK-PK-data-type-matching.patch > > > If lookup table's PK datatype isn't equal to fact table's FK datatype, Kylin > will report error saying "Primary key are not consistent with Foreign key". > This constraint is too strong. Should allow user to disable this check. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-1926) Loosen the constraint on FK-PK data type matching
[ https://issues.apache.org/jira/browse/KYLIN-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101743#comment-16101743 ] kangkaisen commented on KYLIN-1926: --- So, I'm sure this is a bug in Kylin. I notice that Kylin specially handle the column startwith "_KY_" in SqlToRelConverter.hackSelectStar. But it didn't handle this case. > Loosen the constraint on FK-PK data type matching > - > > Key: KYLIN-1926 > URL: https://issues.apache.org/jira/browse/KYLIN-1926 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: all >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Minor > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1926-FK-PK-data-type-matching.patch > > > If lookup table's PK datatype isn't equal to fact table's FK datatype, Kylin > will report error saying "Primary key are not consistent with Foreign key". > This constraint is too strong. Should allow user to disable this check. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KYLIN-1926) Loosen the constraint on FK-PK data type matching
[ https://issues.apache.org/jira/browse/KYLIN-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101734#comment-16101734 ] kangkaisen edited comment on KYLIN-1926 at 7/26/17 2:40 PM: We can reproduce this issue easily by changing the KYLIN_SALES.BUYER_ID from bigint to int. Then we query this SQL: {code:java} select SUM(price) from KYLIN_SALES inner join KYLIN_ACCOUNT on KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID {code} Finally, the "No model found" error will happen. The logic chain for this error is : 1 The datatype for KYLIN_SALES.BUYER_ID and KYLIN_ACCOUNT.ACCOUNT_ID is inconsistent 2 Calcite cast BUYER_ID from int to bigint 3 Calcite pushDownJoinConditions join 4 Calcite create a Project with all KYLIN_SALES column and BUYER_ID cast column 5 Kylin Add the column startwith _KY_ to the context.allColumns in OLAPProjectRel 6 real.getAllColumnDescs() don't contain all context.allColumns because real.getAllColumnDescs() don't contain the column startwith _KY_ in ModelChooser 7 The "No model found" error will happen in ModelChooser was (Author: kangkaisen): We can reproduce this issue easily by changing the KYLIN_SALES.BUYER_ID from bigint to int. Then we query this SQL: {code:java} select SUM(price) from KYLIN_SALES inner join KYLIN_ACCOUNT on KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID {code} Finally, the "No model found" error will happen. The logic chain for this error is : 1 The datatype for KYLIN_SALES.BUYER_ID and KYLIN_ACCOUNT.ACCOUNT_ID is inconsistent 2 Calcite cast BUYER_ID from int to bigint 3 Calcite pushDownJoinConditions join 4 Calcite create a Project with all KYLIN_SALES column and BUYER_ID cast column 5 Kylin Add the column startwith _KY_ to the context.allColumns 6 real.getAllColumnDescs() don't contain all context.allColumns because real.getAllColumnDescs() don't contain the column startwith _KY_ in ModelChooser 7 The "No model found" error will happen in ModelChooser > Loosen the constraint on FK-PK data type matching > - > > Key: KYLIN-1926 > URL: https://issues.apache.org/jira/browse/KYLIN-1926 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: all >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Minor > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1926-FK-PK-data-type-matching.patch > > > If lookup table's PK datatype isn't equal to fact table's FK datatype, Kylin > will report error saying "Primary key are not consistent with Foreign key". > This constraint is too strong. Should allow user to disable this check. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KYLIN-2653) Spark cubing support HBase cluster with kerberos
[ https://issues.apache.org/jira/browse/KYLIN-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-2653. --- Resolution: Fixed Fix Version/s: v2.1.0 > Spark cubing support HBase cluster with kerberos > > > Key: KYLIN-2653 > URL: https://issues.apache.org/jira/browse/KYLIN-2653 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > > Currently, Spark cubing doesn't support HBase cluster with kerberos. > Temporarily,we could support HBase cluster with kerberos on Yarn client mode, > because which is easy. > In the long term,we should avoid access HBase in Spark cubing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (KYLIN-2740) FileNotFoundException on base cuboid build if GlobalDictionary is used
[ https://issues.apache.org/jira/browse/KYLIN-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen closed KYLIN-2740. - Resolution: Duplicate > FileNotFoundException on base cuboid build if GlobalDictionary is used > -- > > Key: KYLIN-2740 > URL: https://issues.apache.org/jira/browse/KYLIN-2740 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.0.0 >Reporter: Alexander Sterligov >Assignee: kangkaisen > Attachments: KYLIN-2740-patch > > > 2017-07-13 15:25:20,515 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: > java.io.FileNotFoundException: No such file or directory: > 'home/production/bi/kylin/kylin_metadata/resources/GlobalDict/dict/MART.STAR_MAIN_EVENT/DEVICE_ID/version_1499959477799/.index' > at > org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:129) > at org.apache.kylin.cube.CubeManager.getDictionary(CubeManager.java:264) > at org.apache.kylin.cube.CubeSegment.getDictionary(CubeSegment.java:329) > at > org.apache.kylin.cube.CubeSegment.buildDictionaryMap(CubeSegment.java:321) > at > org.apache.kylin.engine.mr.common.BaseCuboidBuilder.(BaseCuboidBuilder.java:86) > at > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.setup(BaseCuboidMapperBase.java:70) > at > org.apache.kylin.engine.mr.steps.HiveToBaseCuboidMapper.setup(HiveToBaseCuboidMapper.java:36) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:796) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.io.FileNotFoundException: No such file or directory: > 'home/production/bi/kylin/kylin_metadata/resources/GlobalDict/dict/MART.STAR_MAIN_EVENT/DEVICE_ID/version_1499959477799/.index' > The reason of the exception is that flushIndex in > org.apache.kylin.dict.AppendTrieDictionary flushes and closes file after > CachedTreeMap is committed. .index file is left in working directory. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2764) Build the dict for UHC column with MR
kangkaisen created KYLIN-2764: - Summary: Build the dict for UHC column with MR Key: KYLIN-2764 URL: https://issues.apache.org/jira/browse/KYLIN-2764 Project: Kylin Issue Type: Improvement Components: Job Engine Affects Versions: v2.0.0 Reporter: kangkaisen Assignee: kangkaisen KYLIN-2217 has built dict for normal column with MR, but the UHC column still build dict in JobServer. Like KYLIN-2217, we also could use MR build dict for UHC column. which could thoroughly release the memory pressure and improve job concurrent for JobServer as well as speed up multi UHC columns procedure. The MR input is the output of "Extract Fact Table Distinct Columns", the MR output is the UHC column dict. Because it is very hard build global dict with multi reducers, I use one reducer handle one UHC column and allocate enough memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2653) Spark cubing support HBase cluster with kerberos
[ https://issues.apache.org/jira/browse/KYLIN-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104358#comment-16104358 ] kangkaisen commented on KYLIN-2653: --- Yes. HBase jars can be removed. > Spark cubing support HBase cluster with kerberos > > > Key: KYLIN-2653 > URL: https://issues.apache.org/jira/browse/KYLIN-2653 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > > Currently, Spark cubing doesn't support HBase cluster with kerberos. > Temporarily,we could support HBase cluster with kerberos on Yarn client mode, > because which is easy. > In the long term,we should avoid access HBase in Spark cubing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2706) Fix the bug for the comparator in SortedIteratorMergerWithLimit
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2706: -- Fix Version/s: (was: v2.1.0) > Fix the bug for the comparator in SortedIteratorMergerWithLimit > --- > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2706) Fix the bug for the comparator in SortedIteratorMergerWithLimit
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106374#comment-16106374 ] kangkaisen commented on KYLIN-2706: --- No. This patch need to review. > Fix the bug for the comparator in SortedIteratorMergerWithLimit > --- > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2765) Eliminate restriction on Global Dictionary of Dim columns
[ https://issues.apache.org/jira/browse/KYLIN-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16108767#comment-16108767 ] kangkaisen commented on KYLIN-2765: --- What's the goal of this JIRA? Use Global Dict for Dimension column or one column support two type dicts when one column need to be Dimension and Measure at the same time? > Eliminate restriction on Global Dictionary of Dim columns > - > > Key: KYLIN-2765 > URL: https://issues.apache.org/jira/browse/KYLIN-2765 > Project: Kylin > Issue Type: Improvement > Components: Job Engine, Metadata, Query Engine >Reporter: Roger Shi >Assignee: Dong Li > > Cube dimension column is not allow to be in accurate-count-distinct measure. > Global Dictionary encoding is a kind of dict in metadata. Dict encoding is > created for dim at the beginning, so Global Dictionary is a special one for > measure. > To eliminate the restriction, there're two possible ways in my view. One is > move Global Dictionary metadata out of dict section to a new section such as > "measure dict" (not there now, create a new one). The other way is handle > Global Dictionary differently in both cubing engine and query engine. > There might be other better methods. Let's discuss here and find a good way > out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2653) Spark cubing support HBase cluster with kerberos
[ https://issues.apache.org/jira/browse/KYLIN-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116526#comment-16116526 ] kangkaisen commented on KYLIN-2653: --- Hi,liyang. In that case, How could we get all Kylin config? reflection? > Spark cubing support HBase cluster with kerberos > > > Key: KYLIN-2653 > URL: https://issues.apache.org/jira/browse/KYLIN-2653 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > Currently, Spark cubing doesn't support HBase cluster with kerberos. > Temporarily,we could support HBase cluster with kerberos on Yarn client mode, > because which is easy. > In the long term,we should avoid access HBase in Spark cubing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2604: -- Fix Version/s: (was: v2.1.0) v2.2.0 > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.2.0 > > Attachments: KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119756#comment-16119756 ] kangkaisen commented on KYLIN-2604: --- Thanks zhixiong. I am sorry to delay KYLIN-2604 to 2.2.0 because KYLIN-2622 is still open. > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.2.0 > > Attachments: KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2706) Fix the bug for the comparator in SortedIteratorMergerWithLimit
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119781#comment-16119781 ] kangkaisen commented on KYLIN-2706: --- Thanks hongbin. This is the commit: https://github.com/apache/kylin/commit/659eeaedd571c837df3beae44456dadde3036c3d > Fix the bug for the comparator in SortedIteratorMergerWithLimit > --- > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KYLIN-2706) Fix the bug for the comparator in SortedIteratorMergerWithLimit
[ https://issues.apache.org/jira/browse/KYLIN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-2706. --- Resolution: Fixed Fix Version/s: v2.2.0 > Fix the bug for the comparator in SortedIteratorMergerWithLimit > --- > > Key: KYLIN-2706 > URL: https://issues.apache.org/jira/browse/KYLIN-2706 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > Attachments: KYLIN-2706.patch > > > For this SQL, which should disable Storage limit push. Because this SQL will > return more than one record from HBase tables, but the > SortedIteratorMergerWithLimit only return one record, which will get wrong > result. > {code:java} > SELECT sum(A) > FROM TABLE > WHERE date_id >= 20170624 and date_id <= 20170626 > limit 1 > {code} > We should disable Storage limit push down when singleValuesD doesn't > containsAll othersD -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2653) Spark cubing support HBase cluster with kerberos
[ https://issues.apache.org/jira/browse/KYLIN-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128214#comment-16128214 ] kangkaisen commented on KYLIN-2653: --- OK. I see. Thanks you, liyang. > Spark cubing support HBase cluster with kerberos > > > Key: KYLIN-2653 > URL: https://issues.apache.org/jira/browse/KYLIN-2653 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > Currently, Spark cubing doesn't support HBase cluster with kerberos. > Temporarily,we could support HBase cluster with kerberos on Yarn client mode, > because which is easy. > In the long term,we should avoid access HBase in Spark cubing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2606) Only return counter for precise count_distinct if query is exactAggregate
[ https://issues.apache.org/jira/browse/KYLIN-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133967#comment-16133967 ] kangkaisen commented on KYLIN-2606: --- Thanks hongbin. > Only return counter for precise count_distinct if query is exactAggregate > - > > Key: KYLIN-2606 > URL: https://issues.apache.org/jira/browse/KYLIN-2606 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > > If the query is exactAggregation and has some memory hungry measures, we > could directly return final result to speed up the query , reduce the RPC > data size and memory usage in queryServer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KYLIN-2606) Only return counter for precise count_distinct if query is exactAggregate
[ https://issues.apache.org/jira/browse/KYLIN-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-2606. --- Resolution: Fixed Fix Version/s: v2.2.0 > Only return counter for precise count_distinct if query is exactAggregate > - > > Key: KYLIN-2606 > URL: https://issues.apache.org/jira/browse/KYLIN-2606 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > If the query is exactAggregation and has some memory hungry measures, we > could directly return final result to speed up the query , reduce the RPC > data size and memory usage in queryServer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2622) AppendTrieDictionary support not global
[ https://issues.apache.org/jira/browse/KYLIN-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151476#comment-16151476 ] kangkaisen commented on KYLIN-2622: --- This is the commit: https://github.com/apache/kylin/commit/ec5dd54e9ea5e373569cd65cab322a17716718ff > AppendTrieDictionary support not global > --- > > Key: KYLIN-2622 > URL: https://issues.apache.org/jira/browse/KYLIN-2622 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > > Currently, AppendTrieDictionary only support global dict, which means the > dict will grow continuously. But for the cube doesn't have Partition Date > Column and the cube doesn't need aggregate query across segments, we could > build AppendTrieDictionary from empty dict every time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2622) AppendTrieDictionary support not global
[ https://issues.apache.org/jira/browse/KYLIN-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151477#comment-16151477 ] kangkaisen commented on KYLIN-2622: --- The main idea is add a new DictionaryBuilder SegmentAppendTrieDictBuilder, which build AppendTrieDictionary from empty dict every time in different HDFS dir, so SegmentAppendTrieDictBuilder needn't lock and support concurrency. > AppendTrieDictionary support not global > --- > > Key: KYLIN-2622 > URL: https://issues.apache.org/jira/browse/KYLIN-2622 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > > Currently, AppendTrieDictionary only support global dict, which means the > dict will grow continuously. But for the cube doesn't have Partition Date > Column and the cube doesn't need aggregate query across segments, we could > build AppendTrieDictionary from empty dict every time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151478#comment-16151478 ] kangkaisen commented on KYLIN-2764: --- This is the commit: https://github.com/apache/kylin/commit/2607e18b5e17d2a68f4079a76b8c990f144cbbd6. The core idea is easy, but there are four special points we should note: 1. The FK column in fact table could be UHC column. 2. we could not get correct HDFS working dir from KylinConfig in MR. 3. The one or all UHC columns maybe NULL. 4. There maybe timeout in setup phase of Reducer because of global dict copy and lock. > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2764: -- Attachment: job-memory-before.png job-memory-after.png This commit has run a long time in our prod env. The two pictures show this commit could remarkably reducer memory usage for Kylin JobServer, in addition to this, which could remarkably improve Concurrent ability for Kylin JobServer. After applied this commit, we have removed one JobServer from all three JobServers. > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: job-memory-after.png, job-memory-before.png > > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2604: -- Attachment: KYLIN-2602-Non-Int-type-precise-count-distinct-measure-must-set-advanced-dict.patch This patch add a check in web. Hi, Zhixiong,please you review this patch, Thanks you. > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.2.0 > > Attachments: > KYLIN-2602-Non-Int-type-precise-count-distinct-measure-must-set-advanced-dict.patch, > KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2838) Should get storageType in changeHtableHost of CubeMigrationCLI
kangkaisen created KYLIN-2838: - Summary: Should get storageType in changeHtableHost of CubeMigrationCLI Key: KYLIN-2838 URL: https://issues.apache.org/jira/browse/KYLIN-2838 Project: Kylin Issue Type: Bug Components: Tools, Build and Test Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Fix For: v2.2.0 We should get storageType in changeHtableHost of CubeMigrationCLI, not engineType. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KYLIN-2838) Should get storageType in changeHtableHost of CubeMigrationCLI
[ https://issues.apache.org/jira/browse/KYLIN-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-2838. --- Resolution: Fixed > Should get storageType in changeHtableHost of CubeMigrationCLI > -- > > Key: KYLIN-2838 > URL: https://issues.apache.org/jira/browse/KYLIN-2838 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > We should get storageType in changeHtableHost of CubeMigrationCLI, not > engineType. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2838) Should get storageType in changeHtableHost of CubeMigrationCLI
[ https://issues.apache.org/jira/browse/KYLIN-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151762#comment-16151762 ] kangkaisen commented on KYLIN-2838: --- This is the commit: https://github.com/apache/kylin/commit/78543d6e970cfb9dc85bcc48775681afcdb1c0e9 > Should get storageType in changeHtableHost of CubeMigrationCLI > -- > > Key: KYLIN-2838 > URL: https://issues.apache.org/jira/browse/KYLIN-2838 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > We should get storageType in changeHtableHost of CubeMigrationCLI, not > engineType. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (KYLIN-2838) Should get storageType in changeHtableHost of CubeMigrationCLI
[ https://issues.apache.org/jira/browse/KYLIN-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen closed KYLIN-2838. - > Should get storageType in changeHtableHost of CubeMigrationCLI > -- > > Key: KYLIN-2838 > URL: https://issues.apache.org/jira/browse/KYLIN-2838 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > We should get storageType in changeHtableHost of CubeMigrationCLI, not > engineType. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151765#comment-16151765 ] kangkaisen commented on KYLIN-2604: --- The int value is the input type of RoaringBitmap, So int value needn't dict encode. In other word, The Int type precise distinct count measure needn't global dict. > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.2.0 > > Attachments: > KYLIN-2602-Non-Int-type-precise-count-distinct-measure-must-set-advanced-dict.patch, > KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152630#comment-16152630 ] kangkaisen commented on KYLIN-2764: --- I have rebased KYLIN-2622 and KYLIN-2764 on master branch. KYLIN-2622 and KYLIN-2764 are both about global dict, So I put those two commit on one branch 2622-2764 and run IT together. > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: job-memory-after.png, job-memory-before.png > > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2841) LIMIT is buggy with subquery
[ https://issues.apache.org/jira/browse/KYLIN-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174348#comment-16174348 ] kangkaisen commented on KYLIN-2841: --- Hi, [~zhengd] Thanks you. I think maybe we use context.afterAggregate is enough and needn't add a afterOuterAggregate variable. What do you think of it? > LIMIT is buggy with subquery > > > Key: KYLIN-2841 > URL: https://issues.apache.org/jira/browse/KYLIN-2841 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.1.0 >Reporter: Mu Kong >Assignee: zhengdong > Labels: scope > Attachments: 0001-KYLIN-2841-LIMIT-is-buggy-with-subquery.patch > > > Hi, all. > I found that limit in the web UI seems not behaving as expected. > When I run a query like the follows: > {code:sql} > SELECT > SUM(col3) AS col4, > SUM(col5) AS total_col5, > col1 > FROM > ( > SELECT > col1, > col2, > MAX(col3) AS col3, > COUNT(*) AS col5 > FROM db.table > WHERE col6 = 'somestring' > GROUP BY col1, col2 > ) > GROUP BY col1 > {code} > When I specify the limit as 50, the result has 19 records, and when I specify > the limit as 50, there are 90+ records in the result and each record has > higher col4 and total_col5. > But for query that doesn't have subquery, the result remains the same no > matter how I change the limit. > I guess for the query with subquery, limit somehow limits the number of the > result from the inner query instead of the result of the outer query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2841) LIMIT is buggy with subquery
[ https://issues.apache.org/jira/browse/KYLIN-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176016#comment-16176016 ] kangkaisen commented on KYLIN-2841: --- Yes. you are right! I think wrong. I am sorry. Thanks you. > LIMIT is buggy with subquery > > > Key: KYLIN-2841 > URL: https://issues.apache.org/jira/browse/KYLIN-2841 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.1.0 >Reporter: Mu Kong >Assignee: zhengdong > Labels: scope > Attachments: 0001-KYLIN-2841-LIMIT-is-buggy-with-subquery.patch > > > Hi, all. > I found that limit in the web UI seems not behaving as expected. > When I run a query like the follows: > {code:sql} > SELECT > SUM(col3) AS col4, > SUM(col5) AS total_col5, > col1 > FROM > ( > SELECT > col1, > col2, > MAX(col3) AS col3, > COUNT(*) AS col5 > FROM db.table > WHERE col6 = 'somestring' > GROUP BY col1, col2 > ) > GROUP BY col1 > {code} > When I specify the limit as 50, the result has 19 records, and when I specify > the limit as 50, there are 90+ records in the result and each record has > higher col4 and total_col5. > But for query that doesn't have subquery, the result remains the same no > matter how I change the limit. > I guess for the query with subquery, limit somehow limits the number of the > result from the inner query instead of the result of the outer query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177724#comment-16177724 ] kangkaisen commented on KYLIN-2764: --- liyang, Thanks very much for your review. In KYLIN-2135, we use multiple reducers to speed up "Extract Fact Table Distinct Columns" for UHC column. This is the reason why I couldn't build global dict in {{FactDistinctColumnsReducer}}. In addition to this point, do you have any other suggestions? > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: job-memory-after.png, job-memory-before.png > > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2622) AppendTrieDictionary support not global
[ https://issues.apache.org/jira/browse/KYLIN-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177729#comment-16177729 ] kangkaisen commented on KYLIN-2622: --- Thanks very much for your review. I believe this feature is useful and it is widely used in our prod env. I will write a post about global dict and precise distinct, in that post , I will use data and fact explain why this feature is necessary. > AppendTrieDictionary support not global > --- > > Key: KYLIN-2622 > URL: https://issues.apache.org/jira/browse/KYLIN-2622 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.2.0 > > > Currently, AppendTrieDictionary only support global dict, which means the > dict will grow continuously. But for the cube doesn't have Partition Date > Column and the cube doesn't need aggregate query across segments, we could > build AppendTrieDictionary from empty dict every time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178081#comment-16178081 ] kangkaisen commented on KYLIN-2764: --- If the UHC columns have Hundreds of billions of rows and we use one reducer to handle it , the {{FactDistinctColumnsReducer}} will be very very slow. In other words,if we could use multiple reducers to build one global dict, we will needn't add a new UHCDictionaryJob, but it is very hard build global dict with multi reducers. > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: job-memory-after.png, job-memory-before.png > > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2180) Add project config and make config priority become "cube > project > server"
[ https://issues.apache.org/jira/browse/KYLIN-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205374#comment-16205374 ] kangkaisen commented on KYLIN-2180: --- HI, julian. I don't see anywhere I change the ACL in this patch. Could you point out the concrete code ? > Add project config and make config priority become "cube > project > server" > > > Key: KYLIN-2180 > URL: https://issues.apache.org/jira/browse/KYLIN-2180 > Project: Kylin > Issue Type: New Feature > Components: Metadata >Affects Versions: v1.5.4.1 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.0.0 > > Attachments: KYLIN-2180-refactor-ProjectRequest.patch, > KYLIN-2180.patch > > > There are cases we want to override global kylin.properties in the scope of a > project. E.g. the queue name of Hadoop job. > Finally, the config priority for Kylin should be "cube > project > server". I > think which is reasonable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214197#comment-16214197 ] kangkaisen commented on KYLIN-2764: --- Thanks you very much, liyang and shaofeng. Shaofeng, you should let me do the merge work,thanks you. I don't have further change, but there is a issue in 2764 branch: After KYLIN-2800 https://github.com/apache/kylin/commit/ac77008ee81d4dcc2956b1a2cfd6eaa7ae9fc5d9 There isn't the first point I had pointed in the comment: {quote} 1. The FK column in fact table could be UHC column. {quote} So the latest commit in 2764 branch coube be simplify, This is the commit to apply KYLIN-2800: https://github.com/apache/kylin/commit/48f3fb1953a413acfdd405539a7cfd211a5e85de. > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.3.0 > > Attachments: job-memory-after.png, job-memory-before.png > > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2744) Should return correct type for SUM measure in web
[ https://issues.apache.org/jira/browse/KYLIN-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214201#comment-16214201 ] kangkaisen commented on KYLIN-2744: --- Hi, Zhixiong. OK, I see. I will update the patch later. > Should return correct type for SUM measure in web > - > > Key: KYLIN-2744 > URL: https://issues.apache.org/jira/browse/KYLIN-2744 > Project: Kylin > Issue Type: Bug > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2744.patch > > > Currently, Kylin return decimal type for the sum measure of double type, > which will result in wrong result. So, We should return correct type for SUM > measure in web. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2764) Build the dict for UHC column with MR
[ https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214609#comment-16214609 ] kangkaisen commented on KYLIN-2764: --- OK. Thanks you, shaofeng! > Build the dict for UHC column with MR > - > > Key: KYLIN-2764 > URL: https://issues.apache.org/jira/browse/KYLIN-2764 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.3.0 > > Attachments: job-memory-after.png, job-memory-before.png > > > KYLIN-2217 has built dict for normal column with MR, but the UHC column > still build dict in JobServer. Like KYLIN-2217, we also could use MR build > dict for UHC column. which could thoroughly release the memory pressure and > improve job concurrent for JobServer as well as speed up multi UHC columns > procedure. > The MR input is the output of "Extract Fact Table Distinct Columns", the MR > output is the UHC column dict. Because it is very hard build global dict with > multi reducers, I use one reducer handle one UHC column and allocate enough > memory to the reducer. According to my test, 8G memory is enough. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2992) Avoid OOM in CubeHFileJob.Reducer
kangkaisen created KYLIN-2992: - Summary: Avoid OOM in CubeHFileJob.Reducer Key: KYLIN-2992 URL: https://issues.apache.org/jira/browse/KYLIN-2992 Project: Kylin Issue Type: Improvement Components: Storage - HBase Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Major Refer to HBASE-13897, we also could improve CubeHFileJob.Reducer and avoid OOM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2992) Avoid OOM in CubeHFileJob.Reducer
[ https://issues.apache.org/jira/browse/KYLIN-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237137#comment-16237137 ] kangkaisen commented on KYLIN-2992: --- The main idea is changing reducer sort to shuffle sort, There are two key points: # implement a {{RowKeyWritable}} class to compare KeyValue with KeyValue.KVComparator() # construct a KeyValue base on cuboid with KeyValue.createFirstOnRow The more detail could refer this Chinese blog: https://blog.bcmeng.com/post/kylin-hfile-improve.html > Avoid OOM in CubeHFileJob.Reducer > -- > > Key: KYLIN-2992 > URL: https://issues.apache.org/jira/browse/KYLIN-2992 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > > Refer to HBASE-13897, we also could improve CubeHFileJob.Reducer and avoid > OOM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2993) Add special mr config for base cuboid step
kangkaisen created KYLIN-2993: - Summary: Add special mr config for base cuboid step Key: KYLIN-2993 URL: https://issues.apache.org/jira/browse/KYLIN-2993 Project: Kylin Issue Type: Improvement Components: Job Engine Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Major Refer to http://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/, currently, if user want to enlarge MR memory for global dict, they must use kylin.engine.mr.config-override., which will enlarge the memory of all mr job. In fact, we only need to enlarge the memory for "Build Base Cuboid", so we could add a special mr config for base cuboid step. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2994) Handle NPE when load dict in DictionaryManager
kangkaisen created KYLIN-2994: - Summary: Handle NPE when load dict in DictionaryManager Key: KYLIN-2994 URL: https://issues.apache.org/jira/browse/KYLIN-2994 Project: Kylin Issue Type: Bug Components: Metadata Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Minor Currently, the argument {{resourcePath}} in {{DictionaryManager.getDictionaryInfo}} could be NULL -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2995) Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cuing
kangkaisen created KYLIN-2995: - Summary: Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cuing Key: KYLIN-2995 URL: https://issues.apache.org/jira/browse/KYLIN-2995 Project: Kylin Issue Type: Bug Components: Spark Engine Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Major Currenly, we load metadata from HDFS in SparkCubing:{{AbstractHadoopJob.loadKylinConfigFromHdfs}}, But HadoopUtil will use new Configuration, we should use SparkContext.hadoopConfiguration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2996) DeployCoprocessorCLI Log failed tables info
kangkaisen created KYLIN-2996: - Summary: DeployCoprocessorCLI Log failed tables info Key: KYLIN-2996 URL: https://issues.apache.org/jira/browse/KYLIN-2996 Project: Kylin Issue Type: Improvement Components: Storage - HBase Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Currently, updating coprocessor will be less likely to fail, we should tell user the info in final output. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2997) Allow change engineType even if there are segments in cube
kangkaisen created KYLIN-2997: - Summary: Allow change engineType even if there are segments in cube Key: KYLIN-2997 URL: https://issues.apache.org/jira/browse/KYLIN-2997 Project: Kylin Issue Type: Bug Components: Metadata, Web Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Major Currently, the cube signature contains engineType, if user want to switch engine, they must purge the cube firstly. I think which is unreasonable because the engine doesn't effect query and existing segments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2998) Kill spark app when job was discarded
kangkaisen created KYLIN-2998: - Summary: Kill spark app when job was discarded Key: KYLIN-2998 URL: https://issues.apache.org/jira/browse/KYLIN-2998 Project: Kylin Issue Type: Improvement Components: Spark Engine Affects Versions: v2.1.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Major Currently, when we discard spark job, the spark job will still running, and when we restart JobServer, the SparkExecutable will submit a new spark job. we should handle spark job as mr job. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2998) Kill spark app when cube job was discarded
[ https://issues.apache.org/jira/browse/KYLIN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2998: -- Summary: Kill spark app when cube job was discarded (was: Kill spark app when job was discarded) > Kill spark app when cube job was discarded > -- > > Key: KYLIN-2998 > URL: https://issues.apache.org/jira/browse/KYLIN-2998 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > > Currently, when we discard spark job, the spark job will still running, and > when we restart JobServer, the SparkExecutable will submit a new spark job. > we should handle spark job as mr job. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2999) One click migrate cube in web
kangkaisen created KYLIN-2999: - Summary: One click migrate cube in web Key: KYLIN-2999 URL: https://issues.apache.org/jira/browse/KYLIN-2999 Project: Kylin Issue Type: New Feature Components: Tools, Build and Test, Web Reporter: kangkaisen Assignee: kangkaisen Priority: Major Currently, the cube migration must be done by Kylin Admin, which will waste a lot of time for Kylin Admin. So, we should allow use to migrate cube by one click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-3000) Add a tool supporting migrate Cubedesc across different HBase cluster
kangkaisen created KYLIN-3000: - Summary: Add a tool supporting migrate Cubedesc across different HBase cluster Key: KYLIN-3000 URL: https://issues.apache.org/jira/browse/KYLIN-3000 Project: Kylin Issue Type: New Feature Components: Tools, Build and Test Reporter: kangkaisen Assignee: kangkaisen Priority: Major Add a tool supporting migrate Cubedesc across different HBase cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-3002) Use Spark as default engine for none-global-dict cube
kangkaisen created KYLIN-3002: - Summary: Use Spark as default engine for none-global-dict cube Key: KYLIN-3002 URL: https://issues.apache.org/jira/browse/KYLIN-3002 Project: Kylin Issue Type: Improvement Components: Web Reporter: kangkaisen Assignee: kangkaisen After KYLIN-2997, like KYLIN-2963, we could use Spark as default engine for none-global-dict cube. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2992) Avoid OOM in CubeHFileJob.Reducer
[ https://issues.apache.org/jira/browse/KYLIN-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238898#comment-16238898 ] kangkaisen commented on KYLIN-2992: --- This is the commit: https://github.com/apache/kylin/commit/b837071a6048433a0ec1708f358a62a8e90c2d1a. This commit has passed the IT. > Avoid OOM in CubeHFileJob.Reducer > -- > > Key: KYLIN-2992 > URL: https://issues.apache.org/jira/browse/KYLIN-2992 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > > Refer to HBASE-13897, we also could improve CubeHFileJob.Reducer and avoid > OOM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3002) Use Spark as default engine for none-global-dict cube
[ https://issues.apache.org/jira/browse/KYLIN-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239828#comment-16239828 ] kangkaisen commented on KYLIN-3002: --- I agree with you, Thanks for your reminder. I could do this work. > Use Spark as default engine for none-global-dict cube > - > > Key: KYLIN-3002 > URL: https://issues.apache.org/jira/browse/KYLIN-3002 > Project: Kylin > Issue Type: Improvement > Components: Web >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Trivial > > After KYLIN-2997, like KYLIN-2963, we could use Spark as default engine for > none-global-dict cube. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3002) Use Spark as default engine in web
[ https://issues.apache.org/jira/browse/KYLIN-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-3002: -- Summary: Use Spark as default engine in web (was: Use Spark as default engine for none-global-dict cube) > Use Spark as default engine in web > -- > > Key: KYLIN-3002 > URL: https://issues.apache.org/jira/browse/KYLIN-3002 > Project: Kylin > Issue Type: Improvement > Components: Web >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Trivial > > After KYLIN-2997, like KYLIN-2963, we could use Spark as default engine for > none-global-dict cube. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2992) Avoid OOM in CubeHFileJob.Reducer
[ https://issues.apache.org/jira/browse/KYLIN-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251339#comment-16251339 ] kangkaisen commented on KYLIN-2992: --- Thanks liyang. Will you review this commit ? > Avoid OOM in CubeHFileJob.Reducer > -- > > Key: KYLIN-2992 > URL: https://issues.apache.org/jira/browse/KYLIN-2992 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > > Refer to HBASE-13897, we also could improve CubeHFileJob.Reducer and avoid > OOM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3055) NullPointerException in MutableRoaringBitmap.or
[ https://issues.apache.org/jira/browse/KYLIN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274249#comment-16274249 ] kangkaisen commented on KYLIN-3055: --- Hi Chuqian, Thanks you very much. This bug is introduced by my KYLIN-2606. This patch looks good to me, I will test and merge this patch to master. > NullPointerException in MutableRoaringBitmap.or > --- > > Key: KYLIN-3055 > URL: https://issues.apache.org/jira/browse/KYLIN-3055 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: yuchuqian >Assignee: yuchuqian > Fix For: v2.3.0 > > Attachments: KYLIN-3055.patch > > > 2017-11-21 19:55:17,363 ERROR [Query > b1fbcd45-6524-4b1e-8844-1d6d6277a1bf-120] service.QueryService:459 : > Exception while executing query > java.sql.SQLException: Error while executing SQL "select part_dt, > intersect_count(item_count, part_dt, array[date'2012-01-01']) as first_day, > intersect_count(item_count, part_dt, array[date'2012-01-02']) as second_day, > intersect_count(item_count, part_dt, array[date'2012-01-03']) as third_day, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02']) as retention_oneday, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02',date'2012-01-03']) as retention_twoday > from kylin_sales > where part_dt in (date'2012-01-01',date'2012-01-02',date'2012-01-03') > group by PART_DT > LIMIT 5": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) > at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:834) > at > org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:561) > at org.apache.kylin.rest.service.QueryService.query(QueryService.java:181) > at > org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:415) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > .. > Caused by: java.lang.NullPointerException > at > org.roaringbitmap.buffer.MutableRoaringBitmap.or(MutableRoaringBitmap.java:1041) > at > org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc$RetentionPartialResult.add(BitmapIntersectDistinctCountAggFunc.java:57) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc.add(BitmapIntersectDistinctCountAggFunc.java:90) > at Baz$4.apply(ANONYMOUS.java:136) > at Baz$4.apply(ANONYMOUS.java:158) > at Baz$4.apply(ANONYMOUS.java) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:832) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Baz.java:99) > How to re-produce: > 1. run $KYLIN_HOME/bin/sample.sh > 2. then create a cube like > { > "uuid": "9554f6f6-74dc-489e-b780-2f48f281576c", > "last_modified": 1511247707372, > "version": "2.2.0.0", > "name": "test", > "is_draft": false, > "model_name": "kylin_sales_model", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "PART_DT", > "table": "KYLIN_SALES", > "column": "PART_DT", > "derived": null > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_SALES", > "column": "LEAF_CATEG_ID", > "derived": null > }, > { > "name": "LSTG_SITE_ID", > "table": "KYLIN_SALES", > "column": "LSTG_SITE_ID", > "derived": null > }, > { > "name": "CAL_DT", > "table": "KYLIN_CAL_DT", > "column": null, > "derived": [ > "CAL_DT" > ] > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "LEAF_CATEG_ID" > ] > }, > { > "name": "USER_DEFINED_FIELD1", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_DEFINED_FIELD1" > ] >
[jira] [Assigned] (KYLIN-3055) NullPointerException in MutableRoaringBitmap.or
[ https://issues.apache.org/jira/browse/KYLIN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen reassigned KYLIN-3055: - Assignee: kangkaisen (was: yuchuqian) > NullPointerException in MutableRoaringBitmap.or > --- > > Key: KYLIN-3055 > URL: https://issues.apache.org/jira/browse/KYLIN-3055 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: yuchuqian >Assignee: kangkaisen > Fix For: v2.3.0 > > Attachments: KYLIN-3055.patch > > > 2017-11-21 19:55:17,363 ERROR [Query > b1fbcd45-6524-4b1e-8844-1d6d6277a1bf-120] service.QueryService:459 : > Exception while executing query > java.sql.SQLException: Error while executing SQL "select part_dt, > intersect_count(item_count, part_dt, array[date'2012-01-01']) as first_day, > intersect_count(item_count, part_dt, array[date'2012-01-02']) as second_day, > intersect_count(item_count, part_dt, array[date'2012-01-03']) as third_day, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02']) as retention_oneday, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02',date'2012-01-03']) as retention_twoday > from kylin_sales > where part_dt in (date'2012-01-01',date'2012-01-02',date'2012-01-03') > group by PART_DT > LIMIT 5": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) > at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:834) > at > org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:561) > at org.apache.kylin.rest.service.QueryService.query(QueryService.java:181) > at > org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:415) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > .. > Caused by: java.lang.NullPointerException > at > org.roaringbitmap.buffer.MutableRoaringBitmap.or(MutableRoaringBitmap.java:1041) > at > org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc$RetentionPartialResult.add(BitmapIntersectDistinctCountAggFunc.java:57) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc.add(BitmapIntersectDistinctCountAggFunc.java:90) > at Baz$4.apply(ANONYMOUS.java:136) > at Baz$4.apply(ANONYMOUS.java:158) > at Baz$4.apply(ANONYMOUS.java) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:832) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Baz.java:99) > How to re-produce: > 1. run $KYLIN_HOME/bin/sample.sh > 2. then create a cube like > { > "uuid": "9554f6f6-74dc-489e-b780-2f48f281576c", > "last_modified": 1511247707372, > "version": "2.2.0.0", > "name": "test", > "is_draft": false, > "model_name": "kylin_sales_model", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "PART_DT", > "table": "KYLIN_SALES", > "column": "PART_DT", > "derived": null > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_SALES", > "column": "LEAF_CATEG_ID", > "derived": null > }, > { > "name": "LSTG_SITE_ID", > "table": "KYLIN_SALES", > "column": "LSTG_SITE_ID", > "derived": null > }, > { > "name": "CAL_DT", > "table": "KYLIN_CAL_DT", > "column": null, > "derived": [ > "CAL_DT" > ] > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "LEAF_CATEG_ID" > ] > }, > { > "name": "USER_DEFINED_FIELD1", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_DEFINED_FIELD1" > ] > }, > { > "name": "USER_DEFINED_FIELD3", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER
[jira] [Assigned] (KYLIN-3055) NullPointerException in MutableRoaringBitmap.or
[ https://issues.apache.org/jira/browse/KYLIN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen reassigned KYLIN-3055: - Assignee: yuchuqian (was: kangkaisen) > NullPointerException in MutableRoaringBitmap.or > --- > > Key: KYLIN-3055 > URL: https://issues.apache.org/jira/browse/KYLIN-3055 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: yuchuqian >Assignee: yuchuqian > Fix For: v2.3.0 > > Attachments: KYLIN-3055.patch > > > 2017-11-21 19:55:17,363 ERROR [Query > b1fbcd45-6524-4b1e-8844-1d6d6277a1bf-120] service.QueryService:459 : > Exception while executing query > java.sql.SQLException: Error while executing SQL "select part_dt, > intersect_count(item_count, part_dt, array[date'2012-01-01']) as first_day, > intersect_count(item_count, part_dt, array[date'2012-01-02']) as second_day, > intersect_count(item_count, part_dt, array[date'2012-01-03']) as third_day, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02']) as retention_oneday, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02',date'2012-01-03']) as retention_twoday > from kylin_sales > where part_dt in (date'2012-01-01',date'2012-01-02',date'2012-01-03') > group by PART_DT > LIMIT 5": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) > at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:834) > at > org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:561) > at org.apache.kylin.rest.service.QueryService.query(QueryService.java:181) > at > org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:415) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > .. > Caused by: java.lang.NullPointerException > at > org.roaringbitmap.buffer.MutableRoaringBitmap.or(MutableRoaringBitmap.java:1041) > at > org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc$RetentionPartialResult.add(BitmapIntersectDistinctCountAggFunc.java:57) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc.add(BitmapIntersectDistinctCountAggFunc.java:90) > at Baz$4.apply(ANONYMOUS.java:136) > at Baz$4.apply(ANONYMOUS.java:158) > at Baz$4.apply(ANONYMOUS.java) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:832) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Baz.java:99) > How to re-produce: > 1. run $KYLIN_HOME/bin/sample.sh > 2. then create a cube like > { > "uuid": "9554f6f6-74dc-489e-b780-2f48f281576c", > "last_modified": 1511247707372, > "version": "2.2.0.0", > "name": "test", > "is_draft": false, > "model_name": "kylin_sales_model", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "PART_DT", > "table": "KYLIN_SALES", > "column": "PART_DT", > "derived": null > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_SALES", > "column": "LEAF_CATEG_ID", > "derived": null > }, > { > "name": "LSTG_SITE_ID", > "table": "KYLIN_SALES", > "column": "LSTG_SITE_ID", > "derived": null > }, > { > "name": "CAL_DT", > "table": "KYLIN_CAL_DT", > "column": null, > "derived": [ > "CAL_DT" > ] > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "LEAF_CATEG_ID" > ] > }, > { > "name": "USER_DEFINED_FIELD1", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_DEFINED_FIELD1" > ] > }, > { > "name": "USER_DEFINED_FIELD3", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_
[jira] [Commented] (KYLIN-3055) NullPointerException in MutableRoaringBitmap.or
[ https://issues.apache.org/jira/browse/KYLIN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276662#comment-16276662 ] kangkaisen commented on KYLIN-3055: --- Hi Chuqian, I had added a test case for this bug and passed the IT. Thanks you. Could you re-submit a new patch with your commit author info or give me your commit author info directly? I will commit your patch to master branch. > NullPointerException in MutableRoaringBitmap.or > --- > > Key: KYLIN-3055 > URL: https://issues.apache.org/jira/browse/KYLIN-3055 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: yuchuqian >Assignee: yuchuqian > Fix For: v2.3.0 > > Attachments: KYLIN-3055.patch > > > 2017-11-21 19:55:17,363 ERROR [Query > b1fbcd45-6524-4b1e-8844-1d6d6277a1bf-120] service.QueryService:459 : > Exception while executing query > java.sql.SQLException: Error while executing SQL "select part_dt, > intersect_count(item_count, part_dt, array[date'2012-01-01']) as first_day, > intersect_count(item_count, part_dt, array[date'2012-01-02']) as second_day, > intersect_count(item_count, part_dt, array[date'2012-01-03']) as third_day, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02']) as retention_oneday, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02',date'2012-01-03']) as retention_twoday > from kylin_sales > where part_dt in (date'2012-01-01',date'2012-01-02',date'2012-01-03') > group by PART_DT > LIMIT 5": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) > at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:834) > at > org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:561) > at org.apache.kylin.rest.service.QueryService.query(QueryService.java:181) > at > org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:415) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > .. > Caused by: java.lang.NullPointerException > at > org.roaringbitmap.buffer.MutableRoaringBitmap.or(MutableRoaringBitmap.java:1041) > at > org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc$RetentionPartialResult.add(BitmapIntersectDistinctCountAggFunc.java:57) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc.add(BitmapIntersectDistinctCountAggFunc.java:90) > at Baz$4.apply(ANONYMOUS.java:136) > at Baz$4.apply(ANONYMOUS.java:158) > at Baz$4.apply(ANONYMOUS.java) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:832) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Baz.java:99) > How to re-produce: > 1. run $KYLIN_HOME/bin/sample.sh > 2. then create a cube like > { > "uuid": "9554f6f6-74dc-489e-b780-2f48f281576c", > "last_modified": 1511247707372, > "version": "2.2.0.0", > "name": "test", > "is_draft": false, > "model_name": "kylin_sales_model", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "PART_DT", > "table": "KYLIN_SALES", > "column": "PART_DT", > "derived": null > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_SALES", > "column": "LEAF_CATEG_ID", > "derived": null > }, > { > "name": "LSTG_SITE_ID", > "table": "KYLIN_SALES", > "column": "LSTG_SITE_ID", > "derived": null > }, > { > "name": "CAL_DT", > "table": "KYLIN_CAL_DT", > "column": null, > "derived": [ > "CAL_DT" > ] > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "LEAF_CATEG_ID" > ] > }, > { > "name": "USER_DEFINED_FIELD1", > "table": "KYLIN_CATEGORY_GROUPINGS", > "
[jira] [Resolved] (KYLIN-2992) Avoid OOM in CubeHFileJob.Reducer
[ https://issues.apache.org/jira/browse/KYLIN-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-2992. --- Resolution: Fixed > Avoid OOM in CubeHFileJob.Reducer > -- > > Key: KYLIN-2992 > URL: https://issues.apache.org/jira/browse/KYLIN-2992 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.3.0 > > > Refer to HBASE-13897, we also could improve CubeHFileJob.Reducer and avoid > OOM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2997) Allow change engineType even if there are segments in cube
[ https://issues.apache.org/jira/browse/KYLIN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2997: -- Attachment: KYLIN-2997.patch This is the patch > Allow change engineType even if there are segments in cube > -- > > Key: KYLIN-2997 > URL: https://issues.apache.org/jira/browse/KYLIN-2997 > Project: Kylin > Issue Type: Bug > Components: Metadata, Web >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2997.patch > > > Currently, the cube signature contains engineType, if user want to switch > engine, they must purge the cube firstly. I think which is unreasonable > because the engine doesn't effect query and existing segments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2996) DeployCoprocessorCLI Log failed tables info
[ https://issues.apache.org/jira/browse/KYLIN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2996: -- Attachment: KYLIN-2996.patch This is the patch > DeployCoprocessorCLI Log failed tables info > --- > > Key: KYLIN-2996 > URL: https://issues.apache.org/jira/browse/KYLIN-2996 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Trivial > Attachments: KYLIN-2996.patch > > > Currently, updating coprocessor will be less likely to fail, we should tell > user the info in final output. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2993) Add special mr config for base cuboid step
[ https://issues.apache.org/jira/browse/KYLIN-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2993: -- Attachment: KYLIN-2993.patch This is the patch > Add special mr config for base cuboid step > -- > > Key: KYLIN-2993 > URL: https://issues.apache.org/jira/browse/KYLIN-2993 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2993.patch > > > Refer to http://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/, > currently, if user want to enlarge MR memory for global dict, they must use > kylin.engine.mr.config-override., which will enlarge the memory of all mr > job. In fact, we only need to enlarge the memory for "Build Base Cuboid", so > we could add a special mr config for base cuboid step. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2994) Handle NPE when load dict in DictionaryManager
[ https://issues.apache.org/jira/browse/KYLIN-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2994: -- Attachment: KYLIN-2994.patch This is the patch. > Handle NPE when load dict in DictionaryManager > -- > > Key: KYLIN-2994 > URL: https://issues.apache.org/jira/browse/KYLIN-2994 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2994.patch > > > Currently, the argument {{resourcePath}} in > {{DictionaryManager.getDictionaryInfo}} could be NULL -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276755#comment-16276755 ] kangkaisen commented on KYLIN-2604: --- Hi, Zhixiong. If you don't have any other question or advice. I will merge KYLIN-2602-Non-Int-type-precise-count-distinct-measure-must-set-advanced-dict.patch to master. > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.2.0 > > Attachments: > KYLIN-2602-Non-Int-type-precise-count-distinct-measure-must-set-advanced-dict.patch, > KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2604) Use global dict as the default encoding for precise distinct count in web
[ https://issues.apache.org/jira/browse/KYLIN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16277914#comment-16277914 ] kangkaisen commented on KYLIN-2604: --- OK, Thanks Zhixiong. I have merged the second patch to master. > Use global dict as the default encoding for precise distinct count in web > - > > Key: KYLIN-2604 > URL: https://issues.apache.org/jira/browse/KYLIN-2604 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.2.0 > > Attachments: > KYLIN-2602-Non-Int-type-precise-count-distinct-measure-must-set-advanced-dict.patch, > KYLIN-2604.patch > > > we should use global dict as the default encoding for precise distinct count > in web, which more easy-to-use for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2995) Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cuing
[ https://issues.apache.org/jira/browse/KYLIN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2995: -- Attachment: KYLIN-2995.patch This is the patch. This patch has passed IT. > Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cuing > - > > Key: KYLIN-2995 > URL: https://issues.apache.org/jira/browse/KYLIN-2995 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2995.patch > > > Currenly, we load metadata from HDFS in > SparkCubing:{{AbstractHadoopJob.loadKylinConfigFromHdfs}}, But HadoopUtil > will use new Configuration, we should use SparkContext.hadoopConfiguration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2999: -- Attachment: KYLIN-2999.patch This is the patch > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3055) NullPointerException in MutableRoaringBitmap.or
[ https://issues.apache.org/jira/browse/KYLIN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280098#comment-16280098 ] kangkaisen commented on KYLIN-3055: --- Hi Chuqian, This is the commit: https://github.com/apache/kylin/commit/9265e150d80519d3e4f532c5f106e6718543daba. Thanks you. Welcome more contributions! > NullPointerException in MutableRoaringBitmap.or > --- > > Key: KYLIN-3055 > URL: https://issues.apache.org/jira/browse/KYLIN-3055 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: yuchuqian >Assignee: yuchuqian > Fix For: v2.3.0 > > Attachments: KYLIN-3055.patch > > > 2017-11-21 19:55:17,363 ERROR [Query > b1fbcd45-6524-4b1e-8844-1d6d6277a1bf-120] service.QueryService:459 : > Exception while executing query > java.sql.SQLException: Error while executing SQL "select part_dt, > intersect_count(item_count, part_dt, array[date'2012-01-01']) as first_day, > intersect_count(item_count, part_dt, array[date'2012-01-02']) as second_day, > intersect_count(item_count, part_dt, array[date'2012-01-03']) as third_day, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02']) as retention_oneday, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02',date'2012-01-03']) as retention_twoday > from kylin_sales > where part_dt in (date'2012-01-01',date'2012-01-02',date'2012-01-03') > group by PART_DT > LIMIT 5": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) > at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:834) > at > org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:561) > at org.apache.kylin.rest.service.QueryService.query(QueryService.java:181) > at > org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:415) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > .. > Caused by: java.lang.NullPointerException > at > org.roaringbitmap.buffer.MutableRoaringBitmap.or(MutableRoaringBitmap.java:1041) > at > org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc$RetentionPartialResult.add(BitmapIntersectDistinctCountAggFunc.java:57) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc.add(BitmapIntersectDistinctCountAggFunc.java:90) > at Baz$4.apply(ANONYMOUS.java:136) > at Baz$4.apply(ANONYMOUS.java:158) > at Baz$4.apply(ANONYMOUS.java) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:832) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Baz.java:99) > How to re-produce: > 1. run $KYLIN_HOME/bin/sample.sh > 2. then create a cube like > { > "uuid": "9554f6f6-74dc-489e-b780-2f48f281576c", > "last_modified": 1511247707372, > "version": "2.2.0.0", > "name": "test", > "is_draft": false, > "model_name": "kylin_sales_model", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "PART_DT", > "table": "KYLIN_SALES", > "column": "PART_DT", > "derived": null > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_SALES", > "column": "LEAF_CATEG_ID", > "derived": null > }, > { > "name": "LSTG_SITE_ID", > "table": "KYLIN_SALES", > "column": "LSTG_SITE_ID", > "derived": null > }, > { > "name": "CAL_DT", > "table": "KYLIN_CAL_DT", > "column": null, > "derived": [ > "CAL_DT" > ] > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "LEAF_CATEG_ID" > ] > }, > { > "name": "USER_DEFINED_FIELD1", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_DEFINED_FIELD1" > ]
[jira] [Resolved] (KYLIN-3055) NullPointerException in MutableRoaringBitmap.or
[ https://issues.apache.org/jira/browse/KYLIN-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-3055. --- Resolution: Fixed > NullPointerException in MutableRoaringBitmap.or > --- > > Key: KYLIN-3055 > URL: https://issues.apache.org/jira/browse/KYLIN-3055 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: yuchuqian >Assignee: yuchuqian > Fix For: v2.3.0 > > Attachments: KYLIN-3055.patch > > > 2017-11-21 19:55:17,363 ERROR [Query > b1fbcd45-6524-4b1e-8844-1d6d6277a1bf-120] service.QueryService:459 : > Exception while executing query > java.sql.SQLException: Error while executing SQL "select part_dt, > intersect_count(item_count, part_dt, array[date'2012-01-01']) as first_day, > intersect_count(item_count, part_dt, array[date'2012-01-02']) as second_day, > intersect_count(item_count, part_dt, array[date'2012-01-03']) as third_day, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02']) as retention_oneday, > intersect_count(item_count, part_dt, > array[date'2012-01-01',date'2012-01-02',date'2012-01-03']) as retention_twoday > from kylin_sales > where part_dt in (date'2012-01-01',date'2012-01-02',date'2012-01-03') > group by PART_DT > LIMIT 5": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218) > at org.apache.kylin.rest.service.QueryService.execute(QueryService.java:834) > at > org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:561) > at org.apache.kylin.rest.service.QueryService.query(QueryService.java:181) > at > org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:415) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > .. > Caused by: java.lang.NullPointerException > at > org.roaringbitmap.buffer.MutableRoaringBitmap.or(MutableRoaringBitmap.java:1041) > at > org.apache.kylin.measure.bitmap.RoaringBitmapCounter.orWith(RoaringBitmapCounter.java:72) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc$RetentionPartialResult.add(BitmapIntersectDistinctCountAggFunc.java:57) > at > org.apache.kylin.measure.bitmap.BitmapIntersectDistinctCountAggFunc.add(BitmapIntersectDistinctCountAggFunc.java:90) > at Baz$4.apply(ANONYMOUS.java:136) > at Baz$4.apply(ANONYMOUS.java:158) > at Baz$4.apply(ANONYMOUS.java) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:832) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Baz.java:99) > How to re-produce: > 1. run $KYLIN_HOME/bin/sample.sh > 2. then create a cube like > { > "uuid": "9554f6f6-74dc-489e-b780-2f48f281576c", > "last_modified": 1511247707372, > "version": "2.2.0.0", > "name": "test", > "is_draft": false, > "model_name": "kylin_sales_model", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "PART_DT", > "table": "KYLIN_SALES", > "column": "PART_DT", > "derived": null > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_SALES", > "column": "LEAF_CATEG_ID", > "derived": null > }, > { > "name": "LSTG_SITE_ID", > "table": "KYLIN_SALES", > "column": "LSTG_SITE_ID", > "derived": null > }, > { > "name": "CAL_DT", > "table": "KYLIN_CAL_DT", > "column": null, > "derived": [ > "CAL_DT" > ] > }, > { > "name": "LEAF_CATEG_ID", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "LEAF_CATEG_ID" > ] > }, > { > "name": "USER_DEFINED_FIELD1", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_DEFINED_FIELD1" > ] > }, > { > "name": "USER_DEFINED_FIELD3", > "table": "KYLIN_CATEGORY_GROUPINGS", > "column": null, > "derived": [ > "USER_DEFINED_FIELD3" > ]
[jira] [Commented] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280117#comment-16280117 ] kangkaisen commented on KYLIN-2999: --- The this feature has the following set of configurations: ||Property||Description||Default value||isRequired|| |kylin.tool.auto-migrate-cube.enabled|Whether enable this feature|false|true| |kylin.tool.auto-migrate-cube.src-config |The kylin.properties file path for source server|""|true| |kylin.tool.auto-migrate-cube.dest-config |The kylin.properties file path for target server|""|true| |kylin.tool.auto-migrate-cube.copy-acl|Whether copy cube ACL to target server|true|false| |kylin.tool.auto-migrate-cube.purge-src-cube |Whether purge the cube from src server after the migration |true|false| > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280123#comment-16280123 ] kangkaisen commented on KYLIN-2999: --- The Kylin Admin could enable this feature project by project (even cube level) according to their user's familiarity. > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2995) Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing
[ https://issues.apache.org/jira/browse/KYLIN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281254#comment-16281254 ] kangkaisen commented on KYLIN-2995: --- Not about performance, It's a bug. Like the method {{bindCurrentConfiguration}} in {{KylinMapper}} and {{KylinReducer}}, All MR job must call this method first, Because we must ensure we use the {{context.getConfiguration()}} for HDFS, not the default Configuration. It's the same thing in Spark. For example, If the following config exists in Kylin server's mountTable.xml, doesn't exists in DN node's mountTable.xml. When Kylin Spark job visit hdfs:///kylin, The {{FileNotFoundException}} will throw. {code:java} fs.viewfs.mounttable..link./kylin hdfs:///kylin {code} > Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing > -- > > Key: KYLIN-2995 > URL: https://issues.apache.org/jira/browse/KYLIN-2995 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2995.patch > > > Currenly, we load metadata from HDFS in > SparkCubing:{{AbstractHadoopJob.loadKylinConfigFromHdfs}}, But HadoopUtil > will use new Configuration, we should use SparkContext.hadoopConfiguration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3087) DistributedLock in GlobalDictionaryBuilder may not release
[ https://issues.apache.org/jira/browse/KYLIN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283535#comment-16283535 ] kangkaisen commented on KYLIN-3087: --- Hi Fangyuan, Thanks you. This patch looks good to me, But doesn't have your author info. Please re-submit a new patch with your author info by following the guide here: https://kylin.apache.org/development/howto_contribute.html. I will merge your patch to master branch, Thanks you. > DistributedLock in GlobalDictionaryBuilder may not release > -- > > Key: KYLIN-3087 > URL: https://issues.apache.org/jira/browse/KYLIN-3087 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.2.0 >Reporter: Fangyuan Deng >Assignee: Fangyuan Deng > Attachments: KYLIN-3087.patch > > > In GlobalDictionaryBuilder.init(), > this.builder = new AppendTrieDictionaryBuilder(baseDir, maxEntriesPerSlice, > true); > if this line has exception, the DistributedLock will not release, and other > jobs can not run. > so,I added a try catch. > try { > this.builder = new AppendTrieDictionaryBuilder(baseDir, > maxEntriesPerSlice, true); > } catch (Throwable e) { > lock.unlock(getLockPath(sourceColumn)); > throw new RuntimeException(String.format("Failed to create global > dictionary on %s ", sourceColumn), e); > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2997) Allow change engineType even if there are segments in cube
[ https://issues.apache.org/jira/browse/KYLIN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287299#comment-16287299 ] kangkaisen commented on KYLIN-2997: --- I think this patch won't. For {{checkSignature}} method: {code:java} if (!kylinVersion.isCompatibleWith(cubeVersion)) { logger.info("checkSignature on {} is skipped as the its version {} is different from kylin version {}", getName(), cubeVersion, kylinVersion); return true; } {code} For {{consistentWith}} method, {{calculateSignature}} won't include engineType any more. > Allow change engineType even if there are segments in cube > -- > > Key: KYLIN-2997 > URL: https://issues.apache.org/jira/browse/KYLIN-2997 > Project: Kylin > Issue Type: Bug > Components: Metadata, Web >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2997.patch > > > Currently, the cube signature contains engineType, if user want to switch > engine, they must purge the cube firstly. I think which is unreasonable > because the engine doesn't effect query and existing segments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3089) Query exception on SortedIteratorMergerWithLimit
[ https://issues.apache.org/jira/browse/KYLIN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288656#comment-16288656 ] kangkaisen commented on KYLIN-3089: --- Hi, Yang hao. Thanks you report this issue. I think the root cause is for fixed length string, the Comparator in SortMergedPartitionResultIterator is different from the Comparator in SortedIteratorMergerWithLimit. I think we could fix this bug by disable limit push down for fixed length string. Please go ahead, Thanks you. > Query exception on SortedIteratorMergerWithLimit > > > Key: KYLIN-3089 > URL: https://issues.apache.org/jira/browse/KYLIN-3089 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.1.0 >Reporter: Yang Hao > > The executing error only exists on some special case. I have a simple sql, > and the query is routing onto SortedIteratorMergerWithLimit. When iterate > data, it triggers such error > {code:java} >//TODO: remove this check when validated > if (last != null) { > if (comparator.compare(last, fetched) > 0) > throw new IllegalStateException("Not sorted! last: " + > last + " fetched: " + fetched); > } > {code} > sql is as belows. > {code:java} > select "DATE",appid,dim_1,dim_2, sum(uv) as uv > from table_1 > where appid = and "DATE" = 2017 > group by "DATE",appid,dim_1,dim_2 > limit 5 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2996) DeployCoprocessorCLI Log failed tables info
[ https://issues.apache.org/jira/browse/KYLIN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290467#comment-16290467 ] kangkaisen commented on KYLIN-2996: --- OK,Thanks Liyang! > DeployCoprocessorCLI Log failed tables info > --- > > Key: KYLIN-2996 > URL: https://issues.apache.org/jira/browse/KYLIN-2996 > Project: Kylin > Issue Type: Improvement > Components: Storage - HBase >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Trivial > Fix For: v2.3.0 > > Attachments: KYLIN-2996.patch > > > Currently, updating coprocessor will be less likely to fail, we should tell > user the info in final output. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2994) Handle NPE when load dict in DictionaryManager
[ https://issues.apache.org/jira/browse/KYLIN-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290469#comment-16290469 ] kangkaisen commented on KYLIN-2994: --- OK, Thanks Liyang! > Handle NPE when load dict in DictionaryManager > -- > > Key: KYLIN-2994 > URL: https://issues.apache.org/jira/browse/KYLIN-2994 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-2994.patch > > > Currently, the argument {{resourcePath}} in > {{DictionaryManager.getDictionaryInfo}} could be NULL -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2993) Add special mr config for base cuboid step
[ https://issues.apache.org/jira/browse/KYLIN-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290471#comment-16290471 ] kangkaisen commented on KYLIN-2993: --- OK, Thanks Liyang! > Add special mr config for base cuboid step > -- > > Key: KYLIN-2993 > URL: https://issues.apache.org/jira/browse/KYLIN-2993 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: v2.1.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.3.0 > > Attachments: KYLIN-2993.patch > > > Refer to http://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/, > currently, if user want to enlarge MR memory for global dict, they must use > kylin.engine.mr.config-override., which will enlarge the memory of all mr > job. In fact, we only need to enlarge the memory for "Build Base Cuboid", so > we could add a special mr config for base cuboid step. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-3091) A problem about retention rate analyze
[ https://issues.apache.org/jira/browse/KYLIN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen reassigned KYLIN-3091: - Assignee: kangkaisen (was: Yerui Sun) > A problem about retention rate analyze > -- > > Key: KYLIN-3091 > URL: https://issues.apache.org/jira/browse/KYLIN-3091 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 > Environment: hbase 0.98.8-hadoop2 >Reporter: WangSheng >Assignee: kangkaisen > > I found that kylin supported retention rate analyze function, so I made some > test for this function. The following SQL executed successful: > {code:java} > select city, version, > intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday, > intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as > retention_twoday > from visit_log > where dt in ('2016104', '20161015', '20161016') > group by city, version > {code} > but, other SQLs executed failed like this: > {code:java} > select city, > intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday > from visit_log > where dt in ('2016104', '20161015',) > group by city, version > select city, version, > intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as > retention_twoday > from visit_log > where dt in ('2016104', '20161015', '20161016') > group by city, version > {code} > which means I cannot use just one intersect_count UDAF in a SQL, at lease two > intersect_count. My kylin version is kylin 2.0.0-hbase 0.98.8, and here is > the error log: > {code:java} > Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.kylin.query.relnode.ColumnRowType.getColumnByIndex(ColumnRowType.java:49) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.fillbackOptimizedColumn(OLAPAggregateRel.java:396) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:347) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:283) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:107) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:100) > at > org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108) > at > org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) > at > org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1248) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:306) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:203) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:776) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:595) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3087) DistributedLock in GlobalDictionaryBuilder may not release
[ https://issues.apache.org/jira/browse/KYLIN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292142#comment-16292142 ] kangkaisen commented on KYLIN-3087: --- This is the commit: https://github.com/apache/kylin/commit/60431f46494aaa1297d8da87bbf49bc78312fcb4. Thanks Fangyuan. > DistributedLock in GlobalDictionaryBuilder may not release > -- > > Key: KYLIN-3087 > URL: https://issues.apache.org/jira/browse/KYLIN-3087 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.2.0 >Reporter: Fangyuan Deng >Assignee: Fangyuan Deng > Fix For: v2.3.0 > > Attachments: KYLIN-3087.1.patch, KYLIN-3087.patch > > > In GlobalDictionaryBuilder.init(), > this.builder = new AppendTrieDictionaryBuilder(baseDir, maxEntriesPerSlice, > true); > if this line has exception, the DistributedLock will not release, and other > jobs can not run. > so,I added a try catch. > try { > this.builder = new AppendTrieDictionaryBuilder(baseDir, > maxEntriesPerSlice, true); > } catch (Throwable e) { > lock.unlock(getLockPath(sourceColumn)); > throw new RuntimeException(String.format("Failed to create global > dictionary on %s ", sourceColumn), e); > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KYLIN-3087) DistributedLock in GlobalDictionaryBuilder may not release
[ https://issues.apache.org/jira/browse/KYLIN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-3087. --- Resolution: Fixed > DistributedLock in GlobalDictionaryBuilder may not release > -- > > Key: KYLIN-3087 > URL: https://issues.apache.org/jira/browse/KYLIN-3087 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.2.0 >Reporter: Fangyuan Deng >Assignee: Fangyuan Deng > Fix For: v2.3.0 > > Attachments: KYLIN-3087.1.patch, KYLIN-3087.patch > > > In GlobalDictionaryBuilder.init(), > this.builder = new AppendTrieDictionaryBuilder(baseDir, maxEntriesPerSlice, > true); > if this line has exception, the DistributedLock will not release, and other > jobs can not run. > so,I added a try catch. > try { > this.builder = new AppendTrieDictionaryBuilder(baseDir, > maxEntriesPerSlice, true); > } catch (Throwable e) { > lock.unlock(getLockPath(sourceColumn)); > throw new RuntimeException(String.format("Failed to create global > dictionary on %s ", sourceColumn), e); > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-3113) Editing Measure supports fuzzy search in web
kangkaisen created KYLIN-3113: - Summary: Editing Measure supports fuzzy search in web Key: KYLIN-3113 URL: https://issues.apache.org/jira/browse/KYLIN-3113 Project: Kylin Issue Type: Improvement Components: Web Affects Versions: v2.2.0 Reporter: kangkaisen Assignee: kangkaisen After Kylin 2.0, the column in web contains table name and column name, so the prefixal search is useless, which is a bad user experience. So we should support fuzzy search when editing measure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3113) Editing Measure supports fuzzy search in web
[ https://issues.apache.org/jira/browse/KYLIN-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-3113: -- Attachment: KYLIN-3113.patch This is the patch. > Editing Measure supports fuzzy search in web > > > Key: KYLIN-3113 > URL: https://issues.apache.org/jira/browse/KYLIN-3113 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v2.2.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-3113.patch > > > After Kylin 2.0, the column in web contains table name and column name, so > the prefixal search is useless, which is a bad user experience. So we should > support fuzzy search when editing measure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3091) A problem about retention rate analyze
[ https://issues.apache.org/jira/browse/KYLIN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292431#comment-16292431 ] kangkaisen commented on KYLIN-3091: --- This commit has fixed this bug: https://github.com/apache/kylin/commit/6b4f70d257e1eb363a7b792cde8f6f59821094a6 I added a test case for this bug: https://github.com/apache/kylin/commit/f0e5e376d6466891873514f76c7b34c73c0ea28f > A problem about retention rate analyze > -- > > Key: KYLIN-3091 > URL: https://issues.apache.org/jira/browse/KYLIN-3091 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 > Environment: hbase 0.98.8-hadoop2 >Reporter: WangSheng >Assignee: kangkaisen > > I found that kylin supported retention rate analyze function, so I made some > test for this function. The following SQL executed successful: > {code:java} > select city, version, > intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday, > intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as > retention_twoday > from visit_log > where dt in ('2016104', '20161015', '20161016') > group by city, version > {code} > but, other SQLs executed failed like this: > {code:java} > select city, > intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday > from visit_log > where dt in ('2016104', '20161015',) > group by city, version > select city, version, > intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as > retention_twoday > from visit_log > where dt in ('2016104', '20161015', '20161016') > group by city, version > {code} > which means I cannot use just one intersect_count UDAF in a SQL, at lease two > intersect_count. My kylin version is kylin 2.0.0-hbase 0.98.8, and here is > the error log: > {code:java} > Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.kylin.query.relnode.ColumnRowType.getColumnByIndex(ColumnRowType.java:49) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.fillbackOptimizedColumn(OLAPAggregateRel.java:396) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:347) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:283) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:107) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:100) > at > org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108) > at > org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) > at > org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1248) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:306) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:203) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:776) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:595) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KYLIN-3091) A problem about retention rate analyze
[ https://issues.apache.org/jira/browse/KYLIN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen resolved KYLIN-3091. --- Resolution: Resolved Fix Version/s: v2.2.0 > A problem about retention rate analyze > -- > > Key: KYLIN-3091 > URL: https://issues.apache.org/jira/browse/KYLIN-3091 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.0.0 > Environment: hbase 0.98.8-hadoop2 >Reporter: WangSheng >Assignee: kangkaisen > Fix For: v2.2.0 > > > I found that kylin supported retention rate analyze function, so I made some > test for this function. The following SQL executed successful: > {code:java} > select city, version, > intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday, > intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as > retention_twoday > from visit_log > where dt in ('2016104', '20161015', '20161016') > group by city, version > {code} > but, other SQLs executed failed like this: > {code:java} > select city, > intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday > from visit_log > where dt in ('2016104', '20161015',) > group by city, version > select city, version, > intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as > retention_twoday > from visit_log > where dt in ('2016104', '20161015', '20161016') > group by city, version > {code} > which means I cannot use just one intersect_count UDAF in a SQL, at lease two > intersect_count. My kylin version is kylin 2.0.0-hbase 0.98.8, and here is > the error log: > {code:java} > Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.kylin.query.relnode.ColumnRowType.getColumnByIndex(ColumnRowType.java:49) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.fillbackOptimizedColumn(OLAPAggregateRel.java:396) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:347) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:283) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:107) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:100) > at > org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108) > at > org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) > at > org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1248) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:306) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:203) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:776) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:595) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)