[jira] [Commented] (KYLIN-3271) Optimize sub-path check of ResourceTool
[ https://issues.apache.org/jira/browse/KYLIN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394406#comment-16394406 ] ASF subversion and git services commented on KYLIN-3271: Commit 24d2d6590d1e31edb5f5b7da11c51c477f9d2dcf in kylin's branch refs/heads/sync from [~nichunen] [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=24d2d65 ] KYLIN-3271 Optimize sub-path check of ResourceTool > Optimize sub-path check of ResourceTool > --- > > Key: KYLIN-3271 > URL: https://issues.apache.org/jira/browse/KYLIN-3271 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.2.0 >Reporter: nichunen >Assignee: nichunen >Priority: Minor > Fix For: v2.4.0 > > > kylin uses class org.apache.kylin.common.persistence.ResourceTool to do > metadata download, upload, remove, etc. The algorithm for resource > transversal is not very effective. For instance, for an "execute_output" with > key "/execute_output/\{uuid}", the algorithm will try to check whether it's a > folder with sub-resources, this makes un-necessary time cost, and in cases of > metadata with lots of jobs, it may last for a long time before the finish. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3174) Default scheduler enhancement
[ https://issues.apache.org/jira/browse/KYLIN-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394409#comment-16394409 ] ASF subversion and git services commented on KYLIN-3174: Commit 06f497eb603134a6b0ddbfd88739ccaddab007fb in kylin's branch refs/heads/sync from [~Aron.tao] [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=06f497e ] KYLIN-3174, Default scheduler enhancement. > Default scheduler enhancement > - > > Key: KYLIN-3174 > URL: https://issues.apache.org/jira/browse/KYLIN-3174 > Project: Kylin > Issue Type: Improvement >Reporter: jiatao.tao >Assignee: jiatao.tao >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3271) Optimize sub-path check of ResourceTool
[ https://issues.apache.org/jira/browse/KYLIN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394408#comment-16394408 ] ASF subversion and git services commented on KYLIN-3271: Commit c2ef3c62a96ca52ef673335f621232d6d664c24a in kylin's branch refs/heads/sync from [~nichunen] [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c2ef3c6 ] KYLIN-3271 Minor, avoid null pointer exception > Optimize sub-path check of ResourceTool > --- > > Key: KYLIN-3271 > URL: https://issues.apache.org/jira/browse/KYLIN-3271 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.2.0 >Reporter: nichunen >Assignee: nichunen >Priority: Minor > Fix For: v2.4.0 > > > kylin uses class org.apache.kylin.common.persistence.ResourceTool to do > metadata download, upload, remove, etc. The algorithm for resource > transversal is not very effective. For instance, for an "execute_output" with > key "/execute_output/\{uuid}", the algorithm will try to check whether it's a > folder with sub-resources, this makes un-necessary time cost, and in cases of > metadata with lots of jobs, it may last for a long time before the finish. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3275) Add unit test for StorageCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394405#comment-16394405 ] ASF subversion and git services commented on KYLIN-3275: Commit 9ab6b52c1a40a69e6793a76e3fc2f53d18d1b57e in kylin's branch refs/heads/sync from [~Aron.tao] [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=9ab6b52 ] KYLIN-3275, add ut for storage clean job. > Add unit test for StorageCleanupJob > --- > > Key: KYLIN-3275 > URL: https://issues.apache.org/jira/browse/KYLIN-3275 > Project: Kylin > Issue Type: Improvement >Reporter: jiatao.tao >Assignee: jiatao.tao >Priority: Trivial > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3285) "Value NNN not exists" error run executing query
[ https://issues.apache.org/jira/browse/KYLIN-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394375#comment-16394375 ] Shaofeng SHI commented on KYLIN-3285: - Another workround would be: use "date" encoding for the "day_time" column. By default Kylin uses "Date" encoding for a column whose type is "Date" or "Timestamp". "Date" encoding will convert a "-MM-dd" value to a long value so it doesn't need a dictionary. In your case, as the col is in "string" type, and you selected "dict" for it, so Kylin doesn't know it is a date, then use a dictionary for that. > "Value NNN not exists" error run executing query > - > > Key: KYLIN-3285 > URL: https://issues.apache.org/jira/browse/KYLIN-3285 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.3.0 >Reporter: Shaofeng SHI >Priority: Major > Attachments: cube (1).json, kylin.log > > > Reported by community user zxxb...@163.com: > > {color:#b94a48}Kylin version update from 2.2.0 to 2.3.0; cube was build > before update, and query without error.{color} > {color:#b94a48}After updatge to version 2.3.0 and query {color}"select > count(userid) num,day_time from record_ap group by day_time LIMIT > 1000{color:#b94a48}”{color} > {color:#b94a48}show errors below:{color} > Column 0 value '2018-03-06' met dictionary error: Value '2018-03-06' > (2018-03-06) not exists! while executing SQL: "select count(userid) > num,day_time from record_ap group by day_time LIMIT 1000” > > {color:#b94a48}Where {color}{color:#b94a48}'2018-03-06’ come from? I was > fully confused.{color} > {color:#b94a48}Cube have several segments:{color} > {color:#b94a48}Starttime-endtime{color} > {color:#b94a48}20180101-20180225{color} > 20180225-20180304 > > Attached the log and cube JSON. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3285) "Value NNN not exists" error run executing query
[ https://issues.apache.org/jira/browse/KYLIN-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394374#comment-16394374 ] Shaofeng SHI commented on KYLIN-3285: - >From the log, the root cause is a value not be encoded by the dictionary. The >value '2018-03-01' is the start of a cube segment; The partition col >'day_time' is in type of "string", and the selected encoding is "dict". But as >Xixin mentioned, there is records for '2018-03-01'. Need further >inviestigation about how "shard by" can impact on this. {code:java} Caused by: java.lang.IllegalArgumentException: Column 1 value '2018-03-01' met dictionary error: Value '2018-03-01' (2018-03-01) not exists! at org.apache.kylin.dict.TrieDictionaryForest.getIdFromValueBytesWithoutCache(TrieDictionaryForest.java:127) at org.apache.kylin.dict.CacheDictionary.getIdFromValueImpl(CacheDictionary.java:63) at org.apache.kylin.common.util.Dictionary.getIdFromValue(Dictionary.java:102) at org.apache.kylin.dimension.DictionaryDimEnc$DictionarySerializer.serialize(DictionaryDimEnc.java:127) at org.apache.kylin.cube.gridtable.CubeCodeSystem.encodeColumnValue(CubeCodeSystem.java:124) at org.apache.kylin.cube.gridtable.SegmentGTStartAndEnd.encodeTime(SegmentGTStartAndEnd.java:85) at org.apache.kylin.cube.gridtable.SegmentGTStartAndEnd.getSegmentStartAndEnd(SegmentGTStartAndEnd.java:51) at org.apache.kylin.storage.gtrecord.CubeScanRangePlanner.(CubeScanRangePlanner.java:114) at org.apache.kylin.storage.gtrecord.CubeSegmentScanner.(CubeSegmentScanner.java:73) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:93) at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:62) {code} > "Value NNN not exists" error run executing query > - > > Key: KYLIN-3285 > URL: https://issues.apache.org/jira/browse/KYLIN-3285 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.3.0 >Reporter: Shaofeng SHI >Priority: Major > Attachments: cube (1).json, kylin.log > > > Reported by community user zxxb...@163.com: > > {color:#b94a48}Kylin version update from 2.2.0 to 2.3.0; cube was build > before update, and query without error.{color} > {color:#b94a48}After updatge to version 2.3.0 and query {color}"select > count(userid) num,day_time from record_ap group by day_time LIMIT > 1000{color:#b94a48}”{color} > {color:#b94a48}show errors below:{color} > Column 0 value '2018-03-06' met dictionary error: Value '2018-03-06' > (2018-03-06) not exists! while executing SQL: "select count(userid) > num,day_time from record_ap group by day_time LIMIT 1000” > > {color:#b94a48}Where {color}{color:#b94a48}'2018-03-06’ come from? I was > fully confused.{color} > {color:#b94a48}Cube have several segments:{color} > {color:#b94a48}Starttime-endtime{color} > {color:#b94a48}20180101-20180225{color} > 20180225-20180304 > > Attached the log and cube JSON. -- This message was sent by Atlassian JIRA (v7.6.3#76005)