[jira] [Created] (KYLIN-4782) Verify if the query hit the true cuboid in IT
wangrupeng created KYLIN-4782: - Summary: Verify if the query hit the true cuboid in IT Key: KYLIN-4782 URL: https://issues.apache.org/jira/browse/KYLIN-4782 Project: Kylin Issue Type: Improvement Components: Integration Affects Versions: v4.0.0-alpha Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng edited comment on KYLIN-4776 at 9/28/20, 6:53 AM: - ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|Not passed| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|Yes|No need|No| |KYLIN-4656|No Need|No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| was (Author: wangrupeng): ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|Not passed| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No Need|No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng edited comment on KYLIN-4776 at 9/28/20, 3:08 AM: - ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|Not passed| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No Need|No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| was (Author: wangrupeng): ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|Not passed| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No |No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng edited comment on KYLIN-4776 at 9/28/20, 2:47 AM: - ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|Not passed| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No |No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| was (Author: wangrupeng): ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|No| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No |No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng edited comment on KYLIN-4776 at 9/28/20, 1:55 AM: - ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|No| |KYLIN-4709|No need|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No |No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| was (Author: wangrupeng): ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|No| |KYLIN-4709|Yes|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No |No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng edited comment on KYLIN-4776 at 9/28/20, 1:53 AM: - ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|Yes|Yes|No| |KYLIN-4709|Yes|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No |No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No need|No need|No| was (Author: wangrupeng): ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|No|Yes|No| |KYLIN-4709|No|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No|No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng edited comment on KYLIN-4776 at 9/27/20, 12:59 PM: -- ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|No|Yes|No| |KYLIN-4709|No|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No|No need|No| |KYLIN-4648|Yes|No need|No| |KYLIN-4581|No|No need|No| was (Author: wangrupeng): ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|No|Yes|No| |KYLIN-4709|No|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No|No need|No| |KYLIN-4648|No|No need|No| |KYLIN-4581|No|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4776) Release Kylin v3.1.1
[ https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823 ] wangrupeng commented on KYLIN-4776: --- ||Issue ID||Verified?||Documentation updated?||Others|| |KYLIN-4712|No|Yes|No| |KYLIN-4709|No|No need|No| |KYLIN-4688|Yes|No need|No| |KYLIN-4657|No|No need|No| |KYLIN-4656|No|No need|No| |KYLIN-4648|No|No need|No| |KYLIN-4581|No|No need|No| > Release Kylin v3.1.1 > > > Key: KYLIN-4776 > URL: https://issues.apache.org/jira/browse/KYLIN-4776 > Project: Kylin > Issue Type: Test > Components: Release >Affects Versions: v3.1.0 >Reporter: Xiaoxiang Yu >Assignee: Xiaoxiang Yu >Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > h2. Release Plan for Kylin v3.1.1 > > ||Information||Heading 2|| > |Release Manager|Xiaoxiang Yu| > |Voting Date|2020/10/15| > h3. Issue List > https://issues.apache.org/jira/projects/KYLIN/versions/12348354 > h3. Issue Verification Assignee > ||Assignee ||Issue||Count|| > |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > tianhui5 OR assignee = xxyu )|9| > |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = > gxcheng )|13| > |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = > itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = > julianpan )|10| > |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = > xiaoge )|14| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4725) NSparkCubingStep returns error state when pause build job
[ https://issues.apache.org/jira/browse/KYLIN-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4725: -- Fix Version/s: (was: v3.1.1) v4.0.0-alpha > NSparkCubingStep returns error state when pause build job > - > > Key: KYLIN-4725 > URL: https://issues.apache.org/jira/browse/KYLIN-4725 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v4.0.0-alpha >Reporter: Zhichao Zhang >Assignee: Yaqian Zhang >Priority: Major > Fix For: v4.0.0-alpha > > > When pause a build job, NSparkCubingStep returns ExecuteResult.State.ERROR > state, it must be ExecuteResult.State.STOPPED. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4765) Set spark.sql.shuffle.partition to 1 for debug on local
wangrupeng created KYLIN-4765: - Summary: Set spark.sql.shuffle.partition to 1 for debug on local Key: KYLIN-4765 URL: https://issues.apache.org/jira/browse/KYLIN-4765 Project: Kylin Issue Type: Improvement Reporter: wangrupeng Assignee: wangrupeng Currently, spark.sql.shuffle.partition will be set automatically in cluster mode, but it will use the default value of spark.sql.shuffle.partition which is 200 when debug on local and the build will be slow and with low effeciency. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4760) Optimize TopN measure
[ https://issues.apache.org/jira/browse/KYLIN-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng reassigned KYLIN-4760: - Component/s: Measure - TopN Fix Version/s: v4.0.0-beta Affects Version/s: v4.0.0-alpha Assignee: wangrupeng Description: Now, each time the buffer of topn update function insert one row, it will be resort which will slow down the build. Summary: Optimize TopN measure (was: Optimize TopN) > Optimize TopN measure > - > > Key: KYLIN-4760 > URL: https://issues.apache.org/jira/browse/KYLIN-4760 > Project: Kylin > Issue Type: Improvement > Components: Measure - TopN >Affects Versions: v4.0.0-alpha >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > Now, each time the buffer of topn update function insert one row, it will be > resort which will slow down the build. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4760) Optimize TopN
wangrupeng created KYLIN-4760: - Summary: Optimize TopN Key: KYLIN-4760 URL: https://issues.apache.org/jira/browse/KYLIN-4760 Project: Kylin Issue Type: Improvement Reporter: wangrupeng -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KYLIN-4459) Continuous print warning log-DFSInputStream has been closed already
[ https://issues.apache.org/jira/browse/KYLIN-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng resolved KYLIN-4459. --- Fix Version/s: v4.0.0-alpha Resolution: Fixed > Continuous print warning log-DFSInputStream has been closed already > --- > > Key: KYLIN-4459 > URL: https://issues.apache.org/jira/browse/KYLIN-4459 > Project: Kylin > Issue Type: Improvement > Components: Storage - Parquet >Reporter: xuekaiqi >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-alpha > > > when starting kylin with debug tomcat mode, we can see these logs. > {code:java} > 2020-03-12 10:17:06,082 ERROR [pool-12-thread-1] curator.CuratorScheduler:205 > : Node(127.0.0.1) job server state conflict. Is ZK leader: true; Is active > job server: false > 2020-03-12 10:17:06,830 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:06,830 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:06,830 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:08,717 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:08,717 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:08,717 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:12,936 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:12,936 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:12,936 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:14,152 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:14,152 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:14,152 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:17,603 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:17,603 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:17,603 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : > DFSInputStream has been closed already > 2020-03-12 10:17:17,603 INFO [Curator-LeaderSelector-0] > threadpool.DefaultScheduler:166 : Finishing resume all running jobs. > 2020-03-12 10:17:17,603 INFO [Curator-LeaderSelector-0] > threadpool.DefaultScheduler:170 : Fetching jobs every 30 seconds > 2020-03-12 10:17:17,603 INFO [Curator-LeaderSelector-0] > threadpool.DefaultScheduler:180 : Creating fetcher pool instance:2094578449 > {code} > Upgrade hadoop version to 2.7.2 can fix it, but need more test in case some > unpredictable problems > [https://community.cloudera.com/t5/Support-Questions/DFSInputStream-has-been-closed-already/td-p/125487] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4748) Optimize metadata for debug on local
wangrupeng created KYLIN-4748: - Summary: Optimize metadata for debug on local Key: KYLIN-4748 URL: https://issues.apache.org/jira/browse/KYLIN-4748 Project: Kylin Issue Type: Improvement Reporter: wangrupeng Assignee: wangrupeng * Add count distinct and percentile measure * Add a new column KYLIN_SALES.ITEM_ID for count distinct * Set SELLER_ID as shard by column * Add cube configuration *kylin.storage.columnar.shard-countdistinct-rowcount=1000* for file pruner by shard -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4742) NullPointerException when auto merge segments if exist discard jobs
[ https://issues.apache.org/jira/browse/KYLIN-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4742: -- Component/s: Tools, Build and Test Fix Version/s: v4.0.0-alpha > NullPointerException when auto merge segments if exist discard jobs > --- > > Key: KYLIN-4742 > URL: https://issues.apache.org/jira/browse/KYLIN-4742 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-alpha > > Attachments: image-2020-09-03-14-05-56-127.png > > > It's because the merge job does not set segment name, so it will throw NPE > when job get segment name > !image-2020-09-03-14-05-56-127.png|width=712,height=147! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4742) NullPointerException when auto merge segments if exist discard jobs
wangrupeng created KYLIN-4742: - Summary: NullPointerException when auto merge segments if exist discard jobs Key: KYLIN-4742 URL: https://issues.apache.org/jira/browse/KYLIN-4742 Project: Kylin Issue Type: Bug Reporter: wangrupeng Assignee: wangrupeng Attachments: image-2020-09-03-14-05-56-127.png It's because the merge job does not set segment name, so it will throw NPE when job get segment name !image-2020-09-03-14-05-56-127.png|width=712,height=147! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4729) The hive table will be overwrited when add csv table with the same name
[ https://issues.apache.org/jira/browse/KYLIN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4729: -- Fix Version/s: v4.0.0-alpha Affects Version/s: v4.0.0-alpha > The hive table will be overwrited when add csv table with the same name > --- > > Key: KYLIN-4729 > URL: https://issues.apache.org/jira/browse/KYLIN-4729 > Project: Kylin > Issue Type: Bug >Affects Versions: v4.0.0-alpha >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-alpha > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4729) The hive table will be overwrited when add csv table with the same name
[ https://issues.apache.org/jira/browse/KYLIN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4729: -- Summary: The hive table will be overwrited when add csv table with the same name (was: The hive) > The hive table will be overwrited when add csv table with the same name > --- > > Key: KYLIN-4729 > URL: https://issues.apache.org/jira/browse/KYLIN-4729 > Project: Kylin > Issue Type: Bug >Reporter: wangrupeng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4729) The hive table will be overwrited when add csv table with the same name
[ https://issues.apache.org/jira/browse/KYLIN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng reassigned KYLIN-4729: - Assignee: wangrupeng > The hive table will be overwrited when add csv table with the same name > --- > > Key: KYLIN-4729 > URL: https://issues.apache.org/jira/browse/KYLIN-4729 > Project: Kylin > Issue Type: Bug >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4729) The hive
wangrupeng created KYLIN-4729: - Summary: The hive Key: KYLIN-4729 URL: https://issues.apache.org/jira/browse/KYLIN-4729 Project: Kylin Issue Type: Bug Reporter: wangrupeng -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4723) Set the configurations about shard by to cube level
wangrupeng created KYLIN-4723: - Summary: Set the configurations about shard by to cube level Key: KYLIN-4723 URL: https://issues.apache.org/jira/browse/KYLIN-4723 Project: Kylin Issue Type: Improvement Components: Tools, Build and Test Affects Versions: v4.0.0-alpha Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-alpha Now the shard by related configurations like kylin.storage.columnar.shard-rowcount is Global Level, as it's important for query effeciency, it's better to set to Cube Level -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4722) Add more statistics to the query results
[ https://issues.apache.org/jira/browse/KYLIN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4722: -- Description: Now, the query result contains scaned rows, scaned bytes. There are some other statistics can be added like the number of scan files, spark scan time, etc. It will be useful to add the number of parquet files scaned when querying, especially, the shard by column is configured which will decrease the number of scaned parquet files to improve query efficency. To read more about shard by column with below link. [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] was: Now, the query result contains scaned rows, scaned bytes. It will be useful to add the number of parquet files scaned when querying, especially, the shard by column is configured which will decrease the number of scaned parquet files to improve query efficency. To read more about shard by column with below link. [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] > Add more statistics to the query results > > > Key: KYLIN-4722 > URL: https://issues.apache.org/jira/browse/KYLIN-4722 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Affects Versions: v4.0.0-alpha >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-alpha > > > Now, the query result contains scaned rows, scaned bytes. There are some > other statistics can be added like the number of scan files, spark scan time, > etc. It will be useful to add the number of parquet files scaned when > querying, especially, the shard by column is configured which will decrease > the number of scaned parquet files to improve query efficency. > To read more about shard by column with below link. > [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4722) Add more statistics to the query results
[ https://issues.apache.org/jira/browse/KYLIN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4722: -- Description: Now, the query result contains scaned rows, scaned bytes. There are some other statistics can be added like the number of scan files, spark scan time, etc. It will be useful to add the number of parquet files scaned when querying, especially, the shard by column is configured which will decrease the number of scaned parquet files to improve query efficency. To read more about shard by column with below link. [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] was: Now, the query result contains scaned rows, scaned bytes. There are some other statistics can be added like the number of scan files, spark scan time, etc. It will be useful to add the number of parquet files scaned when querying, especially, the shard by column is configured which will decrease the number of scaned parquet files to improve query efficency. To read more about shard by column with below link. [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] > Add more statistics to the query results > > > Key: KYLIN-4722 > URL: https://issues.apache.org/jira/browse/KYLIN-4722 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Affects Versions: v4.0.0-alpha >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-alpha > > > Now, the query result contains scaned rows, scaned bytes. There are some > other statistics can be added like the number of scan files, spark scan time, > etc. > It will be useful to add the number of parquet files scaned when querying, > especially, the shard by column is configured which will decrease the > number of scaned parquet files to improve query efficency. > To read more about shard by column with below link. > [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4722) Add more statistics to the query results
[ https://issues.apache.org/jira/browse/KYLIN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4722: -- Summary: Add more statistics to the query results (was: Add the number of files scaned when querying) > Add more statistics to the query results > > > Key: KYLIN-4722 > URL: https://issues.apache.org/jira/browse/KYLIN-4722 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Affects Versions: v4.0.0-alpha >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-alpha > > > Now, the query result contains scaned rows, scaned bytes. It will be useful > to add the number of parquet files scaned when querying, especially, the > shard by column is configured which will decrease the number of scaned > parquet files to improve query efficency. > To read more about shard by column with below link. > [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4722) Add the number of files scaned when querying
wangrupeng created KYLIN-4722: - Summary: Add the number of files scaned when querying Key: KYLIN-4722 URL: https://issues.apache.org/jira/browse/KYLIN-4722 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: v4.0.0-alpha Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-alpha Now, the query result contains scaned rows, scaned bytes. It will be useful to add the number of parquet files scaned when querying, especially, the shard by column is configured which will decrease the number of scaned parquet files to improve query efficency. To read more about shard by column with below link. [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4721) The default source source type should be CSV not Hive with the local debug mode
wangrupeng created KYLIN-4721: - Summary: The default source source type should be CSV not Hive with the local debug mode Key: KYLIN-4721 URL: https://issues.apache.org/jira/browse/KYLIN-4721 Project: Kylin Issue Type: Bug Components: Metadata Affects Versions: v4.0.0-alpha Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-alpha When debuging kylin 4.0 with tomcat local mode, Kylin will use the metadata which is located in $KYLIN_SOURCE/examples/test_case_data/sample_local and the source type of tables is hive. The build task will remain pending because it cannot connect the remote hadoop cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build
[ https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4715: -- Description: [http://kylin.apache.org/docs/howto/howto_optimize_build.html] The number of cuboids should be N*(N-1)/2 when with the (N-2) dimensions. !image-2020-08-25-11-13-55-160.png|width=660,height=337! !image-2020-08-25-11-14-14-556.png! was: [http://kylin.apache.org/docs/howto/howto_optimize_build.html] The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. !image-2020-08-25-11-13-55-160.png|width=660,height=337! !image-2020-08-25-11-14-14-556.png! > Wrong function with kylin document about how to optimize cube build > --- > > Key: KYLIN-4715 > URL: https://issues.apache.org/jira/browse/KYLIN-4715 > Project: Kylin > Issue Type: Bug > Components: Documentation >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v3.1.1 > > Attachments: image-2020-08-25-11-13-55-160.png, > image-2020-08-25-11-14-14-556.png > > > [http://kylin.apache.org/docs/howto/howto_optimize_build.html] > The number of cuboids should be N*(N-1)/2 when with the (N-2) dimensions. > !image-2020-08-25-11-13-55-160.png|width=660,height=337! > !image-2020-08-25-11-14-14-556.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build
[ https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4715: -- Description: [http://kylin.apache.org/docs/howto/howto_optimize_build.html] The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. !image-2020-08-25-11-13-55-160.png|width=660,height=337! !image-2020-08-25-11-14-14-556.png! was: [http://kylin.apache.org/docs/howto/howto_optimize_build.html] The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. !image-2020-08-25-11-07-32-579.png|width=591,height=334! !image-2020-08-25-11-09-33-205.png|width=298,height=132! > Wrong function with kylin document about how to optimize cube build > --- > > Key: KYLIN-4715 > URL: https://issues.apache.org/jira/browse/KYLIN-4715 > Project: Kylin > Issue Type: Bug > Components: Documentation >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v3.1.1 > > Attachments: image-2020-08-25-11-13-55-160.png, > image-2020-08-25-11-14-14-556.png > > > [http://kylin.apache.org/docs/howto/howto_optimize_build.html] > The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. > !image-2020-08-25-11-13-55-160.png|width=660,height=337! > !image-2020-08-25-11-14-14-556.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build
[ https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4715: -- Attachment: image-2020-08-25-11-14-14-556.png > Wrong function with kylin document about how to optimize cube build > --- > > Key: KYLIN-4715 > URL: https://issues.apache.org/jira/browse/KYLIN-4715 > Project: Kylin > Issue Type: Bug > Components: Documentation >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v3.1.1 > > Attachments: image-2020-08-25-11-13-55-160.png, > image-2020-08-25-11-14-14-556.png > > > [http://kylin.apache.org/docs/howto/howto_optimize_build.html] > The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. > !image-2020-08-25-11-07-32-579.png|width=591,height=334! > !image-2020-08-25-11-09-33-205.png|width=298,height=132! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build
[ https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4715: -- Attachment: image-2020-08-25-11-13-55-160.png > Wrong function with kylin document about how to optimize cube build > --- > > Key: KYLIN-4715 > URL: https://issues.apache.org/jira/browse/KYLIN-4715 > Project: Kylin > Issue Type: Bug > Components: Documentation >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v3.1.1 > > Attachments: image-2020-08-25-11-13-55-160.png > > > [http://kylin.apache.org/docs/howto/howto_optimize_build.html] > The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. > !image-2020-08-25-11-07-32-579.png|width=591,height=334! > !image-2020-08-25-11-09-33-205.png|width=298,height=132! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build
wangrupeng created KYLIN-4715: - Summary: Wrong function with kylin document about how to optimize cube build Key: KYLIN-4715 URL: https://issues.apache.org/jira/browse/KYLIN-4715 Project: Kylin Issue Type: Bug Components: Documentation Reporter: wangrupeng Assignee: wangrupeng Fix For: v3.1.1 Attachments: image-2020-08-25-11-13-55-160.png [http://kylin.apache.org/docs/howto/howto_optimize_build.html] The number of cuboids should be N*(N-2)/2 when with the (N-2) dimensions. !image-2020-08-25-11-07-32-579.png|width=591,height=334! !image-2020-08-25-11-09-33-205.png|width=298,height=132! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4700) Wrong engine type for realtime streaming
[ https://issues.apache.org/jira/browse/KYLIN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng reassigned KYLIN-4700: - Attachment: image-2020-08-14-20-34-56-168.png Component/s: Website Fix Version/s: v3.1.1 Affects Version/s: v3.1.0 Assignee: wangrupeng Description: As for now, realtime streaming only support map reduce to build, but there's an error that flink engine can be selected when create a realtime streaming cube. !image-2020-08-14-20-34-56-168.png|width=499,height=263! > Wrong engine type for realtime streaming > - > > Key: KYLIN-4700 > URL: https://issues.apache.org/jira/browse/KYLIN-4700 > Project: Kylin > Issue Type: Bug > Components: Website >Affects Versions: v3.1.0 >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v3.1.1 > > Attachments: image-2020-08-14-20-34-56-168.png > > > As for now, realtime streaming only support map reduce to build, but there's > an error that flink engine can be selected when create a realtime streaming > cube. > !image-2020-08-14-20-34-56-168.png|width=499,height=263! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4700) Wrong engine type for realtime streaming
wangrupeng created KYLIN-4700: - Summary: Wrong engine type for realtime streaming Key: KYLIN-4700 URL: https://issues.apache.org/jira/browse/KYLIN-4700 Project: Kylin Issue Type: Bug Reporter: wangrupeng -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4626) add set kylin home sh
[ https://issues.apache.org/jira/browse/KYLIN-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176046#comment-17176046 ] wangrupeng commented on KYLIN-4626: --- That's great! > add set kylin home sh > - > > Key: KYLIN-4626 > URL: https://issues.apache.org/jira/browse/KYLIN-4626 > Project: Kylin > Issue Type: Improvement >Reporter: chuxiao >Assignee: chuxiao >Priority: Major > > KYLIN_HOME 是重要的,几乎每个脚本都离不开它。但随便设置环境变量并不是一个最佳行为,比如安装了多套实例。增加set-kylin-home.sh, > kylin实例可以设置自己的环境变量。 > 这个主要是面向一台服务器部署多套kylin服务用的。 > 另外我们运维规范现在要求环境变量都放到服务自己的文件里,避免冲突。 > 而其他环境变量都可以放到setenv.sh里,但kylin_home在setenv.sh加载前就需要。所以默认是取系统环境变量,也可以根据需要修改 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4690) BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with Spark
[ https://issues.apache.org/jira/browse/KYLIN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174300#comment-17174300 ] wangrupeng edited comment on KYLIN-4690 at 8/10/20, 12:57 PM: -- Could you provide more information about your environment? Like the hadoop version, spark version, etc. I test it in my CDH5.7 with cluster mode and it works fine. was (Author: wangrupeng): Could you provide more information about your environment? I test it in my CDH5.7 with cluster mode and it works fine. > BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with > Spark > --- > > Key: KYLIN-4690 > URL: https://issues.apache.org/jira/browse/KYLIN-4690 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v3.1.0 >Reporter: James >Priority: Critical > > BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with > Spark > Executor: > export > HADOOP_CONF_DIR=/app/kylin/apache-kylin-3.1.0-bin-hbase1x/kylin_hadoop_conf_dir > && /usr/hdp/current/spark2-client/bin/spark-submit --class > org.apache.kylin.common.util.SparkEntry --name "Build Cube with > Spark:CBE_DEV[2020010200_2020010300]" --conf spark.executor.cores=5 > --conf spark.hadoop.yarn.timeline-service.enabled=false --conf > spark.hadoop.mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec > --conf spark.executor.memoryOverhead=1024 --conf > spark.executor.extraJavaOptions=-Dhdp.version=2.6.4.149-3 --conf > spark.master=yarn --conf > spark.hadoop.mapreduce.output.fileoutputformat.compress=true --conf > spark.executor.instances=5 --conf > spark.kryo.register=org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage > --conf spark.yarn.am.extraJavaOptions=-Dhdp.version=2.6.4.149-3 --conf > spark.executor.memory=4G --conf spark.yarn.queue=sgz1-criskapp-haas_dev > --conf spark.submit.deployMode=cluster --conf > spark.dynamicAllocation.minExecutors=0 --conf spark.network.timeout=600 > --conf spark.hadoop.dfs.replication=2 --conf > spark.yarn.executor.memoryOverhead=1024 --conf > spark.dynamicAllocation.executorIdleTimeout=300 --conf > spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf > spark.driver.memory=5G --conf > spark.driver.extraJavaOptions=-Dhdp.version=2.6.4.149-3 --conf > spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec --conf > spark.eventLog.enabled=true --conf spark.shuffle.service.enabled=true > --conf spark.eventLog.dir=hdfs:///kylin/spark-history --conf > spark.dynamicAllocation.maxExecutors=15 --conf > spark.dynamicAllocation.enabled=true --jars > /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar > /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar -className > org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable > kylin310.kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d > -output > hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/cuboid/ > -input > hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d > -segmentId 02f32a29-1d51-0cb0-37ba-825333d38c8d -metaUrl > crr_kylin_dev240@hdfs,path=hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/metadata > -cubename CBE_DEV > > Step Name: > #7 Step Name: Build Cube with Spark:CBE_DEV[2020010200_2020010300] > > Error: > 20/08/08 09:23:54 ERROR ApplicationMaster: User class threw exception: > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > at > org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646) > Caused by: java.lang.IllegalArgumentException: Error while instantiating >
[jira] [Commented] (KYLIN-4690) BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with Spark
[ https://issues.apache.org/jira/browse/KYLIN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174300#comment-17174300 ] wangrupeng commented on KYLIN-4690: --- Could you provide more information about your environment? I test it in my CDH5.7 with cluster mode and it works fine. > BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with > Spark > --- > > Key: KYLIN-4690 > URL: https://issues.apache.org/jira/browse/KYLIN-4690 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v3.1.0 >Reporter: James >Priority: Critical > > BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with > Spark > Executor: > export > HADOOP_CONF_DIR=/app/kylin/apache-kylin-3.1.0-bin-hbase1x/kylin_hadoop_conf_dir > && /usr/hdp/current/spark2-client/bin/spark-submit --class > org.apache.kylin.common.util.SparkEntry --name "Build Cube with > Spark:CBE_DEV[2020010200_2020010300]" --conf spark.executor.cores=5 > --conf spark.hadoop.yarn.timeline-service.enabled=false --conf > spark.hadoop.mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec > --conf spark.executor.memoryOverhead=1024 --conf > spark.executor.extraJavaOptions=-Dhdp.version=2.6.4.149-3 --conf > spark.master=yarn --conf > spark.hadoop.mapreduce.output.fileoutputformat.compress=true --conf > spark.executor.instances=5 --conf > spark.kryo.register=org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage > --conf spark.yarn.am.extraJavaOptions=-Dhdp.version=2.6.4.149-3 --conf > spark.executor.memory=4G --conf spark.yarn.queue=sgz1-criskapp-haas_dev > --conf spark.submit.deployMode=cluster --conf > spark.dynamicAllocation.minExecutors=0 --conf spark.network.timeout=600 > --conf spark.hadoop.dfs.replication=2 --conf > spark.yarn.executor.memoryOverhead=1024 --conf > spark.dynamicAllocation.executorIdleTimeout=300 --conf > spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf > spark.driver.memory=5G --conf > spark.driver.extraJavaOptions=-Dhdp.version=2.6.4.149-3 --conf > spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec --conf > spark.eventLog.enabled=true --conf spark.shuffle.service.enabled=true > --conf spark.eventLog.dir=hdfs:///kylin/spark-history --conf > spark.dynamicAllocation.maxExecutors=15 --conf > spark.dynamicAllocation.enabled=true --jars > /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar > /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar -className > org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable > kylin310.kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d > -output > hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/cuboid/ > -input > hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d > -segmentId 02f32a29-1d51-0cb0-37ba-825333d38c8d -metaUrl > crr_kylin_dev240@hdfs,path=hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/metadata > -cubename CBE_DEV > > Step Name: > #7 Step Name: Build Cube with Spark:CBE_DEV[2020010200_2020010300] > > Error: > 20/08/08 09:23:54 ERROR ApplicationMaster: User class threw exception: > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > at > org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646) > Caused by: java.lang.IllegalArgumentException: Error while instantiating > 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1075) > at >
[jira] [Commented] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary
[ https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174298#comment-17174298 ] wangrupeng commented on KYLIN-4688: --- The 3rd problem you mensioned, it's not cleanup tmp file, but delete partition file if it exists before create partition file. !image-2020-08-10-20-49-46-354.png|width=610,height=256! > Too many tmp files in HDFS tmp dictionary > - > > Key: KYLIN-4688 > URL: https://issues.apache.org/jira/browse/KYLIN-4688 > Project: Kylin > Issue Type: Bug > Components: Others >Affects Versions: all >Reporter: QiangZhang >Priority: Major > Attachments: image-2020-08-06-18-07-00-377.png, > image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, > image-2020-08-06-19-34-22-515.png, image-2020-08-10-19-38-28-913.png, > image-2020-08-10-20-06-01-526.png, image-2020-08-10-20-14-53-854.png, > image-2020-08-10-20-18-10-513.png, image-2020-08-10-20-20-07-899.png, > image-2020-08-10-20-21-44-137.png, image-2020-08-10-20-49-46-354.png > > > Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up > automatically > !image-2020-08-06-18-07-00-377.png! > 2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary
[ https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4688: -- Attachment: image-2020-08-10-20-49-46-354.png > Too many tmp files in HDFS tmp dictionary > - > > Key: KYLIN-4688 > URL: https://issues.apache.org/jira/browse/KYLIN-4688 > Project: Kylin > Issue Type: Bug > Components: Others >Affects Versions: all >Reporter: QiangZhang >Priority: Major > Attachments: image-2020-08-06-18-07-00-377.png, > image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, > image-2020-08-06-19-34-22-515.png, image-2020-08-10-19-38-28-913.png, > image-2020-08-10-20-06-01-526.png, image-2020-08-10-20-14-53-854.png, > image-2020-08-10-20-18-10-513.png, image-2020-08-10-20-20-07-899.png, > image-2020-08-10-20-21-44-137.png, image-2020-08-10-20-49-46-354.png > > > Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up > automatically > !image-2020-08-06-18-07-00-377.png! > 2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary
[ https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174259#comment-17174259 ] wangrupeng commented on KYLIN-4688: --- Where does the default value of "hbase.fs.tmp.dir" set? From the source code of kylin, it will be set to "/tmp" when "conf.get('hbase.fs.tmp.dir')" is blank. > Too many tmp files in HDFS tmp dictionary > - > > Key: KYLIN-4688 > URL: https://issues.apache.org/jira/browse/KYLIN-4688 > Project: Kylin > Issue Type: Bug > Components: Others >Affects Versions: all >Reporter: QiangZhang >Priority: Major > Attachments: image-2020-08-06-18-07-00-377.png, > image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, > image-2020-08-06-19-34-22-515.png > > > Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up > automatically > !image-2020-08-06-18-07-00-377.png! > 2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary
[ https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174256#comment-17174256 ] wangrupeng commented on KYLIN-4688: --- As far as I'm concerned, HDFS won't delete file in /tmp/ . I agree with you that kylin should clean up tmp files at the end of the task. > Too many tmp files in HDFS tmp dictionary > - > > Key: KYLIN-4688 > URL: https://issues.apache.org/jira/browse/KYLIN-4688 > Project: Kylin > Issue Type: Bug > Components: Others >Affects Versions: all >Reporter: QiangZhang >Priority: Major > Attachments: image-2020-08-06-18-07-00-377.png, > image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, > image-2020-08-06-19-34-22-515.png > > > Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up > automatically > !image-2020-08-06-18-07-00-377.png! > 2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4680) Avoid annoying log messages of unit test and integration test
wangrupeng created KYLIN-4680: - Summary: Avoid annoying log messages of unit test and integration test Key: KYLIN-4680 URL: https://issues.apache.org/jira/browse/KYLIN-4680 Project: Kylin Issue Type: Improvement Components: Integration, Tools, Build and Test Affects Versions: v4.0.0-beta Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta When running a unit test case, it will output too much annoying log messages. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KYLIN-4644) New tool to clean up intermediate files for Kylin 4.0
[ https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng resolved KYLIN-4644. --- Resolution: Fixed > New tool to clean up intermediate files for Kylin 4.0 > -- > > Key: KYLIN-4644 > URL: https://issues.apache.org/jira/browse/KYLIN-4644 > Project: Kylin > Issue Type: Improvement > Components: Client - CLI >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > Original Estimate: 72h > Remaining Estimate: 72h > > As the change of storage, Kylin 4.x needs a new tool to clean up intermediate > data and temporary files generated during cube building. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4644) New tool to clean up intermediate files for Kylin 4.0
[ https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159843#comment-17159843 ] wangrupeng commented on KYLIN-4644: --- [https://github.com/apache/kylin/pull/1323] > New tool to clean up intermediate files for Kylin 4.0 > -- > > Key: KYLIN-4644 > URL: https://issues.apache.org/jira/browse/KYLIN-4644 > Project: Kylin > Issue Type: Improvement > Components: Client - CLI >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > Original Estimate: 72h > Remaining Estimate: 72h > > As the change of storage, Kylin 4.x needs a new tool to clean up intermediate > data and temporary files generated during cube building. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4644) New tool to clean up intermediate files for Kylin 4.0
[ https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4644: -- Issue Type: Improvement (was: Bug) > New tool to clean up intermediate files for Kylin 4.0 > -- > > Key: KYLIN-4644 > URL: https://issues.apache.org/jira/browse/KYLIN-4644 > Project: Kylin > Issue Type: Improvement > Components: Client - CLI >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > Original Estimate: 72h > Remaining Estimate: 72h > > As the change of storage, Kylin 4.x needs a new tool to clean up intermediate > data and temporary files generated during cube building. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4644) New tool to clean up intermediate files for Kylin 4.0
[ https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159600#comment-17159600 ] wangrupeng commented on KYLIN-4644: --- Which will be deleted: * {{temp job files}} hdfs:///kylin/${metadata_url}/${project}/job_tmp * {{none used segment cuboid files}} hdfs:///kylin/${metadata_url}/${project}/${cube_name}/${non_used_segment} Usage: # {{ Check which resources can be cleanup, this will not remove anything:}} {code:java} export KYLIN_HOME=/path/to/kylin_home ${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete false{code} {{ 2. You can pickup 1 or 2 resources to check whether they’re no longer be referred; Then add the “--delete true” option to start the cleanup:}} {code:java} ${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true {code} > New tool to clean up intermediate files for Kylin 4.0 > -- > > Key: KYLIN-4644 > URL: https://issues.apache.org/jira/browse/KYLIN-4644 > Project: Kylin > Issue Type: Bug > Components: Client - CLI >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > Original Estimate: 72h > Remaining Estimate: 72h > > As the change of storage, Kylin 4.x needs a new tool to clean up intermediate > data and temporary files generated during cube building. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4644) New tool to clean up intermediate files for Kylin 4.0
wangrupeng created KYLIN-4644: - Summary: New tool to clean up intermediate files for Kylin 4.0 Key: KYLIN-4644 URL: https://issues.apache.org/jira/browse/KYLIN-4644 Project: Kylin Issue Type: Bug Components: Client - CLI Affects Versions: v4.0.0-beta Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta As the change of storage, Kylin 4.x needs a new tool to clean up intermediate data and temporary files generated during cube building. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/15/20, 3:44 AM: - Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local kylin.storage.url=/Users/rupeng.wang/Kyligence/Developments/kylin/kylin-parquet/examples/test_case_data/sample_local kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.engine.spark-conf.spark.sql.shuffle.partitions=1 kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine !image-2020-07-08-17-41-35-954.png|width=561,height=354! * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local kylin.storage.url=/Users/rupeng.wang/Kyligence/Developments/kylin/kylin-parquet/examples/test_case_data/sample_local kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir{code} {code:java} kylin.engine.spark-conf.spark.sql.shuffle.partitions=1 kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine !image-2020-07-08-17-41-35-954.png|width=561,height=354! * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files and Not dependent on remote HDP sandbox, but it's a little bit > complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/15/20, 3:38 AM: - Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local kylin.storage.url=/Users/rupeng.wang/Kyligence/Developments/kylin/kylin-parquet/examples/test_case_data/sample_local kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine !image-2020-07-08-17-41-35-954.png|width=561,height=354! * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files and Not dependent on remote HDP sandbox, but it's a little bit > complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (KYLIN-4632) No such element exception:spark.driver.cores
[ https://issues.apache.org/jira/browse/KYLIN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4632: -- Comment: was deleted (was: [https://github.com/apache/kylin/pull/1310/]) > No such element exception:spark.driver.cores > > > Key: KYLIN-4632 > URL: https://issues.apache.org/jira/browse/KYLIN-4632 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > When submit a build job, it throws an exception. But the build job still can > run successfully. > 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. > Ignore it and try to submit this job. > java.util.NoSuchElementException: spark.driver.cores > at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) > at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.SparkConf.get(SparkConf.scala:245) > at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71) > at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala) > at > org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156) > at > org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76) > at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4632) No such element exception:spark.driver.cores
[ https://issues.apache.org/jira/browse/KYLIN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155190#comment-17155190 ] wangrupeng commented on KYLIN-4632: --- [https://github.com/apache/kylin/pull/1310/] > No such element exception:spark.driver.cores > > > Key: KYLIN-4632 > URL: https://issues.apache.org/jira/browse/KYLIN-4632 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > When submit a build job, it throws an exception. But the build job still can > run successfully. > 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. > Ignore it and try to submit this job. > java.util.NoSuchElementException: spark.driver.cores > at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) > at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.SparkConf.get(SparkConf.scala:245) > at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71) > at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala) > at > org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156) > at > org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76) > at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4632) No such element exception:spark.driver.cores
[ https://issues.apache.org/jira/browse/KYLIN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4632: -- Sprint: Sprint 53 > No such element exception:spark.driver.cores > > > Key: KYLIN-4632 > URL: https://issues.apache.org/jira/browse/KYLIN-4632 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > When submit a build job, it throws an exception. But the build job still can > run successfully. > 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. > Ignore it and try to submit this job. > java.util.NoSuchElementException: spark.driver.cores > at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) > at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.SparkConf.get(SparkConf.scala:245) > at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71) > at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala) > at > org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156) > at > org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76) > at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4632) No such element exception:spark.driver.cores
wangrupeng created KYLIN-4632: - Summary: No such element exception:spark.driver.cores Key: KYLIN-4632 URL: https://issues.apache.org/jira/browse/KYLIN-4632 Project: Kylin Issue Type: Bug Components: Spark Engine Affects Versions: v4.0.0-beta Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta When submit a build job, it throws an exception. But the build job still can run successfully. 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. Ignore it and try to submit this job. java.util.NoSuchElementException: spark.driver.cores at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.SparkConf.get(SparkConf.scala:245) at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71) at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156) at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76) at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4631) Set the default build engine type to spark for Kylin on Parquet
wangrupeng created KYLIN-4631: - Summary: Set the default build engine type to spark for Kylin on Parquet Key: KYLIN-4631 URL: https://issues.apache.org/jira/browse/KYLIN-4631 Project: Kylin Issue Type: Improvement Components: Metadata, Spark Engine Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta Now, the default engine type is still MapReduce when generate sample model and cube. We should edit the cube and set the engine type to spark manually. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Sprint: Sprint 53 > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files and Not dependent on remote HDP sandbox, but it's a little bit > complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: Currently, Kylin on Parquet already supports debuging source code with local csv files and Not dependent on remote HDP sandbox, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files and Not dependent on remote HDP sandbox, but it's a little bit > complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/9/20, 2:11 AM: Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir # Env DEV|QA|PROD\LOCAL\UT # LOCAL means reading local data source when debug with tomcat without connect to sandboxkylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/9/20, 2:10 AM: Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir # Env DEV|QA|PROD\LOCAL\UT # LOCAL means reading local data source when debug with tomcat without connect to sandboxkylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: dCurrently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file:///path/to/local/dir] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: dCurrently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local {code:java} kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > dCurrently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/9/20, 1:59 AM: Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > dCurrently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > {code:java} > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=file:///path/to/local/dir > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT{code} > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/9/20, 1:58 AM: Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local * {code:java} kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL{code} * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > ```log > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=[file:///path/to/local/dir] > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT > ``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file:///path/to/local/dir] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube quickly and focus the bug we want to resolve. But current way is complex to load csv tables, create model and cube and it's hard to use kylin sample cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file:///path/to/local/dir] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube easy. But current way is complex to load csv tables and create model and cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > ```log > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=[file:///path/to/local/dir] > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT > ``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube quickly and focus the > bug we want to resolve. But current way is complex to load csv tables, create > model and cube and it's hard to use kylin sample cube. So, I want to add a > csv source which using the model of kylin sample data directly when debug > tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/8/20, 9:59 AM: Now we can debug tomcat without hadoop environment by following the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now if you want to debug tomcat without hadoop environment, you can follow the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > ```log > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=[file:///path/to/local/dir] > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT > ``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file:///path/to/local/dir] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png|width=574,height=363! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png|width=577,height=259! Most time we debug just want to build and query cube easy. But current way is complex to load csv tables and create model and cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png! Most time we debug just want to build and query cube easy. But current way is complex to load csv tables and create model and cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local > ```log > kylin.metadata.url=$LOCAL_META_DIR > kylin.env.zookeeper-is-local=true > kylin.env.hdfs-working-dir=[file:///path/to/local/dir] > kylin.engine.spark-conf.spark.master=local > kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir > kylin.env=UT > ``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png|width=574,height=363! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png|width=577,height=259! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/8/20, 9:55 AM: Now if you want to debug tomcat without hadoop environment, you can follow the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=546,height=196! was (Author: wangrupeng): Now if you want to debug tomcat without hadoop environment, you can follow the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=1341,height=481! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local >```log >kylin.metadata.url=$LOCAL_META_DIR >kylin.env.zookeeper-is-local=true >kylin.env.hdfs-working-dir=file:///path/to/local/dir >kylin.engine.spark-conf.spark.master=local >kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir >kylin.env=UT >``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng edited comment on KYLIN-4625 at 7/8/20, 9:55 AM: Now if you want to debug tomcat without hadoop environment, you can follow the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test] kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png|width=1341,height=481! was (Author: wangrupeng): Now if you want to debug tomcat without hadoop environment, you can follow the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local >```log >kylin.metadata.url=$LOCAL_META_DIR >kylin.env.zookeeper-is-local=true >kylin.env.hdfs-working-dir=file:///path/to/local/dir >kylin.engine.spark-conf.spark.master=local >kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir >kylin.env=UT >``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453 ] wangrupeng commented on KYLIN-4625: --- Now if you want to debug tomcat without hadoop environment, you can follow the follow steps: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=LOCAL ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" This is used for query engine * start debug tomcat and we can use the models we already defined !screenshot-1.png! > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local >```log >kylin.metadata.url=$LOCAL_META_DIR >kylin.env.zookeeper-is-local=true >kylin.env.hdfs-working-dir=file:///path/to/local/dir >kylin.engine.spark-conf.spark.master=local >kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir >kylin.env=UT >``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Attachment: screenshot-1.png > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png, screenshot-1.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local >```log >kylin.metadata.url=$LOCAL_META_DIR >kylin.env.zookeeper-is-local=true >kylin.env.hdfs-working-dir=file:///path/to/local/dir >kylin.engine.spark-conf.spark.master=local >kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir >kylin.env=UT >``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
[ https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4625: -- Description: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir kylin.env=UT ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png! Most time we debug just want to build and query cube easy. But current way is complex to load csv tables and create model and cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. was: Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png! Most time we debug just want to build and query cube easy. But current way is complex to load csv tables and create model and cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. > Debug the code of Kylin on Parquet without hadoop environment > - > > Key: KYLIN-4625 > URL: https://issues.apache.org/jira/browse/KYLIN-4625 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Attachments: image-2020-07-08-17-41-35-954.png, > image-2020-07-08-17-42-09-603.png > > > Currently, Kylin on Parquet already supports debuging source code with local > csv files, but it's a little bit complex. The steps are as follows: > * edit the properties of > $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local >```log >kylin.metadata.url=$LOCAL_META_DIR >kylin.env.zookeeper-is-local=true >kylin.env.hdfs-working-dir=file:///path/to/local/dir >kylin.engine.spark-conf.spark.master=local >kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir >kylin.env=UT >``` > * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option > "-Dspark.local=true" > !image-2020-07-08-17-41-35-954.png! > * Load csv data source by pressing button "Data Source->Load CSV File as > Table" on "Model" page, and set the schema for your table. Then press > "submit" to save. > !image-2020-07-08-17-42-09-603.png! > Most time we debug just want to build and query cube easy. But current way is > complex to load csv tables and create model and cube. So, I want to add a csv > source which using the model of kylin sample data directly when debug tomcat > started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment
wangrupeng created KYLIN-4625: - Summary: Debug the code of Kylin on Parquet without hadoop environment Key: KYLIN-4625 URL: https://issues.apache.org/jira/browse/KYLIN-4625 Project: Kylin Issue Type: Improvement Components: Spark Engine Reporter: wangrupeng Assignee: wangrupeng Attachments: image-2020-07-08-17-41-35-954.png, image-2020-07-08-17-42-09-603.png Currently, Kylin on Parquet already supports debuging source code with local csv files, but it's a little bit complex. The steps are as follows: * edit the properties of $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local ```log kylin.metadata.url=$LOCAL_META_DIR kylin.env.zookeeper-is-local=true kylin.env.hdfs-working-dir=file:///path/to/local/dir kylin.engine.spark-conf.spark.master=local kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir ``` * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option "-Dspark.local=true" !image-2020-07-08-17-41-35-954.png! * Load csv data source by pressing button "Data Source->Load CSV File as Table" on "Model" page, and set the schema for your table. Then press "submit" to save. !image-2020-07-08-17-42-09-603.png! Most time we debug just want to build and query cube easy. But current way is complex to load csv tables and create model and cube. So, I want to add a csv source which using the model of kylin sample data directly when debug tomcat started. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4621) Avoid annoying log message when build cube and query
[ https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153331#comment-17153331 ] wangrupeng commented on KYLIN-4621: --- * add a new log4j properties configuration file called kylin-parquet-log4j.properties * the log of spark is default "WARN" > Avoid annoying log message when build cube and query > > > Key: KYLIN-4621 > URL: https://issues.apache.org/jira/browse/KYLIN-4621 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > # Build > There will be about 40 thousands rows log messages of one build task, > most of them are unnecessary. > # Query > The first time of query, kylin will init one spark context and print all > jars and classes loaded. This can be print in kylin.out not kylin.log. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (KYLIN-4621) Avoid annoying log message when build cube and query
[ https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4621: -- Comment: was deleted (was: https://github.com/apache/kylin/pull/1310) > Avoid annoying log message when build cube and query > > > Key: KYLIN-4621 > URL: https://issues.apache.org/jira/browse/KYLIN-4621 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > # Build > There will be about 40 thousands rows log messages of one build task, > most of them are unnecessary. > # Query > The first time of query, kylin will init one spark context and print all > jars and classes loaded. This can be print in kylin.out not kylin.log. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4621) Avoid annoying log message when build cube and query
[ https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153328#comment-17153328 ] wangrupeng commented on KYLIN-4621: --- https://github.com/apache/kylin/pull/1310 > Avoid annoying log message when build cube and query > > > Key: KYLIN-4621 > URL: https://issues.apache.org/jira/browse/KYLIN-4621 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > # Build > There will be about 40 thousands rows log messages of one build task, > most of them are unnecessary. > # Query > The first time of query, kylin will init one spark context and print all > jars and classes loaded. This can be print in kylin.out not kylin.log. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4621) Avoid annoying log message when build cube and query
[ https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4621: -- Sprint: Sprint 53 > Avoid annoying log message when build cube and query > > > Key: KYLIN-4621 > URL: https://issues.apache.org/jira/browse/KYLIN-4621 > Project: Kylin > Issue Type: Improvement > Components: Spark Engine >Affects Versions: v4.0.0-beta >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > # Build > There will be about 40 thousands rows log messages of one build task, > most of them are unnecessary. > # Query > The first time of query, kylin will init one spark context and print all > jars and classes loaded. This can be print in kylin.out not kylin.log. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4621) Avoid annoying log message when build cube and query
wangrupeng created KYLIN-4621: - Summary: Avoid annoying log message when build cube and query Key: KYLIN-4621 URL: https://issues.apache.org/jira/browse/KYLIN-4621 Project: Kylin Issue Type: Improvement Components: Spark Engine Affects Versions: v4.0.0-beta Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta # Build There will be about 40 thousands rows log messages of one build task, most of them are unnecessary. # Query The first time of query, kylin will init one spark context and print all jars and classes loaded. This can be print in kylin.out not kylin.log. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KYLIN-4516) Support System Cube
[ https://issues.apache.org/jira/browse/KYLIN-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng resolved KYLIN-4516. --- Resolution: Fixed > Support System Cube > --- > > Key: KYLIN-4516 > URL: https://issues.apache.org/jira/browse/KYLIN-4516 > Project: Kylin > Issue Type: Sub-task >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4526) Enhance get the hive table rows
[ https://issues.apache.org/jira/browse/KYLIN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4526: -- Sprint: Sprint 53 > Enhance get the hive table rows > --- > > Key: KYLIN-4526 > URL: https://issues.apache.org/jira/browse/KYLIN-4526 > Project: Kylin > Issue Type: Task >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > > In kylin-4315, we get the rows of the hive table from metadata, but when we > turn off hive's statistics feature(`hive.stats.autogather=false`), we can't > get the correct rows of hive table from metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4527) Beautify the drop-down list of the cube on query page
[ https://issues.apache.org/jira/browse/KYLIN-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4527: -- Sprint: Sprint 53 > Beautify the drop-down list of the cube on query page > - > > Key: KYLIN-4527 > URL: https://issues.apache.org/jira/browse/KYLIN-4527 > Project: Kylin > Issue Type: Improvement >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Attachments: image-2020-05-27-12-05-49-425.png, > image-2020-05-27-12-22-19-097.png > > > The drop-down list of cube is very compact, which is not convenient to select > cube > Before: > !image-2020-05-27-12-05-49-425.png|width=424,height=212! > After: > !image-2020-05-27-12-22-19-097.png|width=429,height=249! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4405) Internal exception when trying to build cube whose modal has null PartitionDesc
[ https://issues.apache.org/jira/browse/KYLIN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136609#comment-17136609 ] wangrupeng commented on KYLIN-4405: --- Sorry, it seems like this problem is already resolved with in branch master. > Internal exception when trying to build cube whose modal has null > PartitionDesc > > > Key: KYLIN-4405 > URL: https://issues.apache.org/jira/browse/KYLIN-4405 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.6.2 >Reporter: Chester Liu >Assignee: Chester Liu >Priority: Minor > Fix For: v3.1.0, v3.0.2, v2.6.6 > > > We are using 2.6.2 in our production environment and came upon this > exception. We build our model and cube using the REST api, which allows null > partitionDesc in a kylin model. However, when we try to build the cube > related to the model, this exception occurs: > > {{org.apache.kylin.rest.exception.InternalErrorException}} > > {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:398)org.apache.kylin.rest.controller.CubeController.rebuild(CubeController.java:354)}} > > {{org.apache.kylin.rest.controller.CubeController.build(CubeController.java:343)}} > {{sun.reflect.GeneratedMethodAccessor233.invoke(Unknown > Source)sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} > {{java.lang.reflect.Method.invoke(Method.java:497)}} > {{...}} > {{Caused by: java.lang.NullPointerException}} > > {{org.apache.kylin.cube.CubeManager$SegmentAssist.appendSegment(CubeManager.java:695)}} > {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:638)}} > {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:630)}} > > {{org.apache.kylin.rest.service.JobService.submitJobInternal(JobService.java:233)}} > {{org.apache.kylin.rest.service.JobService.submitJob(JobService.java:202)}} > > {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:394)}} > Our current solution is using a not-null partitionDesc with empty > partitionDateColumn. But I think ultimately a null partitionDesc makes more > sense to me when our data source is not partitioned in the first place. > I searched the codebase for null-checking of partitionDesc and found several > of them. So I think an extra null-check should be added. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4405) Internal exception when trying to build cube whose modal has null PartitionDesc
[ https://issues.apache.org/jira/browse/KYLIN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136554#comment-17136554 ] wangrupeng commented on KYLIN-4405: --- I did not reproduce this problem. I sent building request using rest API and the partition Desc of the model is null, but the cube build job finished successed. > Internal exception when trying to build cube whose modal has null > PartitionDesc > > > Key: KYLIN-4405 > URL: https://issues.apache.org/jira/browse/KYLIN-4405 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.6.2 >Reporter: Chester Liu >Assignee: Chester Liu >Priority: Minor > Fix For: v3.1.0, v3.0.2, v2.6.6 > > > We are using 2.6.2 in our production environment and came upon this > exception. We build our model and cube using the REST api, which allows null > partitionDesc in a kylin model. However, when we try to build the cube > related to the model, this exception occurs: > > {{org.apache.kylin.rest.exception.InternalErrorException}} > > {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:398)org.apache.kylin.rest.controller.CubeController.rebuild(CubeController.java:354)}} > > {{org.apache.kylin.rest.controller.CubeController.build(CubeController.java:343)}} > {{sun.reflect.GeneratedMethodAccessor233.invoke(Unknown > Source)sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} > {{java.lang.reflect.Method.invoke(Method.java:497)}} > {{...}} > {{Caused by: java.lang.NullPointerException}} > > {{org.apache.kylin.cube.CubeManager$SegmentAssist.appendSegment(CubeManager.java:695)}} > {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:638)}} > {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:630)}} > > {{org.apache.kylin.rest.service.JobService.submitJobInternal(JobService.java:233)}} > {{org.apache.kylin.rest.service.JobService.submitJob(JobService.java:202)}} > > {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:394)}} > Our current solution is using a not-null partitionDesc with empty > partitionDateColumn. But I think ultimately a null partitionDesc makes more > sense to me when our data source is not partitioned in the first place. > I searched the codebase for null-checking of partitionDesc and found several > of them. So I think an extra null-check should be added. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4563) Support for specifying cuboids when building segments
[ https://issues.apache.org/jira/browse/KYLIN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4563: -- Sprint: Sprint 53 > Support for specifying cuboids when building segments > - > > Key: KYLIN-4563 > URL: https://issues.apache.org/jira/browse/KYLIN-4563 > Project: Kylin > Issue Type: Improvement >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > > Currently, when building a segment, all cuboids of cube will be built. > Especially after removing or adding some cuboids with cube planner, there's > no need to rebuild all segments, we can only remove or add the cuboids data > we need. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4563) Support for specifying cuboids when building segments
wangrupeng created KYLIN-4563: - Summary: Support for specifying cuboids when building segments Key: KYLIN-4563 URL: https://issues.apache.org/jira/browse/KYLIN-4563 Project: Kylin Issue Type: Improvement Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta Currently, when building a segment, all cuboids of cube will be built. Especially after removing or adding some cuboids with cube planner, there's no need to rebuild all segments, we can only remove or add the cuboids data we need. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4224) Create flat table wich spark sql
[ https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4224: -- Labels: (was: doc) > Create flat table wich spark sql > > > Key: KYLIN-4224 > URL: https://issues.apache.org/jira/browse/KYLIN-4224 > Project: Kylin > Issue Type: Sub-task >Reporter: weibin0516 >Assignee: weibin0516 >Priority: Major > Fix For: v3.1.0 > > > Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741. > Currently using hive to create flat table, hive can't read spark datasource > data, we need to support the creation of flat table with spark sql, because > it can read hive and spark datasource data at the same time to create flat > table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4224) Create flat table wich spark sql
[ https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4224: -- Labels: doc (was: ) > Create flat table wich spark sql > > > Key: KYLIN-4224 > URL: https://issues.apache.org/jira/browse/KYLIN-4224 > Project: Kylin > Issue Type: Sub-task >Reporter: weibin0516 >Assignee: weibin0516 >Priority: Major > Labels: doc > Fix For: v3.1.0 > > > Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741. > Currently using hive to create flat table, hive can't read spark datasource > data, we need to support the creation of flat table with spark sql, because > it can read hive and spark datasource data at the same time to create flat > table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (KYLIN-4518) Pruning cuboids with genetic algorithm
[ https://issues.apache.org/jira/browse/KYLIN-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng closed KYLIN-4518. - Resolution: Resolved > Pruning cuboids with genetic algorithm > -- > > Key: KYLIN-4518 > URL: https://issues.apache.org/jira/browse/KYLIN-4518 > Project: Kylin > Issue Type: Sub-task >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (KYLIN-4517) Pruning cuboids with greedy algorithm
[ https://issues.apache.org/jira/browse/KYLIN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng closed KYLIN-4517. - Resolution: Resolved > Pruning cuboids with greedy algorithm > -- > > Key: KYLIN-4517 > URL: https://issues.apache.org/jira/browse/KYLIN-4517 > Project: Kylin > Issue Type: Sub-task >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4498) CubePlaner for Kylin on Parquet
[ https://issues.apache.org/jira/browse/KYLIN-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127465#comment-17127465 ] wangrupeng commented on KYLIN-4498: --- Cube Planner Proposal Cube Planner checks the costs and benefits of each dimension combination, and selects cost-effective dimension combination sets to improve cube build efficiency and query performance. Cube planner has two phases. Cube planner is degined and contributed by ebay. See more about the principle of cube planner from here(https://tech.ebayinc.com/engineering/cube-planner-build-an-apache-kylin-olap-cube-efficiently-and-intelligently/). In my opinion, to let cube planner support Kylin on Parquet, we need make some change to current spark engine for building cube. My suggestion is as follows: The front-end interaction remains the same as before. Phase1(Building cube at first time): 1. Add a new step to calculate each cuboid rows with spark before the step of cube building(Now Kylin on Parquet has two steps for cube building). 2. During the step of cube building, recommend cuboids list with Greedy algorithm or Genetic algorithm before building cube. The code of these two algorithms can be reused. Pase2(Cube has been used for a while) 1. Using System cube which now can be used normally to collect query metrics(Including cuboids scanning rows and scanning bytes) 2. Add a new spark job to optimize and rebuild cube with the information collected by System cube 3. The steps of the new optimized job: a. Using query metrics information to recommend cuboid b. Rebuild old segment by remove non-needed cuboids and adding needed cuboid, Kylin on Paruqet building engine can support only adding cuboids(now we also call "layouts") we need without rebuild all cuboids of the segment c. Update metadata > CubePlaner for Kylin on Parquet > --- > > Key: KYLIN-4498 > URL: https://issues.apache.org/jira/browse/KYLIN-4498 > Project: Kylin > Issue Type: New Feature >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > > CubePlanner still doesn't support Kylin on Parquet yet. We need this to be > more resource efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4544) Kylin on Parquet LEFT JOIN Query failed
[ https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng reassigned KYLIN-4544: - Assignee: wangrupeng > Kylin on Parquet LEFT JOIN Query failed > --- > > Key: KYLIN-4544 > URL: https://issues.apache.org/jira/browse/KYLIN-4544 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: bright liao >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > > select t.n,t1.n from ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t > LEFT JOIN ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200526') as t1 > on 1=1 > This sql execute failed while kylin-3.0 runs with no problem. > Error message: > Error while applying rule OLAPJoinRule, args > [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1, > 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from > ( select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t LEFT JOIN ( select count(*) as n from > dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit > 5" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4544) Kylin on Parquet LEFT JOIN Query failed
[ https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4544: -- Description: select t.n,t1.n from ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') as t LEFT JOIN ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1 This sql execute failed while kylin-3.0 runs with no problem. Error message: Error while applying rule OLAPJoinRule, args [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1, 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') as t LEFT JOIN ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit 5" was: select t.n,t1.n from ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') as t LEFT JOIN ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1 查询出错,但在3.0版本可以。 错误信息如下: Error while applying rule OLAPJoinRule, args [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1, 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') as t LEFT JOIN ( select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit 5" > Kylin on Parquet LEFT JOIN Query failed > --- > > Key: KYLIN-4544 > URL: https://issues.apache.org/jira/browse/KYLIN-4544 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: bright liao >Priority: Major > Fix For: v4.0.0-beta > > > select t.n,t1.n from ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t > LEFT JOIN ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200526') as t1 > on 1=1 > This sql execute failed while kylin-3.0 runs with no problem. > Error message: > Error while applying rule OLAPJoinRule, args > [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1, > 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from > ( select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t LEFT JOIN ( select count(*) as n from > dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit > 5" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4544) Kylin on Parquet LEFT JOIN Query failed
[ https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4544: -- Summary: Kylin on Parquet LEFT JOIN Query failed (was: Kylin on Parquet不兼容LEFT JOIN) > Kylin on Parquet LEFT JOIN Query failed > --- > > Key: KYLIN-4544 > URL: https://issues.apache.org/jira/browse/KYLIN-4544 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: bright liao >Priority: Major > Fix For: v4.0.0-beta > > > select t.n,t1.n from ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t > LEFT JOIN ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200526') as t1 > on 1=1 > 查询出错,但在3.0版本可以。 > 错误信息如下: > Error while applying rule OLAPJoinRule, args > [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1, > 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from > ( select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t LEFT JOIN ( select count(*) as n from > dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit > 5" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4544) Kylin on Parquet不兼容LEFT JOIN
[ https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4544: -- Affects Version/s: (was: v4.0.0-beta) > Kylin on Parquet不兼容LEFT JOIN > > > Key: KYLIN-4544 > URL: https://issues.apache.org/jira/browse/KYLIN-4544 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: bright liao >Priority: Major > Fix For: v4.0.0-beta > > > select t.n,t1.n from ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t > LEFT JOIN ( > select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200526') as t1 > on 1=1 > 查询出错,但在3.0版本可以。 > 错误信息如下: > Error while applying rule OLAPJoinRule, args > [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1, > 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from > ( select count(*) as n from dm.dm_bi_device_info_day where > deal_date='20200527') as t LEFT JOIN ( select count(*) as n from > dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit > 5" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4498) CubePlaner for Kylin on Parquet
[ https://issues.apache.org/jira/browse/KYLIN-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4498: -- Description: CubePlanner still doesn't support Kylin on Parquet yet. We need this to be more resource efficient. (was: Kylin on Parquet still doesn't support CubePlanner yet. We need this to be more resource efficient.) > CubePlaner for Kylin on Parquet > --- > > Key: KYLIN-4498 > URL: https://issues.apache.org/jira/browse/KYLIN-4498 > Project: Kylin > Issue Type: New Feature >Reporter: wangrupeng >Assignee: wangrupeng >Priority: Minor > Fix For: v4.0.0-beta > > > CubePlanner still doesn't support Kylin on Parquet yet. We need this to be > more resource efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4458) FilePruner prune shards
[ https://issues.apache.org/jira/browse/KYLIN-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng updated KYLIN-4458: -- Sprint: Sprint 51 (was: Sprint 52) > FilePruner prune shards > --- > > Key: KYLIN-4458 > URL: https://issues.apache.org/jira/browse/KYLIN-4458 > Project: Kylin > Issue Type: Improvement > Components: Storage - Parquet >Reporter: xuekaiqi >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > Original Estimate: 72h > Remaining Estimate: 72h > > To enable pruning by shard columns, web front end needs to add "shard by > column" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KYLIN-4458) FilePruner prune shards
[ https://issues.apache.org/jira/browse/KYLIN-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng resolved KYLIN-4458. --- Resolution: Fixed > FilePruner prune shards > --- > > Key: KYLIN-4458 > URL: https://issues.apache.org/jira/browse/KYLIN-4458 > Project: Kylin > Issue Type: Improvement > Components: Storage - Parquet >Reporter: xuekaiqi >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > Original Estimate: 72h > Remaining Estimate: 72h > > To enable pruning by shard columns, web front end needs to add "shard by > column" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KYLIN-4450) Add the feature that adjusting spark driver memory adaptively
[ https://issues.apache.org/jira/browse/KYLIN-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangrupeng resolved KYLIN-4450. --- Resolution: Fixed > Add the feature that adjusting spark driver memory adaptively > - > > Key: KYLIN-4450 > URL: https://issues.apache.org/jira/browse/KYLIN-4450 > Project: Kylin > Issue Type: Improvement > Components: Storage - Parquet >Reporter: xuekaiqi >Assignee: wangrupeng >Priority: Major > Fix For: v4.0.0-beta > > Original Estimate: 16h > Remaining Estimate: 16h > > For now the cubing job can adaptively adjust the following spark properties > to use of resources retionally, but the driver memory of the spark job > uploaded to cluster haven't been done. > > {code:java} > spark.executor.memory > spark.executor.cores > spark.executor.memoryOverhead > spark.executor.instances > spark.sql.shuffle.partitions > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4519) Analyse user query history and offer suggestion about the removing of unused cuboids
wangrupeng created KYLIN-4519: - Summary: Analyse user query history and offer suggestion about the removing of unused cuboids Key: KYLIN-4519 URL: https://issues.apache.org/jira/browse/KYLIN-4519 Project: Kylin Issue Type: Sub-task Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4517) Pruning cuboids with greedy algorithm
wangrupeng created KYLIN-4517: - Summary: Pruning cuboids with greedy algorithm Key: KYLIN-4517 URL: https://issues.apache.org/jira/browse/KYLIN-4517 Project: Kylin Issue Type: Sub-task Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4518) Pruning cuboids with genetic algorithm
wangrupeng created KYLIN-4518: - Summary: Pruning cuboids with genetic algorithm Key: KYLIN-4518 URL: https://issues.apache.org/jira/browse/KYLIN-4518 Project: Kylin Issue Type: Sub-task Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4516) Support System Cube
wangrupeng created KYLIN-4516: - Summary: Support System Cube Key: KYLIN-4516 URL: https://issues.apache.org/jira/browse/KYLIN-4516 Project: Kylin Issue Type: Sub-task Reporter: wangrupeng Assignee: wangrupeng Fix For: v4.0.0-beta -- This message was sent by Atlassian Jira (v8.3.4#803005)