[jira] [Created] (KYLIN-4782) Verify if the query hit the true cuboid in IT

2020-10-08 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4782:
-

 Summary: Verify if the query hit the true cuboid in IT
 Key: KYLIN-4782
 URL: https://issues.apache.org/jira/browse/KYLIN-4782
 Project: Kylin
  Issue Type: Improvement
  Components: Integration
Affects Versions: v4.0.0-alpha
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1

2020-09-28 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng edited comment on KYLIN-4776 at 9/28/20, 6:53 AM:
-

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|Not passed|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|Yes|No need|No|
|KYLIN-4656|No Need|No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 


was (Author: wangrupeng):
 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|Not passed|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No Need|No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1

2020-09-27 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng edited comment on KYLIN-4776 at 9/28/20, 3:08 AM:
-

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|Not passed|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No Need|No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 


was (Author: wangrupeng):
 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|Not passed|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No |No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1

2020-09-27 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng edited comment on KYLIN-4776 at 9/28/20, 2:47 AM:
-

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|Not passed|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No |No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 


was (Author: wangrupeng):
 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|No|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No |No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1

2020-09-27 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng edited comment on KYLIN-4776 at 9/28/20, 1:55 AM:
-

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|No|
|KYLIN-4709|No need|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No |No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 


was (Author: wangrupeng):
 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|No|
|KYLIN-4709|Yes|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No |No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1

2020-09-27 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng edited comment on KYLIN-4776 at 9/28/20, 1:53 AM:
-

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|Yes|Yes|No|
|KYLIN-4709|Yes|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No |No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No need|No need|No|

 

 


was (Author: wangrupeng):
 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|No|Yes|No|
|KYLIN-4709|No|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No|No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4776) Release Kylin v3.1.1

2020-09-27 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng edited comment on KYLIN-4776 at 9/27/20, 12:59 PM:
--

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|No|Yes|No|
|KYLIN-4709|No|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No|No need|No|
|KYLIN-4648|Yes|No need|No|
|KYLIN-4581|No|No need|No|

 

 


was (Author: wangrupeng):
 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|No|Yes|No|
|KYLIN-4709|No|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No|No need|No|
|KYLIN-4648|No|No need|No|
|KYLIN-4581|No|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4776) Release Kylin v3.1.1

2020-09-27 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202823#comment-17202823
 ] 

wangrupeng commented on KYLIN-4776:
---

 

 
||Issue ID||Verified?||Documentation updated?||Others||
|KYLIN-4712|No|Yes|No|
|KYLIN-4709|No|No need|No|
|KYLIN-4688|Yes|No need|No|
|KYLIN-4657|No|No need|No|
|KYLIN-4656|No|No need|No|
|KYLIN-4648|No|No need|No|
|KYLIN-4581|No|No need|No|

 

 

> Release Kylin v3.1.1
> 
>
> Key: KYLIN-4776
> URL: https://issues.apache.org/jira/browse/KYLIN-4776
> Project: Kylin
>  Issue Type: Test
>  Components: Release
>Affects Versions: v3.1.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Release Plan for Kylin v3.1.1 
>  
> ||Information||Heading 2||
> |Release Manager|Xiaoxiang Yu|
> |Voting Date|2020/10/15|
> h3. Issue List
> https://issues.apache.org/jira/projects/KYLIN/versions/12348354
> h3. Issue Verification Assignee
> ||Assignee ||Issue||Count||
> |Zhichao Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> tianhui5 OR assignee = xxyu )|9|
> |Yaqian Zhang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> gxcheng  )|13|
> |Rupeng Wang|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> itzhangqiang or assignee = zhangyaqian or assignee = zhangzc and assignee = 
> julianpan )|10|
> |Xiaoxiang Yu|project = 12316121 AND fixVersion = 12348354 and (assignee = 
> xiaoge )|14|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4725) NSparkCubingStep returns error state when pause build job

2020-09-27 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4725:
--
Fix Version/s: (was: v3.1.1)
   v4.0.0-alpha

> NSparkCubingStep returns error state when pause build job
> -
>
> Key: KYLIN-4725
> URL: https://issues.apache.org/jira/browse/KYLIN-4725
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v4.0.0-alpha
>Reporter: Zhichao  Zhang
>Assignee: Yaqian Zhang
>Priority: Major
> Fix For: v4.0.0-alpha
>
>
> When pause a build job, NSparkCubingStep returns ExecuteResult.State.ERROR 
> state, it must be ExecuteResult.State.STOPPED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4765) Set spark.sql.shuffle.partition to 1 for debug on local

2020-09-20 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4765:
-

 Summary: Set spark.sql.shuffle.partition to 1 for debug on local
 Key: KYLIN-4765
 URL: https://issues.apache.org/jira/browse/KYLIN-4765
 Project: Kylin
  Issue Type: Improvement
Reporter: wangrupeng
Assignee: wangrupeng


Currently, spark.sql.shuffle.partition will be set automatically in cluster 
mode, but it will use the default value of spark.sql.shuffle.partition which is 
200 when debug on local and the build will be slow and with low effeciency.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4760) Optimize TopN measure

2020-09-15 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng reassigned KYLIN-4760:
-

  Component/s: Measure - TopN
Fix Version/s: v4.0.0-beta
Affects Version/s: v4.0.0-alpha
 Assignee: wangrupeng
  Description: Now, each time the buffer of  topn update function 
insert one row, it will be resort which will slow down the build.
  Summary: Optimize TopN measure  (was: Optimize TopN)

> Optimize TopN measure
> -
>
> Key: KYLIN-4760
> URL: https://issues.apache.org/jira/browse/KYLIN-4760
> Project: Kylin
>  Issue Type: Improvement
>  Components: Measure - TopN
>Affects Versions: v4.0.0-alpha
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> Now, each time the buffer of  topn update function insert one row, it will be 
> resort which will slow down the build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4760) Optimize TopN

2020-09-15 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4760:
-

 Summary: Optimize TopN
 Key: KYLIN-4760
 URL: https://issues.apache.org/jira/browse/KYLIN-4760
 Project: Kylin
  Issue Type: Improvement
Reporter: wangrupeng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KYLIN-4459) Continuous print warning log-DFSInputStream has been closed already

2020-09-11 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng resolved KYLIN-4459.
---
Fix Version/s: v4.0.0-alpha
   Resolution: Fixed

> Continuous print warning log-DFSInputStream has been closed already
> ---
>
> Key: KYLIN-4459
> URL: https://issues.apache.org/jira/browse/KYLIN-4459
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: xuekaiqi
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-alpha
>
>
> when starting kylin with debug tomcat mode, we can see these logs.
> {code:java}
> 2020-03-12 10:17:06,082 ERROR [pool-12-thread-1] curator.CuratorScheduler:205 
> : Node(127.0.0.1) job server state conflict. Is ZK leader: true; Is active 
> job server: false
> 2020-03-12 10:17:06,830 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:06,830 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:06,830 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:08,717 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:08,717 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:08,717 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:12,936 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:12,936 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:12,936 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:14,152 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:14,152 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:14,152 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:17,603 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:17,603 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:17,603 WARN [Curator-LeaderSelector-0] hdfs.DFSClient:669 : 
> DFSInputStream has been closed already
> 2020-03-12 10:17:17,603 INFO [Curator-LeaderSelector-0] 
> threadpool.DefaultScheduler:166 : Finishing resume all running jobs.
> 2020-03-12 10:17:17,603 INFO [Curator-LeaderSelector-0] 
> threadpool.DefaultScheduler:170 : Fetching jobs every 30 seconds
> 2020-03-12 10:17:17,603 INFO [Curator-LeaderSelector-0] 
> threadpool.DefaultScheduler:180 : Creating fetcher pool instance:2094578449
> {code}
> Upgrade hadoop version to 2.7.2 can fix it, but need more test in case some 
> unpredictable problems
> [https://community.cloudera.com/t5/Support-Questions/DFSInputStream-has-been-closed-already/td-p/125487]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4748) Optimize metadata for debug on local

2020-09-04 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4748:
-

 Summary: Optimize metadata for debug on local
 Key: KYLIN-4748
 URL: https://issues.apache.org/jira/browse/KYLIN-4748
 Project: Kylin
  Issue Type: Improvement
Reporter: wangrupeng
Assignee: wangrupeng


* Add count distinct and percentile measure
 * Add a new column KYLIN_SALES.ITEM_ID for count distinct
 * Set SELLER_ID as shard by column 
 * Add cube configuration 
*kylin.storage.columnar.shard-countdistinct-rowcount=1000* for file pruner by 
shard



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4742) NullPointerException when auto merge segments if exist discard jobs

2020-09-03 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4742:
--
  Component/s: Tools, Build and Test
Fix Version/s: v4.0.0-alpha

> NullPointerException when auto merge segments if exist discard jobs
> ---
>
> Key: KYLIN-4742
> URL: https://issues.apache.org/jira/browse/KYLIN-4742
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-alpha
>
> Attachments: image-2020-09-03-14-05-56-127.png
>
>
> It's because the merge job does not set segment name, so it will throw NPE 
> when job get segment name
> !image-2020-09-03-14-05-56-127.png|width=712,height=147!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4742) NullPointerException when auto merge segments if exist discard jobs

2020-09-03 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4742:
-

 Summary: NullPointerException when auto merge segments if exist 
discard jobs
 Key: KYLIN-4742
 URL: https://issues.apache.org/jira/browse/KYLIN-4742
 Project: Kylin
  Issue Type: Bug
Reporter: wangrupeng
Assignee: wangrupeng
 Attachments: image-2020-09-03-14-05-56-127.png

It's because the merge job does not set segment name, so it will throw NPE when 
job get segment name

!image-2020-09-03-14-05-56-127.png|width=712,height=147!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4729) The hive table will be overwrited when add csv table with the same name

2020-08-28 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4729:
--
Fix Version/s: v4.0.0-alpha
Affects Version/s: v4.0.0-alpha

> The hive table will be overwrited when add csv table with the same name
> ---
>
> Key: KYLIN-4729
> URL: https://issues.apache.org/jira/browse/KYLIN-4729
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v4.0.0-alpha
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-alpha
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4729) The hive table will be overwrited when add csv table with the same name

2020-08-28 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4729:
--
Summary: The hive table will be overwrited when add csv table with the same 
name  (was: The hive)

> The hive table will be overwrited when add csv table with the same name
> ---
>
> Key: KYLIN-4729
> URL: https://issues.apache.org/jira/browse/KYLIN-4729
> Project: Kylin
>  Issue Type: Bug
>Reporter: wangrupeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4729) The hive table will be overwrited when add csv table with the same name

2020-08-28 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng reassigned KYLIN-4729:
-

Assignee: wangrupeng

> The hive table will be overwrited when add csv table with the same name
> ---
>
> Key: KYLIN-4729
> URL: https://issues.apache.org/jira/browse/KYLIN-4729
> Project: Kylin
>  Issue Type: Bug
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4729) The hive

2020-08-28 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4729:
-

 Summary: The hive
 Key: KYLIN-4729
 URL: https://issues.apache.org/jira/browse/KYLIN-4729
 Project: Kylin
  Issue Type: Bug
Reporter: wangrupeng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4723) Set the configurations about shard by to cube level

2020-08-27 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4723:
-

 Summary: Set the configurations about shard by to cube level
 Key: KYLIN-4723
 URL: https://issues.apache.org/jira/browse/KYLIN-4723
 Project: Kylin
  Issue Type: Improvement
  Components: Tools, Build and Test
Affects Versions: v4.0.0-alpha
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-alpha


Now the shard by related configurations like 
kylin.storage.columnar.shard-rowcount is Global Level, as it's important for 
query effeciency, it's better to set to Cube Level



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4722) Add more statistics to the query results

2020-08-26 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4722:
--
Description: 
Now, the query result contains scaned rows, scaned bytes. There are some other 
statistics can be added like the number of scan files, spark scan time, etc. It 
will be useful to add the number of parquet files scaned when querying, 
especially, the shard by column is configured which will decrease the   number 
of scaned parquet files to improve query efficency. 

To read more about shard by column with below link.

[https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]

  was:
Now, the query result contains scaned rows, scaned bytes. It will be useful to 
add the number of parquet files scaned when querying, especially, the shard by 
column is configured which will decrease the   number of scaned parquet files 
to improve query efficency. 

To read more about shard by column with below link.

[https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]


> Add more statistics to the query results
> 
>
> Key: KYLIN-4722
> URL: https://issues.apache.org/jira/browse/KYLIN-4722
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Affects Versions: v4.0.0-alpha
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-alpha
>
>
> Now, the query result contains scaned rows, scaned bytes. There are some 
> other statistics can be added like the number of scan files, spark scan time, 
> etc. It will be useful to add the number of parquet files scaned when 
> querying, especially, the shard by column is configured which will decrease 
> the   number of scaned parquet files to improve query efficency. 
> To read more about shard by column with below link.
> [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4722) Add more statistics to the query results

2020-08-26 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4722:
--
Description: 
Now, the query result contains scaned rows, scaned bytes. There are some other 
statistics can be added like the number of scan files, spark scan time, etc.

It will be useful to add the number of parquet files scaned when querying, 
especially, the shard by column is configured which will decrease the   number 
of scaned parquet files to improve query efficency. 

To read more about shard by column with below link.

[https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]

  was:
Now, the query result contains scaned rows, scaned bytes. There are some other 
statistics can be added like the number of scan files, spark scan time, etc. It 
will be useful to add the number of parquet files scaned when querying, 
especially, the shard by column is configured which will decrease the   number 
of scaned parquet files to improve query efficency. 

To read more about shard by column with below link.

[https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]


> Add more statistics to the query results
> 
>
> Key: KYLIN-4722
> URL: https://issues.apache.org/jira/browse/KYLIN-4722
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Affects Versions: v4.0.0-alpha
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-alpha
>
>
> Now, the query result contains scaned rows, scaned bytes. There are some 
> other statistics can be added like the number of scan files, spark scan time, 
> etc.
> It will be useful to add the number of parquet files scaned when querying, 
> especially, the shard by column is configured which will decrease the   
> number of scaned parquet files to improve query efficency. 
> To read more about shard by column with below link.
> [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4722) Add more statistics to the query results

2020-08-26 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4722:
--
Summary: Add more statistics to the query results  (was: Add the number of 
files scaned when querying)

> Add more statistics to the query results
> 
>
> Key: KYLIN-4722
> URL: https://issues.apache.org/jira/browse/KYLIN-4722
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Affects Versions: v4.0.0-alpha
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-alpha
>
>
> Now, the query result contains scaned rows, scaned bytes. It will be useful 
> to add the number of parquet files scaned when querying, especially, the 
> shard by column is configured which will decrease the   number of scaned 
> parquet files to improve query efficency. 
> To read more about shard by column with below link.
> [https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4722) Add the number of files scaned when querying

2020-08-26 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4722:
-

 Summary: Add the number of files scaned when querying
 Key: KYLIN-4722
 URL: https://issues.apache.org/jira/browse/KYLIN-4722
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: v4.0.0-alpha
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-alpha


Now, the query result contains scaned rows, scaned bytes. It will be useful to 
add the number of parquet files scaned when querying, especially, the shard by 
column is configured which will decrease the   number of scaned parquet files 
to improve query efficency. 

To read more about shard by column with below link.

[https://cwiki.apache.org/confluence/display/KYLIN/Improving+query+effeciency+by+set+shard+by+column]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4721) The default source source type should be CSV not Hive with the local debug mode

2020-08-26 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4721:
-

 Summary: The default source source type should be CSV not Hive 
with the local debug mode
 Key: KYLIN-4721
 URL: https://issues.apache.org/jira/browse/KYLIN-4721
 Project: Kylin
  Issue Type: Bug
  Components: Metadata
Affects Versions: v4.0.0-alpha
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-alpha


When debuging kylin 4.0 with tomcat local mode, Kylin will use the metadata 
which is located in $KYLIN_SOURCE/examples/test_case_data/sample_local and the 
source type of tables is hive.

The build task will remain pending because it cannot connect the remote hadoop 
cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build

2020-08-24 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4715:
--
Description: 
[http://kylin.apache.org/docs/howto/howto_optimize_build.html]

The number of cuboids should be N*(N-1)/2 when with the (N-2)  dimensions. 

!image-2020-08-25-11-13-55-160.png|width=660,height=337!

!image-2020-08-25-11-14-14-556.png!

  was:
[http://kylin.apache.org/docs/howto/howto_optimize_build.html]

The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 

!image-2020-08-25-11-13-55-160.png|width=660,height=337!

!image-2020-08-25-11-14-14-556.png!


> Wrong function with kylin document about how to optimize cube build
> ---
>
> Key: KYLIN-4715
> URL: https://issues.apache.org/jira/browse/KYLIN-4715
> Project: Kylin
>  Issue Type: Bug
>  Components: Documentation
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v3.1.1
>
> Attachments: image-2020-08-25-11-13-55-160.png, 
> image-2020-08-25-11-14-14-556.png
>
>
> [http://kylin.apache.org/docs/howto/howto_optimize_build.html]
> The number of cuboids should be N*(N-1)/2 when with the (N-2)  dimensions. 
> !image-2020-08-25-11-13-55-160.png|width=660,height=337!
> !image-2020-08-25-11-14-14-556.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build

2020-08-24 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4715:
--
Description: 
[http://kylin.apache.org/docs/howto/howto_optimize_build.html]

The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 

!image-2020-08-25-11-13-55-160.png|width=660,height=337!

!image-2020-08-25-11-14-14-556.png!

  was:
[http://kylin.apache.org/docs/howto/howto_optimize_build.html]

The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 

!image-2020-08-25-11-07-32-579.png|width=591,height=334!

!image-2020-08-25-11-09-33-205.png|width=298,height=132!


> Wrong function with kylin document about how to optimize cube build
> ---
>
> Key: KYLIN-4715
> URL: https://issues.apache.org/jira/browse/KYLIN-4715
> Project: Kylin
>  Issue Type: Bug
>  Components: Documentation
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v3.1.1
>
> Attachments: image-2020-08-25-11-13-55-160.png, 
> image-2020-08-25-11-14-14-556.png
>
>
> [http://kylin.apache.org/docs/howto/howto_optimize_build.html]
> The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 
> !image-2020-08-25-11-13-55-160.png|width=660,height=337!
> !image-2020-08-25-11-14-14-556.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build

2020-08-24 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4715:
--
Attachment: image-2020-08-25-11-14-14-556.png

> Wrong function with kylin document about how to optimize cube build
> ---
>
> Key: KYLIN-4715
> URL: https://issues.apache.org/jira/browse/KYLIN-4715
> Project: Kylin
>  Issue Type: Bug
>  Components: Documentation
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v3.1.1
>
> Attachments: image-2020-08-25-11-13-55-160.png, 
> image-2020-08-25-11-14-14-556.png
>
>
> [http://kylin.apache.org/docs/howto/howto_optimize_build.html]
> The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 
> !image-2020-08-25-11-07-32-579.png|width=591,height=334!
> !image-2020-08-25-11-09-33-205.png|width=298,height=132!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build

2020-08-24 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4715:
--
Attachment: image-2020-08-25-11-13-55-160.png

> Wrong function with kylin document about how to optimize cube build
> ---
>
> Key: KYLIN-4715
> URL: https://issues.apache.org/jira/browse/KYLIN-4715
> Project: Kylin
>  Issue Type: Bug
>  Components: Documentation
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v3.1.1
>
> Attachments: image-2020-08-25-11-13-55-160.png
>
>
> [http://kylin.apache.org/docs/howto/howto_optimize_build.html]
> The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 
> !image-2020-08-25-11-07-32-579.png|width=591,height=334!
> !image-2020-08-25-11-09-33-205.png|width=298,height=132!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4715) Wrong function with kylin document about how to optimize cube build

2020-08-24 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4715:
-

 Summary: Wrong function with kylin document about how to optimize 
cube build
 Key: KYLIN-4715
 URL: https://issues.apache.org/jira/browse/KYLIN-4715
 Project: Kylin
  Issue Type: Bug
  Components: Documentation
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v3.1.1
 Attachments: image-2020-08-25-11-13-55-160.png

[http://kylin.apache.org/docs/howto/howto_optimize_build.html]

The number of cuboids should be N*(N-2)/2 when with the (N-2)  dimensions. 

!image-2020-08-25-11-07-32-579.png|width=591,height=334!

!image-2020-08-25-11-09-33-205.png|width=298,height=132!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4700) Wrong engine type for realtime streaming

2020-08-14 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng reassigned KYLIN-4700:
-

   Attachment: image-2020-08-14-20-34-56-168.png
  Component/s: Website
Fix Version/s: v3.1.1
Affects Version/s: v3.1.0
 Assignee: wangrupeng
  Description: 
As for now, realtime streaming only support map reduce to build, but there's an 
error that flink engine can be selected when create a realtime streaming cube. 

!image-2020-08-14-20-34-56-168.png|width=499,height=263!

> Wrong engine type for realtime streaming 
> -
>
> Key: KYLIN-4700
> URL: https://issues.apache.org/jira/browse/KYLIN-4700
> Project: Kylin
>  Issue Type: Bug
>  Components: Website
>Affects Versions: v3.1.0
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v3.1.1
>
> Attachments: image-2020-08-14-20-34-56-168.png
>
>
> As for now, realtime streaming only support map reduce to build, but there's 
> an error that flink engine can be selected when create a realtime streaming 
> cube. 
> !image-2020-08-14-20-34-56-168.png|width=499,height=263!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4700) Wrong engine type for realtime streaming

2020-08-14 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4700:
-

 Summary: Wrong engine type for realtime streaming 
 Key: KYLIN-4700
 URL: https://issues.apache.org/jira/browse/KYLIN-4700
 Project: Kylin
  Issue Type: Bug
Reporter: wangrupeng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4626) add set kylin home sh

2020-08-12 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176046#comment-17176046
 ] 

wangrupeng commented on KYLIN-4626:
---

That's great!

> add set kylin home sh
> -
>
> Key: KYLIN-4626
> URL: https://issues.apache.org/jira/browse/KYLIN-4626
> Project: Kylin
>  Issue Type: Improvement
>Reporter: chuxiao
>Assignee: chuxiao
>Priority: Major
>
> KYLIN_HOME 是重要的,几乎每个脚本都离不开它。但随便设置环境变量并不是一个最佳行为,比如安装了多套实例。增加set-kylin-home.sh, 
> kylin实例可以设置自己的环境变量。
> 这个主要是面向一台服务器部署多套kylin服务用的。
> 另外我们运维规范现在要求环境变量都放到服务自己的文件里,避免冲突。
> 而其他环境变量都可以放到setenv.sh里,但kylin_home在setenv.sh加载前就需要。所以默认是取系统环境变量,也可以根据需要修改



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4690) BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with Spark

2020-08-10 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174300#comment-17174300
 ] 

wangrupeng edited comment on KYLIN-4690 at 8/10/20, 12:57 PM:
--

Could you provide more information about your environment? Like the hadoop 
version, spark version, etc. I test it  in my CDH5.7 with cluster mode and it 
works fine.


was (Author: wangrupeng):
Could you provide more information about your environment? I test it  in my 
CDH5.7 with cluster mode and it works fine.

> BUILD CUBE  - job fail on spark clusters mode - #7 Step Name: Build Cube with 
> Spark
> ---
>
> Key: KYLIN-4690
> URL: https://issues.apache.org/jira/browse/KYLIN-4690
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Affects Versions: v3.1.0
>Reporter: James
>Priority: Critical
>
> BUILD CUBE  - job fail on spark clusters mode - #7 Step Name: Build Cube with 
> Spark
>  Executor:
> export 
> HADOOP_CONF_DIR=/app/kylin/apache-kylin-3.1.0-bin-hbase1x/kylin_hadoop_conf_dir
>  && /usr/hdp/current/spark2-client/bin/spark-submit --class 
> org.apache.kylin.common.util.SparkEntry --name "Build Cube with 
> Spark:CBE_DEV[2020010200_2020010300]" --conf spark.executor.cores=5  
> --conf spark.hadoop.yarn.timeline-service.enabled=false  --conf 
> spark.hadoop.mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec
>   --conf spark.executor.memoryOverhead=1024  --conf 
> spark.executor.extraJavaOptions=-Dhdp.version=2.6.4.149-3  --conf 
> spark.master=yarn  --conf 
> spark.hadoop.mapreduce.output.fileoutputformat.compress=true  --conf 
> spark.executor.instances=5  --conf 
> spark.kryo.register=org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage
>   --conf spark.yarn.am.extraJavaOptions=-Dhdp.version=2.6.4.149-3  --conf 
> spark.executor.memory=4G  --conf spark.yarn.queue=sgz1-criskapp-haas_dev  
> --conf spark.submit.deployMode=cluster  --conf 
> spark.dynamicAllocation.minExecutors=0  --conf spark.network.timeout=600  
> --conf spark.hadoop.dfs.replication=2  --conf 
> spark.yarn.executor.memoryOverhead=1024  --conf 
> spark.dynamicAllocation.executorIdleTimeout=300  --conf 
> spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf 
> spark.driver.memory=5G  --conf 
> spark.driver.extraJavaOptions=-Dhdp.version=2.6.4.149-3  --conf 
> spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec  --conf 
> spark.eventLog.enabled=true  --conf spark.shuffle.service.enabled=true  
> --conf spark.eventLog.dir=hdfs:///kylin/spark-history  --conf 
> spark.dynamicAllocation.maxExecutors=15  --conf 
> spark.dynamicAllocation.enabled=true --jars 
> /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar 
> /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar -className 
> org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable 
> kylin310.kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d 
> -output 
> hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/cuboid/
>  -input 
> hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d
>  -segmentId 02f32a29-1d51-0cb0-37ba-825333d38c8d -metaUrl 
> crr_kylin_dev240@hdfs,path=hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/metadata
>  -cubename CBE_DEV
>  
> Step Name:
> #7 Step Name: Build Cube with Spark:CBE_DEV[2020010200_2020010300]
>  
> Error:
> 20/08/08 09:23:54 ERROR ApplicationMaster: User class threw exception: 
> java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while 
> instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
> java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while 
> instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
>     at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
>     at 
> org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:497)
>     at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646)
> Caused by: java.lang.IllegalArgumentException: Error while instantiating 
> 

[jira] [Commented] (KYLIN-4690) BUILD CUBE - job fail on spark clusters mode - #7 Step Name: Build Cube with Spark

2020-08-10 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174300#comment-17174300
 ] 

wangrupeng commented on KYLIN-4690:
---

Could you provide more information about your environment? I test it  in my 
CDH5.7 with cluster mode and it works fine.

> BUILD CUBE  - job fail on spark clusters mode - #7 Step Name: Build Cube with 
> Spark
> ---
>
> Key: KYLIN-4690
> URL: https://issues.apache.org/jira/browse/KYLIN-4690
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Affects Versions: v3.1.0
>Reporter: James
>Priority: Critical
>
> BUILD CUBE  - job fail on spark clusters mode - #7 Step Name: Build Cube with 
> Spark
>  Executor:
> export 
> HADOOP_CONF_DIR=/app/kylin/apache-kylin-3.1.0-bin-hbase1x/kylin_hadoop_conf_dir
>  && /usr/hdp/current/spark2-client/bin/spark-submit --class 
> org.apache.kylin.common.util.SparkEntry --name "Build Cube with 
> Spark:CBE_DEV[2020010200_2020010300]" --conf spark.executor.cores=5  
> --conf spark.hadoop.yarn.timeline-service.enabled=false  --conf 
> spark.hadoop.mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec
>   --conf spark.executor.memoryOverhead=1024  --conf 
> spark.executor.extraJavaOptions=-Dhdp.version=2.6.4.149-3  --conf 
> spark.master=yarn  --conf 
> spark.hadoop.mapreduce.output.fileoutputformat.compress=true  --conf 
> spark.executor.instances=5  --conf 
> spark.kryo.register=org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage
>   --conf spark.yarn.am.extraJavaOptions=-Dhdp.version=2.6.4.149-3  --conf 
> spark.executor.memory=4G  --conf spark.yarn.queue=sgz1-criskapp-haas_dev  
> --conf spark.submit.deployMode=cluster  --conf 
> spark.dynamicAllocation.minExecutors=0  --conf spark.network.timeout=600  
> --conf spark.hadoop.dfs.replication=2  --conf 
> spark.yarn.executor.memoryOverhead=1024  --conf 
> spark.dynamicAllocation.executorIdleTimeout=300  --conf 
> spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf 
> spark.driver.memory=5G  --conf 
> spark.driver.extraJavaOptions=-Dhdp.version=2.6.4.149-3  --conf 
> spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec  --conf 
> spark.eventLog.enabled=true  --conf spark.shuffle.service.enabled=true  
> --conf spark.eventLog.dir=hdfs:///kylin/spark-history  --conf 
> spark.dynamicAllocation.maxExecutors=15  --conf 
> spark.dynamicAllocation.enabled=true --jars 
> /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar 
> /app/kylin/apache-kylin-3.1.0-bin-hbase1x/lib/kylin-job-3.1.0.jar -className 
> org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable 
> kylin310.kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d 
> -output 
> hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/cuboid/
>  -input 
> hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/kylin_intermediate_cbe_dev_02f32a29_1d51_0cb0_37ba_825333d38c8d
>  -segmentId 02f32a29-1d51-0cb0-37ba-825333d38c8d -metaUrl 
> crr_kylin_dev240@hdfs,path=hdfs:///dev/kylin310/kylin-0f5b105d-4794-e7ce-b329-fd7a83cb1aa2/CBE_DEV/metadata
>  -cubename CBE_DEV
>  
> Step Name:
> #7 Step Name: Build Cube with Spark:CBE_DEV[2020010200_2020010300]
>  
> Error:
> 20/08/08 09:23:54 ERROR ApplicationMaster: User class threw exception: 
> java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while 
> instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
> java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Error while 
> instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
>     at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
>     at 
> org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:497)
>     at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646)
> Caused by: java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
>     at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1075)
>     at 
> 

[jira] [Commented] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary

2020-08-10 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174298#comment-17174298
 ] 

wangrupeng commented on KYLIN-4688:
---

The 3rd problem you mensioned, it's not cleanup tmp file, but delete partition 
file if it exists before create partition file. 

!image-2020-08-10-20-49-46-354.png|width=610,height=256!

> Too many tmp files in HDFS tmp dictionary
> -
>
> Key: KYLIN-4688
> URL: https://issues.apache.org/jira/browse/KYLIN-4688
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: all
>Reporter: QiangZhang
>Priority: Major
> Attachments: image-2020-08-06-18-07-00-377.png, 
> image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, 
> image-2020-08-06-19-34-22-515.png, image-2020-08-10-19-38-28-913.png, 
> image-2020-08-10-20-06-01-526.png, image-2020-08-10-20-14-53-854.png, 
> image-2020-08-10-20-18-10-513.png, image-2020-08-10-20-20-07-899.png, 
> image-2020-08-10-20-21-44-137.png, image-2020-08-10-20-49-46-354.png
>
>
> Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up 
> automatically
> !image-2020-08-06-18-07-00-377.png!
>   2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary

2020-08-10 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4688:
--
Attachment: image-2020-08-10-20-49-46-354.png

> Too many tmp files in HDFS tmp dictionary
> -
>
> Key: KYLIN-4688
> URL: https://issues.apache.org/jira/browse/KYLIN-4688
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: all
>Reporter: QiangZhang
>Priority: Major
> Attachments: image-2020-08-06-18-07-00-377.png, 
> image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, 
> image-2020-08-06-19-34-22-515.png, image-2020-08-10-19-38-28-913.png, 
> image-2020-08-10-20-06-01-526.png, image-2020-08-10-20-14-53-854.png, 
> image-2020-08-10-20-18-10-513.png, image-2020-08-10-20-20-07-899.png, 
> image-2020-08-10-20-21-44-137.png, image-2020-08-10-20-49-46-354.png
>
>
> Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up 
> automatically
> !image-2020-08-06-18-07-00-377.png!
>   2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary

2020-08-10 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174259#comment-17174259
 ] 

wangrupeng commented on KYLIN-4688:
---

Where does the default value of "hbase.fs.tmp.dir" set? From the source code of 
kylin, it will be set to "/tmp" when   "conf.get('hbase.fs.tmp.dir')" is blank.

> Too many tmp files in HDFS tmp dictionary
> -
>
> Key: KYLIN-4688
> URL: https://issues.apache.org/jira/browse/KYLIN-4688
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: all
>Reporter: QiangZhang
>Priority: Major
> Attachments: image-2020-08-06-18-07-00-377.png, 
> image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, 
> image-2020-08-06-19-34-22-515.png
>
>
> Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up 
> automatically
> !image-2020-08-06-18-07-00-377.png!
>   2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4688) Too many tmp files in HDFS tmp dictionary

2020-08-10 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174256#comment-17174256
 ] 

wangrupeng commented on KYLIN-4688:
---

As far as I'm concerned, HDFS won't delete file in /tmp/ . I agree with you 
that kylin should clean up tmp files at the end of the task.

> Too many tmp files in HDFS tmp dictionary
> -
>
> Key: KYLIN-4688
> URL: https://issues.apache.org/jira/browse/KYLIN-4688
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: all
>Reporter: QiangZhang
>Priority: Major
> Attachments: image-2020-08-06-18-07-00-377.png, 
> image-2020-08-06-18-29-28-503.png, image-2020-08-06-18-58-00-355.png, 
> image-2020-08-06-19-34-22-515.png
>
>
> Too many tmp files in HDFS tmp dictionary,and Kylin doesn't clean up 
> automatically
> !image-2020-08-06-18-07-00-377.png!
>   2. when I debug ,I found : !image-2020-08-06-18-29-28-503.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4680) Avoid annoying log messages of unit test and integration test

2020-08-04 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4680:
-

 Summary: Avoid annoying log messages of unit test and integration 
test
 Key: KYLIN-4680
 URL: https://issues.apache.org/jira/browse/KYLIN-4680
 Project: Kylin
  Issue Type: Improvement
  Components: Integration, Tools, Build and Test
Affects Versions: v4.0.0-beta
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta


When running a unit test case, it will output too much annoying log messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KYLIN-4644) New tool to clean up  intermediate files for Kylin 4.0

2020-07-26 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng resolved KYLIN-4644.
---
Resolution: Fixed

> New tool to clean up  intermediate files for Kylin 4.0
> --
>
> Key: KYLIN-4644
> URL: https://issues.apache.org/jira/browse/KYLIN-4644
> Project: Kylin
>  Issue Type: Improvement
>  Components: Client - CLI
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> As the change of storage, Kylin 4.x needs a new tool to clean up intermediate 
> data and temporary files generated during cube building.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4644) New tool to clean up  intermediate files for Kylin 4.0

2020-07-17 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159843#comment-17159843
 ] 

wangrupeng commented on KYLIN-4644:
---

[https://github.com/apache/kylin/pull/1323]

> New tool to clean up  intermediate files for Kylin 4.0
> --
>
> Key: KYLIN-4644
> URL: https://issues.apache.org/jira/browse/KYLIN-4644
> Project: Kylin
>  Issue Type: Improvement
>  Components: Client - CLI
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> As the change of storage, Kylin 4.x needs a new tool to clean up intermediate 
> data and temporary files generated during cube building.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4644) New tool to clean up  intermediate files for Kylin 4.0

2020-07-17 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4644:
--
Issue Type: Improvement  (was: Bug)

> New tool to clean up  intermediate files for Kylin 4.0
> --
>
> Key: KYLIN-4644
> URL: https://issues.apache.org/jira/browse/KYLIN-4644
> Project: Kylin
>  Issue Type: Improvement
>  Components: Client - CLI
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> As the change of storage, Kylin 4.x needs a new tool to clean up intermediate 
> data and temporary files generated during cube building.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4644) New tool to clean up  intermediate files for Kylin 4.0

2020-07-16 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159600#comment-17159600
 ] 

wangrupeng commented on KYLIN-4644:
---

Which will be deleted:
 * {{temp job files}}

            hdfs:///kylin/${metadata_url}/${project}/job_tmp
 * {{none used segment cuboid files}}

            
hdfs:///kylin/${metadata_url}/${project}/${cube_name}/${non_used_segment}       
     

 Usage:
 # {{ Check which resources can be cleanup, this will not remove anything:}}

{code:java}
export KYLIN_HOME=/path/to/kylin_home 
${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete 
false{code}
 

{{   2. You can pickup 1 or 2 resources to check whether they’re no longer be 
referred; Then add the “--delete true” option to start the cleanup:}}
{code:java}
${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true
{code}
 

> New tool to clean up  intermediate files for Kylin 4.0
> --
>
> Key: KYLIN-4644
> URL: https://issues.apache.org/jira/browse/KYLIN-4644
> Project: Kylin
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> As the change of storage, Kylin 4.x needs a new tool to clean up intermediate 
> data and temporary files generated during cube building.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4644) New tool to clean up  intermediate files for Kylin 4.0

2020-07-16 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4644:
-

 Summary: New tool to clean up  intermediate files for Kylin 4.0
 Key: KYLIN-4644
 URL: https://issues.apache.org/jira/browse/KYLIN-4644
 Project: Kylin
  Issue Type: Bug
  Components: Client - CLI
Affects Versions: v4.0.0-beta
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta


As the change of storage, Kylin 4.x needs a new tool to clean up intermediate 
data and temporary files generated during cube building.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-14 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/15/20, 3:44 AM:
-

Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local
kylin.storage.url=/Users/rupeng.wang/Kyligence/Developments/kylin/kylin-parquet/examples/test_case_data/sample_local
kylin.env.zookeeper-is-local=true
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local
kylin.engine.spark-conf.spark.master=local
kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
kylin.engine.spark-conf.spark.sql.shuffle.partitions=1
kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine

             !image-2020-07-08-17-41-35-954.png|width=561,height=354!
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local
 
kylin.storage.url=/Users/rupeng.wang/Kyligence/Developments/kylin/kylin-parquet/examples/test_case_data/sample_local
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local
 kylin.engine.spark-conf.spark.master=local 
kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir{code}
{code:java}
kylin.engine.spark-conf.spark.sql.shuffle.partitions=1 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine

             !image-2020-07-08-17-41-35-954.png|width=561,height=354!
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files and Not dependent on remote HDP sandbox, but it's a little bit 
> complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-14 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/15/20, 3:38 AM:
-

Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local
 
kylin.storage.url=/Users/rupeng.wang/Kyligence/Developments/kylin/kylin-parquet/examples/test_case_data/sample_local
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/sample_local
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine

            !image-2020-07-08-17-41-35-954.png|width=561,height=354!
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files and Not dependent on remote HDP sandbox, but it's a little bit 
> complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KYLIN-4632) No such element exception:spark.driver.cores

2020-07-10 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4632:
--
Comment: was deleted

(was: [https://github.com/apache/kylin/pull/1310/])

> No such element exception:spark.driver.cores
> 
>
> Key: KYLIN-4632
> URL: https://issues.apache.org/jira/browse/KYLIN-4632
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> When submit a build job, it throws an exception. But the build job still can 
> run successfully.
> 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. 
> Ignore it and try to submit this job.
> java.util.NoSuchElementException: spark.driver.cores
> at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
> at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.SparkConf.get(SparkConf.scala:245)
> at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71)
> at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala)
> at 
> org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156)
> at 
> org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76)
> at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4632) No such element exception:spark.driver.cores

2020-07-10 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155190#comment-17155190
 ] 

wangrupeng commented on KYLIN-4632:
---

[https://github.com/apache/kylin/pull/1310/]

> No such element exception:spark.driver.cores
> 
>
> Key: KYLIN-4632
> URL: https://issues.apache.org/jira/browse/KYLIN-4632
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> When submit a build job, it throws an exception. But the build job still can 
> run successfully.
> 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. 
> Ignore it and try to submit this job.
> java.util.NoSuchElementException: spark.driver.cores
> at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
> at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.SparkConf.get(SparkConf.scala:245)
> at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71)
> at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala)
> at 
> org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156)
> at 
> org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76)
> at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4632) No such element exception:spark.driver.cores

2020-07-10 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4632:
--
Sprint: Sprint 53

> No such element exception:spark.driver.cores
> 
>
> Key: KYLIN-4632
> URL: https://issues.apache.org/jira/browse/KYLIN-4632
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> When submit a build job, it throws an exception. But the build job still can 
> run successfully.
> 20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. 
> Ignore it and try to submit this job.
> java.util.NoSuchElementException: spark.driver.cores
> at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
> at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.SparkConf.get(SparkConf.scala:245)
> at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71)
> at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala)
> at 
> org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156)
> at 
> org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76)
> at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4632) No such element exception:spark.driver.cores

2020-07-10 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4632:
-

 Summary: No such element exception:spark.driver.cores
 Key: KYLIN-4632
 URL: https://issues.apache.org/jira/browse/KYLIN-4632
 Project: Kylin
  Issue Type: Bug
  Components: Spark Engine
Affects Versions: v4.0.0-beta
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta


When submit a build job, it throws an exception. But the build job still can 
run successfully.
20/07/10 14:06:29 WARN SparkApplication: Error occurred when check resource. 
Ignore it and try to submit this job.
java.util.NoSuchElementException: spark.driver.cores
at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
at org.apache.spark.SparkConf$$anonfun$get$1.apply(SparkConf.scala:245)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.SparkConf.get(SparkConf.scala:245)
at org.apache.spark.utils.ResourceUtils$.checkResource(ResourceUtils.scala:71)
at org.apache.spark.utils.ResourceUtils.checkResource(ResourceUtils.scala)
at 
org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:156)
at 
org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:76)
at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4631) Set the default build engine type to spark for Kylin on Parquet

2020-07-10 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4631:
-

 Summary: Set the default build engine type to spark for Kylin on 
Parquet
 Key: KYLIN-4631
 URL: https://issues.apache.org/jira/browse/KYLIN-4631
 Project: Kylin
  Issue Type: Improvement
  Components: Metadata, Spark Engine
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta


Now, the default engine type is still MapReduce when generate sample model and 
cube. We should edit the cube and set the engine type to spark manually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-09 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Sprint: Sprint 53

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files and Not dependent on remote HDP sandbox, but it's a little bit 
> complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
Currently, Kylin on Parquet already supports debuging source code with local 
csv files and Not dependent on remote HDP sandbox, but it's a little bit 
complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.

  was:
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files and Not dependent on remote HDP sandbox, but it's a little bit 
> complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/9/20, 2:11 AM:


Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 # Env DEV|QA|PROD\LOCAL\UT
# LOCAL means reading local data source when debug with tomcat without connect 
to sandboxkylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/9/20, 2:10 AM:


Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 # Env DEV|QA|PROD\LOCAL\UT
# LOCAL means reading local data source when debug with tomcat without connect 
to sandboxkylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.

  was:
dCurrently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.

  was:
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
> kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
dCurrently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.

  was:
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local

{code:java}
kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=file:///path/to/local/dir
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> dCurrently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/9/20, 1:59 AM:


Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> dCurrently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/9/20, 1:58 AM:


Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 * 
{code:java}
kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL{code}

 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>  ```log
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT
>  ```
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube quickly and focus the bug 
we want to resolve. But current way is complex to load csv tables, create model 
and cube and it's hard to use kylin sample cube. So, I want to add a csv source 
which using the model of kylin sample data directly when debug tomcat started.

  was:
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube easy. But current way is 
complex to load csv tables and create model and cube. So, I want to add a csv 
source which using the model of kylin sample data directly when debug tomcat 
started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>  ```log
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT
>  ```
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/8/20, 9:59 AM:


Now we can debug tomcat without hadoop environment by following the follow 
steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now if you want to debug tomcat without hadoop environment, you can follow the 
follow steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>  ```log
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT
>  ```
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$LOCAL_META_DIR
 kylin.env.zookeeper-is-local=true
 kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=UT
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 !image-2020-07-08-17-41-35-954.png|width=574,height=363!
 * Load csv data source by pressing button "Data Source->Load CSV File as 
Table" on "Model" page, and set the schema for your table. Then press "submit" 
to save.
 !image-2020-07-08-17-42-09-603.png|width=577,height=259!

Most time we debug just want to build and query cube easy. But current way is 
complex to load csv tables and create model and cube. So, I want to add a csv 
source which using the model of kylin sample data directly when debug tomcat 
started.

  was:
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
* edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
   ```log
   kylin.metadata.url=$LOCAL_META_DIR
   kylin.env.zookeeper-is-local=true
   kylin.env.hdfs-working-dir=file:///path/to/local/dir
   kylin.engine.spark-conf.spark.master=local
   kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
   kylin.env=UT
   ```
* debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true" 
!image-2020-07-08-17-41-35-954.png! 
* Load csv data source by pressing button "Data Source->Load CSV File as Table" 
on "Model" page, and set the schema for your table. Then press "submit" to save.
 !image-2020-07-08-17-42-09-603.png! 

Most time we debug just want to build and query cube easy. But current way is 
complex to load csv tables and create model and cube. So, I want to add a csv 
source  which using the model of kylin sample data directly when debug tomcat 
started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>  ```log
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=[file:///path/to/local/dir]
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT
>  ```
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/8/20, 9:55 AM:


Now if you want to debug tomcat without hadoop environment, you can follow the 
follow steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=546,height=196!


was (Author: wangrupeng):
Now if you want to debug tomcat without hadoop environment, you can follow the 
follow steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=1341,height=481!

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
> * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>```log
>kylin.metadata.url=$LOCAL_META_DIR
>kylin.env.zookeeper-is-local=true
>kylin.env.hdfs-working-dir=file:///path/to/local/dir
>kylin.engine.spark-conf.spark.master=local
>kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>kylin.env=UT
>```
> * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true" 
> !image-2020-07-08-17-41-35-954.png! 
> * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png! 
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source  which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng edited comment on KYLIN-4625 at 7/8/20, 9:55 AM:


Now if you want to debug tomcat without hadoop environment, you can follow the 
follow steps:
 * edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
 ```log
 kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
 kylin.env.zookeeper-is-local=true
 
kylin.env.hdfs-working-dir=[file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test|file://%24kylin_source_dir/examples/test_case_data/parquet_test]
 kylin.engine.spark-conf.spark.master=local
 kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
 kylin.env=LOCAL
 ```
 * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
 This is used for query engine
 * start debug tomcat and we can use the models we already defined
 !screenshot-1.png|width=1341,height=481!


was (Author: wangrupeng):
Now if you want to debug tomcat without hadoop environment, you can follow the 
follow steps:
* edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
```log
kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
kylin.env.zookeeper-is-local=true
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
kylin.engine.spark-conf.spark.master=local
kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
kylin.env=LOCAL
```
* debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
  This is used for query engine
* start debug tomcat and we can use the models we already defined
 !screenshot-1.png! 

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
> * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>```log
>kylin.metadata.url=$LOCAL_META_DIR
>kylin.env.zookeeper-is-local=true
>kylin.env.hdfs-working-dir=file:///path/to/local/dir
>kylin.engine.spark-conf.spark.master=local
>kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>kylin.env=UT
>```
> * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true" 
> !image-2020-07-08-17-41-35-954.png! 
> * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png! 
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source  which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153453#comment-17153453
 ] 

wangrupeng commented on KYLIN-4625:
---

Now if you want to debug tomcat without hadoop environment, you can follow the 
follow steps:
* edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
```log
kylin.metadata.url=$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
kylin.env.zookeeper-is-local=true
kylin.env.hdfs-working-dir=file://$KYLIN_SOURCE_DIR/examples/test_case_data/parquet_test
kylin.engine.spark-conf.spark.master=local
kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
kylin.env=LOCAL
```
* debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true"
  This is used for query engine
* start debug tomcat and we can use the models we already defined
 !screenshot-1.png! 

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
> * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>```log
>kylin.metadata.url=$LOCAL_META_DIR
>kylin.env.zookeeper-is-local=true
>kylin.env.hdfs-working-dir=file:///path/to/local/dir
>kylin.engine.spark-conf.spark.master=local
>kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>kylin.env=UT
>```
> * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true" 
> !image-2020-07-08-17-41-35-954.png! 
> * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png! 
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source  which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Attachment: screenshot-1.png

> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
> * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>```log
>kylin.metadata.url=$LOCAL_META_DIR
>kylin.env.zookeeper-is-local=true
>kylin.env.hdfs-working-dir=file:///path/to/local/dir
>kylin.engine.spark-conf.spark.master=local
>kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>kylin.env=UT
>```
> * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true" 
> !image-2020-07-08-17-41-35-954.png! 
> * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png! 
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source  which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4625:
--
Description: 
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
* edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
   ```log
   kylin.metadata.url=$LOCAL_META_DIR
   kylin.env.zookeeper-is-local=true
   kylin.env.hdfs-working-dir=file:///path/to/local/dir
   kylin.engine.spark-conf.spark.master=local
   kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
   kylin.env=UT
   ```
* debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true" 
!image-2020-07-08-17-41-35-954.png! 
* Load csv data source by pressing button "Data Source->Load CSV File as Table" 
on "Model" page, and set the schema for your table. Then press "submit" to save.
 !image-2020-07-08-17-42-09-603.png! 

Most time we debug just want to build and query cube easy. But current way is 
complex to load csv tables and create model and cube. So, I want to add a csv 
source  which using the model of kylin sample data directly when debug tomcat 
started.

  was:
Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
* edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
   ```log
   kylin.metadata.url=$LOCAL_META_DIR
   kylin.env.zookeeper-is-local=true
   kylin.env.hdfs-working-dir=file:///path/to/local/dir
   kylin.engine.spark-conf.spark.master=local
   kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
   ```
* debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true" 
!image-2020-07-08-17-41-35-954.png! 
* Load csv data source by pressing button "Data Source->Load CSV File as Table" 
on "Model" page, and set the schema for your table. Then press "submit" to save.
 !image-2020-07-08-17-42-09-603.png! 

Most time we debug just want to build and query cube easy. But current way is 
complex to load csv tables and create model and cube. So, I want to add a csv 
source  which using the model of kylin sample data directly when debug tomcat 
started.


> Debug the code of Kylin on Parquet without hadoop environment
> -
>
> Key: KYLIN-4625
> URL: https://issues.apache.org/jira/browse/KYLIN-4625
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files, but it's a little bit complex. The steps are as follows:
> * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
>```log
>kylin.metadata.url=$LOCAL_META_DIR
>kylin.env.zookeeper-is-local=true
>kylin.env.hdfs-working-dir=file:///path/to/local/dir
>kylin.engine.spark-conf.spark.master=local
>kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>kylin.env=UT
>```
> * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true" 
> !image-2020-07-08-17-41-35-954.png! 
> * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png! 
> Most time we debug just want to build and query cube easy. But current way is 
> complex to load csv tables and create model and cube. So, I want to add a csv 
> source  which using the model of kylin sample data directly when debug tomcat 
> started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

2020-07-08 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4625:
-

 Summary: Debug the code of Kylin on Parquet without hadoop 
environment
 Key: KYLIN-4625
 URL: https://issues.apache.org/jira/browse/KYLIN-4625
 Project: Kylin
  Issue Type: Improvement
  Components: Spark Engine
Reporter: wangrupeng
Assignee: wangrupeng
 Attachments: image-2020-07-08-17-41-35-954.png, 
image-2020-07-08-17-42-09-603.png

Currently, Kylin on Parquet already supports debuging source code with local 
csv files, but it's a little bit complex. The steps are as follows:
* edit the properties of 
$KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
   ```log
   kylin.metadata.url=$LOCAL_META_DIR
   kylin.env.zookeeper-is-local=true
   kylin.env.hdfs-working-dir=file:///path/to/local/dir
   kylin.engine.spark-conf.spark.master=local
   kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
   ```
* debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
"-Dspark.local=true" 
!image-2020-07-08-17-41-35-954.png! 
* Load csv data source by pressing button "Data Source->Load CSV File as Table" 
on "Model" page, and set the schema for your table. Then press "submit" to save.
 !image-2020-07-08-17-42-09-603.png! 

Most time we debug just want to build and query cube easy. But current way is 
complex to load csv tables and create model and cube. So, I want to add a csv 
source  which using the model of kylin sample data directly when debug tomcat 
started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4621) Avoid annoying log message when build cube and query

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153331#comment-17153331
 ] 

wangrupeng commented on KYLIN-4621:
---

* add a new log4j properties configuration file called 
kylin-parquet-log4j.properties
* the log of spark is default "WARN"

> Avoid annoying log message when build cube and query
> 
>
> Key: KYLIN-4621
> URL: https://issues.apache.org/jira/browse/KYLIN-4621
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> # Build
> There will be about 40 thousands rows log messages of one build task, 
> most of them are unnecessary.
> # Query 
> The first time of query,  kylin will init one spark context and print all 
> jars and classes loaded. This can be print in kylin.out not kylin.log. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KYLIN-4621) Avoid annoying log message when build cube and query

2020-07-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4621:
--
Comment: was deleted

(was: https://github.com/apache/kylin/pull/1310)

> Avoid annoying log message when build cube and query
> 
>
> Key: KYLIN-4621
> URL: https://issues.apache.org/jira/browse/KYLIN-4621
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> # Build
> There will be about 40 thousands rows log messages of one build task, 
> most of them are unnecessary.
> # Query 
> The first time of query,  kylin will init one spark context and print all 
> jars and classes loaded. This can be print in kylin.out not kylin.log. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4621) Avoid annoying log message when build cube and query

2020-07-08 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153328#comment-17153328
 ] 

wangrupeng commented on KYLIN-4621:
---

https://github.com/apache/kylin/pull/1310

> Avoid annoying log message when build cube and query
> 
>
> Key: KYLIN-4621
> URL: https://issues.apache.org/jira/browse/KYLIN-4621
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> # Build
> There will be about 40 thousands rows log messages of one build task, 
> most of them are unnecessary.
> # Query 
> The first time of query,  kylin will init one spark context and print all 
> jars and classes loaded. This can be print in kylin.out not kylin.log. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4621) Avoid annoying log message when build cube and query

2020-07-06 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4621:
--
Sprint: Sprint 53

> Avoid annoying log message when build cube and query
> 
>
> Key: KYLIN-4621
> URL: https://issues.apache.org/jira/browse/KYLIN-4621
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Affects Versions: v4.0.0-beta
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> # Build
> There will be about 40 thousands rows log messages of one build task, 
> most of them are unnecessary.
> # Query 
> The first time of query,  kylin will init one spark context and print all 
> jars and classes loaded. This can be print in kylin.out not kylin.log. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4621) Avoid annoying log message when build cube and query

2020-07-06 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4621:
-

 Summary: Avoid annoying log message when build cube and query
 Key: KYLIN-4621
 URL: https://issues.apache.org/jira/browse/KYLIN-4621
 Project: Kylin
  Issue Type: Improvement
  Components: Spark Engine
Affects Versions: v4.0.0-beta
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta


# Build
There will be about 40 thousands rows log messages of one build task, most 
of them are unnecessary.
# Query 
The first time of query,  kylin will init one spark context and print all 
jars and classes loaded. This can be print in kylin.out not kylin.log. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KYLIN-4516) Support System Cube

2020-06-19 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng resolved KYLIN-4516.
---
Resolution: Fixed

> Support System Cube
> ---
>
> Key: KYLIN-4516
> URL: https://issues.apache.org/jira/browse/KYLIN-4516
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4526) Enhance get the hive table rows

2020-06-19 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4526:
--
Sprint: Sprint 53

> Enhance get the hive table rows
> ---
>
> Key: KYLIN-4526
> URL: https://issues.apache.org/jira/browse/KYLIN-4526
> Project: Kylin
>  Issue Type: Task
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
>
> In kylin-4315, we get the rows of the hive table from metadata, but when we 
> turn off hive's statistics feature(`hive.stats.autogather=false`), we can't 
> get the correct rows of hive table from metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4527) Beautify the drop-down list of the cube on query page

2020-06-19 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4527:
--
Sprint: Sprint 53

> Beautify the drop-down list of the cube on query page
> -
>
> Key: KYLIN-4527
> URL: https://issues.apache.org/jira/browse/KYLIN-4527
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: image-2020-05-27-12-05-49-425.png, 
> image-2020-05-27-12-22-19-097.png
>
>
> The drop-down list of cube is very compact, which is not convenient to select 
> cube
> Before:
> !image-2020-05-27-12-05-49-425.png|width=424,height=212!
> After: 
> !image-2020-05-27-12-22-19-097.png|width=429,height=249!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4405) Internal exception when trying to build cube whose modal has null PartitionDesc

2020-06-16 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136609#comment-17136609
 ] 

wangrupeng commented on KYLIN-4405:
---

Sorry, it seems like this problem is already resolved with in branch master.

> Internal exception when trying to build cube whose modal has null 
> PartitionDesc 
> 
>
> Key: KYLIN-4405
> URL: https://issues.apache.org/jira/browse/KYLIN-4405
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.6.2
>Reporter: Chester Liu
>Assignee: Chester Liu
>Priority: Minor
> Fix For: v3.1.0, v3.0.2, v2.6.6
>
>
> We are using 2.6.2 in our production environment and came upon this 
> exception. We build our model and cube using the REST api, which allows null 
> partitionDesc in a kylin model. However, when we try to build the cube 
> related to the model, this exception occurs:
>  
> {{org.apache.kylin.rest.exception.InternalErrorException}}
>  
> {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:398)org.apache.kylin.rest.controller.CubeController.rebuild(CubeController.java:354)}}
>  
> {{org.apache.kylin.rest.controller.CubeController.build(CubeController.java:343)}}
>  {{sun.reflect.GeneratedMethodAccessor233.invoke(Unknown 
> Source)sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
>  {{java.lang.reflect.Method.invoke(Method.java:497)}}
>  {{...}}
>  {{Caused by: java.lang.NullPointerException}}
>  
> {{org.apache.kylin.cube.CubeManager$SegmentAssist.appendSegment(CubeManager.java:695)}}
>  {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:638)}}
>  {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:630)}}
>  
> {{org.apache.kylin.rest.service.JobService.submitJobInternal(JobService.java:233)}}
>  {{org.apache.kylin.rest.service.JobService.submitJob(JobService.java:202)}}
>  
> {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:394)}}
> Our current solution is using a not-null partitionDesc with empty 
> partitionDateColumn. But I think ultimately a null partitionDesc makes more 
> sense to me when our data source is not partitioned in the first place.
> I searched the codebase for null-checking of partitionDesc and found several 
> of them. So I think an extra null-check should be added.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4405) Internal exception when trying to build cube whose modal has null PartitionDesc

2020-06-16 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136554#comment-17136554
 ] 

wangrupeng commented on KYLIN-4405:
---

I did not reproduce this problem. I sent building request using rest API and 
the partition Desc of the model is null, but the cube build job finished 
successed.

> Internal exception when trying to build cube whose modal has null 
> PartitionDesc 
> 
>
> Key: KYLIN-4405
> URL: https://issues.apache.org/jira/browse/KYLIN-4405
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.6.2
>Reporter: Chester Liu
>Assignee: Chester Liu
>Priority: Minor
> Fix For: v3.1.0, v3.0.2, v2.6.6
>
>
> We are using 2.6.2 in our production environment and came upon this 
> exception. We build our model and cube using the REST api, which allows null 
> partitionDesc in a kylin model. However, when we try to build the cube 
> related to the model, this exception occurs:
>  
> {{org.apache.kylin.rest.exception.InternalErrorException}}
>  
> {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:398)org.apache.kylin.rest.controller.CubeController.rebuild(CubeController.java:354)}}
>  
> {{org.apache.kylin.rest.controller.CubeController.build(CubeController.java:343)}}
>  {{sun.reflect.GeneratedMethodAccessor233.invoke(Unknown 
> Source)sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
>  {{java.lang.reflect.Method.invoke(Method.java:497)}}
>  {{...}}
>  {{Caused by: java.lang.NullPointerException}}
>  
> {{org.apache.kylin.cube.CubeManager$SegmentAssist.appendSegment(CubeManager.java:695)}}
>  {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:638)}}
>  {{org.apache.kylin.cube.CubeManager.appendSegment(CubeManager.java:630)}}
>  
> {{org.apache.kylin.rest.service.JobService.submitJobInternal(JobService.java:233)}}
>  {{org.apache.kylin.rest.service.JobService.submitJob(JobService.java:202)}}
>  
> {{org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:394)}}
> Our current solution is using a not-null partitionDesc with empty 
> partitionDateColumn. But I think ultimately a null partitionDesc makes more 
> sense to me when our data source is not partitioned in the first place.
> I searched the codebase for null-checking of partitionDesc and found several 
> of them. So I think an extra null-check should be added.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4563) Support for specifying cuboids when building segments

2020-06-12 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4563:
--
Sprint: Sprint 53

> Support for specifying cuboids when building segments
> -
>
> Key: KYLIN-4563
> URL: https://issues.apache.org/jira/browse/KYLIN-4563
> Project: Kylin
>  Issue Type: Improvement
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>
> Currently, when building a segment, all cuboids of cube will be built. 
> Especially after removing  or adding some cuboids with cube planner, there's 
> no need to rebuild all segments, we can only remove or add the cuboids data  
> we need. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4563) Support for specifying cuboids when building segments

2020-06-12 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4563:
-

 Summary: Support for specifying cuboids when building segments
 Key: KYLIN-4563
 URL: https://issues.apache.org/jira/browse/KYLIN-4563
 Project: Kylin
  Issue Type: Improvement
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta


Currently, when building a segment, all cuboids of cube will be built. 
Especially after removing  or adding some cuboids with cube planner, there's no 
need to rebuild all segments, we can only remove or add the cuboids data  we 
need. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4224) Create flat table wich spark sql

2020-06-12 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4224:
--
Labels:   (was: doc)

> Create flat table wich spark sql
> 
>
> Key: KYLIN-4224
> URL: https://issues.apache.org/jira/browse/KYLIN-4224
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
>
> Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741.
> Currently using hive to create flat table, hive can't read spark datasource 
> data, we need to support the creation of flat table with spark sql, because 
> it can read hive and spark datasource data at the same time to create flat 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4224) Create flat table wich spark sql

2020-06-12 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4224:
--
Labels: doc  (was: )

> Create flat table wich spark sql
> 
>
> Key: KYLIN-4224
> URL: https://issues.apache.org/jira/browse/KYLIN-4224
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>  Labels: doc
> Fix For: v3.1.0
>
>
> Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741.
> Currently using hive to create flat table, hive can't read spark datasource 
> data, we need to support the creation of flat table with spark sql, because 
> it can read hive and spark datasource data at the same time to create flat 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KYLIN-4518) Pruning cuboids with genetic algorithm

2020-06-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng closed KYLIN-4518.
-
Resolution: Resolved

> Pruning cuboids with genetic algorithm
> --
>
> Key: KYLIN-4518
> URL: https://issues.apache.org/jira/browse/KYLIN-4518
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KYLIN-4517) Pruning cuboids with greedy algorithm

2020-06-08 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng closed KYLIN-4517.
-
Resolution: Resolved

> Pruning cuboids  with greedy algorithm
> --
>
> Key: KYLIN-4517
> URL: https://issues.apache.org/jira/browse/KYLIN-4517
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4498) CubePlaner for Kylin on Parquet

2020-06-06 Thread wangrupeng (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127465#comment-17127465
 ] 

wangrupeng commented on KYLIN-4498:
---

Cube Planner Proposal

Cube Planner checks the costs and benefits of each dimension combination, and 
selects cost-effective dimension combination sets to improve cube build 
efficiency and query performance. Cube planner has two phases.
Cube planner is degined and contributed by ebay. See more about the principle 
of cube planner from 
here(https://tech.ebayinc.com/engineering/cube-planner-build-an-apache-kylin-olap-cube-efficiently-and-intelligently/).

In my opinion, to let cube planner support Kylin on Parquet, we need make some 
change to current spark engine for building cube. My suggestion is as follows:
The front-end interaction remains the same as before. 
Phase1(Building cube at first time):
1. Add a new step to calculate each cuboid rows with spark before the step of 
cube building(Now Kylin on Parquet has two steps for cube building).
2. During the step of cube building, recommend cuboids list with Greedy 
algorithm or Genetic algorithm before building cube. The code of these two 
algorithms can be reused.

Pase2(Cube has been used for a while)
1. Using System cube which now can be used normally to collect query 
metrics(Including cuboids scanning rows and scanning bytes) 
2. Add a new spark job to optimize and rebuild cube with the information 
collected by System cube
3. The steps of the new optimized job:
a. Using query metrics information to recommend cuboid  
b. Rebuild old segment by remove non-needed cuboids and adding needed 
cuboid, Kylin on Paruqet building engine can support only adding cuboids(now we 
also call "layouts") we need without rebuild all cuboids of the segment
c. Update metadata

> CubePlaner for Kylin on Parquet
> ---
>
> Key: KYLIN-4498
> URL: https://issues.apache.org/jira/browse/KYLIN-4498
> Project: Kylin
>  Issue Type: New Feature
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>
> CubePlanner still doesn't support Kylin on Parquet  yet.  We need this to be 
> more resource efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4544) Kylin on Parquet LEFT JOIN Query failed

2020-06-03 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng reassigned KYLIN-4544:
-

Assignee: wangrupeng

> Kylin on Parquet LEFT JOIN Query failed
> ---
>
> Key: KYLIN-4544
> URL: https://issues.apache.org/jira/browse/KYLIN-4544
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: bright liao
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> select t.n,t1.n from (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t
> LEFT JOIN (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200526') as t1
> on 1=1
> This sql execute failed while kylin-3.0 runs with no problem.
> Error message:
> Error while applying rule OLAPJoinRule, args 
> [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1,
>  1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from 
> ( select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t LEFT JOIN ( select count(*) as n from 
> dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit 
> 5"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4544) Kylin on Parquet LEFT JOIN Query failed

2020-06-03 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4544:
--
Description: 
select t.n,t1.n from (
select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') 
as t
LEFT JOIN (
select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200526') 
as t1
on 1=1

This sql execute failed while kylin-3.0 runs with no problem.

Error message:

Error while applying rule OLAPJoinRule, args 
[rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1,
 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from ( 
select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') 
as t LEFT JOIN ( select count(*) as n from dm.dm_bi_device_info_day where 
deal_date='20200526') as t1 on 1=1) limit 5"

  was:
select t.n,t1.n from (
select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') 
as t
LEFT JOIN (
select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200526') 
as t1
on 1=1

查询出错,但在3.0版本可以。

错误信息如下:

Error while applying rule OLAPJoinRule, args 
[rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1,
 1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from ( 
select count(*) as n from dm.dm_bi_device_info_day where deal_date='20200527') 
as t LEFT JOIN ( select count(*) as n from dm.dm_bi_device_info_day where 
deal_date='20200526') as t1 on 1=1) limit 5"


> Kylin on Parquet LEFT JOIN Query failed
> ---
>
> Key: KYLIN-4544
> URL: https://issues.apache.org/jira/browse/KYLIN-4544
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: bright liao
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> select t.n,t1.n from (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t
> LEFT JOIN (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200526') as t1
> on 1=1
> This sql execute failed while kylin-3.0 runs with no problem.
> Error message:
> Error while applying rule OLAPJoinRule, args 
> [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1,
>  1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from 
> ( select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t LEFT JOIN ( select count(*) as n from 
> dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit 
> 5"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4544) Kylin on Parquet LEFT JOIN Query failed

2020-06-03 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4544:
--
Summary: Kylin on Parquet LEFT JOIN Query failed  (was: Kylin on 
Parquet不兼容LEFT JOIN)

> Kylin on Parquet LEFT JOIN Query failed
> ---
>
> Key: KYLIN-4544
> URL: https://issues.apache.org/jira/browse/KYLIN-4544
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: bright liao
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> select t.n,t1.n from (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t
> LEFT JOIN (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200526') as t1
> on 1=1
> 查询出错,但在3.0版本可以。
> 错误信息如下:
> Error while applying rule OLAPJoinRule, args 
> [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1,
>  1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from 
> ( select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t LEFT JOIN ( select count(*) as n from 
> dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit 
> 5"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4544) Kylin on Parquet不兼容LEFT JOIN

2020-06-03 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4544:
--
Affects Version/s: (was: v4.0.0-beta)

> Kylin on Parquet不兼容LEFT JOIN
> 
>
> Key: KYLIN-4544
> URL: https://issues.apache.org/jira/browse/KYLIN-4544
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: bright liao
>Priority: Major
> Fix For: v4.0.0-beta
>
>
> select t.n,t1.n from (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t
> LEFT JOIN (
> select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200526') as t1
> on 1=1
> 查询出错,但在3.0版本可以。
> 错误信息如下:
> Error while applying rule OLAPJoinRule, args 
> [rel#234029:LogicalJoin.NONE.[](left=rel#234022:Subset#3.NONE.[],right=rel#234028:Subset#6.NONE.[],condition==(1,
>  1),joinType=left)] while executing SQL: "select * from (select t.n,t1.n from 
> ( select count(*) as n from dm.dm_bi_device_info_day where 
> deal_date='20200527') as t LEFT JOIN ( select count(*) as n from 
> dm.dm_bi_device_info_day where deal_date='20200526') as t1 on 1=1) limit 
> 5"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4498) CubePlaner for Kylin on Parquet

2020-06-01 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4498:
--
Description: CubePlanner still doesn't support Kylin on Parquet  yet.  We 
need this to be more resource efficient.  (was: Kylin on Parquet still doesn't 
support CubePlanner yet.  We need this to be more resource efficient.)

> CubePlaner for Kylin on Parquet
> ---
>
> Key: KYLIN-4498
> URL: https://issues.apache.org/jira/browse/KYLIN-4498
> Project: Kylin
>  Issue Type: New Feature
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
> Fix For: v4.0.0-beta
>
>
> CubePlanner still doesn't support Kylin on Parquet  yet.  We need this to be 
> more resource efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4458) FilePruner prune shards

2020-05-31 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng updated KYLIN-4458:
--
Sprint: Sprint 51  (was: Sprint 52)

> FilePruner prune shards
> ---
>
> Key: KYLIN-4458
> URL: https://issues.apache.org/jira/browse/KYLIN-4458
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: xuekaiqi
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> To enable pruning by shard columns, web front end needs to add "shard by 
> column"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KYLIN-4458) FilePruner prune shards

2020-05-31 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng resolved KYLIN-4458.
---
Resolution: Fixed

> FilePruner prune shards
> ---
>
> Key: KYLIN-4458
> URL: https://issues.apache.org/jira/browse/KYLIN-4458
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: xuekaiqi
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> To enable pruning by shard columns, web front end needs to add "shard by 
> column"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KYLIN-4450) Add the feature that adjusting spark driver memory adaptively

2020-05-31 Thread wangrupeng (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangrupeng resolved KYLIN-4450.
---
Resolution: Fixed

> Add the feature that adjusting spark driver memory adaptively
> -
>
> Key: KYLIN-4450
> URL: https://issues.apache.org/jira/browse/KYLIN-4450
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: xuekaiqi
>Assignee: wangrupeng
>Priority: Major
> Fix For: v4.0.0-beta
>
>   Original Estimate: 16h
>  Remaining Estimate: 16h
>
> For now the cubing job can adaptively adjust the following spark properties 
> to use of resources retionally, but the driver memory of the spark job 
> uploaded to cluster haven't been done.
>  
> {code:java}
> spark.executor.memory
> spark.executor.cores
> spark.executor.memoryOverhead
> spark.executor.instances
> spark.sql.shuffle.partitions
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4519) Analyse user query history and offer suggestion about the removing of unused cuboids

2020-05-22 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4519:
-

 Summary: Analyse user query history and offer suggestion about the 
removing of unused cuboids
 Key: KYLIN-4519
 URL: https://issues.apache.org/jira/browse/KYLIN-4519
 Project: Kylin
  Issue Type: Sub-task
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4517) Pruning cuboids with greedy algorithm

2020-05-22 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4517:
-

 Summary: Pruning cuboids  with greedy algorithm
 Key: KYLIN-4517
 URL: https://issues.apache.org/jira/browse/KYLIN-4517
 Project: Kylin
  Issue Type: Sub-task
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4518) Pruning cuboids with genetic algorithm

2020-05-22 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4518:
-

 Summary: Pruning cuboids with genetic algorithm
 Key: KYLIN-4518
 URL: https://issues.apache.org/jira/browse/KYLIN-4518
 Project: Kylin
  Issue Type: Sub-task
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4516) Support System Cube

2020-05-22 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4516:
-

 Summary: Support System Cube
 Key: KYLIN-4516
 URL: https://issues.apache.org/jira/browse/KYLIN-4516
 Project: Kylin
  Issue Type: Sub-task
Reporter: wangrupeng
Assignee: wangrupeng
 Fix For: v4.0.0-beta






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >