[jira] [Created] (KYLIN-2427) Auto adjust join order to make query executable

2017-02-06 Thread Kaige Liu (JIRA)
 Kaige Liu created KYLIN-2427:
-

 Summary: Auto adjust join order to make query executable
 Key: KYLIN-2427
 URL: https://issues.apache.org/jira/browse/KYLIN-2427
 Project: Kylin
  Issue Type: Bug
Reporter:  Kaige Liu


KYLIN-2406 reports an issue: The order of joins will affect the result of 
query. For example, below query leads to "No model found"
Below query triggers NPE

{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
inner join supplier on ps_suppkey = s_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}

While below query is OK. Only difference being the order of "inner join tmp3" 
and "inner join supplier"

{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join supplier on ps_suppkey = s_suppkey
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}

But below query is OK.
{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join supplier on ps_suppkey = s_suppkey
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2427) Auto adjust join order to make query executable

2017-02-06 Thread Kaige Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaige Liu updated KYLIN-2427:
-
Description: 
KYLIN-2406 reports an issue: The order of joins will affect the result of 
query. For example, below query leads to "No model found"
Below query triggers NPE

{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
inner join supplier on ps_suppkey = s_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}

While below query is OK. Only difference being the order of "inner join tmp3" 
and "inner join supplier"

{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join supplier on ps_suppkey = s_suppkey
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}

  was:
KYLIN-2406 reports an issue: The order of joins will affect the result of 
query. For example, below query leads to "No model found"
Below query triggers NPE

{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
inner join supplier on ps_suppkey = s_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}

While below query is OK. Only difference being the order of "inner join tmp3" 
and "inner join supplier"

{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join supplier on ps_suppkey = s_suppkey
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}

But below query is OK.
{code}
with tmp3 as (
select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
from v_lineitem
inner join supplier on l_suppkey = s_suppkey
inner join nation on s_nationkey = n_nationkey
inner join part on l_partkey = p_partkey
where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
and n_name = 'CANADA'
and p_name like 'forest%'
group by l_partkey, l_suppkey
)

select
s_name,
s_address
from
v_partsupp
inner join supplier on ps_suppkey = s_suppkey
inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
where
ps_availqty > sum_quantity
group by
s_name, s_address
order by
s_name
{code}


> Auto adjust join order to make query executable
> ---
>
> Key: KYLIN-2427
> URL: https://issues.apache.org/jira/browse/KYLIN-2427
> Project: Kylin
>  Issue Type: Bug
>Reporter:  Kaige Liu
>
> KYLIN-2406 reports an issue: The order of joins will affect the result of 
> query. For example, below query leads to "No model found"
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' an

[jira] [Assigned] (KYLIN-2427) Auto adjust join order to make query executable

2017-02-06 Thread Kaige Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kaige Liu reassigned KYLIN-2427:
-

Assignee:  Kaige Liu

> Auto adjust join order to make query executable
> ---
>
> Key: KYLIN-2427
> URL: https://issues.apache.org/jira/browse/KYLIN-2427
> Project: Kylin
>  Issue Type: Bug
>Reporter:  Kaige Liu
>Assignee:  Kaige Liu
>
> KYLIN-2406 reports an issue: The order of joins will affect the result of 
> query. For example, below query leads to "No model found"
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> inner join supplier on ps_suppkey = s_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> While below query is OK. Only difference being the order of "inner join tmp3" 
> and "inner join supplier"
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2406) TPC-H query 20, can triggers NPE

2017-02-06 Thread Kaige Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaige Liu updated KYLIN-2406:
-
Attachment: KYLIN-2406-fix-NPE.patch

For now this patch only prevent the NPE and give an error hint. A better 
solution will be given in KYLIN-2427 later.

> TPC-H query 20, can triggers NPE
> 
>
> Key: KYLIN-2406
> URL: https://issues.apache.org/jira/browse/KYLIN-2406
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Attachments: KYLIN-2406-fix-NPE.patch
>
>
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> inner join supplier on ps_suppkey = s_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> While below query is OK. Only difference being the order of "inner join tmp3" 
> and "inner join supplier"
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2406) TPC-H query 20, can triggers NPE

2017-02-06 Thread Kaige Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaige Liu updated KYLIN-2406:
-
Attachment: KYLIN-2406-fix-NPE.patch

> TPC-H query 20, can triggers NPE
> 
>
> Key: KYLIN-2406
> URL: https://issues.apache.org/jira/browse/KYLIN-2406
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Attachments: KYLIN-2406-fix-NPE.patch
>
>
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> inner join supplier on ps_suppkey = s_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> While below query is OK. Only difference being the order of "inner join tmp3" 
> and "inner join supplier"
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2406) TPC-H query 20, can triggers NPE

2017-02-06 Thread Kaige Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaige Liu updated KYLIN-2406:
-
Attachment: (was: KYLIN-2406-fix-NPE.patch)

> TPC-H query 20, can triggers NPE
> 
>
> Key: KYLIN-2406
> URL: https://issues.apache.org/jira/browse/KYLIN-2406
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Attachments: KYLIN-2406-fix-NPE.patch
>
>
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> inner join supplier on ps_suppkey = s_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> While below query is OK. Only difference being the order of "inner join tmp3" 
> and "inner join supplier"
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KYLIN-2406) TPC-H query 20, prevent NPE and give error hint

2017-02-06 Thread Dong Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Li resolved KYLIN-2406.

   Resolution: Fixed
Fix Version/s: v2.0.0

Merged to master branch. Thanks Kaige!

> TPC-H query 20, prevent NPE and give error hint
> ---
>
> Key: KYLIN-2406
> URL: https://issues.apache.org/jira/browse/KYLIN-2406
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Fix For: v2.0.0
>
> Attachments: KYLIN-2406-fix-NPE.patch
>
>
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> inner join supplier on ps_suppkey = s_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> While below query is OK. Only difference being the order of "inner join tmp3" 
> and "inner join supplier"
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2406) TPC-H query 20, prevent NPE and give error hint

2017-02-06 Thread Dong Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Li updated KYLIN-2406:
---
Request participants:   (was: )
 Summary: TPC-H query 20, prevent NPE and give error hint  
(was: TPC-H query 20, can triggers NPE)

> TPC-H query 20, prevent NPE and give error hint
> ---
>
> Key: KYLIN-2406
> URL: https://issues.apache.org/jira/browse/KYLIN-2406
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Fix For: v2.0.0
>
> Attachments: KYLIN-2406-fix-NPE.patch
>
>
> Below query triggers NPE
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> inner join supplier on ps_suppkey = s_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> While below query is OK. Only difference being the order of "inner join tmp3" 
> and "inner join supplier"
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2424) Optimize the integration test's performance

2017-02-06 Thread hongbin ma (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853970#comment-15853970
 ] 

hongbin ma commented on KYLIN-2424:
---

[~Shaofengshi] great work! I can close KYLIN-2015 safely now
[~yimingliu] looks like it's abbreviation for "true". If so it could be 
confusing, why not just use "true"?

> Optimize the integration test's performance
> ---
>
> Key: KYLIN-2424
> URL: https://issues.apache.org/jira/browse/KYLIN-2424
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
> Fix For: v2.0.0
>
>
> Kylin's integration test is slow, especially the ITCombinationTest. Most of 
> time are spent on H2 to execute the test queries. In a latest integration 
> test, this test case take 90 minutes to finish.
> By checking H2's document, I think the main problem is the absence of index 
> on the tables, while index is very important for a relational database's 
> query performance. So when Kylin create the tables in H2, shoud create index 
> on the columns that will be used in the queries, like the pk/fk, the 
> filtering columns etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2015) replace h2 with alternatives like sqllite or mysql

2017-02-06 Thread hongbin ma (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853972#comment-15853972
 ] 

hongbin ma commented on KYLIN-2015:
---

The performance issue of H2 is solved in KYLIN-2424 without replacing H2

> replace h2 with alternatives like sqllite or mysql
> --
>
> Key: KYLIN-2015
> URL: https://issues.apache.org/jira/browse/KYLIN-2015
> Project: Kylin
>  Issue Type: Improvement
>Reporter: hongbin ma
>Assignee: hongbin ma
>
> in IT we compare kylin's result with H2's results to ensure query correctness.
> however h2 only supports part of the SQL syntax. For example, it cannot  
> support functions like timestampadd, or (DATE'2013-01-02' + interval '3' 
> day). What's more, subqueries are observed to be very slow on H2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KYLIN-2424) Optimize the integration test's performance

2017-02-06 Thread hongbin ma (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853970#comment-15853970
 ] 

hongbin ma edited comment on KYLIN-2424 at 2/6/17 1:20 PM:
---

[~Shaofengshi] great work! 
[~yimingliu] looks like it's abbreviation for "true". If so it could be 
confusing, why not just use "true"?


was (Author: mahongbin):
[~Shaofengshi] great work! I can close KYLIN-2015 safely now
[~yimingliu] looks like it's abbreviation for "true". If so it could be 
confusing, why not just use "true"?

> Optimize the integration test's performance
> ---
>
> Key: KYLIN-2424
> URL: https://issues.apache.org/jira/browse/KYLIN-2424
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
> Fix For: v2.0.0
>
>
> Kylin's integration test is slow, especially the ITCombinationTest. Most of 
> time are spent on H2 to execute the test queries. In a latest integration 
> test, this test case take 90 minutes to finish.
> By checking H2's document, I think the main problem is the absence of index 
> on the tables, while index is very important for a relational database's 
> query performance. So when Kylin create the tables in H2, shoud create index 
> on the columns that will be used in the queries, like the pk/fk, the 
> filtering columns etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KYLIN-2407) TPC-H query 20, why this query returns no result?

2017-02-06 Thread Kaige Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kaige Liu reassigned KYLIN-2407:
-

Assignee:  Kaige Liu

> TPC-H query 20, why this query returns no result?
> -
>
> Key: KYLIN-2407
> URL: https://issues.apache.org/jira/browse/KYLIN-2407
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
>
> Below query returns no result.
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> ),
> tmp5 as (
> select
> ps_suppkey
> from
> v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = 
> l_suppkey
> where
> ps_availqty > sum_quantity
> )
> select
> s_name,
> s_address
> from
> supplier
> where
> s_suppkey IN (select ps_suppkey from tmp5)
> order by s_name
> {code}
> While another similar query returns correct result.
> {code}
> with tmp3 as (
> select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey
> from v_lineitem
> inner join supplier on l_suppkey = s_suppkey
> inner join nation on s_nationkey = n_nationkey
> inner join part on l_partkey = p_partkey
> where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01'
> and n_name = 'CANADA'
> and p_name like 'forest%'
> group by l_partkey, l_suppkey
> )
> select
> s_name,
> s_address
> from
> v_partsupp
> inner join supplier on ps_suppkey = s_suppkey
> inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey
> where
> ps_availqty > sum_quantity
> group by
> s_name, s_address
> order by
> s_name
> {code}
> Maybe something wrong with the "where ... IN ..." clause?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KYLIN-2428) Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server

2017-02-06 Thread Billy Liu (JIRA)
Billy Liu created KYLIN-2428:


 Summary: Cleanup unnecessary shaded libraries for 
job/coprocessor/jdbc/server
 Key: KYLIN-2428
 URL: https://issues.apache.org/jira/browse/KYLIN-2428
 Project: Kylin
  Issue Type: Improvement
  Components: General
Affects Versions: v1.6.0
Reporter: Billy Liu
Assignee: Billy Liu


Kylin releases three libraries: kylin-coprocessor, kylin-jdbc, kylin-job and 
one web application: server. 
Currently, all libraries have shaded some used third party libraries into the 
package. For example, guava, curator, commons, kyro in kylin-job. The duplicate 
libraries in runtime classpath may have potential class loading conflicts and 
waste computing resource. We should leverage the hadoop provided libraries at 
runtime instead of the shaded one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2428) Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server

2017-02-06 Thread Billy Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billy Liu updated KYLIN-2428:
-
Attachment: KYLIN-2428.patch

> Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server
> 
>
> Key: KYLIN-2428
> URL: https://issues.apache.org/jira/browse/KYLIN-2428
> Project: Kylin
>  Issue Type: Improvement
>  Components: General
>Affects Versions: v1.6.0
>Reporter: Billy Liu
>Assignee: Billy Liu
> Attachments: KYLIN-2428.patch
>
>
> Kylin releases three libraries: kylin-coprocessor, kylin-jdbc, kylin-job and 
> one web application: server. 
> Currently, all libraries have shaded some used third party libraries into the 
> package. For example, guava, curator, commons, kyro in kylin-job. The 
> duplicate libraries in runtime classpath may have potential class loading 
> conflicts and waste computing resource. We should leverage the hadoop 
> provided libraries at runtime instead of the shaded one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2428) Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server

2017-02-06 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854147#comment-15854147
 ] 

Billy Liu commented on KYLIN-2428:
--

Compared with kylin 1.6:
kylin-coprocessor is reduced from 2.4M to 1.9M, by removing extendedset, 
commons-lang3, but also introduced tduning.math
kylin-jdbc increased from 7.6M to 8.5M since Avatica 1.9 upgrade.
kylin-job is reduced from 9.5M to 5.8M, by removing esotericsoftware, guava, 
jsch, extendedset, jsr305, commons-cli, commons-io, commons-lang, curator and 
objenesis.
Web application has reduced from 94 libraries to 80.



> Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server
> 
>
> Key: KYLIN-2428
> URL: https://issues.apache.org/jira/browse/KYLIN-2428
> Project: Kylin
>  Issue Type: Improvement
>  Components: General
>Affects Versions: v1.6.0
>Reporter: Billy Liu
>Assignee: Billy Liu
> Attachments: KYLIN-2428.patch
>
>
> Kylin releases three libraries: kylin-coprocessor, kylin-jdbc, kylin-job and 
> one web application: server. 
> Currently, all libraries have shaded some used third party libraries into the 
> package. For example, guava, curator, commons, kyro in kylin-job. The 
> duplicate libraries in runtime classpath may have potential class loading 
> conflicts and waste computing resource. We should leverage the hadoop 
> provided libraries at runtime instead of the shaded one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2428) Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server

2017-02-06 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854211#comment-15854211
 ] 

Billy Liu commented on KYLIN-2428:
--

By refine the JDBC shade setting, the package has reduced from 7.6M to 5.4M, no 
duplicated shaded any more.

> Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server
> 
>
> Key: KYLIN-2428
> URL: https://issues.apache.org/jira/browse/KYLIN-2428
> Project: Kylin
>  Issue Type: Improvement
>  Components: General
>Affects Versions: v1.6.0
>Reporter: Billy Liu
>Assignee: Billy Liu
> Attachments: KYLIN-2428.patch
>
>
> Kylin releases three libraries: kylin-coprocessor, kylin-jdbc, kylin-job and 
> one web application: server. 
> Currently, all libraries have shaded some used third party libraries into the 
> package. For example, guava, curator, commons, kyro in kylin-job. The 
> duplicate libraries in runtime classpath may have potential class loading 
> conflicts and waste computing resource. We should leverage the hadoop 
> provided libraries at runtime instead of the shaded one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KYLIN-2429) Variable initialized should be declared volatile in SparkCubingByLayer#execute()

2017-02-06 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-2429:
-

 Summary: Variable initialized should be declared volatile in 
SparkCubingByLayer#execute()
 Key: KYLIN-2429
 URL: https://issues.apache.org/jira/browse/KYLIN-2429
 Project: Kylin
  Issue Type: Bug
Reporter: Ted Yu


{code}
final JavaPairRDD encodedBaseRDD = 
intermediateTable.javaRDD().mapToPair(new PairFunction() {
transient boolean initialized = false;
...
public Tuple2 call(Row row) throws Exception {
if (initialized == false) {
synchronized (SparkCubingByLayer.class) {
if (initialized == false) {
{code}
For double checked locking to work, initialized needs to be volatile.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KYLIN-2430) Unnecessary exception catching in BulkLoadJob

2017-02-06 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-2430:
-

 Summary: Unnecessary exception catching in BulkLoadJob
 Key: KYLIN-2430
 URL: https://issues.apache.org/jira/browse/KYLIN-2430
 Project: Kylin
  Issue Type: Bug
  Components: Storage - HBase
Affects Versions: v1.6.0
Reporter: kangkaisen
Assignee: kangkaisen


FsShell.run has caught all exceptions, So we should get exitCode instead of 
catching exception.
Currently code potentially result in infinite loop in {{LoadIncrementalHFiles}} 
if user use HBase 0.98.13 and don't set {{hbase.bulkload.retries.number}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KYLIN-2329) Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result

2017-02-06 Thread liyang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang resolved KYLIN-2329.
---
   Resolution: Fixed
Fix Version/s: v2.0.0

> Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result
> -
>
> Key: KYLIN-2329
> URL: https://issues.apache.org/jira/browse/KYLIN-2329
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Fix For: v2.0.0
>
>
> A TPC-H query returns incorrect result:
> {code}
> select
> sum(l_saleprice) as revenue
> from
> v_lineitem
> where
> l_shipdate >= '1993-01-01'
> and l_shipdate < '1994-01-01'
> and l_discount between 0.06 - 0.01 and 0.06 + 0.01
> and l_quantity < 25;
> {code}
> The result becomes correct if change condition to below
> {code}
> and l_discount between 0.05 and 0.07
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2329) Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result

2017-02-06 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855458#comment-15855458
 ] 

liyang commented on KYLIN-2329:
---

Cannot reproduce on latest master. Perhaps, the recent upgrade of calcite 
solved this.

> Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result
> -
>
> Key: KYLIN-2329
> URL: https://issues.apache.org/jira/browse/KYLIN-2329
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee:  Kaige Liu
> Fix For: v2.0.0
>
>
> A TPC-H query returns incorrect result:
> {code}
> select
> sum(l_saleprice) as revenue
> from
> v_lineitem
> where
> l_shipdate >= '1993-01-01'
> and l_shipdate < '1994-01-01'
> and l_discount between 0.06 - 0.01 and 0.06 + 0.01
> and l_quantity < 25;
> {code}
> The result becomes correct if change condition to below
> {code}
> and l_discount between 0.05 and 0.07
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KYLIN-2431) StorageCleanupJob will remove immediate tables created by other kylin instances

2017-02-06 Thread Dong Li (JIRA)
Dong Li created KYLIN-2431:
--

 Summary: StorageCleanupJob will remove immediate tables created by 
other kylin instances
 Key: KYLIN-2431
 URL: https://issues.apache.org/jira/browse/KYLIN-2431
 Project: Kylin
  Issue Type: Improvement
Reporter: Dong Li
Assignee: Dong Li
Priority: Minor


If QA and PROD instances are in same hive database, running StorageCleanupJob 
on QA will remove immediate tables created by PROD, which might fail Kylin jobs 
of PROD.

A solution is to add metastore name to hive table prefix, then filter table 
names with metastore name during cleanup job.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2428) Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server

2017-02-06 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855491#comment-15855491
 ] 

Billy Liu commented on KYLIN-2428:
--

[INFO] Apache Kylin ... SUCCESS [  3.045 s]
[INFO] Apache Kylin - Calcite Overrides ... SUCCESS [  2.161 s]
[INFO] Apache Kylin - Core Common . SUCCESS [  5.515 s]
[INFO] Apache Kylin - Core Metadata ... SUCCESS [ 25.081 s]
[INFO] Apache Kylin - Core Dictionary . SUCCESS [01:14 min]
[INFO] Apache Kylin - Core Cube ... SUCCESS [01:44 min]
[INFO] Apache Kylin - Core Job  SUCCESS [01:58 min]
[INFO] Apache Kylin - Core Storage  SUCCESS [ 34.999 s]
[INFO] Apache Kylin - MapReduce Engine  SUCCESS [ 20.623 s]
[INFO] Apache Kylin - HBase Storage ... SUCCESS [ 12.564 s]
[INFO] Apache Kylin - Spark Engine  SUCCESS [ 13.664 s]
[INFO] Apache Kylin - Hive Source . SUCCESS [  4.314 s]
[INFO] Apache Kylin - Kafka Source  SUCCESS [  3.250 s]
[INFO] Apache Kylin - Query ... SUCCESS [  2.485 s]
[INFO] Apache Kylin - Tool  SUCCESS [  4.199 s]
[INFO] Apache Kylin - REST Server Base  SUCCESS [  6.802 s]
[INFO] Apache Kylin - REST Server . SUCCESS [ 37.196 s]
[INFO] Apache Kylin - JDBC Driver . SUCCESS [  5.646 s]
[INFO] Apache Kylin - Assembly  SUCCESS [  5.158 s]
[INFO] Apache Kylin - Integration Test  SUCCESS [  01:08 h]
[INFO] Apache Kylin - Tomcat Extension  SUCCESS [  2.810 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 01:16 h
[INFO] Finished at: 2017-02-07T15:27:23+00:00
[INFO] Final Memory: 75M/935M

> Cleanup unnecessary shaded libraries for job/coprocessor/jdbc/server
> 
>
> Key: KYLIN-2428
> URL: https://issues.apache.org/jira/browse/KYLIN-2428
> Project: Kylin
>  Issue Type: Improvement
>  Components: General
>Affects Versions: v1.6.0
>Reporter: Billy Liu
>Assignee: Billy Liu
> Attachments: KYLIN-2428.patch
>
>
> Kylin releases three libraries: kylin-coprocessor, kylin-jdbc, kylin-job and 
> one web application: server. 
> Currently, all libraries have shaded some used third party libraries into the 
> package. For example, guava, curator, commons, kyro in kylin-job. The 
> duplicate libraries in runtime classpath may have potential class loading 
> conflicts and waste computing resource. We should leverage the hadoop 
> provided libraries at runtime instead of the shaded one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)