[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.4.patch

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: (was: HIVE-15239.4.patch)

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.4.patch

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: (was: HIVE-15239.4.patch)

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.4.patch

Thanks [~xuefuz] for the review. Update patch v4 to address your comment.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch, HIVE-15239.4.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: (was: HIVE-15239.3.patch)

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-12-01 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.3.patch

Try again

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-30 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.3.patch

[~xuefuz] I see your point. Update patch to address your comment. I also moved 
the compare logic to respective classes. Will create an RB for v3 patch.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch, 
> HIVE-15239.3.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-24 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.2.patch

Fix NPE and address Xuefu's comments.
I also add the example in description as a qtest, which should only run against 
spark.
[~csun], seems you added {{spark.only.query.files}}, but are they picked up by 
our test framework? I noted TestSparkCliDriver only includes 
{{spark.query.files}} and TestMiniSparkOnYarnCliDriver only includes 
{{miniSparkOnYarn.query.files}}.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch, HIVE-15239.2.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-23 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Status: Patch Available  (was: Open)

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.0, 1.2.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-23 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15239:
--
Attachment: HIVE-15239.1.patch

Thanks [~wenli] for reporting the issue.
The problem is we can really tell whether two TS are equivalent if they have 
same schema, alias, etc. So the patch checks the path to partition info of 
MapWorks before checking operators.

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15239.1.patch
>
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-17 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-15239:
---
Description: 
env: hive on spark engine
reproduce step:
{code}
create table a1(KEHHAO string, START_DT string) partitioned by (END_DT string);
create table a2(KEHHAO string, START_DT string) partitioned by (END_DT string);

alter table a1 add partition(END_DT='20161020');
alter table a1 add partition(END_DT='20161021');

insert into table a1 partition(END_DT='20161020') 
values('2000721360','20161001');


SELECT T1.KEHHAO,COUNT(1) FROM ( 
SELECT KEHHAO FROM a1 T 
WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND T.END_DT-1 
UNION ALL 
SELECT KEHHAO FROM a2 T
WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND T.END_DT-1 
) T1 
GROUP BY T1.KEHHAO 
HAVING COUNT(1)>1; 

+-+--+--+
|  t1.kehhao  | _c1  |
+-+--+--+
| 2000721360  | 2|
+-+--+--+
{code}

the result should be none record

  was:
env: hive on spark engine
reproduce step:

create table a1(KEHHAO string, START_DT string) partitioned by (END_DT string);
create table a2(KEHHAO string, START_DT string) partitioned by (END_DT string);

alter table a1 add partition(END_DT='20161020');
alter table a1 add partition(END_DT='20161021');

insert into table a1 partition(END_DT='20161020') 
values('2000721360','20161001');


SELECT T1.KEHHAO,COUNT(1) FROM ( 
SELECT KEHHAO FROM a1 T 
WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND T.END_DT-1 
UNION ALL 
SELECT KEHHAO FROM a2 T
WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND T.END_DT-1 
) T1 
GROUP BY T1.KEHHAO 
HAVING COUNT(1)>1; 

+-+--+--+
|  t1.kehhao  | _c1  |
+-+--+--+
| 2000721360  | 2|
+-+--+--+


the result should be none record


> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>
> env: hive on spark engine
> reproduce step:
> {code}
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> {code}
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15239) hive on spark combine equivalentwork get wrong result because of tablescan operation compare

2016-11-17 Thread wangwenli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwenli updated HIVE-15239:
-
Summary: hive on spark combine equivalentwork get wrong result because of  
tablescan operation compare  (was: hive on spark combine equivalentwork get 
wrong result because of  tablescan operaton compare)

> hive on spark combine equivalentwork get wrong result because of  tablescan 
> operation compare
> -
>
> Key: HIVE-15239
> URL: https://issues.apache.org/jira/browse/HIVE-15239
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.0, 2.1.0
>Reporter: wangwenli
>
> env: hive on spark engine
> reproduce step:
> create table a1(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> create table a2(KEHHAO string, START_DT string) partitioned by (END_DT 
> string);
> alter table a1 add partition(END_DT='20161020');
> alter table a1 add partition(END_DT='20161021');
> insert into table a1 partition(END_DT='20161020') 
> values('2000721360','20161001');
> SELECT T1.KEHHAO,COUNT(1) FROM ( 
> SELECT KEHHAO FROM a1 T 
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> UNION ALL 
> SELECT KEHHAO FROM a2 T
> WHERE T.KEHHAO = '2000721360' AND '20161018' BETWEEN T.START_DT AND 
> T.END_DT-1 
> ) T1 
> GROUP BY T1.KEHHAO 
> HAVING COUNT(1)>1; 
> +-+--+--+
> |  t1.kehhao  | _c1  |
> +-+--+--+
> | 2000721360  | 2|
> +-+--+--+
> the result should be none record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)