[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-26 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-11097:
-
Attachment: HIVE-11097.5.patch

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch, HIVE-11097.4.patch, HIVE-11097.5.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-20 Thread Wan Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110204#comment-15110204
 ] 

Wan Chang commented on HIVE-11097:
--

[~prasanth_j]  symlink_text_input_format.q test failure relates to the patch. I 
have fixed it and add some comments to specify the scene. 

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch, HIVE-11097.4.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-20 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-11097:
-
Attachment: HIVE-11097.4.patch

Update patch

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch, HIVE-11097.4.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-19 Thread Wan Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106468#comment-15106468
 ] 

Wan Chang commented on HIVE-11097:
--

Hi [~prasanth_j], I use hive0.13.1 and the bug occurs with some complex sql. 
But I didn't reproduce the case on the master branch. I don't know whether it 
has been fix yet. 

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-18 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-11097:
-
Attachment: HIVE-11097.3.patch

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-18 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-11097:
-
Attachment: HIVE-11097.2.patch

Update patch

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-18 Thread Wan Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104953#comment-15104953
 ] 

Wan Chang commented on HIVE-11097:
--

[~prasanth_j]  Thanks for your information. I will update the patch soon.

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-07-03 Thread Wan Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613563#comment-14613563
 ] 

Wan Chang commented on HIVE-11097:
--

Hi [~ashutoshc], would you help to review this patch please?

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-06-24 Thread Wan Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600719#comment-14600719
 ] 

Wan Chang commented on HIVE-11097:
--

Hi [~jvs], would you help to review this?

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-1903) Can't join HBase tables if one's name is the beginning of the other

2015-06-24 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-1903:

Attachment: (was: HIVE-11097.1.patch)

> Can't join HBase tables if one's name is the beginning of the other
> ---
>
> Key: HIVE-1903
> URL: https://issues.apache.org/jira/browse/HIVE-1903
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Jean-Daniel Cryans
>Assignee: John Sichi
> Fix For: 0.7.0
>
> Attachments: HIVE-1903.1.patch
>
>
> I tried joining two tables, let's call them "table" and "table_a", but I'm 
> seeing an array of errors such as this:
> {noformat}
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:118)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:231)
> {noformat}
> The reason is that HiveInputFormat.pushProjectionsAndFilters matches the 
> aliases with startsWith so in my case the mappers for "table_a" were getting 
> the columns from "table" as well as its own (and since it had less column, it 
> was trying to get one too far in the array).
> I don't know if just changing it to "equals" fill fix it, my guess is it 
> won't, since it may break RCFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-1903) Can't join HBase tables if one's name is the beginning of the other

2015-06-24 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-1903:

Attachment: HIVE-11097.1.patch

> Can't join HBase tables if one's name is the beginning of the other
> ---
>
> Key: HIVE-1903
> URL: https://issues.apache.org/jira/browse/HIVE-1903
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Jean-Daniel Cryans
>Assignee: John Sichi
> Fix For: 0.7.0
>
> Attachments: HIVE-11097.1.patch, HIVE-1903.1.patch
>
>
> I tried joining two tables, let's call them "table" and "table_a", but I'm 
> seeing an array of errors such as this:
> {noformat}
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>   at java.util.ArrayList.get(ArrayList.java:322)
>   at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:118)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:231)
> {noformat}
> The reason is that HiveInputFormat.pushProjectionsAndFilters matches the 
> aliases with startsWith so in my case the mappers for "table_a" were getting 
> the columns from "table" as well as its own (and since it had less column, it 
> was trying to get one too far in the array).
> I don't know if just changing it to "equals" fill fix it, my guess is it 
> won't, since it may break RCFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-06-24 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HIVE-11097:
-
Attachment: HIVE-11097.1.patch

Attach patch file

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)