[jira] [Resolved] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov resolved HIVE-15433. - Resolution: Invalid Resolved as invalid due to HIVE-16392 > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Vlad Gudikov > Fix For: 3.0.0, 1.2.0 > > Attachments: HIVE-15433-branch-1.2.patch, HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-15433: Fix Version/s: 3.0.0 > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Vlad Gudikov > Fix For: 1.2.0, 3.0.0 > > Attachments: HIVE-15433-branch-1.2.patch, HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-15433: Status: In Progress (was: Patch Available) > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 2.0.0, 1.2.0, 1.0.0 >Reporter: Alina Abramova >Assignee: Vlad Gudikov > Fix For: 3.0.0, 1.2.0 > > Attachments: HIVE-15433-branch-1.2.patch, HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov reassigned HIVE-15433: --- Assignee: Vlad Gudikov (was: Alina Abramova) > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Vlad Gudikov > Fix For: 1.2.0 > > Attachments: HIVE-15433-branch-1.2.patch, HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-11309) Replace PidDailyRollingFileAppender with equivalent log4j2 implementation
[ https://issues.apache.org/jira/browse/HIVE-11309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145356#comment-16145356 ] Vlad Gudikov edited comment on HIVE-11309 at 8/29/17 2:12 PM: -- I think it doesn't work correctly. I am adding pid to file pattern. {code} appender.DRFA.filePattern = ${sys:hive.log.dir}/${sys:hive.log.file}.%d{-MM-dd-HH-mm}.%pid {code} When it's time to roll it just takes whole hive.log and renames it as described above. hive.log contains logs from different services so the logs created on roll are not containing information about processes they've been created for. was (Author: allgoodok): I think it doesn't work correctly. I am adding pid to file pattern. {code} appender.DRFA.filePattern = ${sys:hive.log.dir}/${sys:hive.log.file}.%d{-MM-dd-HH-mm}.%pid {code} When it's time to roll it just takes whole hive.log and renames it as described above. hive.log contains logs from different services so the logs create on roll are not containing information about processes they've been created for. > Replace PidDailyRollingFileAppender with equivalent log4j2 implementation > - > > Key: HIVE-11309 > URL: https://issues.apache.org/jira/browse/HIVE-11309 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11309.patch > > > PidDailyRollingFileAppender appends pid@hostname information to file name > output. Similar thing can be achieved by adding a custom file pattern > converter in log4j2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-11309) Replace PidDailyRollingFileAppender with equivalent log4j2 implementation
[ https://issues.apache.org/jira/browse/HIVE-11309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145356#comment-16145356 ] Vlad Gudikov edited comment on HIVE-11309 at 8/29/17 2:12 PM: -- I think it doesn't work correctly. I am adding pid to file pattern. {code} appender.DRFA.filePattern = ${sys:hive.log.dir}/${sys:hive.log.file}.%d{-MM-dd-HH-mm}.%pid {code} When it's time to roll it just takes whole hive.log and renames it as described above. hive.log contains logs from different services so the logs create on roll are not containing information about processes they've been created for. was (Author: allgoodok): I think it doesn't work correctly. I am adding pid to file pattern. appender.DRFA.filePattern = ${sys:hive.log.dir}/${sys:hive.log.file}.%d{-MM-dd-HH-mm}.%pid When it's time to roll it just takes whole hive.log and renames it as described above. hive.log contains logs from different services so the logs create on roll are not containing information about processes they've been created for. > Replace PidDailyRollingFileAppender with equivalent log4j2 implementation > - > > Key: HIVE-11309 > URL: https://issues.apache.org/jira/browse/HIVE-11309 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11309.patch > > > PidDailyRollingFileAppender appends pid@hostname information to file name > output. Similar thing can be achieved by adding a custom file pattern > converter in log4j2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-11309) Replace PidDailyRollingFileAppender with equivalent log4j2 implementation
[ https://issues.apache.org/jira/browse/HIVE-11309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145356#comment-16145356 ] Vlad Gudikov commented on HIVE-11309: - I think it doesn't work correctly. I am adding pid to file pattern. appender.DRFA.filePattern = ${sys:hive.log.dir}/${sys:hive.log.file}.%d{-MM-dd-HH-mm}.%pid When it's time to roll it just takes whole hive.log and renames it as described above. hive.log contains logs from different services so the logs create on roll are not containing information about processes they've been created for. > Replace PidDailyRollingFileAppender with equivalent log4j2 implementation > - > > Key: HIVE-11309 > URL: https://issues.apache.org/jira/browse/HIVE-11309 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-11309.patch > > > PidDailyRollingFileAppender appends pid@hostname information to file name > output. Similar thing can be achieved by adding a custom file pattern > converter in log4j2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17346) TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning] is failing every time
[ https://issues.apache.org/jira/browse/HIVE-17346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131009#comment-16131009 ] Vlad Gudikov commented on HIVE-17346: - Yeah, it was an intended change, missed this one. Thanks! > TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning] is failing > every time > --- > > Key: HIVE-17346 > URL: https://issues.apache.org/jira/browse/HIVE-17346 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17346.patch > > > The TestMiniSparkOnYarnCliDriver.testCliDriver - > spark_dynamic_partition_pruning is failing with this diff: > {code} > Client Execution succeeded but contained differences (error code = 1) after > executing spark_dynamic_partition_pruning.q > 714c714 > < filterExpr: ((date = '2008-04-08') and abs(((- > UDFToLong(concat(UDFToString(day(ds)), '0'))) + 10)) is not null) (type: > boolean) > --- > > filterExpr: ((date = '2008-04-08') and ds is not null) > > (type: boolean) > 717c717 > < predicate: ((date = '2008-04-08') and abs(((- > UDFToLong(concat(UDFToString(day(ds)), '0'))) + 10)) is not null) (type: > boolean) > --- > > predicate: ((date = '2008-04-08') and ds is not null) > > (type: boolean) > 749c749 > < filterExpr: abs(((- > UDFToLong(concat(UDFToString(day(ds)), '0'))) + 10)) is not null (type: > boolean) > --- > > filterExpr: ds is not null (type: boolean) > 751,752c751,753 > < Filter Operator > < predicate: abs(((- > UDFToLong(concat(UDFToString(day(ds)), '0'))) + 10)) is not null (type: > boolean) > --- > > Select Operator > > expressions: ds (type: string) > > outputColumnNames: _col0 > 754,756c755,758 > < Select Operator > < expressions: ds (type: string) > < outputColumnNames: _col0 > --- > > Reduce Output Operator > > key expressions: abs(((- > > UDFToLong(concat(UDFToString(day(_col0)), '0'))) + 10)) (type: bigint) > > sort order: + > > Map-reduce partition columns: abs(((- > > UDFToLong(concat(UDFToString(day(_col0)), '0'))) + 10)) (type: bigint) > 758,762d759 > < Reduce Output Operator > < key expressions: abs(((- > UDFToLong(concat(UDFToString(day(_col0)), '0'))) + 10)) (type: bigint) > < sort order: + > < Map-reduce partition columns: abs(((- > UDFToLong(concat(UDFToString(day(_col0)), '0'))) + 10)) (type: bigint) > < Statistics: Num rows: 2000 Data size: 21248 Basic > stats: COMPLETE Column stats: NONE > 767c764 > < > Output was too long and had to be truncated... > {code} > I think it is caused by: > HIVE-17148 - Incorrect result for Hive join query with COALESCE in WHERE > condition > [~allgoodok]: Am I right? Is it an intended change and only the golden file > regeneration is needed? > Thanks, > Peter -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Attachment: HIVE-17148.3.patch > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.2.patch, > HIVE-17148.3.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119531#comment-16119531 ] Vlad Gudikov commented on HIVE-17148: - Uploaded new patch with fixed tests > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.2.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Attachment: HIVE-17148.2.patch > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.2.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119445#comment-16119445 ] Vlad Gudikov commented on HIVE-17148: - [~ashutoshc] today I will upload another patch with fixed related tests > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Status: Open (was: Patch Available) > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Status: Patch Available (was: Open) > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112242#comment-16112242 ] Vlad Gudikov commented on HIVE-17148: - Added patch with testcase > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Status: Open (was: Patch Available) > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Status: Patch Available (was: Open) > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Attachment: HIVE-17148.1.patch > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104919#comment-16104919 ] Vlad Gudikov edited comment on HIVE-17148 at 7/28/17 1:08 PM: -- ROOT-CAUSE: The problem was with the predicates that were created according to HiveJoinAddNotNullRule. This rule is creating predicates from fields that take part in join filter, no matter if this fields are used as parameters of functions or not. SOLUTION: Create predicate based on functions that take part in filters as well as fields. The point is to check if left part and right part of the filter is not null, not just fields that are part of the join filter. I.e we have to tables *test1(a1 int, a2 int)* and *test2(b1)*. When we execute following query *select * from ct1 c1 inner join ct2 c2 on (COALESCE(a1,b1)=a2);* we get to predicates for filter operator: b1 is not null --- right part a1 is not null and a2 is not null -- left part Applying predicate for left part of join will result in data loss as we exclude rows with null fields. COALESCE is a good example for this case as the main purpose of COALESCE function is to get not null values from tables. To fix the data loss we need to check that coalesce won't bring us null values as we can't join nulls. My fix will check that left part and right part will look like: b1 is not null -- right part (still checking fields on null condition) COALESCE(a1,a2) is not null (checking that whole function won't bring us null values) In next patch I'm going to change related failed tests with the fixed stage plans. was (Author: allgoodok): ROOT-CAUSE: The problem was with the predicates that were created according to HiveJoinAddNotNullRule. This rule is creating predicates from fields that take part in join filter, no matter if this fields are used as parameters of functions or not. SOLUTION: Create predicate based on functions that take part in filters as well as fields. The point is to check if left part and right part of the filter is not null, not just fields that are part of the join filter. I.e we have to tables test1(a1 int, a2 int) and test2(b1). When we execute following query *select * from ct1 c1 inner join ct2 c2 on (COALESCE(a1,b1)=a2);* we get to predicates for filter operator: b1 is not null --- right part a1 is not null and a2 is not null -- left part Applying predicate for left part of join will result in data loss as we exclude rows with null fields. COALESCE is a good example for this case as the main purpose of COALESCE function is to get not null values from tables. To fix the data loss we need to check that coalesce won't bring us null values as we can't join nulls. My fix will check that left part and right part will look like: b1 is not null -- right part (still checking fields on null condition) COALESCE(a1,a2) is not null (checking that whole function won't bring us null values) In next patch I'm going to change related failed tests with the fixed stage plans. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] >
[jira] [Comment Edited] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104919#comment-16104919 ] Vlad Gudikov edited comment on HIVE-17148 at 7/28/17 1:07 PM: -- ROOT-CAUSE: The problem was with the predicates that were created according to HiveJoinAddNotNullRule. This rule is creating predicates from fields that take part in join filter, no matter if this fields are used as parameters of functions or not. SOLUTION: Create predicate based on functions that take part in filters as well as fields. The point is to check if left part and right part of the filter is not null, not just fields that are part of the join filter. I.e we have to tables test1(a1 int, a2 int) and test2(b1). When we execute following query *select * from ct1 c1 inner join ct2 c2 on (COALESCE(a1,b1)=a2);* we get to predicates for filter operator: b1 is not null --- right part a1 is not null and a2 is not null -- left part Applying predicate for left part of join will result in data loss as we exclude rows with null fields. COALESCE is a good example for this case as the main purpose of COALESCE function is to get not null values from tables. To fix the data loss we need to check that coalesce won't bring us null values as we can't join nulls. My fix will check that left part and right part will look like: b1 is not null -- right part (still checking fields on null condition) COALESCE(a1,a2) is not null (checking that whole function won't bring us null values) In next patch I'm going to change related failed tests with the fixed stage plans. was (Author: allgoodok): ROOT-CAUSE: The problem was with the predicates that were created according to HiveJoinAddNotNullRule. This rule is creating predicates from fields that take part in join filter, no matter if this fields are used as parameters of functions or not. SOLUTION: Create predicate based on functions that take part in filters as well as fields. The point is to check if left part and right part of the filter is not null, not just fields that are part of the join filter. I.e we have to tables test1(a1 int, a2 int) and test2(b1). When we execute following query strong text*select * from ct1 c1 inner join ct2 c2 on (COALESCE(a1,b1)=a2);*strong text* we get to predicates for filter operator: b1 is not null --- right part a1 is not null and a2 is not null -- left part Applying predicate for left part of join will result in data loss as we exclude rows with null fields. COALESCE is a good example for this case as the main purpose of COALESCE function is to get not null values from tables. To fix the data loss we need to check that coalesce won't bring us null values as we can't join nulls. My fix will check that left part and right part will look like: b1 is not null -- right part (still checking fields on null condition) COALESCE(a1,a2) is not null (checking that whole function won't bring us null values) In next patch I'm going to change related failed tests with the fixed stage plans. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BRO
[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104919#comment-16104919 ] Vlad Gudikov commented on HIVE-17148: - ROOT-CAUSE: The problem was with the predicates that were created according to HiveJoinAddNotNullRule. This rule is creating predicates from fields that take part in join filter, no matter if this fields are used as parameters of functions or not. SOLUTION: Create predicate based on functions that take part in filters as well as fields. The point is to check if left part and right part of the filter is not null, not just fields that are part of the join filter. I.e we have to tables test1(a1 int, a2 int) and test2(b1). When we execute following query strong text*select * from ct1 c1 inner join ct2 c2 on (COALESCE(a1,b1)=a2);*strong text* we get to predicates for filter operator: b1 is not null --- right part a1 is not null and a2 is not null -- left part Applying predicate for left part of join will result in data loss as we exclude rows with null fields. COALESCE is a good example for this case as the main purpose of COALESCE function is to get not null values from tables. To fix the data loss we need to check that coalesce won't bring us null values as we can't join nulls. My fix will check that left part and right part will look like: b1 is not null -- right part (still checking fields on null condition) COALESCE(a1,a2) is not null (checking that whole function won't bring us null values) In next patch I'm going to change related failed tests with the fixed stage plans. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104534#comment-16104534 ] Vlad Gudikov commented on HIVE-17148: - Related test failures are due to changes in the plan as we do not create not null conjunctions for fields that are in filter but for expressions in filter as well. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Status: Patch Available (was: Open) > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Attachment: HIVE-17148.patch > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Attachments: HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov reassigned HIVE-17148: --- Assignee: Vlad Gudikov > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Description: The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo enabled: STEPS TO REPRODUCE: {code} Step 1: Create a table ct1 create table ct1 (a1 string,b1 string); Step 2: Create a table ct2 create table ct2 (a2 string); Step 3 : Insert following data into table ct1 insert into table ct1 (a1) values ('1'); Step 4 : Insert following data into table ct2 insert into table ct2 (a2) values ('1'); Step 5 : Execute the following query select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; {code} ACTUAL RESULT: {code} The query returns nothing; {code} EXPECTED RESULT: {code} 1 NULL1 {code} The issue seems to be because of the incorrect query plan. In the plan we can see: predicate:(a1 is not null and b1 is not null) which does not look correct. As a result, it is filtering out all the rows is any column mentioned in the COALESCE has null value. Please find the query plan below: {code} Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Map 2 (BROADCAST_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Map 1 File Output Operator [FS_10] Map Join Operator [MAPJOIN_15] (rows=1 width=4) Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] <-Map 2 [BROADCAST_EDGE] BROADCAST [RS_7] PartitionCols:_col0 Select Operator [SEL_5] (rows=1 width=1) Output:["_col0"] Filter Operator [FIL_14] (rows=1 width=1) predicate:a2 is not null TableScan [TS_3] (rows=1 width=1) default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] <-Select Operator [SEL_2] (rows=1 width=4) Output:["_col0","_col1"] Filter Operator [FIL_13] (rows=1 width=4) predicate:{color:red}(a1 is not null and b1 is not null){color} TableScan [TS_0] (rows=1 width=4) default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] {code} This happens only if join is inner type, otherwise HiveJoinAddNotRule which creates this problem is skipped. was: The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo enabled: STEPS TO REPRODUCE: {code} Step 1: Create a table ct1 create table ct1 (a1 string,b1 string); Step 2: Create a table ct2 create table ct2 (a2 string); Step 3 : Insert following data into table ct1 insert into table ct1 (a1) values ('1'); Step 4 : Insert following data into table ct2 insert into table ct2 (a2) values ('1'); Step 5 : Execute the following query select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; {code} ACTUAL RESULT: {code} The query returns nothing; {code} EXPECTED RESULT: {code} 1 NULL1 {code} The issue seems to be because of the incorrect query plan. In the plan we can see: predicate:(a1 is not null and b1 is not null) which does not look correct. As a result, it is filtering out all the rows is any column mentioned in the COALESCE has null value. Please find the query plan below: {code} Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Map 2 (BROADCAST_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Map 1 File Output Operator [FS_10] Map Join Operator [MAPJOIN_15] (rows=1 width=4) Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] <-Map 2 [BROADCAST_EDGE] BROADCAST [RS_7] PartitionCols:_col0 Select Operator [SEL_5] (rows=1 width=1) Output:["_col0"] Filter Operator [FIL_14] (rows=1 width=1) predicate:a2 is not null TableScan [TS_3] (rows=1 width=1) default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] <-Select Operator [SEL_2] (rows=1 width=4) Output:["_col0","_col1"] Filter Operator [FIL_13] (rows=1 width=4) predicate:{color:red}(a1 is not null and b1 is not null){color} TableScan [TS_0] (rows=1 width=4) default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] {code} This happens only if join is inner type, otherwise HiveJoinAddNotRule which creates whis problem is skipped. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRO
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Description: The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo enabled: STEPS TO REPRODUCE: {code} Step 1: Create a table ct1 create table ct1 (a1 string,b1 string); Step 2: Create a table ct2 create table ct2 (a2 string); Step 3 : Insert following data into table ct1 insert into table ct1 (a1) values ('1'); Step 4 : Insert following data into table ct2 insert into table ct2 (a2) values ('1'); Step 5 : Execute the following query select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; {code} ACTUAL RESULT: {code} The query returns nothing; {code} EXPECTED RESULT: {code} 1 NULL1 {code} The issue seems to be because of the incorrect query plan. In the plan we can see: predicate:(a1 is not null and b1 is not null) which does not look correct. As a result, it is filtering out all the rows is any column mentioned in the COALESCE has null value. Please find the query plan below: {code} Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Map 2 (BROADCAST_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Map 1 File Output Operator [FS_10] Map Join Operator [MAPJOIN_15] (rows=1 width=4) Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] <-Map 2 [BROADCAST_EDGE] BROADCAST [RS_7] PartitionCols:_col0 Select Operator [SEL_5] (rows=1 width=1) Output:["_col0"] Filter Operator [FIL_14] (rows=1 width=1) predicate:a2 is not null TableScan [TS_3] (rows=1 width=1) default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] <-Select Operator [SEL_2] (rows=1 width=4) Output:["_col0","_col1"] Filter Operator [FIL_13] (rows=1 width=4) predicate:(a1 is not null and b1 is not null) TableScan [TS_0] (rows=1 width=4) default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] {code} This happens only if join is inner type, otherwise HiveJoinAddNotRule which creates this problem is skipped. was: The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo enabled: STEPS TO REPRODUCE: {code} Step 1: Create a table ct1 create table ct1 (a1 string,b1 string); Step 2: Create a table ct2 create table ct2 (a2 string); Step 3 : Insert following data into table ct1 insert into table ct1 (a1) values ('1'); Step 4 : Insert following data into table ct2 insert into table ct2 (a2) values ('1'); Step 5 : Execute the following query select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; {code} ACTUAL RESULT: {code} The query returns nothing; {code} EXPECTED RESULT: {code} 1 NULL1 {code} The issue seems to be because of the incorrect query plan. In the plan we can see: predicate:(a1 is not null and b1 is not null) which does not look correct. As a result, it is filtering out all the rows is any column mentioned in the COALESCE has null value. Please find the query plan below: {code} Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Map 2 (BROADCAST_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Map 1 File Output Operator [FS_10] Map Join Operator [MAPJOIN_15] (rows=1 width=4) Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] <-Map 2 [BROADCAST_EDGE] BROADCAST [RS_7] PartitionCols:_col0 Select Operator [SEL_5] (rows=1 width=1) Output:["_col0"] Filter Operator [FIL_14] (rows=1 width=1) predicate:a2 is not null TableScan [TS_3] (rows=1 width=1) default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] <-Select Operator [SEL_2] (rows=1 width=4) Output:["_col0","_col1"] Filter Operator [FIL_13] (rows=1 width=4) predicate:{color:red}(a1 is not null and b1 is not null){color} TableScan [TS_0] (rows=1 width=4) default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] {code} This happens only if join is inner type, otherwise HiveJoinAddNotRule which creates this problem is skipped. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > S
[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096357#comment-16096357 ] Vlad Gudikov commented on HIVE-17148: - The thing is that while optimizing query HiveJoinAddNotRule which is checking if values that are part of filter are not null. The thing is that coalesce is working with null values, but tuples with null values are omitted. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:{color:red}(a1 is not null and b1 is not null){color} > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates whis problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Environment: (was: {color:red}colored text{color}) > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:{color:red}(a1 is not null and b1 is not null){color} > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type? otherwise HiveJoinAddNotRule which > creates whis problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition
[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17148: Description: The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo enabled: STEPS TO REPRODUCE: {code} Step 1: Create a table ct1 create table ct1 (a1 string,b1 string); Step 2: Create a table ct2 create table ct2 (a2 string); Step 3 : Insert following data into table ct1 insert into table ct1 (a1) values ('1'); Step 4 : Insert following data into table ct2 insert into table ct2 (a2) values ('1'); Step 5 : Execute the following query select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; {code} ACTUAL RESULT: {code} The query returns nothing; {code} EXPECTED RESULT: {code} 1 NULL1 {code} The issue seems to be because of the incorrect query plan. In the plan we can see: predicate:(a1 is not null and b1 is not null) which does not look correct. As a result, it is filtering out all the rows is any column mentioned in the COALESCE has null value. Please find the query plan below: {code} Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Map 2 (BROADCAST_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Map 1 File Output Operator [FS_10] Map Join Operator [MAPJOIN_15] (rows=1 width=4) Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] <-Map 2 [BROADCAST_EDGE] BROADCAST [RS_7] PartitionCols:_col0 Select Operator [SEL_5] (rows=1 width=1) Output:["_col0"] Filter Operator [FIL_14] (rows=1 width=1) predicate:a2 is not null TableScan [TS_3] (rows=1 width=1) default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] <-Select Operator [SEL_2] (rows=1 width=4) Output:["_col0","_col1"] Filter Operator [FIL_13] (rows=1 width=4) predicate:{color:red}(a1 is not null and b1 is not null){color} TableScan [TS_0] (rows=1 width=4) default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] {code} This happens only if join is inner type, otherwise HiveJoinAddNotRule which creates whis problem is skipped. was: The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo enabled: STEPS TO REPRODUCE: {code} Step 1: Create a table ct1 create table ct1 (a1 string,b1 string); Step 2: Create a table ct2 create table ct2 (a2 string); Step 3 : Insert following data into table ct1 insert into table ct1 (a1) values ('1'); Step 4 : Insert following data into table ct2 insert into table ct2 (a2) values ('1'); Step 5 : Execute the following query select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; {code} ACTUAL RESULT: {code} The query returns nothing; {code} EXPECTED RESULT: {code} 1 NULL1 {code} The issue seems to be because of the incorrect query plan. In the plan we can see: predicate:(a1 is not null and b1 is not null) which does not look correct. As a result, it is filtering out all the rows is any column mentioned in the COALESCE has null value. Please find the query plan below: {code} Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Map 2 (BROADCAST_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Map 1 File Output Operator [FS_10] Map Join Operator [MAPJOIN_15] (rows=1 width=4) Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] <-Map 2 [BROADCAST_EDGE] BROADCAST [RS_7] PartitionCols:_col0 Select Operator [SEL_5] (rows=1 width=1) Output:["_col0"] Filter Operator [FIL_14] (rows=1 width=1) predicate:a2 is not null TableScan [TS_3] (rows=1 width=1) default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] <-Select Operator [SEL_2] (rows=1 width=4) Output:["_col0","_col1"] Filter Operator [FIL_13] (rows=1 width=4) predicate:{color:red}(a1 is not null and b1 is not null){color} TableScan [TS_0] (rows=1 width=4) default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] {code} This happens only if join is inner type? otherwise HiveJoinAddNotRule which creates whis problem is skipped. > Incorrect result for Hive join query with COALESCE in WHERE condition > - > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.1 >Reporter: Vlad Gudikov > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRO
[jira] [Updated] (HIVE-16775) Fix HiveFilterAggregateTransposeRule when filter is always false
[ https://issues.apache.org/jira/browse/HIVE-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16775: Description: strong textquery4.q,query74.q {code} [7e490527-156a-48c7-aa87-8c80093cdfa8 main] ql.Driver: FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$QBVisitor.visit(ASTConverter.java:457) at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:110) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:393) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:115) {code} was: query4.q,query74.q {code} [7e490527-156a-48c7-aa87-8c80093cdfa8 main] ql.Driver: FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$QBVisitor.visit(ASTConverter.java:457) at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:110) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:393) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:115) {code} > Fix HiveFilterAggregateTransposeRule when filter is always false > > > Key: HIVE-16775 > URL: https://issues.apache.org/jira/browse/HIVE-16775 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 3.0.0 > > Attachments: HIVE-16775.01.patch, HIVE-16775.02.patch, > HIVE-16775.03.patch > > > strong textquery4.q,query74.q > {code} > [7e490527-156a-48c7-aa87-8c80093cdfa8 main] ql.Driver: FAILED: > NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$QBVisitor.visit(ASTConverter.java:457) > at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:110) > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:393) > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:115) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085664#comment-16085664 ] Vlad Gudikov commented on HIVE-16983: - Maybe it is reasonable to leave version 2.8.1 as is. I haven't seen that master branch already has 2.8.1 version of joda time. My fault. I thought it's still 2.5 > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-branch-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081846#comment-16081846 ] Vlad Gudikov commented on HIVE-16983: - Well I've tested this one with s3a://, using Impala and Hive (storing keys in core-site.xml). Also tested using hadoop commands by directly passing keys to command. > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-branch-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Status: Patch Available (was: Open) > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-branch-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Status: Open (was: Patch Available) > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-branch-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Attachment: (was: HIVE-16983-brach-2.1.patch) > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Attachment: HIVE-16983-branch-2.1.patch > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-branch-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Issue Comment Deleted] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Comment: was deleted (was: Updating joda-time version to 2.9.9) > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-brach-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Status: Patch Available (was: Open) The solution is to update joda-time version from 2.5 to 2.9.9 > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-brach-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Attachment: HIVE-16983-brach-2.1.patch > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > Attachments: HIVE-16983-brach-2.1.patch > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Status: Open (was: Patch Available) > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error C
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16983: Fix Version/s: 2.1.1 Target Version/s: 2.1.1 Status: Patch Available (was: Open) Updating joda-time version to 2.9.9 > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > Fix For: 2.1.1 > > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error
[ https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov reassigned HIVE-16983: --- Assignee: Vlad Gudikov > getFileStatus on accessible s3a://[bucket-name]/folder: throws > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; > - > > Key: HIVE-16983 > URL: https://issues.apache.org/jira/browse/HIVE-16983 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 > Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to > S3 using s3a:// protocol >Reporter: Alex Baretto >Assignee: Vlad Gudikov > > I've followed various published documentation on integrating Apache Hive > 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` > and > `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and > `hive/conf/hive-site.xml`. > I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` > to work properly (it returns s3 ls of that bucket). So I know my creds, > bucket access, and overall Hadoop setup is valid. > hdfs dfs -ls s3a://[bucket-name]/ > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files > ...etc. > hdfs dfs -ls s3a://[bucket-name]/files > > drwxrwxrwx - hdfs hdfs 0 2017-06-27 22:43 > s3a://[bucket-name]/files/my-csv.csv > However, when I attempt to access the same s3 resources from hive, e.g. run > any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION > 's3a://[bucket-name]/files/'`, it fails. > for example: > >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, > >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED > >BY ',' LOCATION 's3a://[bucket-name]/files/'; > I keep getting this error: > >FAILED: Execution Error, return code 1 from > >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: > >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus > >on s3a://[bucket-name]/files: > >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > >C9CF3F9C50EF08D1), S3 Extended Request ID: > >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=) > This makes no sense. I have access to the bucket as one can see in the hdfs > test. And I've added the proper creds to hive-site.xml. > Anyone have any idea what's missing from this equation? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17014) Password File Encryption for HiveServer2 Client
[ https://issues.apache.org/jira/browse/HIVE-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076563#comment-16076563 ] Vlad Gudikov commented on HIVE-17014: - Attached document with possible ways to implement this feature. As [~lmccay] commented in [HIVE-17014]. - "we may want to consider the use of the CredentialProvider API that will be committed soon. See [HADOOP-10607]. This isn't mutually exclusive with the password file approach as there are plans to fallback to existing password files in certain components. However, the abstraction of the API is best realized through the new Configuration.getPassword(String name) method. This will allow you to ask for a configuration item that you know is a password and it will check for an aliased credential based on the name through the CredentialProvider API. If the name is not resolved into a credential from a provider then it falls back to the config file." Would be happy to discuss this approach with other members. > Password File Encryption for HiveServer2 Client > --- > > Key: HIVE-17014 > URL: https://issues.apache.org/jira/browse/HIVE-17014 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Fix For: 2.1.2 > > Attachments: PasswordFileEncryption.docx.pdf > > > The main point of this file is to encrypt password file that is used for > beeline connection using -w key. Any ideas or proposals would be great. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17014) Password File Encryption for HiveServer2 Client
[ https://issues.apache.org/jira/browse/HIVE-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17014: Attachment: PasswordFileEncryption.docx.pdf This document describes possible ways of implementing password file encryption feature > Password File Encryption for HiveServer2 Client > --- > > Key: HIVE-17014 > URL: https://issues.apache.org/jira/browse/HIVE-17014 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov > Fix For: 2.1.2 > > Attachments: PasswordFileEncryption.docx.pdf > > > The main point of this file is to encrypt password file that is used for > beeline connection using -w key. Any ideas or proposals would be great. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17014) Password File Encryption for HiveServer2 Client
[ https://issues.apache.org/jira/browse/HIVE-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-17014: Summary: Password File Encryption for HiveServer2 Client (was: Password File Encryption) > Password File Encryption for HiveServer2 Client > --- > > Key: HIVE-17014 > URL: https://issues.apache.org/jira/browse/HIVE-17014 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vlad Gudikov > > The main point of this file is to encrypt password file that is used for > beeline connection using -w key. Any ideas or proposals would be great. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16021278#comment-16021278 ] Vlad Gudikov commented on HIVE-16674: - Are these failures related to fix? > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > Attachments: HIVE-16674.1.patch, HIVE-16674.patch > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16674: Attachment: HIVE-16674.1.patch > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > Attachments: HIVE-16674.1.patch, HIVE-16674.patch > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16674: Status: Patch Available (was: In Progress) > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov >Priority: Blocker > Fix For: 2.3.0, 1.2.1 > > Attachments: HIVE-16674.patch > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov updated HIVE-16674: Attachment: HIVE-16674.patch > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > Attachments: HIVE-16674.patch > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Work started] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-16674 started by Vlad Gudikov. --- > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Gudikov reassigned HIVE-16674: --- Assignee: Vlad Gudikov > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Assignee: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015781#comment-16015781 ] Vlad Gudikov edited comment on HIVE-16674 at 5/18/17 2:00 PM: -- Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actually need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", \"COMMENT\", \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc>() { @Override public void apply(List t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} was (Author: allgoodok): Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actuualy need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", \"COMMENT\", \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc>() { @Override public void apply(List t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.Thread
[jira] [Comment Edited] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015781#comment-16015781 ] Vlad Gudikov edited comment on HIVE-16674 at 5/18/17 1:57 PM: -- Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actuualy need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", \"COMMENT\", \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc>() { @Override public void apply(List t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} was (Author: allgoodok): Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actuualy need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", {color:red}\"COMMENT\"{color}, \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc>() { @Override public void apply(List t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util
[jira] [Commented] (HIVE-16674) Hive metastore JVM dumps core
[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015781#comment-16015781 ] Vlad Gudikov commented on HIVE-16674: - Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actuualy need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", {color:red}\"COMMENT\"{color}, \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc>() { @Override public void apply(List t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} > Hive metastore JVM dumps core > - > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster >Reporter: Vlad Gudikov >Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)