date:20170618

[jira] [Updated] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

2017-06-18 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16797:
---
Attachment: HIVE-16797.04.patch

> Enhance HiveFilterSetOpTransposeRule to remove union branches
> -
>
> Key: HIVE-16797
> URL: https://issues.apache.org/jira/browse/HIVE-16797
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16797.01.patch, HIVE-16797.02.patch, 
> HIVE-16797.03.patch, HIVE-16797.04.patch
>
>
> in query4.q, we can see that it creates a CTE with union all of 3 branches. 
> Then it is going to do a 3 way self-join of the CTE with predicates. The 
> predicates actually specifies only one of the branch in CTE to participate in 
> the join. Thus, in some cases, e.g.,
> {code}
>/- filter(false) -TS0 
> union all  - filter(false) -TS1
>\-TS2
> {code}
> we can cut the branches of TS0 and TS1. The union becomes only TS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

2017-06-18 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16797:
---
Status: Open  (was: Patch Available)

> Enhance HiveFilterSetOpTransposeRule to remove union branches
> -
>
> Key: HIVE-16797
> URL: https://issues.apache.org/jira/browse/HIVE-16797
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16797.01.patch, HIVE-16797.02.patch, 
> HIVE-16797.03.patch, HIVE-16797.04.patch
>
>
> in query4.q, we can see that it creates a CTE with union all of 3 branches. 
> Then it is going to do a 3 way self-join of the CTE with predicates. The 
> predicates actually specifies only one of the branch in CTE to participate in 
> the join. Thus, in some cases, e.g.,
> {code}
>/- filter(false) -TS0 
> union all  - filter(false) -TS1
>\-TS2
> {code}
> we can cut the branches of TS0 and TS1. The union becomes only TS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

2017-06-18 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16797:
---
Status: Patch Available  (was: Open)

> Enhance HiveFilterSetOpTransposeRule to remove union branches
> -
>
> Key: HIVE-16797
> URL: https://issues.apache.org/jira/browse/HIVE-16797
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16797.01.patch, HIVE-16797.02.patch, 
> HIVE-16797.03.patch, HIVE-16797.04.patch
>
>
> in query4.q, we can see that it creates a CTE with union all of 3 branches. 
> Then it is going to do a 3 way self-join of the CTE with predicates. The 
> predicates actually specifies only one of the branch in CTE to participate in 
> the join. Thus, in some cases, e.g.,
> {code}
>/- filter(false) -TS0 
> union all  - filter(false) -TS1
>\-TS2
> {code}
> we can cut the branches of TS0 and TS1. The union becomes only TS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

2017-06-18 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053117#comment-16053117
 ] 

Hive QA commented on HIVE-16797:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873401/HIVE-16797.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10820 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=103)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5672/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5672/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5672/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873401 - PreCommit-HIVE-Build

> Enhance HiveFilterSetOpTransposeRule to remove union branches
> -
>
> Key: HIVE-16797
> URL: https://issues.apache.org/jira/browse/HIVE-16797
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16797.01.patch, HIVE-16797.02.patch, 
> HIVE-16797.03.patch, HIVE-16797.04.patch
>
>
> in query4.q, we can see that it creates a CTE with union all of 3 branches. 
> Then it is going to do a 3 way self-join of the CTE with predicates. The 
> predicates actually specifies only one of the branch in CTE to participate in 
> the join. Thus, in some cases, e.g.,
> {code}
>/- filter(false) -TS0 
> union all  - filter(false) -TS1
>\-TS2
> {code}
> we can cut the branches of TS0 and TS1. The union becomes only TS2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files

2017-06-18 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052903#comment-16052903
 ] 

Eugene Koifman edited comment on HIVE-16177 at 6/19/17 1:49 AM:


The file list is sorted to make sure there is consistent ordering for both read 
and compact.
Compaction needs to process the whole list of files (for a bucket) and assign 
ROW_IDs consistently.
For read, OrcRawRecordReader just has a split from some file.  So I need to 
make sure order them the same way so that the "offset" for the current file is 
computed the same way as for compaction.

Since Hive doesn't restrict the layout of files in a table very well, sorting 
is the most general way to do this.
For example, say we realize that some "feature" places bucket files in 
subdirectories - by sorting the whole list of "original" files it makes this 
work with any directory layout.

Same goes for when we allow non-bucketed tables - files can be anywhere and 
they need to be "numbered" consistently.  Sorting seems like the simplest way 
to do this.

Putting a Comparator in AcidUtils makes sense.

"totalSize" is probably because I run the tests on Mac.  Stats often differ on 
Mac.



was (Author: ekoifman):
The file list is sorted to make sure there is consistent ordering for both read 
and compact.
Compaction needs to process the whole list of files (for a bucket) and assign 
ROW_IDs consistently.
For read, OrcRawRecordReader just has a split from some file.  So I need to 
make sure order them the same way so that the "offset" for the current file is 
computed the same way as for compaction.

Since Hive doesn't restrict the layout of files in a table very well, sorting 
is the most general way to do this.
For example, say we realize that some "feature" places bucket files in 
subdirectories - by sorting the whole list of "original" files it makes this 
work with any directory layout.

Putting a Comparator in AcidUtils makes sense.

"totalSize" is probably because I run the tests on Mac.  Stats often differ on 
Mac.


> non Acid to acid conversion doesn't handle _copy_N files
> 
>
> Key: HIVE-16177
> URL: https://issues.apache.org/jira/browse/HIVE-16177
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-16177.01.patch, HIVE-16177.02.patch, 
> HIVE-16177.04.patch, HIVE-16177.07.patch, HIVE-16177.08.patch, 
> HIVE-16177.09.patch, HIVE-16177.10.patch, HIVE-16177.11.patch, 
> HIVE-16177.14.patch, HIVE-16177.15.patch
>
>
> {noformat}
> create table T(a int, b int) clustered by (a)  into 2 buckets stored as orc 
> TBLPROPERTIES('transactional'='false')
> insert into T(a,b) values(1,2)
> insert into T(a,b) values(1,3)
> alter table T SET TBLPROPERTIES ('transactional'='true')
> {noformat}
> //we should now have bucket files 01_0 and 01_0_copy_1
> but OrcRawRecordMerger.OriginalReaderPair.next() doesn't know that there can 
> be copy_N files and numbers rows in each bucket from 0 thus generating 
> duplicate IDs
> {noformat}
> select ROW__ID, INPUT__FILE__NAME, a, b from T
> {noformat}
> produces 
> {noformat}
> {"transactionid":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0,1,2
> {"transactionid\":0,"bucketid":1,"rowid":0},file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands.../warehouse/nonacidorctbl/01_0_copy_1,1,3
> {noformat}
> [~owen.omalley], do you have any thoughts on a good way to handle this?
> attached patch has a few changes to make Acid even recognize copy_N but this 
> is just a pre-requisite.  The new UT demonstrates the issue.
> Futhermore,
> {noformat}
> alter table T compact 'major'
> select ROW__ID, INPUT__FILE__NAME, a, b from T order by b
> {noformat}
> produces 
> {noformat}
> {"transactionid":0,"bucketid":1,"rowid":0}
> file:/Users/ekoifman/dev/hiverwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommandswarehouse/nonacidorctbl/base_-9223372036854775808/bucket_1
> 1   2
> {noformat}
> HIVE-16177.04.patch has TestTxnCommands.testNonAcidToAcidConversion0() 
> demonstrating this
> This is because compactor doesn't handle copy_N files either (skips them)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

2017-06-18 Thread liyunzhang_intel (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053446#comment-16053446
 ] 

liyunzhang_intel commented on HIVE-11297:
-

[~csun]: When i print the operator tree of multi_column_single_source.q  when 
debugging in 
[SplitOpTreeForDPP|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java#L75
 ], the physical plan is 
{code}
set hive.execution.engine=spark; 
set hive.auto.convert.join.noconditionaltask.size=20; 
set hive.spark.dynamic.partition.pruning=true;
select count(*) from srcpart join srcpart_date_hour on (srcpart.ds = 
srcpart_date_hour.ds and srcpart.hr = srcpart_date_hour.hr) where 
srcpart_date_hour.`date` = '2008-04-08' and srcpart_date_hour.hour = 11;
{code}

physical plan 
{code}
TS[1]-FIL[17]-RS[4]-JOIN[5]-GBY[8]-RS[9]-GBY[10]-FS[12]
 -SEL[18]-GBY[19]-SPARKPRUNINGSINK[20]
 -SEL[21]-GBY[22]-SPARKPRUNINGSINK[23]
{code}
{noformat}RS[4],SEL[18],SEL[21] is children of FIL[17]{noformat}
bq. I think in the original code the parent node of all branches is a filter 
op, but now it is changed
I don't think so, i think now filter op is still {noformat}FIL[17]{noformat}.  
the difference between previous is now.  Before we split above tree into three 
trees
{noformat}
tree1: TS[1]-FIL[17]-RS[4]-JOIN[5]-GBY[8]-RS[9]-GBY[10]-FS[12]
tree2: TS[1]-FIL[17]-SEL[18]-GBY[19]-SPARKPRUNINGSINK[20]
tree3: TS[1]-FIL[17]-SEL[21]-GBY[22]-SPARKPRUNINGSINK[23]
{noformat}

Now we split above tree into two trees
{noformat}
tree1: TS[1]-FIL[17]-RS[4]-JOIN[5]-GBY[8]-RS[9]-GBY[10]-FS[12]
tree2: TS[1]-FIL[17]-SEL[18]-GBY[19]-SPARKPRUNINGSINK[20]
   -SEL[21]-GBY[22]-SPARKPRUNINGSINK[23]
{noformat}

> Combine op trees for partition info generating tasks [Spark branch]
> ---
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chao Sun
>Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, 
> HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates 
> partition info for more than one partition columns, multiple operator trees 
> are created, which all start from the same table scan op, but have different 
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do 
> table scan multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2

2017-06-18 Thread Saijin Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16905:

Description: 
Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode of 
hiveserver2 deleted by accident.



  was:Add zookeeper ACL for hiveserver2


> Add zookeeper ACL for hiveserver2
> -
>
> Key: HIVE-16905
> URL: https://issues.apache.org/jira/browse/HIVE-16905
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
> Attachments: HIVE-16905.1.patch
>
>
> Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode 
> of hiveserver2 deleted by accident.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2

2017-06-18 Thread Saijin Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16905:

Description: 
Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode of 
hiveserver2 deleted by accident.

--
case:
when i do beeline connections throught hive HA with zookeeper, i suddenly find 
the beeline can not connect the hiveserve2.The reason of the problem is that 
others delete the /hiveserver2 falsely which cause to the beeline connection is 
failed and can not read the configs from zookeeper.

-
as a result of the acl of /hiveserver2, the acl is set to world:anyone:cdrwa 
which meant to anyone easily delete the /hiveserver2 and znodes anytime.It is 
unsafe and necessary to protect the znode /hiveserver2.



  was:
Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode of 
hiveserver2 deleted by accident.




> Add zookeeper ACL for hiveserver2
> -
>
> Key: HIVE-16905
> URL: https://issues.apache.org/jira/browse/HIVE-16905
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
> Attachments: HIVE-16905.1.patch
>
>
> Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode 
> of hiveserver2 deleted by accident.
> --
> case:
> when i do beeline connections throught hive HA with zookeeper, i suddenly 
> find the beeline can not connect the hiveserve2.The reason of the problem is 
> that others delete the /hiveserver2 falsely which cause to the beeline 
> connection is failed and can not read the configs from zookeeper.
> -
> as a result of the acl of /hiveserver2, the acl is set to world:anyone:cdrwa 
> which meant to anyone easily delete the /hiveserver2 and znodes anytime.It is 
> unsafe and necessary to protect the znode /hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

2017-06-18 Thread liyunzhang_intel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-11297:

Attachment: HIVE-11297.6.patch

[~csun]: in HIVE-11297.6, fix all comments except renaming filterOp.  About 
this, i explain more in above, if there is misunderstanding ,tell me.

> Combine op trees for partition info generating tasks [Spark branch]
> ---
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chao Sun
>Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, 
> HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, HIVE-11297.6.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates 
> partition info for more than one partition columns, multiple operator trees 
> are created, which all start from the same table scan op, but have different 
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do 
> table scan multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-06-18 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053485#comment-16053485
 ] 

anishek commented on HIVE-16896:


We might have to have separate jira's to handle bootstrap repl load vs 
incremental repl load, with the wrapper or lazy top level task for both 
scenarios to work differently. 

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

2017-06-18 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053510#comment-16053510
 ] 

Hive QA commented on HIVE-11297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873432/HIVE-11297.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10831 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=157)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5673/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5673/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5673/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873432 - PreCommit-HIVE-Build

> Combine op trees for partition info generating tasks [Spark branch]
> ---
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chao Sun
>Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, 
> HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, HIVE-11297.6.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates 
> partition info for more than one partition columns, multiple operator trees 
> are created, which all start from the same table scan op, but have different 
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do 
> table scan multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

[jira] [Updated] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

[jira] [Updated] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

[jira] [Commented] (HIVE-16797) Enhance HiveFilterSetOpTransposeRule to remove union branches

[jira] [Comment Edited] (HIVE-16177) non Acid to acid conversion doesn't handle _copy_N files

[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2

[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2

[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

11 matches

Site Navigation

Mail list logo

Footer information