[jira] [Commented] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]

2015-06-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585436#comment-14585436
 ] 

Rui Li commented on HIVE-10989:
---

Hi [~xuefuz], these flags should only be set for the MapWork that handles the 
big table, i.e. in this case the skewed data. Previously, we set the flags for 
all the MapWork including those for the small table. This was copied from MR, 
where there's only one MapWork for the big table, and small tables are 
processed in MapredLocalWork. So the 3rd part makes our implementation inline 
with the MR version.

Also some performance data in case you wanna know. I tested joining the skewed 
data using 6 mappers (configured) vs 2 mappers (default). And the performance 
is 31s vs 43s. The improvement should be more obvious on bigger data.

> HoS can't control number of map tasks for runtime skew join [Spark Branch]
> --
>
> Key: HIVE-10989
> URL: https://issues.apache.org/jira/browse/HIVE-10989
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-10989.1-spark.patch
>
>
> Flags {{hive.skewjoin.mapjoin.map.tasks}} and 
> {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map 
> tasks for the map join of runtime skew join. They work well for MR but have 
> no effect for spark.
> This makes runtime skew join less useful, i.e. we just end up with slow 
> mappers instead of reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]

2015-06-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585420#comment-14585420
 ] 

Xuefu Zhang commented on HIVE-10989:


[~lirui], Thanks for working on this. Changes look good except that I don't 
quite understand the 3rd part of the change. Could you please explain? Thanks.

> HoS can't control number of map tasks for runtime skew join [Spark Branch]
> --
>
> Key: HIVE-10989
> URL: https://issues.apache.org/jira/browse/HIVE-10989
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-10989.1-spark.patch
>
>
> Flags {{hive.skewjoin.mapjoin.map.tasks}} and 
> {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map 
> tasks for the map join of runtime skew join. They work well for MR but have 
> no effect for spark.
> This makes runtime skew join less useful, i.e. we just end up with slow 
> mappers instead of reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]

2015-06-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585418#comment-14585418
 ] 

Rui Li commented on HIVE-10989:
---

Failed tests are not related.

> HoS can't control number of map tasks for runtime skew join [Spark Branch]
> --
>
> Key: HIVE-10989
> URL: https://issues.apache.org/jira/browse/HIVE-10989
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-10989.1-spark.patch
>
>
> Flags {{hive.skewjoin.mapjoin.map.tasks}} and 
> {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map 
> tasks for the map join of runtime skew join. They work well for MR but have 
> no effect for spark.
> This makes runtime skew join less useful, i.e. we just end up with slow 
> mappers instead of reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]

2015-06-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585411#comment-14585411
 ] 

Hive QA commented on HIVE-10989:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12739538/HIVE-10989.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7567 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/878/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/878/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12739538 - PreCommit-HIVE-SPARK-Build

> HoS can't control number of map tasks for runtime skew join [Spark Branch]
> --
>
> Key: HIVE-10989
> URL: https://issues.apache.org/jira/browse/HIVE-10989
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-10989.1-spark.patch
>
>
> Flags {{hive.skewjoin.mapjoin.map.tasks}} and 
> {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map 
> tasks for the map join of runtime skew join. They work well for MR but have 
> no effect for spark.
> This makes runtime skew join less useful, i.e. we just end up with slow 
> mappers instead of reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-14 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585400#comment-14585400
 ] 

Vikram Dixit K commented on HIVE-10841:
---

+1 for 0.14 branch.

> [WHERE col is not null] does not work sometimes for queries with many JOIN 
> statements
> -
>
> Key: HIVE-10841
> URL: https://issues.apache.org/jira/browse/HIVE-10841
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0
>Reporter: Alexander Pivovarov
>Assignee: Laljo John Pullokkaran
> Fix For: 1.2.1
>
> Attachments: HIVE-10841.03.patch, HIVE-10841.1.patch, 
> HIVE-10841.2.patch, HIVE-10841.patch
>
>
> The result from the following SELECT query is 3 rows but it should be 1 row.
> I checked it in MySQL - it returned 1 row.
> To reproduce the issue in Hive
> 1. prepare tables
> {code}
> drop table if exists L;
> drop table if exists LA;
> drop table if exists FR;
> drop table if exists A;
> drop table if exists PI;
> drop table if exists acct;
> create table L as select 4436 id;
> create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
> create table FR as select 4436 loan_id;
> create table A as select 4748 id;
> create table PI as select 4415 id;
> create table acct as select 4748 aid, 10 acc_n, 122 brn;
> insert into table acct values(4748, null, null);
> insert into table acct values(4748, null, null);
> {code}
> 2. run SELECT query
> {code}
> select
>   acct.ACC_N,
>   acct.brn
> FROM L
> JOIN LA ON L.id = LA.loan_id
> JOIN FR ON L.id = FR.loan_id
> JOIN A ON LA.aid = A.id
> JOIN PI ON PI.id = LA.pi_id
> JOIN acct ON A.id = acct.aid
> WHERE
>   L.id = 4436
>   and acct.brn is not null;
> {code}
> the result is 3 rows
> {code}
> 10122
> NULL  NULL
> NULL  NULL
> {code}
> but it should be 1 row
> {code}
> 10122
> {code}
> 2.1 "explain select ..." output for hive-1.3.0 MR
> {code}
> STAGE DEPENDENCIES:
>   Stage-12 is a root stage
>   Stage-9 depends on stages: Stage-12
>   Stage-0 depends on stages: Stage-9
> STAGE PLANS:
>   Stage: Stage-12
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> a 
>   Fetch Operator
> limit: -1
> acct 
>   Fetch Operator
> limit: -1
> fr 
>   Fetch Operator
> limit: -1
> l 
>   Fetch Operator
> limit: -1
> pi 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> a 
>   TableScan
> alias: a
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: id is not null (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> acct 
>   TableScan
> alias: acct
> Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: aid is not null (type: boolean)
>   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 _col5 (type: int)
>   1 id (type: int)
>   2 aid (type: int)
> fr 
>   TableScan
> alias: fr
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (loan_id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> l 
>   TableScan
> alias: l
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
> stats: NONE
> Filter Operator
>   predicate: (id = 4436) (type: boolean)
>   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
> Column stats: NONE
>   HashTable Sink Operator
> keys:
>   0 4436 (type: int)
>   1 4436 (type: int)
>   2 4436 (type: int)
> pi 
>   TableScan
> alias: pi
> Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
>

[jira] [Updated] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]

2015-06-14 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10989:
--
Attachment: HIVE-10989.1-spark.patch

The flags were properly set in the MapWork. We just need to create the RDD 
accordingly.

> HoS can't control number of map tasks for runtime skew join [Spark Branch]
> --
>
> Key: HIVE-10989
> URL: https://issues.apache.org/jira/browse/HIVE-10989
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-10989.1-spark.patch
>
>
> Flags {{hive.skewjoin.mapjoin.map.tasks}} and 
> {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map 
> tasks for the map join of runtime skew join. They work well for MR but have 
> no effect for spark.
> This makes runtime skew join less useful, i.e. we just end up with slow 
> mappers instead of reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]

2015-06-14 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10989:
--
Summary: HoS can't control number of map tasks for runtime skew join [Spark 
Branch]  (was: Spark can't control number of map tasks for runtime skew join 
[Spark Branch])

> HoS can't control number of map tasks for runtime skew join [Spark Branch]
> --
>
> Key: HIVE-10989
> URL: https://issues.apache.org/jira/browse/HIVE-10989
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>
> Flags {{hive.skewjoin.mapjoin.map.tasks}} and 
> {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map 
> tasks for the map join of runtime skew join. They work well for MR but have 
> no effect for spark.
> This makes runtime skew join less useful, i.e. we just end up with slow 
> mappers instead of reducers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585367#comment-14585367
 ] 

Gopal V commented on HIVE-10940:


With more logging, it becomes slightly clearer 

{code}
2015-06-14 19:00:40,473 INFO [TezChild] io.HiveInputFormat: push down initiated 
with  filterText = (l_orderkey = 121201) filterExpr = 
GenericUDFOPEqual(Column[l_orderkey], Const bigint 121201) 
serializedFilterObj = null serializedFilterExpr = 
AQEAamF2YS51dGlsLkFycmF5TGlz9AECAQFvcmcuYXBhY2hlLmhhZG9vcC5oaXZlLnFsLnBsYW4uRXhwck5vZGVDb2x1bW5EZXPjAQFsX29yZGVya2X5AAABbGluZWl0Ze0BAm9yZy5hcGFjaGUuaGFkb29wLmhpdmUuc2VyZGUyLnR5cGVpbmZvLlByaW1pdGl2ZVR5cGVJbmbvAQFiaWdpbvQBA29yZy5hcGFjaGUuaGFkb29wLmhpdmUucWwucGxhbi5FeHByTm9kZUNvbnN0YW50RGVz4wEBAgcJgpztgwkBBG9yZy5hcGFjaGUuaGFkb29wLmhpdmUucWwudWRmLmdlbmVyaWMuR2VuZXJpY1VERk9QRXF1YewBAAABgj0BRVFVQcwBBW9yZy5hcGFjaGUuaGFkb29wLmlvLkJvb2xlYW5Xcml0YWJs5QEAAAECAQFib29sZWHu
 filterObject = null
{code}

> HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
> call
> -
>
> Key: HIVE-10940
> URL: https://issues.apache.org/jira/browse/HIVE-10940
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10940.patch
>
>
> {code}
> String filterText = filterExpr.getExprString();
> String filterExprSerialized = Utilities.serializeExpression(filterExpr);
> {code}
> the serializeExpression initializes Kryo and produces a new packed object for 
> every split.
> HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.
> And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585333#comment-14585333
 ] 

Gopal V commented on HIVE-10940:


That was a kryo messup, the patch looks like it works exactly as expected on 
trunk.

> HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
> call
> -
>
> Key: HIVE-10940
> URL: https://issues.apache.org/jira/browse/HIVE-10940
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10940.patch
>
>
> {code}
> String filterText = filterExpr.getExprString();
> String filterExprSerialized = Utilities.serializeExpression(filterExpr);
> {code}
> the serializeExpression initializes Kryo and produces a new packed object for 
> every split.
> HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.
> And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11003) LLAP: Fix LLAP startup issues due to heap rounding errors

2015-06-14 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-11003.

Resolution: Fixed

commit 5830125526d5f93a5df3a6112afa25f5477d2853
Author: Gopal V 
Date:   Sun Jun 14 16:58:09 2015 -0700

HIVE-11003: LLAP: Fix LLAP startup issues due to heap rounding errors 
(gopalv)

> LLAP: Fix LLAP startup issues due to heap rounding errors
> -
>
> Key: HIVE-11003
> URL: https://issues.apache.org/jira/browse/HIVE-11003
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: llap
>
> Attachments: HIVE-11003.1.patch
>
>
> Heap sizes can be off by about a megabyte when summing different pool sizes 
> together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11003) LLAP: Fix LLAP startup issues due to heap rounding errors

2015-06-14 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11003:
---
Attachment: HIVE-11003.1.patch

> LLAP: Fix LLAP startup issues due to heap rounding errors
> -
>
> Key: HIVE-11003
> URL: https://issues.apache.org/jira/browse/HIVE-11003
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: llap
>
> Attachments: HIVE-11003.1.patch
>
>
> Heap sizes can be off by about a megabyte when summing different pool sizes 
> together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-06-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585307#comment-14585307
 ] 

Hive QA commented on HIVE-7018:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12739520/HIVE-7018.4.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9007 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4267/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4267/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4267/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12739520 - PreCommit-HIVE-TRUNK-Build

> Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
> not others
> -
>
> Key: HIVE-7018
> URL: https://issues.apache.org/jira/browse/HIVE-7018
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Yongzhi Chen
> Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch, HIVE-7018.3.patch, 
> HIVE-7018.4.patch
>
>
> It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
> column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10746) Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 produces 1-byte FileSplits from TextInputFormat

2015-06-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585306#comment-14585306
 ] 

Gopal V commented on HIVE-10746:


[~gss2002]: Talking to the MRv2 folks to change the defaults to be saner than 1 
byte.

Until that issue is resolved, I'll keep this open as a critical issue.

>  Hive 1.2.0 w/ Tez 0.5.3/Tez 0.6.0 produces 1-byte FileSplits from 
> TextInputFormat
> --
>
> Key: HIVE-10746
> URL: https://issues.apache.org/jira/browse/HIVE-10746
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 0.14.0, 0.14.1, 1.2.0, 1.1.0, 1.1.1
>Reporter: Greg Senia
>Assignee: Gopal V
>Priority: Critical
> Attachments: slow_query_output.zip
>
>
> The following query: "SELECT appl_user_id, arsn_cd, COUNT(*) as RecordCount 
> FROM adw.crc_arsn GROUP BY appl_user_id,arsn_cd ORDER BY appl_user_id;" runs 
> consistently fast in Spark and Mapreduce on Hive 1.2.0. When attempting to 
> run this same query against Tez as the execution engine it consistently runs 
> for over 300-500 seconds this seems extremely long. This is a basic external 
> table delimited by tabs and is a single file in a folder. In Hive 0.13 this 
> query with Tez runs fast and I tested with Hive 0.14, 0.14.1/1.0.0 and now 
> Hive 1.2.0 and there clearly is something going awry with Hive w/Tez as an 
> execution engine with Single or small file tables. I can attach further logs 
> if someone needs them for deeper analysis.
> HDFS Output:
> hadoop fs -ls /example_dw/crc/arsn
> Found 2 items
> -rwxr-x---   6 loaduser hadoopusers  0 2015-05-17 20:03 
> /example_dw/crc/arsn/_SUCCESS
> -rwxr-x---   6 loaduser hadoopusers3883880 2015-05-17 20:03 
> /example_dw/crc/arsn/part-m-0
> Hive Table Describe:
> hive> describe formatted crc_arsn;
> OK
> # col_name  data_type   comment 
>  
> arsn_cd string  
> clmlvl_cd   string  
> arclss_cd   string  
> arclssg_cd  string  
> arsn_prcsr_rmk_ind  string  
> arsn_mbr_rspns_ind  string  
> savtyp_cd   string  
> arsn_eff_dt string  
> arsn_exp_dt string  
> arsn_pstd_dts   string  
> arsn_lstupd_dts string  
> arsn_updrsn_txt string  
> appl_user_idstring  
> arsntyp_cd  string  
> pre_d_indicator string  
> arsn_display_txtstring  
> arstat_cd   string  
> arsn_tracking_nostring  
> arsn_cstspcfc_ind   string  
> arsn_mstr_rcrd_ind  string  
> state_specific_ind  string  
> region_specific_in  string  
> arsn_dpndnt_cd  string  
> unit_adjustment_in  string  
> arsn_mbr_only_ind   string  
> arsn_qrmb_ind   string  
>  
> # Detailed Table Information 
> Database:   adw  
> Owner:  loadu...@exa.example.com   
> CreateTime: Mon Apr 28 13:28:05 EDT 2014 
> LastAccessTime: UNKNOWN  
> Protect Mode:   None 
> Retention:  0
> Location:   hdfs://xhadnnm1p.example.com:8020/example_dw/crc/arsn 
>
> Table Type: EXTERNAL_TABLE   
> Table Parameters:
> EXTERNALTRUE
> transient_lastDdlTime   1398706085  
>  
> # Storage Information
> SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat:org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat:   
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Com

[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-06-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585284#comment-14585284
 ] 

Hive QA commented on HIVE-7018:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12739520/HIVE-7018.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 27 tests executed
*Failed tests:*
{noformat}
Test failed: mysql/upgrade-1.2.0-to-2.0.0.mysql.sql
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/53/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/53/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-53/

Messages:
{noformat}
LXC derby found.
LXC derby is not started. Starting container...
Container started.
Preparing derby container...
Container prepared.
Calling /hive/testutils/metastore/dbs/derby/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/derby/execute.sh ...
Tests executed.
LXC mysql found.
LXC mysql is not started. Starting container...
Container started.
Preparing mysql container...
Container prepared.
Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/mysql/execute.sh ...
Test failed: mysql/upgrade-1.2.0-to-2.0.0.mysql.sql
Tests executed.
LXC postgres found.
LXC postgres is not started. Starting container...
Container started.
Preparing postgres container...
Container prepared.
Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/postgres/execute.sh ...
Tests executed.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12739520 - PreCommit-HIVE-METASTORE-Test

> Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
> not others
> -
>
> Key: HIVE-7018
> URL: https://issues.apache.org/jira/browse/HIVE-7018
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Yongzhi Chen
> Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch, HIVE-7018.3.patch, 
> HIVE-7018.4.patch
>
>
> It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
> column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-06-14 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585278#comment-14585278
 ] 

Yongzhi Chen commented on HIVE-7018:


Hive Chaoyu, I attached HIVE-7018.4.patch for 2.0.0  Thanks

> Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
> not others
> -
>
> Key: HIVE-7018
> URL: https://issues.apache.org/jira/browse/HIVE-7018
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Yongzhi Chen
> Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch, HIVE-7018.3.patch, 
> HIVE-7018.4.patch
>
>
> It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
> column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-06-14 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-7018:
---
Attachment: HIVE-7018.4.patch

Patch 4 for master (2.0.0)

> Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
> not others
> -
>
> Key: HIVE-7018
> URL: https://issues.apache.org/jira/browse/HIVE-7018
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Yongzhi Chen
> Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch, HIVE-7018.3.patch, 
> HIVE-7018.4.patch
>
>
> It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
> column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11001) HS2 http cookie mode does not honor doAs url parameter

2015-06-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585276#comment-14585276
 ] 

Hive QA commented on HIVE-11001:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12739458/HIVE-11001.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9008 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4266/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4266/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4266/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12739458 - PreCommit-HIVE-TRUNK-Build

> HS2 http cookie mode does not honor doAs url parameter
> --
>
> Key: HIVE-11001
> URL: https://issues.apache.org/jira/browse/HIVE-11001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.2.1
>
> Attachments: HIVE-11001.1.patch
>
>
> When HiveServer2 HTTP mode is used with cookie authentication enabled ( 
> hive.server2.thrift.http.cookie.auth.enabled=true), the doAs url parameter 
> does not get captured, and the authenticated user gets used instead of the 
> doAs user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7313) Allow in-memory/ssd session-level temp-tables

2015-06-14 Thread Damien Carol (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585264#comment-14585264
 ] 

Damien Carol commented on HIVE-7313:


[~leftylev] "default" setting means "HIVE doesn't change HDFS storage policy".
I think an association table could be more usefull. Like this:
||Value of hive.exec.temporary.table.storage||Constant used 
(HdfsConstants.xxx)||Storage policy|| Description||
|default|| | |
|mem|MEMORY_STORAGE_POLICY_NAME|Lazy_Persist|for writing blocks with single 
replica in memory. The replica is first written in RAM_DISK and then it is 
lazily persisted in DISK.|
I will work on that this week.

> Allow in-memory/ssd session-level temp-tables
> -
>
> Key: HIVE-7313
> URL: https://issues.apache.org/jira/browse/HIVE-7313
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: InMemory, Performance
> Fix For: 1.1.0
>
> Attachments: HIVE-7313.1.patch, HIVE-7313.2.patch
>
>
> With HDFS storage policies implementation, temporary tables can be written 
> with different storage/reliability policies. 
> In-session temporary tables can be targetted at both SSD and memory storage 
> policies, with fallbacks onto the disk and the associated reliability 
> trade-offs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11001) HS2 http cookie mode does not honor doAs url parameter

2015-06-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585219#comment-14585219
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11001:
--

Nice catch!  LGTM, +1

> HS2 http cookie mode does not honor doAs url parameter
> --
>
> Key: HIVE-11001
> URL: https://issues.apache.org/jira/browse/HIVE-11001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.2.1
>
> Attachments: HIVE-11001.1.patch
>
>
> When HiveServer2 HTTP mode is used with cookie authentication enabled ( 
> hive.server2.thrift.http.cookie.auth.enabled=true), the doAs url parameter 
> does not get captured, and the authenticated user gets used instead of the 
> doAs user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11001) HS2 http cookie mode does not honor doAs url parameter

2015-06-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585214#comment-14585214
 ] 

Thejas M Nair commented on HIVE-11001:
--

[~hsubramaniyan] Can you please review this patch ?


> HS2 http cookie mode does not honor doAs url parameter
> --
>
> Key: HIVE-11001
> URL: https://issues.apache.org/jira/browse/HIVE-11001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.2.1
>
> Attachments: HIVE-11001.1.patch
>
>
> When HiveServer2 HTTP mode is used with cookie authentication enabled ( 
> hive.server2.thrift.http.cookie.auth.enabled=true), the doAs url parameter 
> does not get captured, and the authenticated user gets used instead of the 
> doAs user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585108#comment-14585108
 ] 

Hive QA commented on HIVE-10999:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12739493/HIVE-10999.1-spark.patch

{color:red}ERROR:{color} -1 due to 604 failed/errored test(s), 7420 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_smb_mapjoin_8
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_uber_reduce
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliD

[jira] [Updated] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10999:
---
Attachment: HIVE-10999.1-spark.patch

> Upgrade Spark dependency to 1.4 [Spark Branch]
> --
>
> Key: HIVE-10999
> URL: https://issues.apache.org/jira/browse/HIVE-10999
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-10999.1-spark.patch
>
>
> Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to 
> 1.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]

2015-06-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-10999:
--

Assignee: Xuefu Zhang

> Upgrade Spark dependency to 1.4 [Spark Branch]
> --
>
> Key: HIVE-10999
> URL: https://issues.apache.org/jira/browse/HIVE-10999
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-10999.1-spark.patch
>
>
> Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to 
> 1.4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9248) Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is Hash mode

2015-06-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585012#comment-14585012
 ] 

Hive QA commented on HIVE-9248:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12739477/HIVE-9248.06.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9008 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4265/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4265/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4265/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12739477 - PreCommit-HIVE-TRUNK-Build

> Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is 
> Hash mode
> ---
>
> Key: HIVE-9248
> URL: https://issues.apache.org/jira/browse/HIVE-9248
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-9248.01.patch, HIVE-9248.02.patch, 
> HIVE-9248.03.patch, HIVE-9248.04.patch, HIVE-9248.05.patch, HIVE-9248.06.patch
>
>
> Under Tez and Vectorization, ReduceWork not getting vectorized unless it 
> GROUP BY operator is MergePartial.  Add valid cases where GROUP BY is Hash 
> (and presumably there are downstream reducers that will do MergePartial).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10622) Hive doc error: 'from' is a keyword, when use it as a column name throw error.

2015-06-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584989#comment-14584989
 ] 

Lefty Leverenz commented on HIVE-10622:
---

Changed "from" column name to "came_from" -- [~alangates], would you please 
review this?

* [DML -- Insert Values -- Examples | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Examples]

> Hive doc error: 'from' is a keyword, when use it as a column name throw error.
> --
>
> Key: HIVE-10622
> URL: https://issues.apache.org/jira/browse/HIVE-10622
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.1
>Reporter: Anne Yu
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML, Use 
> "from" as a column name in create table, throw error.
> {code}
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> Error: Error while compiling statement: FAILED: ParseException line 1:57 
> cannot recognize input near 'from' 'STRING' ')' in column specification 
> (state=42000,code=4)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9248) Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is Hash mode

2015-06-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9248:
---
Attachment: HIVE-9248.06.patch

> Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is 
> Hash mode
> ---
>
> Key: HIVE-9248
> URL: https://issues.apache.org/jira/browse/HIVE-9248
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-9248.01.patch, HIVE-9248.02.patch, 
> HIVE-9248.03.patch, HIVE-9248.04.patch, HIVE-9248.05.patch, HIVE-9248.06.patch
>
>
> Under Tez and Vectorization, ReduceWork not getting vectorized unless it 
> GROUP BY operator is MergePartial.  Add valid cases where GROUP BY is Hash 
> (and presumably there are downstream reducers that will do MergePartial).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)