date:20140903

[jira] [Commented] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-09-03 Thread Venki Korukanti (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121017#comment-14121017
 ] 

Venki Korukanti commented on HIVE-6245:
---

[~thejas] Sorry about that. I see in all Hive QA runs where the test failed, 
second connection is failed to establish. There isn't any info in hive.log to 
narrow down. I will try to repro locally and update (HIVE-7942 is created for 
tracking).

> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Chaoyu Tang
>Assignee: Venki Korukanti
> Fix For: 0.14.0
>
> Attachments: HIVE-6245.2.patch.txt, HIVE-6245.3.patch.txt, 
> HIVE-6245.4.patch, HIVE-6245.5.patch, HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 25329: HIVE-7932: It may cause NP exception when add accessed columns to ReadEntity

2014-09-03 Thread Xiaomeng Huang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25329/
---

Review request for hive, Brock Noland, Prasad Mujumdar, and Szehon Ho.


Repository: hive-git


Description
---

When I execute a query with view join, the view's type is table, but 
tableToColumnAccessMap will not store view's name, so it will throw null 
pointer exception 


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 

Diff: https://reviews.apache.org/r/25329/diff/


Testing
---


Thanks,

Xiaomeng Huang

[jira] [Commented] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121008#comment-14121008
 ] 

Hive QA commented on HIVE-5760:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666390/HIVE-5760.93.patch

{color:green}SUCCESS:{color} +1 6166 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/630/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/630/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-630/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666390

> Add vectorized support for CHAR/VARCHAR data types
> --
>
> Key: HIVE-5760
> URL: https://issues.apache.org/jira/browse/HIVE-5760
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Matt McCline
> Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
> HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
> HIVE-5760.91.patch, HIVE-5760.92.patch, HIVE-5760.93.patch
>
>
> Add support to allow queries referencing VARCHAR columns and expression 
> results to run efficiently in vectorized mode. This should re-use the code 
> for the STRING type to the extent possible and beneficial. Include unit tests 
> and end-to-end tests. Consider re-using or extending existing end-to-end 
> tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-09-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120989#comment-14120989
 ] 

Thejas M Nair commented on HIVE-6245:
-

[~venki387] The test in TestHS2ImpersonationWithRemoteMS seems to be failing 
frequently. Would you be able to take a look at it ?


> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Chaoyu Tang
>Assignee: Venki Korukanti
> Fix For: 0.14.0
>
> Attachments: HIVE-6245.2.patch.txt, HIVE-6245.3.patch.txt, 
> HIVE-6245.4.patch, HIVE-6245.5.patch, HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7972) hiveserver2 specific configuration file is not getting used

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120962#comment-14120962
 ] 

Hive QA commented on HIVE-7972:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666378/HIVE-7972.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6142 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/629/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/629/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-629/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666378

> hiveserver2 specific configuration file is not getting used
> ---
>
> Key: HIVE-7972
> URL: https://issues.apache.org/jira/browse/HIVE-7972
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7972.1.patch
>
>
> The HiveServer2 specific configuration file introduced in HIVE-7342 is not 
> getting used when hiveserver2 is started from commandline. This is because 
> the HiveConf used in HiveServer2.init is actually created before HiveServer2 
> constructor is called.
> The tests are using embedded HiveServer2 which does not go through 
> HiveServer2.main and don't show this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.

2014-09-03 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120958#comment-14120958
 ] 

Lefty Leverenz commented on HIVE-7100:
--

Thanks [~dbsalti], the javadoc comments look good.  I hope you'll get a code 
review soon.

> Users of hive should be able to specify skipTrash when dropping tables.
> ---
>
> Key: HIVE-7100
> URL: https://issues.apache.org/jira/browse/HIVE-7100
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Ravi Prakash
>Assignee: Jayesh
> Attachments: HIVE-7100.1.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, 
> HIVE-7100.4.patch, HIVE-7100.5.patch, HIVE-7100.patch
>
>
> Users of our clusters are often running up against their quota limits because 
> of Hive tables. When they drop tables, they have to then manually delete the 
> files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
> should enable users to skipTrash directly when dropping tables.
> We should also be able to provide this functionality without polluting SQL 
> syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 25178: Add DROP TABLE PURGE

2014-09-03 Thread Lefty Leverenz


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25178/#review52279
---


Javadoc changes look good.  Just one optional quibble.


metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java


You could add an explanation for UnsupportedOperationException but it seems 
self-explanatory (so I'm not opening an issue).


- Lefty Leverenz


On Sept. 2, 2014, 11:41 p.m., david seraf wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25178/
> ---
> 
> (Updated Sept. 2, 2014, 11:41 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add PURGE option to DROP TABLE command to skip saving table data to the trash
> 
> 
> Diffs
> -
> 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java
>  be7134f 
>   
> hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/tool/TestTempletonUtils.java
>  af952f2 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
>  da51a55 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 9489949 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> a94a7a3 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreFsImpl.java 
> cff0718 
>   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
> cbdba30 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreFS.java 
> a141793 
>   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 613b709 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e387b8f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4cf98d8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> f31a409 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 32db0c7 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java ba30e1f 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 406aae9 
>   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveRemote.java 1a5ba87 
>   ql/src/test/queries/clientpositive/drop_table_purge.q PRE-CREATION 
>   ql/src/test/results/clientpositive/drop_table_purge.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/25178/diff/
> 
> 
> Testing
> ---
> 
> added code test and added QL test.  Tests passed in CI, but other, unrelated 
> tests failed.
> 
> 
> Thanks,
> 
> david seraf
> 
>

[jira] [Commented] (HIVE-7956) When inserting into a bucketed table, all data goes to a single bucket [Spark Branch]

2014-09-03 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120926#comment-14120926
 ] 

Rui Li commented on HIVE-7956:
--

[~xuefuz], I copied the extra fields ({{hashCode}},{{distKeyLength}}) for 
HiveKey. Now all the reducers get some data but I got another issue here: most 
of the records go to just one reducer and records with the same key can go to 
different reducers. I'll investigate more into this.

> When inserting into a bucketed table, all data goes to a single bucket [Spark 
> Branch]
> -
>
> Key: HIVE-7956
> URL: https://issues.apache.org/jira/browse/HIVE-7956
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>
> I created a bucketed table:
> {code}
> create table testBucket(x int,y string) clustered by(x) into 10 buckets;
> {code}
> Then I run a query like:
> {code}
> set hive.enforce.bucketing = true;
> insert overwrite table testBucket select intCol,stringCol from src;
> {code}
> Here {{src}} is a simple textfile-based table containing 4000 records 
> (not bucketed). The query launches 10 reduce tasks but all the data goes to 
> only one of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7956) When inserting into a bucketed table, all data goes to a single bucket [Spark Branch]

2014-09-03 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120922#comment-14120922
 ] 

Xuefu Zhang commented on HIVE-7956:
---

Hi [~lirui], If your suspeicion is confirmed by your followup investigation, 
maybe this is an opportunity for us to use Hivekey instead of BytesWritable 
since join seemingly also relies on this. You can create a separate JIRA for 
that.

> When inserting into a bucketed table, all data goes to a single bucket [Spark 
> Branch]
> -
>
> Key: HIVE-7956
> URL: https://issues.apache.org/jira/browse/HIVE-7956
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>
> I created a bucketed table:
> {code}
> create table testBucket(x int,y string) clustered by(x) into 10 buckets;
> {code}
> Then I run a query like:
> {code}
> set hive.enforce.bucketing = true;
> insert overwrite table testBucket select intCol,stringCol from src;
> {code}
> Here {{src}} is a simple textfile-based table containing 4000 records 
> (not bucketed). The query launches 10 reduce tasks but all the data goes to 
> only one of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7971) Support alter table change/replace/add columns for existing partitions

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120921#comment-14120921
 ] 

Hive QA commented on HIVE-7971:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666373/HIVE-7971.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6143 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/628/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/628/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-628/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666373

> Support alter table change/replace/add columns for existing partitions
> --
>
> Key: HIVE-7971
> URL: https://issues.apache.org/jira/browse/HIVE-7971
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-7971.1.patch
>
>
> ALTER TABLE CHANGE COLUMN is allowed for tables, but not for partitions. Same 
> for add/replace columns.
> Allowing this for partitions can be useful in some cases. For example, one 
> user has tables with Hive 0.12 Decimal columns, which do not specify 
> precision/scale. To be able to properly read the decimal values from the 
> existing partitions, the column types in the partitions need to be changed to 
> decimal types with precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.99.patch

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
> HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
> HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Patch Available  (was: In Progress)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
> HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
> HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5871) Use multiple-characters as field delimiter

2014-09-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120883#comment-14120883
 ] 

Brock Noland commented on HIVE-5871:


+1

> Use multiple-characters as field delimiter
> --
>
> Key: HIVE-5871
> URL: https://issues.apache.org/jira/browse/HIVE-5871
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.12.0
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-5871.2.patch, HIVE-5871.3.patch, HIVE-5871.4.patch, 
> HIVE-5871.5.patch, HIVE-5871.6.patch, HIVE-5871.patch
>
>
> By default, hive only allows user to use single character as field delimiter. 
> Although there's RegexSerDe to specify multiple-character delimiter, it can 
> be daunting to use, especially for amateurs.
> In the patch, I add a new SerDe named MultiDelimitSerDe. With 
> MultiDelimitSerDe, users can specify a multiple-character field delimiter 
> when creating tables, in a way most similar to typical table creations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7580) Support dynamic partitioning [Spark Branch]

2014-09-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120882#comment-14120882
 ] 

Brock Noland commented on HIVE-7580:


[~chinnalalam] I think this is missing the test properties update?

> Support dynamic partitioning [Spark Branch]
> ---
>
> Key: HIVE-7580
> URL: https://issues.apache.org/jira/browse/HIVE-7580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Chinna Rao Lalam
>  Labels: Spark-M1
> Attachments: HIVE-7580.patch
>
>
> My understanding is that we don't need to do anything special for this. 
> However, this needs to be verified and tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 25176: HIVE-7870: Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-09-03 Thread Brock Noland


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25176/#review52271
---


Thank you for the update Na! I have a couple minor comments and then we can 
commit.


ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java


Is there any reason not init this to new HashMap...?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java


Can we create this by default?


- Brock Noland


On Aug. 29, 2014, 8:59 p.m., Na Yang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25176/
> ---
> 
> (Updated Aug. 29, 2014, 8:59 p.m.)
> 
> 
> Review request for hive, Brock Noland, Szehon Ho, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7870
> https://issues.apache.org/jira/browse/HIVE-7870
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-7870: Insert overwrite table query does not generate correct task plan 
> [Spark Branch]
> 
> The cause of this problem is during spark/tez task generation, the union file 
> sink operator are cloned to two new filesink operator. The linkedfilesinkdesc 
> info for those new filesink operators are missing. In addition, the two new 
> filesink operators also need to be linked together.   
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 88ef4f8 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 9c808d4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 5ddc16d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> 379a39c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 76fc290 
>   ql/src/test/queries/clientpositive/union_remove_1.q c87b3fe 
>   ql/src/test/queries/clientpositive/union_remove_10.q 6701952 
>   ql/src/test/queries/clientpositive/union_remove_11.q 4b2fa42 
>   ql/src/test/queries/clientpositive/union_remove_12.q 69d0d0a 
>   ql/src/test/queries/clientpositive/union_remove_13.q 7605f0e 
>   ql/src/test/queries/clientpositive/union_remove_14.q a4fdfc8 
>   ql/src/test/queries/clientpositive/union_remove_15.q e3c937b 
>   ql/src/test/queries/clientpositive/union_remove_16.q 537078b 
>   ql/src/test/queries/clientpositive/union_remove_17.q d70f3d3 
>   ql/src/test/queries/clientpositive/union_remove_18.q 6352bc3 
>   ql/src/test/queries/clientpositive/union_remove_19.q 8c45953 
>   ql/src/test/queries/clientpositive/union_remove_2.q 83cd288 
>   ql/src/test/queries/clientpositive/union_remove_20.q f80f7c1 
>   ql/src/test/queries/clientpositive/union_remove_21.q 8963c25 
>   ql/src/test/queries/clientpositive/union_remove_22.q b0c1ccd 
>   ql/src/test/queries/clientpositive/union_remove_23.q a1b989a 
>   ql/src/test/queries/clientpositive/union_remove_24.q ec561e0 
>   ql/src/test/queries/clientpositive/union_remove_25.q 76c1ff5 
>   ql/src/test/queries/clientpositive/union_remove_3.q 9617f73 
>   ql/src/test/queries/clientpositive/union_remove_4.q cae323b 
>   ql/src/test/queries/clientpositive/union_remove_5.q 5df84e1 
>   ql/src/test/queries/clientpositive/union_remove_6.q bfce26d 
>   ql/src/test/queries/clientpositive/union_remove_7.q 3a95674 
>   ql/src/test/queries/clientpositive/union_remove_8.q a83a43e 
>   ql/src/test/queries/clientpositive/union_remove_9.q e71f6dd 
>   ql/src/test/results/clientpositive/spark/union10.q.out 20c681e 
>   ql/src/test/results/clientpositive/spark/union18.q.out 3f37a0a 
>   ql/src/test/results/clientpositive/spark/union19.q.out 6922fcd 
>   ql/src/test/results/clientpositive/spark/union28.q.out 8bd5218 
>   ql/src/test/results/clientpositive/spark/union29.q.out b9546ef 
>   ql/src/test/results/clientpositive/spark/union3.q.out 3ae6536 
>   ql/src/test/results/clientpositive/spark/union30.q.out 12717a1 
>   ql/src/test/results/clientpositive/spark/union33.q.out b89757f 
>   ql/src/test/results/clientpositive/spark/union4.q.out 6341cd9 
>   ql/src/test/results/clientpositive/spark/union6.q.out 263d9f4 
>   ql/src/test/results/clientpositive/spark/union_remove_10.q.out 927a15d 
>   ql/src/test/results/clientpositive/spark/union_remove_11.q.out 96651e1 
>   ql/src/test/results/clientpositive/spark/union_remove_16.q.out 0954ae4 
>   ql/src/test/results/clientpositive/spark/union_remove_4.q.out cc46dda 
>   ql/src/test/results/clientpositive/spark/union_remove_5.q.out f6cdeb3 
>   ql/src/test/results/clientpositive/spark/union_remove_9.q.out 1f0260c 
> 
> Diff: https://reviews.apache.org/r/25176/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Na Yang
> 
>

[jira] [Commented] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120881#comment-14120881
 ] 

Hive QA commented on HIVE-6847:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666336/HIVE-6847.10.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6138 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.org.apache.hive.service.TestHS2ImpersonationWithRemoteMS
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/627/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/627/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-627/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666336

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.10.patch, 
> HIVE-6847.2.patch, HIVE-6847.3.patch, HIVE-6847.4.patch, HIVE-6847.5.patch, 
> HIVE-6847.6.patch, HIVE-6847.7.patch, HIVE-6847.8.patch, HIVE-6847.9.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7976) Merge tez branch into trunk (tez 0.5.0)

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7976:
-
Status: Patch Available  (was: Open)

> Merge tez branch into trunk (tez 0.5.0)
> ---
>
> Key: HIVE-7976
> URL: https://issues.apache.org/jira/browse/HIVE-7976
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gopal V
> Attachments: HIVE-7976.1.patch
>
>
> Tez 0.5.0 release is available now. 
> (https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-api/0.5.0/)
> In Tez 0.5.0 a lot of APIs have changed, and we were doing dev against these 
> APIs in the tez branch, until they've become stable and available.
> [~gopalv] has been driving a lot of the API changes necessary, but [~sseth], 
> [~rajesh.balamohan], [~vikram.dixit] and myself have chimed in as well.
> One new feature (dynamic partition pruning, HIVE-7826) has also been parked 
> on this branch because of dependencies to APIs first released in 0.5.0.
> This ticket is to merge the tez branch back to trunk. I'll post patches for 
> review and for the unit tests to run, but once the required +1s are there the 
> goal is to merge to keep the history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7976) Merge tez branch into trunk (tez 0.5.0)

2014-09-03 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120872#comment-14120872
 ] 

Gunther Hagleitner commented on HIVE-7976:
--

Review board: https://reviews.apache.org/r/25323/

> Merge tez branch into trunk (tez 0.5.0)
> ---
>
> Key: HIVE-7976
> URL: https://issues.apache.org/jira/browse/HIVE-7976
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gopal V
> Attachments: HIVE-7976.1.patch
>
>
> Tez 0.5.0 release is available now. 
> (https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-api/0.5.0/)
> In Tez 0.5.0 a lot of APIs have changed, and we were doing dev against these 
> APIs in the tez branch, until they've become stable and available.
> [~gopalv] has been driving a lot of the API changes necessary, but [~sseth], 
> [~rajesh.balamohan], [~vikram.dixit] and myself have chimed in as well.
> One new feature (dynamic partition pruning, HIVE-7826) has also been parked 
> on this branch because of dependencies to APIs first released in 0.5.0.
> This ticket is to merge the tez branch back to trunk. I'll post patches for 
> review and for the unit tests to run, but once the required +1s are there the 
> goal is to merge to keep the history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7976) Merge tez branch into trunk (tez 0.5.0)

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7976:
-
Attachment: HIVE-7976.1.patch

.1 is the combined delta of tez branch + trunk

> Merge tez branch into trunk (tez 0.5.0)
> ---
>
> Key: HIVE-7976
> URL: https://issues.apache.org/jira/browse/HIVE-7976
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gopal V
> Attachments: HIVE-7976.1.patch
>
>
> Tez 0.5.0 release is available now. 
> (https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-api/0.5.0/)
> In Tez 0.5.0 a lot of APIs have changed, and we were doing dev against these 
> APIs in the tez branch, until they've become stable and available.
> [~gopalv] has been driving a lot of the API changes necessary, but [~sseth], 
> [~rajesh.balamohan], [~vikram.dixit] and myself have chimed in as well.
> One new feature (dynamic partition pruning, HIVE-7826) has also been parked 
> on this branch because of dependencies to APIs first released in 0.5.0.
> This ticket is to merge the tez branch back to trunk. I'll post patches for 
> review and for the unit tests to run, but once the required +1s are there the 
> goal is to merge to keep the history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6988) Hive changes for tez-0.5.x compatibility

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6988:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

All changes committed to branch.

> Hive changes for tez-0.5.x compatibility
> 
>
> Key: HIVE-6988
> URL: https://issues.apache.org/jira/browse/HIVE-6988
> Project: Hive
>  Issue Type: Task
>Reporter: Gopal V
> Attachments: HIVE-6988.1.patch, HIVE-6988.2.patch, HIVE-6988.3.patch, 
> HIVE-6988.4.patch, HIVE-6988.5.patch, HIVE-6988.5.patch, HIVE-6988.6.patch
>
>
> Umbrella JIRA to track all hive changes needed for tez-0.5.x compatibility.
> tez-0.4.x -> tez.0.5.x is going to break backwards compat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6988) Hive changes for tez-0.5.x compatibility

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6988:
-
Assignee: Gopal V

> Hive changes for tez-0.5.x compatibility
> 
>
> Key: HIVE-6988
> URL: https://issues.apache.org/jira/browse/HIVE-6988
> Project: Hive
>  Issue Type: Task
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-6988.1.patch, HIVE-6988.2.patch, HIVE-6988.3.patch, 
> HIVE-6988.4.patch, HIVE-6988.5.patch, HIVE-6988.5.patch, HIVE-6988.6.patch
>
>
> Umbrella JIRA to track all hive changes needed for tez-0.5.x compatibility.
> tez-0.4.x -> tez.0.5.x is going to break backwards compat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7976) Merge tez branch into trunk (tez 0.5.0)

2014-09-03 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120858#comment-14120858
 ] 

Gunther Hagleitner commented on HIVE-7976:
--

HIVE-6988 has the API compat changes (umbrella jira)

> Merge tez branch into trunk (tez 0.5.0)
> ---
>
> Key: HIVE-7976
> URL: https://issues.apache.org/jira/browse/HIVE-7976
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gopal V
>
> Tez 0.5.0 release is available now. 
> (https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-api/0.5.0/)
> In Tez 0.5.0 a lot of APIs have changed, and we were doing dev against these 
> APIs in the tez branch, until they've become stable and available.
> [~gopalv] has been driving a lot of the API changes necessary, but [~sseth], 
> [~rajesh.balamohan], [~vikram.dixit] and myself have chimed in as well.
> One new feature (dynamic partition pruning, HIVE-7826) has also been parked 
> on this branch because of dependencies to APIs first released in 0.5.0.
> This ticket is to merge the tez branch back to trunk. I'll post patches for 
> review and for the unit tests to run, but once the required +1s are there the 
> goal is to merge to keep the history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7976) Merge tez branch into trunk (tez 0.5.0)

2014-09-03 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120859#comment-14120859
 ] 

Gunther Hagleitner commented on HIVE-7976:
--

HIVE-7826 is the jira for dynamic partition pruning, which relies on a new Tez 
event API.

> Merge tez branch into trunk (tez 0.5.0)
> ---
>
> Key: HIVE-7976
> URL: https://issues.apache.org/jira/browse/HIVE-7976
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gopal V
>
> Tez 0.5.0 release is available now. 
> (https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-api/0.5.0/)
> In Tez 0.5.0 a lot of APIs have changed, and we were doing dev against these 
> APIs in the tez branch, until they've become stable and available.
> [~gopalv] has been driving a lot of the API changes necessary, but [~sseth], 
> [~rajesh.balamohan], [~vikram.dixit] and myself have chimed in as well.
> One new feature (dynamic partition pruning, HIVE-7826) has also been parked 
> on this branch because of dependencies to APIs first released in 0.5.0.
> This ticket is to merge the tez branch back to trunk. I'll post patches for 
> review and for the unit tests to run, but once the required +1s are there the 
> goal is to merge to keep the history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7976) Merge tez branch into trunk (tez 0.5.0)

2014-09-03 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-7976:


 Summary: Merge tez branch into trunk (tez 0.5.0)
 Key: HIVE-7976
 URL: https://issues.apache.org/jira/browse/HIVE-7976
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gopal V


Tez 0.5.0 release is available now. 
(https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-api/0.5.0/)

In Tez 0.5.0 a lot of APIs have changed, and we were doing dev against these 
APIs in the tez branch, until they've become stable and available.

[~gopalv] has been driving a lot of the API changes necessary, but [~sseth], 
[~rajesh.balamohan], [~vikram.dixit] and myself have chimed in as well.

One new feature (dynamic partition pruning, HIVE-7826) has also been parked on 
this branch because of dependencies to APIs first released in 0.5.0.

This ticket is to merge the tez branch back to trunk. I'll post patches for 
review and for the unit tests to run, but once the required +1s are there the 
goal is to merge to keep the history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7508) Kerberos support for streaming

2014-09-03 Thread Roshan Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120854#comment-14120854
 ] 

Roshan Naik commented on HIVE-7508:
---

[~leftylev] or [~ashutoshc] can you grant me write permission on that wiki ?

> Kerberos support for streaming
> --
>
> Key: HIVE-7508
> URL: https://issues.apache.org/jira/browse/HIVE-7508
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: Streaming, TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7508.patch
>
>
> Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7956) When inserting into a bucketed table, all data goes to a single bucket [Spark Branch]

2014-09-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120853#comment-14120853
 ] 

Brock Noland commented on HIVE-7956:


Gotcha

> When inserting into a bucketed table, all data goes to a single bucket [Spark 
> Branch]
> -
>
> Key: HIVE-7956
> URL: https://issues.apache.org/jira/browse/HIVE-7956
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>
> I created a bucketed table:
> {code}
> create table testBucket(x int,y string) clustered by(x) into 10 buckets;
> {code}
> Then I run a query like:
> {code}
> set hive.enforce.bucketing = true;
> insert overwrite table testBucket select intCol,stringCol from src;
> {code}
> Here {{src}} is a simple textfile-based table containing 4000 records 
> (not bucketed). The query launches 10 reduce tasks but all the data goes to 
> only one of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7947) Add message at the end of each testcase with timestamp in Webhcat system tests

2014-09-03 Thread Jagruti Varia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120845#comment-14120845
 ] 

Jagruti Varia commented on HIVE-7947:
-

Thanks all for reviewing and merging this change.

> Add message at the end of each testcase with timestamp in Webhcat system tests
> --
>
> Key: HIVE-7947
> URL: https://issues.apache.org/jira/browse/HIVE-7947
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests, WebHCat
>Reporter: Jagruti Varia
>Assignee: Jagruti Varia
>Priority: Trivial
> Fix For: 0.14.0
>
> Attachments: HIVE-7947.1.patch
>
>
> Currently, Webhcat e2e testsuite only prints message while starting test run:
> {noformat}
> Beginning test  at 1406716992
> {noformat}
> It should also print ending message with timestamp similar to this:
> {noformat}
> Ending test  at 1406717992
> {noformat}
> This change will make log collection easy for failed test cases. 
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7947) Add message at the end of each testcase with timestamp in Webhcat system tests

2014-09-03 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7947:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the contribution Jagruti!


> Add message at the end of each testcase with timestamp in Webhcat system tests
> --
>
> Key: HIVE-7947
> URL: https://issues.apache.org/jira/browse/HIVE-7947
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests, WebHCat
>Reporter: Jagruti Varia
>Assignee: Jagruti Varia
>Priority: Trivial
> Fix For: 0.14.0
>
> Attachments: HIVE-7947.1.patch
>
>
> Currently, Webhcat e2e testsuite only prints message while starting test run:
> {noformat}
> Beginning test  at 1406716992
> {noformat}
> It should also print ending message with timestamp similar to this:
> {noformat}
> Ending test  at 1406717992
> {noformat}
> This change will make log collection easy for failed test cases. 
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7975) HS2 memory optimization: Internalizing instance fields of Thrift-generated metastore API classes

2014-09-03 Thread Wilbur Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilbur Yang updated HIVE-7975:
--
Status: Patch Available  (was: Open)

> HS2 memory optimization: Internalizing instance fields of Thrift-generated 
> metastore API classes
> 
>
> Key: HIVE-7975
> URL: https://issues.apache.org/jira/browse/HIVE-7975
> Project: Hive
>  Issue Type: Improvement
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
> Fix For: 0.14.0
>
> Attachments: HIVE-7975-doc.pdf, HIVE-7975.1.patch
>
>
> We should internalize the String-based instance fields of the metastore API 
> classes FieldSchema, Partition, SerDeInfo, and StorageDescriptor in order to 
> save memory. In a test environment with data consisting of about 1800 
> partitions, the proposed changes are able to save about 24% of old generation 
> memory during a complex query. See details in the attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7975) HS2 memory optimization: Internalizing instance fields of Thrift-generated metastore API classes

2014-09-03 Thread Wilbur Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilbur Yang updated HIVE-7975:
--
Attachment: HIVE-7975.1.patch
HIVE-7975-doc.pdf

> HS2 memory optimization: Internalizing instance fields of Thrift-generated 
> metastore API classes
> 
>
> Key: HIVE-7975
> URL: https://issues.apache.org/jira/browse/HIVE-7975
> Project: Hive
>  Issue Type: Improvement
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
> Fix For: 0.14.0
>
> Attachments: HIVE-7975-doc.pdf, HIVE-7975.1.patch
>
>
> We should internalize the String-based instance fields of the metastore API 
> classes FieldSchema, Partition, SerDeInfo, and StorageDescriptor in order to 
> save memory. In a test environment with data consisting of about 1800 
> partitions, the proposed changes are able to save about 24% of old generation 
> memory during a complex query. See details in the attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7975) HS2 memory optimization: Internalizing instance fields of Thrift-generated metastore API classes

2014-09-03 Thread Wilbur Yang (JIRA)

Wilbur Yang created HIVE-7975:
-

 Summary: HS2 memory optimization: Internalizing instance fields of 
Thrift-generated metastore API classes
 Key: HIVE-7975
 URL: https://issues.apache.org/jira/browse/HIVE-7975
 Project: Hive
  Issue Type: Improvement
Reporter: Wilbur Yang
Assignee: Wilbur Yang
 Fix For: 0.14.0


We should internalize the String-based instance fields of the metastore API 
classes FieldSchema, Partition, SerDeInfo, and StorageDescriptor in order to 
save memory. In a test environment with data consisting of about 1800 
partitions, the proposed changes are able to save about 24% of old generation 
memory during a complex query. See details in the attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Status: Patch Available  (was: In Progress)

> Add vectorized support for CHAR/VARCHAR data types
> --
>
> Key: HIVE-5760
> URL: https://issues.apache.org/jira/browse/HIVE-5760
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Matt McCline
> Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
> HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
> HIVE-5760.91.patch, HIVE-5760.92.patch, HIVE-5760.93.patch
>
>
> Add support to allow queries referencing VARCHAR columns and expression 
> results to run efficiently in vectorized mode. This should re-use the code 
> for the STRING type to the extent possible and beneficial. Include unit tests 
> and end-to-end tests. Consider re-using or extending existing end-to-end 
> tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Attachment: HIVE-5760.93.patch

> Add vectorized support for CHAR/VARCHAR data types
> --
>
> Key: HIVE-5760
> URL: https://issues.apache.org/jira/browse/HIVE-5760
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Matt McCline
> Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
> HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
> HIVE-5760.91.patch, HIVE-5760.92.patch, HIVE-5760.93.patch
>
>
> Add support to allow queries referencing VARCHAR columns and expression 
> results to run efficiently in vectorized mode. This should re-use the code 
> for the STRING type to the extent possible and beneficial. Include unit tests 
> and end-to-end tests. Consider re-using or extending existing end-to-end 
> tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7956) When inserting into a bucketed table, all data goes to a single bucket [Spark Branch]

2014-09-03 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120807#comment-14120807
 ] 

Rui Li commented on HIVE-7956:
--

Yes [~brocknoland], with {{set hive.enforce.bucketing = true;}}, spark also 
launches a proper number of reducers (according to # of buckets). But all 
mappers write shuffle data for just one reducer so all other buckets are empty. 
Therefore I think there's something wrong with the partitioning. I can try 
cloning the extra hashcode in {{HiveKey}}.

> When inserting into a bucketed table, all data goes to a single bucket [Spark 
> Branch]
> -
>
> Key: HIVE-7956
> URL: https://issues.apache.org/jira/browse/HIVE-7956
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>
> I created a bucketed table:
> {code}
> create table testBucket(x int,y string) clustered by(x) into 10 buckets;
> {code}
> Then I run a query like:
> {code}
> set hive.enforce.bucketing = true;
> insert overwrite table testBucket select intCol,stringCol from src;
> {code}
> Here {{src}} is a simple textfile-based table containing 4000 records 
> (not bucketed). The query launches 10 reduce tasks but all the data goes to 
> only one of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Status: In Progress  (was: Patch Available)

> Add vectorized support for CHAR/VARCHAR data types
> --
>
> Key: HIVE-5760
> URL: https://issues.apache.org/jira/browse/HIVE-5760
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Matt McCline
> Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
> HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
> HIVE-5760.91.patch, HIVE-5760.92.patch
>
>
> Add support to allow queries referencing VARCHAR columns and expression 
> results to run efficiently in vectorized mode. This should re-use the code 
> for the STRING type to the extent possible and beneficial. Include unit tests 
> and end-to-end tests. Consider re-using or extending existing end-to-end 
> tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7974) Notification Event Listener and Replication Task introduction

2014-09-03 Thread Sushanth Sowmyan (JIRA)

Sushanth Sowmyan created HIVE-7974:
--

 Summary: Notification Event Listener and Replication Task 
introduction
 Key: HIVE-7974
 URL: https://issues.apache.org/jira/browse/HIVE-7974
 Project: Hive
  Issue Type: Sub-task
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


We need to create a new hive module (say hive-repl? ) to subsume the 
NotificationListener from HCatalog, and we need to introduce a notion of a 
ReplicationTask, which allows a tool using it to figure out what to do with a 
generated notification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7208) move SearchArgument interface into serde package

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120803#comment-14120803
 ] 

Hive QA commented on HIVE-7208:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666321/HIVE-7208.03.patch

{color:green}SUCCESS:{color} +1 6142 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/625/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/625/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-625/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666321

> move SearchArgument interface into serde package
> 
>
> Key: HIVE-7208
> URL: https://issues.apache.org/jira/browse/HIVE-7208
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-7208.01.patch, HIVE-7208.02.patch, 
> HIVE-7208.03.patch, HIVE-7208.patch
>
>
> For usage in alternative input formats/serdes, it might be useful to move 
> SearchArgument class to a place that is not in ql (because it's hard to 
> depend on ql).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7973) Hive Replication Support

2014-09-03 Thread Sushanth Sowmyan (JIRA)

Sushanth Sowmyan created HIVE-7973:
--

 Summary: Hive Replication Support
 Key: HIVE-7973
 URL: https://issues.apache.org/jira/browse/HIVE-7973
 Project: Hive
  Issue Type: Bug
  Components: Import/Export
Reporter: Sushanth Sowmyan


A need for replication is a common one in many database management systems, and 
it's important for hive to evolve support for such a tool as part of its 
ecosystem. Hive already supports an EXPORT and IMPORT command, which can be 
used to dump out tables, distcp them to another cluster, and and import/create 
from that. If we had a mechanism by which exports and imports could be 
automated, it establishes the base with which replication can be developed.

One place where this kind of automation can be developed is with aid of the 
HiveMetaStoreEventHandler mechanisms, to generate notifications when certain 
changes are committed to the metastore, and then translate those notifications 
to export actions, distcp actions and import actions on another import action.

Part of that already exists is with the Notification system that is part of 
hcatalog-server-extensions. Initially, this was developed to be able to trigger 
a JMS notification, which an Oozie workflow can use to can start off actions 
keyed on the finishing of a job that used HCatalog to write to a table. While 
this currently lives under hcatalog, the primary reason for its existence has a 
scope well past hcatalog alone, and can be used as-is without the use of 
HCatalog IF/OF. This can be extended, with the help of a library which does 
that aforementioned translation. I also think that these sections should live 
in a core hive module, rather than being tucked away inside hcatalog.

Once we have rudimentary support for table & partition replication, we can then 
move on to further requirements of replication, such as metadata replications 
(such as replication of changes to roles/etc), and/or optimize away the 
requirement to distcp and use webhdfs instead, etc.

This Story tracks all the bits that go into development of such a system - I'll 
create multiple smaller tasks inside this as we go on.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7208) move SearchArgument interface into serde package

2014-09-03 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120773#comment-14120773
 ] 

Daniel Dai commented on HIVE-7208:
--

Pig needs to change accordingly. But since the code (will be in Pig 0.14.0) is 
not released, it should be fine for us, we will pull hive-serde anyway.

> move SearchArgument interface into serde package
> 
>
> Key: HIVE-7208
> URL: https://issues.apache.org/jira/browse/HIVE-7208
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-7208.01.patch, HIVE-7208.02.patch, 
> HIVE-7208.03.patch, HIVE-7208.patch
>
>
> For usage in alternative input formats/serdes, it might be useful to move 
> SearchArgument class to a place that is not in ql (because it's hard to 
> depend on ql).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7811) Compactions need to update table/partition stats

2014-09-03 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120766#comment-14120766
 ] 

Eugene Koifman commented on HIVE-7811:
--

this failure is not related.  I verified locally and {code}mvn clean test 
-Phadoop-2  -Dtest=TestCliDriver -Dtest.output.overwrite=true 
-Dqfile_regex=schemeAuthority{code} passes

> Compactions need to update table/partition stats
> 
>
> Key: HIVE-7811
> URL: https://issues.apache.org/jira/browse/HIVE-7811
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7811.3.patch, HIVE-7811.4.patch, HIVE-7811.5.patch, 
> HIVE-7811.6.patch
>
>
> Compactions should trigger stats recalculation for columns which already have 
> sats.
> https://reviews.apache.org/r/25201/
> Major compactions will cause the Compactor to see which columns already have 
> stats and run analyze command for those columns.  If compacting a partition 
> then stats for that partition will be computed.  If table is not partitioned, 
> then the whole table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7972) hiveserver2 specific configuration file is not getting used

2014-09-03 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120754#comment-14120754
 ] 

Jason Dere commented on HIVE-7972:
--

+1

> hiveserver2 specific configuration file is not getting used
> ---
>
> Key: HIVE-7972
> URL: https://issues.apache.org/jira/browse/HIVE-7972
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7972.1.patch
>
>
> The HiveServer2 specific configuration file introduced in HIVE-7342 is not 
> getting used when hiveserver2 is started from commandline. This is because 
> the HiveConf used in HiveServer2.init is actually created before HiveServer2 
> constructor is called.
> The tests are using embedded HiveServer2 which does not go through 
> HiveServer2.main and don't show this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7811) Compactions need to update table/partition stats

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120733#comment-14120733
 ] 

Hive QA commented on HIVE-7811:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665991/HIVE-7811.6.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6143 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/624/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/624/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-624/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665991

> Compactions need to update table/partition stats
> 
>
> Key: HIVE-7811
> URL: https://issues.apache.org/jira/browse/HIVE-7811
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7811.3.patch, HIVE-7811.4.patch, HIVE-7811.5.patch, 
> HIVE-7811.6.patch
>
>
> Compactions should trigger stats recalculation for columns which already have 
> sats.
> https://reviews.apache.org/r/25201/
> Major compactions will cause the Compactor to see which columns already have 
> stats and run analyze command for those columns.  If compacting a partition 
> then stats for that partition will be computed.  If table is not partitioned, 
> then the whole table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7972) hiveserver2 specific configuration file is not getting used

2014-09-03 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7972:

Attachment: HIVE-7972.1.patch

Hive metastore is already doing the same thing by updating HiveConf in 
HiveMetastore.main. 

Tested this manually. MiniHS2 and embedded HS2 does not use this code path, so 
can't reproduce it using MiniHS2.



> hiveserver2 specific configuration file is not getting used
> ---
>
> Key: HIVE-7972
> URL: https://issues.apache.org/jira/browse/HIVE-7972
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
> Attachments: HIVE-7972.1.patch
>
>
> The HiveServer2 specific configuration file introduced in HIVE-7342 is not 
> getting used when hiveserver2 is started from commandline. This is because 
> the HiveConf used in HiveServer2.init is actually created before HiveServer2 
> constructor is called.
> The tests are using embedded HiveServer2 which does not go through 
> HiveServer2.main and don't show this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7972) hiveserver2 specific configuration file is not getting used

2014-09-03 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7972:

Assignee: Thejas M Nair
  Status: Patch Available  (was: Open)

> hiveserver2 specific configuration file is not getting used
> ---
>
> Key: HIVE-7972
> URL: https://issues.apache.org/jira/browse/HIVE-7972
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7972.1.patch
>
>
> The HiveServer2 specific configuration file introduced in HIVE-7342 is not 
> getting used when hiveserver2 is started from commandline. This is because 
> the HiveConf used in HiveServer2.init is actually created before HiveServer2 
> constructor is called.
> The tests are using embedded HiveServer2 which does not go through 
> HiveServer2.main and don't show this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7972) hiveserver2 specific configuration file is not getting used

2014-09-03 Thread Thejas M Nair (JIRA)

Thejas M Nair created HIVE-7972:
---

 Summary: hiveserver2 specific configuration file is not getting 
used
 Key: HIVE-7972
 URL: https://issues.apache.org/jira/browse/HIVE-7972
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair


The HiveServer2 specific configuration file introduced in HIVE-7342 is not 
getting used when hiveserver2 is started from commandline. This is because the 
HiveConf used in HiveServer2.init is actually created before HiveServer2 
constructor is called.
The tests are using embedded HiveServer2 which does not go through 
HiveServer2.main and don't show this issue.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7969) Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7969:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to cbo branch.

> Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer
> --
>
> Key: HIVE-7969
> URL: https://issues.apache.org/jira/browse/HIVE-7969
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-7969.patch
>
>
> After patch series of OPTIQ-391 , OPTIQ-392 , OPTIQ-395 , OPTIQ-396 its now 
> possible to use Optiq's native FieldTrimmer. So, lets use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 25320: HIVE-7971: Support alter table change/replace/add columns for existing partitions

2014-09-03 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25320/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-7971
https://issues.apache.org/jira/browse/HIVE-7971


Repository: hive-git


Description
---

Allow change/replace/add column to work on partitions


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e42bbdd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 05cde3e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 25cd3a5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 8517319 
  ql/src/test/queries/clientpositive/alter_partition_change_col.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter_partition_change_col.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25320/diff/


Testing
---

New qfile test added


Thanks,

Jason Dere

Re: Review Request 25313: Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread John Pullokkaran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25313/#review52263
---

Ship it!


Ship It!

- John Pullokkaran


On Sept. 3, 2014, 9:35 p.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25313/
> ---
> 
> (Updated Sept. 3, 2014, 9:35 p.m.)
> 
> 
> Review request for hive and Harish Butani.
> 
> 
> Bugs: HIVE-7969
> https://issues.apache.org/jira/browse/HIVE-7969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
> e9b258e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/TraitsUtil.java 
> e8069ee 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveAggregateRel.java
>  1588cdf 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveJoinRel.java
>  6a3410b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveProjectRel.java
>  8cbf2f1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveSortRel.java
>  1c42a29 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveUnionRel.java
>  b81f3c8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HiveRelFieldTrimmer.java
>  c28f974 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 8fe8b3c 
> 
> Diff: https://reviews.apache.org/r/25313/diff/
> 
> 
> Testing
> ---
> 
> cbo_correctness.q passes
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>

[jira] [Updated] (HIVE-7971) Support alter table change/replace/add columns for existing partitions

2014-09-03 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7971:
-
Status: Patch Available  (was: Open)

> Support alter table change/replace/add columns for existing partitions
> --
>
> Key: HIVE-7971
> URL: https://issues.apache.org/jira/browse/HIVE-7971
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-7971.1.patch
>
>
> ALTER TABLE CHANGE COLUMN is allowed for tables, but not for partitions. Same 
> for add/replace columns.
> Allowing this for partitions can be useful in some cases. For example, one 
> user has tables with Hive 0.12 Decimal columns, which do not specify 
> precision/scale. To be able to properly read the decimal values from the 
> existing partitions, the column types in the partitions need to be changed to 
> decimal types with precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7971) Support alter table change/replace/add columns for existing partitions

2014-09-03 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7971:
-
Attachment: HIVE-7971.1.patch

> Support alter table change/replace/add columns for existing partitions
> --
>
> Key: HIVE-7971
> URL: https://issues.apache.org/jira/browse/HIVE-7971
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-7971.1.patch
>
>
> ALTER TABLE CHANGE COLUMN is allowed for tables, but not for partitions. Same 
> for add/replace columns.
> Allowing this for partitions can be useful in some cases. For example, one 
> user has tables with Hive 0.12 Decimal columns, which do not specify 
> precision/scale. To be able to properly read the decimal values from the 
> existing partitions, the column types in the partitions need to be changed to 
> decimal types with precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.98.patch

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
> HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
> HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7971) Support alter table change/replace/add columns for existing partitions

2014-09-03 Thread Jason Dere (JIRA)

Jason Dere created HIVE-7971:


 Summary: Support alter table change/replace/add columns for 
existing partitions
 Key: HIVE-7971
 URL: https://issues.apache.org/jira/browse/HIVE-7971
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


ALTER TABLE CHANGE COLUMN is allowed for tables, but not for partitions. Same 
for add/replace columns.
Allowing this for partitions can be useful in some cases. For example, one user 
has tables with Hive 0.12 Decimal columns, which do not specify 
precision/scale. To be able to properly read the decimal values from the 
existing partitions, the column types in the partitions need to be changed to 
decimal types with precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-03 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: In Progress  (was: Patch Available)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
> HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
> HIVE-7405.96.patch, HIVE-7405.97.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7952) Investigate query failures (1)

2014-09-03 Thread Suhas Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120706#comment-14120706
 ] 

Suhas Satish commented on HIVE-7952:


auto_sortmerge_join_1 and auto_sortmerge_join13 are covered under existing jira 
on  Map join and the stackTrace from the test failure is listed here - 
https://issues.apache.org/jira/browse/HIVE-7613

> Investigate query failures (1)
> --
>
> Key: HIVE-7952
> URL: https://issues.apache.org/jira/browse/HIVE-7952
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
>
> I ran all q-file tests and the following failed with an exception:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-SPARK-ALL-TESTS-Build/lastCompletedBuild/testReport/
> we don't necessary want to run all these tests as part of the spark tests, 
> but we should understand why they failed with an exception. This JIRA is to 
> look into these failures and document them with one of:
> * New JIRA
> * Covered under existing JIRA
> * More investigation required
> Tests:
> {noformat}
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_13
>2.5 sec 2
>  org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_tez_fsstat   
> 1.6 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dynpart_sort_opt_vectorization
>5.3 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_14
>6.3 sec 2
>  org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_using
> 0.34 sec2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_create_func1
>0.96 sec2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample_islocalmode_hook
>   11 sec  2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_set_show_current_role
>   1.4 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_owner_actions_db
>0.42 sec2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
> 5.5 sec 2
>  org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lock21.8 sec 
> 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_1_sql_std
>   2.7 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_19_part_external_location
>3.9 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_empty_partition
> 0.67 sec2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_role_grant1
> 3.6 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_role_grant2
> 2.6 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_show_grant
>  3.5 sec 2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_smb_mapjoin_14
>   2.6 sec 2
>  org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_query1  
> 0.93 sec2
>  org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_query4  
> 0.26 sec2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_1
> 10 sec  2
>  
> org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_7
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-09-03 Thread Suhas Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120697#comment-14120697
 ] 

Suhas Satish commented on HIVE-7613:


as a part of this work, we should also enable auto_sortmerge_join_1.q which 
currently fails with 

{code:title=auto_sortmerge_join_1.stackTrace|borderStyle=solid}
2014-09-03 16:12:59,607 ERROR [main]: spark.SparkClient 
(SparkClient.java:execute(166)) - Error executing Spark Plan
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 
1, localhost): java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {"key":"0","value":"val_0","ds":"2008-04-08"}

org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:151)

org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)

org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)

org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:99)

scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1177)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1166)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1165)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1165)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1383)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

{code}

> Research optimization of auto convert join to map join [Spark branch]
> -
>
> Key: HIVE-7613
> URL: https://issues.apache.org/jira/browse/HIVE-7613
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Szehon Ho
>Priority: Minor
> Attachments: HIve on Spark Map join background.docx
>
>
> ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
> join) with a map join(aka broadcast or fragment replicate join) when 
> possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-3653) Failure in a counter poller run should not be considered as a job failure

2014-09-03 Thread Darren Yin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120694#comment-14120694
 ] 

Darren Yin commented on HIVE-3653:
--

The jobtracker RPC server is dropping your connection -- you could consider 
tracking rpc latency stats, and/or upping the number of threads in the RPC 
server, or possibly upping the max idle time for handler threads before they 
get killed/reclaimed by the rpc server. There may also be other options.

> Failure in a counter poller run should not be considered as a job failure
> -
>
> Key: HIVE-3653
> URL: https://issues.apache.org/jira/browse/HIVE-3653
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 0.7.1
>Reporter: Harsh J
>Assignee: Shreepadma Venugopalan
>
> A client had a simple transient failure in polling the JT for job status 
> (which it does for HIVECOUNTERSPULLINTERVAL for each currently running job).
> {code}
> java.io.IOException: Call to HOST/IP:PORT failed on local exception: 
> java.io.IOException: Connection reset by peer 
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1110) 
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) 
> at org.apache.hadoop.mapred.$Proxy10.getJobStatus(Unknown Source) 
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:1053) 
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:1065) 
> at org.apache.hadoop.hive.ql.exec.ExecDriver.progress(ExecDriver.java:351) 
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:686) 
> at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:131) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) 
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209) 
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286) 
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:310) 
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:317) 
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:490) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:197) 
> {code}
> This lead to Hive thinking the running job itself has failed, and it failed 
> the query run, although the running job progressed to completion in the 
> background.
> We should not let transient IOExceptions in counter polling cause query 
> termination, and should instead just retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7184) TestHadoop20SAuthBridge no longer compiles after HADOOP-10448

2014-09-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120686#comment-14120686
 ] 

Brock Noland commented on HIVE-7184:


Note that HADOOP-10817 requires changed in this patch.

{noformat}
@@ -130,7 +130,7 @@ private void configureSuperUserIPAddresses(Configuration 
conf,
 }
 builder.append("127.0.1.1,");
 builder.append(InetAddress.getLocalHost().getCanonicalHostName());
-
conf.setStrings(DefaultImpersonationProvider.getProxySuperuserIpConfKey(superUserShortName),
+
conf.setStrings(DefaultImpersonationProvider.getTestProvider().getProxySuperuserIpConfKey(superUserShortName),
 builder.toString());
   }
 
@@ -294,7 +294,7 @@ public String run() throws Exception {
   private void setGroupsInConf(String[] groupNames, String proxyUserName)
   throws IOException {
conf.set(
-  
DefaultImpersonationProvider.getProxySuperuserGroupConfKey(proxyUserName),
+  
DefaultImpersonationProvider.getTestProvider().getProxySuperuserGroupConfKey(proxyUserName),
   StringUtils.join(",", Arrays.asList(groupNames)));
 configureSuperUserIPAddresses(conf, proxyUserName);
 ProxyUsers.refreshSuperUserGroupsConfiguration(conf);
{noformat}

> TestHadoop20SAuthBridge no longer compiles after HADOOP-10448
> -
>
> Key: HIVE-7184
> URL: https://issues.apache.org/jira/browse/HIVE-7184
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Jason Dere
> Attachments: HIVE-7184.1.patch, HIVE-7184.2.patch
>
>
> HADOOP-10448 moves a couple of methods which were being used by the 
> TestHadoop20SAuthBridge test. If/when Hive build uses Hadoop 2.5 as a 
> dependency, this will cause compilation errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 25179: HIVE-7905: CBO: more cost model changes

2014-09-03 Thread Harish Butani


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25179/
---

(Updated Sept. 3, 2014, 11:19 p.m.)


Review request for hive, Gunther Hagleitner, John Pullokkaran, and Mostafa 
Mokhtar.


Changes
---

have added the changes
- to check range against rowCounts. This depends on HIVE-7915, which has not 
being merged into CBO branch
- apply ndv scaling, where applicable
- log the detailsof PK - FK relationship


Repository: hive-git


Description
---

CBO: more cost model changes


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveDefaultRelMetadataProvider.java
 2c08772 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdRowCount.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdSelectivity.java
 df70de2 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdUniqueKeys.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 8fe8b3c 

Diff: https://reviews.apache.org/r/25179/diff/


Testing
---

existing tests.


Thanks,

Harish Butani

[jira] [Commented] (HIVE-7943) hive.security.authorization.createtable.owner.grants is ineffective with Default Authorization

2014-09-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120649#comment-14120649
 ] 

Thejas M Nair commented on HIVE-7943:
-

But if case 2 were to be supported, we would need to add support for setting 
privileges for new owner.


> hive.security.authorization.createtable.owner.grants is ineffective with 
> Default Authorization
> --
>
> Key: HIVE-7943
> URL: https://issues.apache.org/jira/browse/HIVE-7943
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Ashu Pachauri
> Attachments: HIVE-7943.1.patch
>
>
> HIVE-6250 separates owner privileges from user privileges. However, Default 
> Authorization does not adapt to the change and table owners do not inherit 
> permissions from the config.
> Steps to Reproduce:
> set hive.security.authorization.enabled=true;
> set hive.security.authorization.createtable.owner.grants=ALL;
> create table temp_table(id int, value string);
> drop table temp_table;
> Above set of operations throw the following error:
> 
> Authorization failed:No privilege 'Drop' found for outputs { 
> database:default, table:temp_table}. Use SHOW GRANT to get more details.
> 14/09/02 17:49:38 ERROR ql.Driver: Authorization failed:No privilege 'Drop' 
> found for outputs { database:default, table:temp_table}. Use SHOW GRANT to 
> get more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7943) hive.security.authorization.createtable.owner.grants is ineffective with Default Authorization

2014-09-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120647#comment-14120647
 ] 

Thejas M Nair commented on HIVE-7943:
-

Case 2 is not relevant right now, because there is no syntax support for 
altering table owner.


> hive.security.authorization.createtable.owner.grants is ineffective with 
> Default Authorization
> --
>
> Key: HIVE-7943
> URL: https://issues.apache.org/jira/browse/HIVE-7943
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Ashu Pachauri
> Attachments: HIVE-7943.1.patch
>
>
> HIVE-6250 separates owner privileges from user privileges. However, Default 
> Authorization does not adapt to the change and table owners do not inherit 
> permissions from the config.
> Steps to Reproduce:
> set hive.security.authorization.enabled=true;
> set hive.security.authorization.createtable.owner.grants=ALL;
> create table temp_table(id int, value string);
> drop table temp_table;
> Above set of operations throw the following error:
> 
> Authorization failed:No privilege 'Drop' found for outputs { 
> database:default, table:temp_table}. Use SHOW GRANT to get more details.
> 14/09/02 17:49:38 ERROR ql.Driver: Authorization failed:No privilege 'Drop' 
> found for outputs { database:default, table:temp_table}. Use SHOW GRANT to 
> get more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7943) hive.security.authorization.createtable.owner.grants is ineffective with Default Authorization

2014-09-03 Thread Ashu Pachauri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120633#comment-14120633
 ] 

Ashu Pachauri commented on HIVE-7943:
-

Okay, I understand the rationale behind the separation. But I am confused 
between the two cases:

1. Owner grants are tightly bound to the user who creates the table.
2. Owner grants are tightly bound only to the table (in metadata) but apply 
only to the current owner.

If case 1 is true, we can just append owner privileges to user privs at table 
creation time.
If case 2 is true, we need some place to store owner privileges in the metadata 
at table creation time and merge them with current user privileges (if he is 
the owner) at the time of authorization.

> hive.security.authorization.createtable.owner.grants is ineffective with 
> Default Authorization
> --
>
> Key: HIVE-7943
> URL: https://issues.apache.org/jira/browse/HIVE-7943
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Ashu Pachauri
> Attachments: HIVE-7943.1.patch
>
>
> HIVE-6250 separates owner privileges from user privileges. However, Default 
> Authorization does not adapt to the change and table owners do not inherit 
> permissions from the config.
> Steps to Reproduce:
> set hive.security.authorization.enabled=true;
> set hive.security.authorization.createtable.owner.grants=ALL;
> create table temp_table(id int, value string);
> drop table temp_table;
> Above set of operations throw the following error:
> 
> Authorization failed:No privilege 'Drop' found for outputs { 
> database:default, table:temp_table}. Use SHOW GRANT to get more details.
> 14/09/02 17:49:38 ERROR ql.Driver: Authorization failed:No privilege 'Drop' 
> found for outputs { database:default, table:temp_table}. Use SHOW GRANT to 
> get more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120629#comment-14120629
 ] 

Hive QA commented on HIVE-5760:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666294/HIVE-5760.92.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6166 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testIfConditionalExprs
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testVectorExpressionDescriptor
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/623/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/623/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-623/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666294

> Add vectorized support for CHAR/VARCHAR data types
> --
>
> Key: HIVE-5760
> URL: https://issues.apache.org/jira/browse/HIVE-5760
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Matt McCline
> Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
> HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
> HIVE-5760.91.patch, HIVE-5760.92.patch
>
>
> Add support to allow queries referencing VARCHAR columns and expression 
> results to run efficiently in vectorized mode. This should re-use the code 
> for the STRING type to the extent possible and beneficial. Include unit tests 
> and end-to-end tests. Consider re-using or extending existing end-to-end 
> tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7942) TestHS2ImpersonationWithRemoteMS is flaky

2014-09-03 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120622#comment-14120622
 ] 

Vaibhav Gumashta commented on HIVE-7942:


[~vkorukanti] I think the appropriate change would be to do a 
{code}
conf.setBoolean("dfs.permissions.enabled", false); 
{code}
before starting miniHS2. We open JDBC connections as different users and the 
default value for hive.server2.enable.doAs is true. 

> TestHS2ImpersonationWithRemoteMS is flaky
> -
>
> Key: HIVE-7942
> URL: https://issues.apache.org/jira/browse/HIVE-7942
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Venki Korukanti
>
> I think we need to be a sleep in TestHS2ImpersonationWithRemoteMS.setup()  
> after MiniHS2.start()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7970) Operator::removeChild may remove the wrong child

2014-09-03 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-7970:
---
Description: 
Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}

{code}

However, it's often the case that assertion isn't enabled, and if the operator
only have one child, it may remove the wrong child.

There are lots of other methods in this class that use assertion, such as 
{{replaceChild}},
{{removeParent}}, {{replaceParent}}, etc. They have similar problems.
I propose we change these assertions to something else, maybe {{Preconditions}} 
in Guava.

  was:
Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}

{code}

However, most of the time assertion isn't turned on, and if the operator
only have one child, it may remove the wrong child.

There are lots of other method in this class that uses assertion, such as 
{{replaceChild}},
{{removeParent}}, {{replaceParent}}. They have similar problems.
I propose we change assertions to something else, maybe {{Preconditions}} in 
Guava.


> Operator::removeChild may remove the wrong child
> 
>
> Key: HIVE-7970
> URL: https://issues.apache.org/jira/browse/HIVE-7970
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao
>
> Currently, {{Operator::removeChild}} does the following:
> {code}
>   public void removeChild(Operator child) {
> int childIndex = childOperators.indexOf(child);
> assert childIndex != -1;
> if (childOperators.size() == 1) {
>   setChildOperators(null);
> } else {
>   childOperators.remove(childIndex);
> }
> 
> {code}
> However, it's often the case that assertion isn't enabled, and if the operator
> only have one child, it may remove the wrong child.
> There are lots of other methods in this class that use assertion, such as 
> {{replaceChild}},
> {{removeParent}}, {{replaceParent}}, etc. They have similar problems.
> I propose we change these assertions to something else, maybe 
> {{Preconditions}} in Guava.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7970) Operator::removeChild may remove the wrong child

2014-09-03 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-7970:
---
Description: 
Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}

{code}

However, most of the time assertion isn't turned on, and if the operator
only have one child, it may remove the wrong child.

There are lots of other method in this class that uses assertion, such as 
{{replaceChild}},
{{removeParent}}, {{replaceParent}}. They have similar problems.
I propose we change assertions to something else, maybe {{Preconditions}} in 
Guava.

  was:
Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}

{code}

However, most of the time assertion isn't turned on, and if the operator
only have one child, it may remove the wrong child.
I think we need to change the assertion to something else.


> Operator::removeChild may remove the wrong child
> 
>
> Key: HIVE-7970
> URL: https://issues.apache.org/jira/browse/HIVE-7970
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao
>
> Currently, {{Operator::removeChild}} does the following:
> {code}
>   public void removeChild(Operator child) {
> int childIndex = childOperators.indexOf(child);
> assert childIndex != -1;
> if (childOperators.size() == 1) {
>   setChildOperators(null);
> } else {
>   childOperators.remove(childIndex);
> }
> 
> {code}
> However, most of the time assertion isn't turned on, and if the operator
> only have one child, it may remove the wrong child.
> There are lots of other method in this class that uses assertion, such as 
> {{replaceChild}},
> {{removeParent}}, {{replaceParent}}. They have similar problems.
> I propose we change assertions to something else, maybe {{Preconditions}} in 
> Guava.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-7809) Fix ObjectRegistry to work with Tez 0.5

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-7809.
--
Resolution: Fixed

Committed to branch. Thanks [~sseth]!

> Fix ObjectRegistry to work with Tez 0.5
> ---
>
> Key: HIVE-7809
> URL: https://issues.apache.org/jira/browse/HIVE-7809
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-7809.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7958) SparkWork generated by SparkCompiler may require multiple Spark jobs to run

2014-09-03 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-7958:
---
Description: 
A SparkWork instance currently may contain disjointed work graphs. For 
instance, union_remove_1.q may generated a plan like this:
{code}
Reduce2 <- Map 1
Reduce4 <- Map 3
{code}
The SparkPlan instance generated from this work graph contains two result RDDs. 
When such plan is executed, we call .foreach() on the two RDDs sequentially, 
which results two Spark jobs, one after the other.

While this works functionally, the performance will not be great as the Spark 
jobs are run sequentially rather than concurrently.

Another side effect of this is that the corresponding SparkPlan instance is 
over-complicated.

The are two potential approaches:

1. Let SparkCompiler generate a work that can be executed in ONE Spark job 
only. In above example, two Spark task should be generated.

2. Let SparkPlanGenerate generate multiple Spark plans and then SparkClient 
executes them concurrently.

Approach #1 seems more reasonable and naturally fit to our architecture. Also, 
Hive's task execution framework already takes care of the task concurrency.

  was:
A SparkWork instance currently may contain disjointed work graphs. For 
instance, union_remove_1.q may generated a plan like this:
{code}
Reduce2 -> Map 1
Reduce4 <- Map 3
{code}
The SparkPlan instance generated from this work graph contains two result RDDs. 
When such plan is executed, we call .foreach() on the two RDDs sequentially, 
which results two Spark jobs, one after the other.

While this works functionally, the performance will not be great as the Spark 
jobs are run sequentially rather than concurrently.

Another side effect of this is that the corresponding SparkPlan instance is 
over-complicated.

The are two potential approaches:

1. Let SparkCompiler generate a work that can be executed in ONE Spark job 
only. In above example, two Spark task should be generated.

2. Let SparkPlanGenerate generate multiple Spark plans and then SparkClient 
executes them concurrently.

Approach #1 seems more reasonable and naturally fit to our architecture. Also, 
Hive's task execution framework already takes care of the task concurrency.


> SparkWork generated by SparkCompiler may require multiple Spark jobs to run
> ---
>
> Key: HIVE-7958
> URL: https://issues.apache.org/jira/browse/HIVE-7958
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Xuefu Zhang
>Priority: Critical
>  Labels: Spark-M1
>
> A SparkWork instance currently may contain disjointed work graphs. For 
> instance, union_remove_1.q may generated a plan like this:
> {code}
> Reduce2 <- Map 1
> Reduce4 <- Map 3
> {code}
> The SparkPlan instance generated from this work graph contains two result 
> RDDs. When such plan is executed, we call .foreach() on the two RDDs 
> sequentially, which results two Spark jobs, one after the other.
> While this works functionally, the performance will not be great as the Spark 
> jobs are run sequentially rather than concurrently.
> Another side effect of this is that the corresponding SparkPlan instance is 
> over-complicated.
> The are two potential approaches:
> 1. Let SparkCompiler generate a work that can be executed in ONE Spark job 
> only. In above example, two Spark task should be generated.
> 2. Let SparkPlanGenerate generate multiple Spark plans and then SparkClient 
> executes them concurrently.
> Approach #1 seems more reasonable and naturally fit to our architecture. 
> Also, Hive's task execution framework already takes care of the task 
> concurrency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7970) Operator::removeChild may remove the wrong child

2014-09-03 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-7970:
---
Description: 
Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}

{code}

However, most of the time assertion isn't turned on, and if the operator
only have one child, it may remove the wrong child.
I think we need to change the assertion to something else.

  was:
Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}
{code}

However, most of the time assertion isn't turned on, and if the operator
only have one child, it may remove the wrong child.
I think we need to change the assertion to something else.


> Operator::removeChild may remove the wrong child
> 
>
> Key: HIVE-7970
> URL: https://issues.apache.org/jira/browse/HIVE-7970
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao
>
> Currently, {{Operator::removeChild}} does the following:
> {code}
>   public void removeChild(Operator child) {
> int childIndex = childOperators.indexOf(child);
> assert childIndex != -1;
> if (childOperators.size() == 1) {
>   setChildOperators(null);
> } else {
>   childOperators.remove(childIndex);
> }
> 
> {code}
> However, most of the time assertion isn't turned on, and if the operator
> only have one child, it may remove the wrong child.
> I think we need to change the assertion to something else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7970) Operator::removeChild may remove the wrong child

2014-09-03 Thread Chao (JIRA)

Chao created HIVE-7970:
--

 Summary: Operator::removeChild may remove the wrong child
 Key: HIVE-7970
 URL: https://issues.apache.org/jira/browse/HIVE-7970
 Project: Hive
  Issue Type: Bug
Reporter: Chao


Currently, {{Operator::removeChild}} does the following:

{code}
  public void removeChild(Operator child) {
int childIndex = childOperators.indexOf(child);
assert childIndex != -1;
if (childOperators.size() == 1) {
  setChildOperators(null);
} else {
  childOperators.remove(childIndex);
}
{code}

However, most of the time assertion isn't turned on, and if the operator
only have one child, it may remove the wrong child.
I think we need to change the assertion to something else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7544) Changes related to TEZ-1288 (FastTezSerialization)

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7544:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks [~rajesh.balamohan] and [~gopalv]!

> Changes related to TEZ-1288 (FastTezSerialization)
> --
>
> Key: HIVE-7544
> URL: https://issues.apache.org/jira/browse/HIVE-7544
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-7544.1.patch, HIVE-7544.tez-branch.2.patch
>
>
> Add ability to make use of TezBytesWritableSerialization.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-09-03 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6847:
---
Attachment: HIVE-6847.10.patch

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.10.patch, 
> HIVE-6847.2.patch, HIVE-6847.3.patch, HIVE-6847.4.patch, HIVE-6847.5.patch, 
> HIVE-6847.6.patch, HIVE-6847.7.patch, HIVE-6847.8.patch, HIVE-6847.9.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-09-03 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6847:
---
Status: Open  (was: Patch Available)

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.10.patch, 
> HIVE-6847.2.patch, HIVE-6847.3.patch, HIVE-6847.4.patch, HIVE-6847.5.patch, 
> HIVE-6847.6.patch, HIVE-6847.7.patch, HIVE-6847.8.patch, HIVE-6847.9.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-09-03 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6847:
---
Status: Patch Available  (was: Open)

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.10.patch, 
> HIVE-6847.2.patch, HIVE-6847.3.patch, HIVE-6847.4.patch, HIVE-6847.5.patch, 
> HIVE-6847.6.patch, HIVE-6847.7.patch, HIVE-6847.8.patch, HIVE-6847.9.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7907:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks [~gopalv]!

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch, HIVE-7907.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-09-03 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7907:
-
Attachment: HIVE-7907.2.patch

.2 also changes the version of tez to 0.5.0. It thus applies and compiles in 
the branch.

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch, HIVE-7907.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7223) Support generic PartitionSpecs in Metastore partition-functions

2014-09-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120519#comment-14120519
 ] 

Hive QA commented on HIVE-7223:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12666290/HIVE-7223.5.patch

{color:green}SUCCESS:{color} +1 6145 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/622/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/622/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-622/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12666290

> Support generic PartitionSpecs in Metastore partition-functions
> ---
>
> Key: HIVE-7223
> URL: https://issues.apache.org/jira/browse/HIVE-7223
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog, Metastore
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-7223.1.patch, HIVE-7223.2.patch, HIVE-7223.3.patch, 
> HIVE-7223.4.patch, HIVE-7223.5.patch
>
>
> Currently, the functions in the HiveMetaStore API that handle multiple 
> partitions do so using List. E.g. 
> {code}
> public List listPartitions(String db_name, String tbl_name, short 
> max_parts);
> public List listPartitionsByFilter(String db_name, String 
> tbl_name, String filter, short max_parts);
> public int add_partitions(List new_parts);
> {code}
> Partition objects are fairly heavyweight, since each Partition carries its 
> own copy of a StorageDescriptor, partition-values, etc. Tables with tens of 
> thousands of partitions take so long to have their partitions listed that the 
> client times out with default hive.metastore.client.socket.timeout. There is 
> the additional expense of serializing and deserializing metadata for large 
> sets of partitions, w.r.t time and heap-space. Reducing the thrift traffic 
> should help in this regard.
> In a date-partitioned table, all sub-partitions for a particular date are 
> *likely* (but not expected) to have:
> # The same base directory (e.g. {{/feeds/search/20140601/}})
> # Similar directory structure (e.g. {{/feeds/search/20140601/[US,UK,IN]}})
> # The same SerDe/StorageHandler/IOFormat classes
> # Sorting/Bucketing/SkewInfo settings
> In this “most likely” scenario (henceforth termed “normal”), it’s possible to 
> represent the partition-list (for a date) in a more condensed form: a list of 
> LighterPartition instances, all sharing a common StorageDescriptor whose 
> location points to the root directory. 
> We can go one better for the {{add_partitions()}} case: When adding all 
> partitions for a given date, the “normal” case affords us the ability to 
> specify the top-level date-directory, where sub-partitions can be inferred 
> from the HDFS directory-path.
> These extensions are hard to introduce at the metastore-level, since 
> partition-functions explicitly specify {{List}} arguments. I 
> wonder if a {{PartitionSpec}} interface might help:
> {code}
> public PartitionSpec listPartitions(db_name, tbl_name, max_parts) throws ... 
> ; 
> public int add_partitions( PartitionSpec new_parts ) throws … ;
> {code}
> where the PartitionSpec looks like:
> {code}
> public interface PartitionSpec {
> public List getPartitions();
> public List getPartNames();
> public Iterator getPartitionIter();
> public Iterator getPartNameIter();
> }
> {code}
> For addPartitions(), an {{HDFSDirBasedPartitionSpec}} class could implement 
> {{PartitionSpec}}, store a top-level directory, and return Partition 
> instances from sub-directory names, while storing a single StorageDescriptor 
> for all of them.
> Similarly, list_partitions() could return a List, where each 
> PartitionSpec corresponds to a set or partitions that can share a 
> StorageDescriptor.
> By exposing iterator semantics, neither the client nor the metastore need 
> instantiate all partitions at once. That should help with memory requirements.
> In case no smart grouping is possible, we could just fall back on a 
> {{DefaultPartitionSpec}} which composes {{List}}, and is no worse 
> than status quo.
> PartitionSpec abstracts away how a set of partitions may be represented. A 
> tighter representation allows us to communicate metadata for a larger number 
> of Partitions, with less Thrift traffic.
> Given that Thrift doesn’t su

[jira] [Updated] (HIVE-7969) Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7969:
---
Attachment: HIVE-7969.patch

Note that this requires latest optiq snapshot

> Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer
> --
>
> Key: HIVE-7969
> URL: https://issues.apache.org/jira/browse/HIVE-7969
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-7969.patch
>
>
> After patch series of OPTIQ-391 , OPTIQ-392 , OPTIQ-395 , OPTIQ-396 its now 
> possible to use Optiq's native FieldTrimmer. So, lets use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7969) Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7969:
---
Status: Patch Available  (was: Open)

> Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer
> --
>
> Key: HIVE-7969
> URL: https://issues.apache.org/jira/browse/HIVE-7969
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-7969.patch
>
>
> After patch series of OPTIQ-391 , OPTIQ-392 , OPTIQ-395 , OPTIQ-396 its now 
> possible to use Optiq's native FieldTrimmer. So, lets use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 25313: Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25313/
---

Review request for hive and Harish Butani.


Bugs: HIVE-7969
https://issues.apache.org/jira/browse/HIVE-7969


Repository: hive-git


Description
---

Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
e9b258e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/TraitsUtil.java e8069ee 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveAggregateRel.java
 1588cdf 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveJoinRel.java
 6a3410b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveProjectRel.java
 8cbf2f1 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveSortRel.java
 1c42a29 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveUnionRel.java
 b81f3c8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HiveRelFieldTrimmer.java
 c28f974 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 8fe8b3c 

Diff: https://reviews.apache.org/r/25313/diff/


Testing
---

cbo_correctness.q passes


Thanks,

Ashutosh Chauhan

[jira] [Commented] (HIVE-7208) move SearchArgument interface into serde package

2014-09-03 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120499#comment-14120499
 ] 

Sergey Shelukhin commented on HIVE-7208:


[~daijy] can you check if this will break something in Pig?
Note that package stayed the same for all the classes, just module changed.
The only exception is that SearchArgument.FACTORY changed to 
SearchArgumentFactory, because interface and implementation go to different 
modules, therefore factory cannot remain a part of the interface.

> move SearchArgument interface into serde package
> 
>
> Key: HIVE-7208
> URL: https://issues.apache.org/jira/browse/HIVE-7208
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-7208.01.patch, HIVE-7208.02.patch, 
> HIVE-7208.03.patch, HIVE-7208.patch
>
>
> For usage in alternative input formats/serdes, it might be useful to move 
> SearchArgument class to a place that is not in ql (because it's hard to 
> depend on ql).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7969) Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7969:
---
Description: After patch series of OPTIQ-391 , OPTIQ-392 , OPTIQ-395 , 
OPTIQ-396 its now possible to use Optiq's native FieldTrimmer. So, lets use it. 
 (was: After patch series of OPTIQ-391 OPTIQ-392 OPTIQ-395 OPTIQ-396 its now 
possible to use Optiq's native FieldTrimmer. So, lets use it.)

> Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer
> --
>
> Key: HIVE-7969
> URL: https://issues.apache.org/jira/browse/HIVE-7969
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> After patch series of OPTIQ-391 , OPTIQ-392 , OPTIQ-395 , OPTIQ-396 its now 
> possible to use Optiq's native FieldTrimmer. So, lets use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH

2014-09-03 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6948:
---
   Resolution: Duplicate
Fix Version/s: (was: 0.14.0)
   Status: Resolved  (was: Patch Available)

> HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
> --
>
> Key: HIVE-6948
> URL: https://issues.apache.org/jira/browse/HIVE-6948
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0
>Reporter: Peng Zhang
> Attachments: HIVE-6948.patch, HIVE-6948.patch
>
>
> HiveServer2 ignores HIVE_AUX_JARS_PATH.
> This will cause aux jars not distributed to Yarn cluster, and job will fail 
> without dependent jars.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7969) Use Optiq's native FieldTrimmer instead of HiveRelFieldTrimmer

2014-09-03 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-7969:
--

 Summary: Use Optiq's native FieldTrimmer instead of 
HiveRelFieldTrimmer
 Key: HIVE-7969
 URL: https://issues.apache.org/jira/browse/HIVE-7969
 Project: Hive
  Issue Type: Sub-task
  Components: CBO, Logical Optimizer
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


After patch series of OPTIQ-391 OPTIQ-392 OPTIQ-395 OPTIQ-396 its now possible 
to use Optiq's native FieldTrimmer. So, lets use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7826) Dynamic partition pruning on Tez

2014-09-03 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120495#comment-14120495
 ] 

Gunther Hagleitner commented on HIVE-7826:
--

Thanks [~damien.carol]. Your last comment definitely made my day :-)

> Dynamic partition pruning on Tez
> 
>
> Key: HIVE-7826
> URL: https://issues.apache.org/jira/browse/HIVE-7826
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>  Labels: TODOC14, tez
> Attachments: HIVE-7826.1.patch, HIVE-7826.2.patch, HIVE-7826.3.patch, 
> HIVE-7826.4.patch, HIVE-7826.5.patch, HIVE-7826.6.patch, HIVE-7826.7.patch
>
>
> It's natural in a star schema to map one or more dimensions to partition 
> columns. Time or location are likely candidates. 
> It can also useful to be to compute the partitions one would like to scan via 
> a subquery (where p in select ... from ...).
> The resulting joins in hive require a full table scan of the large table 
> though, because partition pruning takes place before the corresponding values 
> are known.
> On Tez it's relatively straight forward to send the values needed to prune to 
> the application master - where splits are generated and tasks are submitted. 
> Using these values we can strip out any unneeded partitions dynamically, 
> while the query is running.
> The approach is straight forward:
> - Insert synthetic conditions for each join representing "x in (keys of other 
> side in join)"
> - This conditions will be pushed as far down as possible
> - If the condition hits a table scan and the column involved is a partition 
> column:
>- Setup Operator to send key events to AM
> - else:
>- Remove synthetic predicate
> Add  these properties :
> ||Property||Default Value||
> |{{hive.tez.dynamic.partition.pruning}}|true|
> |{{hive.tez.dynamic.partition.pruning.max.event.size}}|1*1024*1024L|
> |{{hive.tez.dynamic.parition.pruning.max.data.size}}|100*1024*1024L|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH

2014-09-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120490#comment-14120490
 ] 

Brock Noland commented on HIVE-6948:


This is a dup of  HIVE-6820.

> HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
> --
>
> Key: HIVE-6948
> URL: https://issues.apache.org/jira/browse/HIVE-6948
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0
>Reporter: Peng Zhang
> Fix For: 0.14.0
>
> Attachments: HIVE-6948.patch, HIVE-6948.patch
>
>
> HiveServer2 ignores HIVE_AUX_JARS_PATH.
> This will cause aux jars not distributed to Yarn cluster, and job will fail 
> without dependent jars.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7956) When inserting into a bucketed table, all data goes to a single bucket [Spark Branch]

2014-09-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120484#comment-14120484
 ] 

Brock Noland commented on HIVE-7956:


I thought that MR does this by setting the number of reducers equal to the 
number of buckets.

> When inserting into a bucketed table, all data goes to a single bucket [Spark 
> Branch]
> -
>
> Key: HIVE-7956
> URL: https://issues.apache.org/jira/browse/HIVE-7956
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>
> I created a bucketed table:
> {code}
> create table testBucket(x int,y string) clustered by(x) into 10 buckets;
> {code}
> Then I run a query like:
> {code}
> set hive.enforce.bucketing = true;
> insert overwrite table testBucket select intCol,stringCol from src;
> {code}
> Here {{src}} is a simple textfile-based table containing 4000 records 
> (not bucketed). The query launches 10 reduce tasks but all the data goes to 
> only one of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7208) move SearchArgument interface into serde package

2014-09-03 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-7208:
---
Attachment: HIVE-7208.03.patch

File deletion has not been rebased properly

> move SearchArgument interface into serde package
> 
>
> Key: HIVE-7208
> URL: https://issues.apache.org/jira/browse/HIVE-7208
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-7208.01.patch, HIVE-7208.02.patch, 
> HIVE-7208.03.patch, HIVE-7208.patch
>
>
> For usage in alternative input formats/serdes, it might be useful to move 
> SearchArgument class to a place that is not in ql (because it's hard to 
> depend on ql).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7811) Compactions need to update table/partition stats

2014-09-03 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7811:
-
Status: Patch Available  (was: Open)

> Compactions need to update table/partition stats
> 
>
> Key: HIVE-7811
> URL: https://issues.apache.org/jira/browse/HIVE-7811
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7811.3.patch, HIVE-7811.4.patch, HIVE-7811.5.patch, 
> HIVE-7811.6.patch
>
>
> Compactions should trigger stats recalculation for columns which already have 
> sats.
> https://reviews.apache.org/r/25201/
> Major compactions will cause the Compactor to see which columns already have 
> stats and run analyze command for those columns.  If compacting a partition 
> then stats for that partition will be computed.  If table is not partitioned, 
> then the whole table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7968) Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2

2014-09-03 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7968:
---
Affects Version/s: 0.14.0

> Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-7968
> URL: https://issues.apache.org/jira/browse/HIVE-7968
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
>
> MiniHS2 uses MiniMr. Makes no sense to have two test cases for same setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7968) Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2

2014-09-03 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7968:
---
Description: MiniHS2 uses MiniMr. Makes no sense to have two test cases for 
same setup when JDBC is the client api for HS2.  (was: MiniHS2 uses MiniMr. 
Makes no sense to have two test cases for same setup.)

> Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-7968
> URL: https://issues.apache.org/jira/browse/HIVE-7968
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
>
> MiniHS2 uses MiniMr. Makes no sense to have two test cases for same setup 
> when JDBC is the client api for HS2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7580) Support dynamic partitioning [Spark Branch]

2014-09-03 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-7580:
---
Attachment: HIVE-7580.patch

Patch contains passed test cases.

> Support dynamic partitioning [Spark Branch]
> ---
>
> Key: HIVE-7580
> URL: https://issues.apache.org/jira/browse/HIVE-7580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Chinna Rao Lalam
>  Labels: Spark-M1
> Attachments: HIVE-7580.patch
>
>
> My understanding is that we don't need to do anything special for this. 
> However, this needs to be verified and tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-7968) Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2

2014-09-03 Thread Vaibhav Gumashta (JIRA)

Vaibhav Gumashta created HIVE-7968:
--

 Summary: Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2
 Key: HIVE-7968
 URL: https://issues.apache.org/jira/browse/HIVE-7968
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


MiniHS2 uses MiniMr. Makes no sense to have two test cases for same setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7968) Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2

2014-09-03 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7968:
---
Fix Version/s: 0.14.0

> Merge tests in TestJdbcWithMiniMr with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-7968
> URL: https://issues.apache.org/jira/browse/HIVE-7968
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
>
> MiniHS2 uses MiniMr. Makes no sense to have two test cases for same setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7508) Kerberos support for streaming

2014-09-03 Thread Roshan Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120448#comment-14120448
 ] 

Roshan Naik commented on HIVE-7508:
---

[~leftylev]. Yes Thanks for bringing it up. I will work with [~alangates] on 
updating that.

> Kerberos support for streaming
> --
>
> Key: HIVE-7508
> URL: https://issues.apache.org/jira/browse/HIVE-7508
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: Streaming, TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7508.patch
>
>
> Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7943) hive.security.authorization.createtable.owner.grants is ineffective with Default Authorization

2014-09-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120444#comment-14120444
 ] 

Thejas M Nair commented on HIVE-7943:
-

You can try tracing through the calls made from Hive.createTable to 
CreateTableAutomaticGrant.getUserGrants where it adds the grants to table 
object.
 

> hive.security.authorization.createtable.owner.grants is ineffective with 
> Default Authorization
> --
>
> Key: HIVE-7943
> URL: https://issues.apache.org/jira/browse/HIVE-7943
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Ashu Pachauri
> Attachments: HIVE-7943.1.patch
>
>
> HIVE-6250 separates owner privileges from user privileges. However, Default 
> Authorization does not adapt to the change and table owners do not inherit 
> permissions from the config.
> Steps to Reproduce:
> set hive.security.authorization.enabled=true;
> set hive.security.authorization.createtable.owner.grants=ALL;
> create table temp_table(id int, value string);
> drop table temp_table;
> Above set of operations throw the following error:
> 
> Authorization failed:No privilege 'Drop' found for outputs { 
> database:default, table:temp_table}. Use SHOW GRANT to get more details.
> 14/09/02 17:49:38 ERROR ql.Driver: Authorization failed:No privilege 'Drop' 
> found for outputs { database:default, table:temp_table}. Use SHOW GRANT to 
> get more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7943) hive.security.authorization.createtable.owner.grants is ineffective with Default Authorization

2014-09-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120440#comment-14120440
 ] 

Thejas M Nair commented on HIVE-7943:
-

The description of the configuration also mentions the purpose - "the 
privileges automatically granted to the owner whenever a table gets created." 
This is also the case with use grants configuration.
The purpose hasn't been changed intentionally.

The reason for separating user grants and owner grants was so that the owner 
user is set correctly, when the owner is changed within a session (for ease of 
testing).


> hive.security.authorization.createtable.owner.grants is ineffective with 
> Default Authorization
> --
>
> Key: HIVE-7943
> URL: https://issues.apache.org/jira/browse/HIVE-7943
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Ashu Pachauri
> Attachments: HIVE-7943.1.patch
>
>
> HIVE-6250 separates owner privileges from user privileges. However, Default 
> Authorization does not adapt to the change and table owners do not inherit 
> permissions from the config.
> Steps to Reproduce:
> set hive.security.authorization.enabled=true;
> set hive.security.authorization.createtable.owner.grants=ALL;
> create table temp_table(id int, value string);
> drop table temp_table;
> Above set of operations throw the following error:
> 
> Authorization failed:No privilege 'Drop' found for outputs { 
> database:default, table:temp_table}. Use SHOW GRANT to get more details.
> 14/09/02 17:49:38 ERROR ql.Driver: Authorization failed:No privilege 'Drop' 
> found for outputs { database:default, table:temp_table}. Use SHOW GRANT to 
> get more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7580) Support dynamic partitioning [Spark Branch]

2014-09-03 Thread Chinna Rao Lalam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120438#comment-14120438
 ] 

Chinna Rao Lalam commented on HIVE-7580:


Verified the below tests and all the tests are passed except  load_dyn_part1.q, 
load_dyn_part8.q

{noformat}
load_dyn_part1.q,
load_dyn_part2.q,
load_dyn_part3.q,
load_dyn_part4.q,
load_dyn_part5.q,
load_dyn_part6.q,
load_dyn_part7.q,
load_dyn_part8.q,
load_dyn_part9,
load_dyn_part10.q,
load_dyn_part11.q,
load_dyn_part12.q,
load_dyn_part13.q,
load_dyn_part.14,
load_dyn_part15.q
{noformat}

To enable the tests for dynamic partitions considered below tests(referred from 
tez)

{noformat}
load_dyn_part1.q,
load_dyn_part2.q,
load_dyn_part3.q,
dynpart_sort_optimization.q,
dynpart_sort_opt_vectorization.q
{noformat}

Here the below tests are failing
{noformat}
load_dyn_part1.q,
oad_dyn_part8.q,
dynpart_sort_optimization.q,
dynpart_sort_opt_vectorization.q 
{noformat}

For these 4 test cases we have issues. I will add these tests in those jira's 
and i will work.

{quote}
load_dyn_part1.q,load_dyn_part8.q both the tests contains multi-inserts. Need 
to test after HIVE-7503 is fixed.
{quote}

{quote}
dynpart_sort_opt_vectorization.q is related to vectorization. Need to test 
after HIVE-7794 is fixed.
{quote}

{quote}
dynpart_sort_optimization.q is hitting same exception as HIVE-7843.
{quote}

> Support dynamic partitioning [Spark Branch]
> ---
>
> Key: HIVE-7580
> URL: https://issues.apache.org/jira/browse/HIVE-7580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Chinna Rao Lalam
>  Labels: Spark-M1
>
> My understanding is that we don't need to do anything special for this. 
> However, this needs to be verified and tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7944) current update stats for columns of a partition of a table is not correct

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7944:
---
Component/s: Statistics

> current update stats for columns of a partition of a table is not correct
> -
>
> Key: HIVE-7944
> URL: https://issues.apache.org/jira/browse/HIVE-7944
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.14.0
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
> Fix For: 0.14.0
>
> Attachments: HIVE-7944.1.patch, HIVE-7944.2.patch
>
>
> We worked hard towards faster update stats for columns of a partition of a 
> table previously 
> https://issues.apache.org/jira/browse/HIVE-7736
> and
> https://issues.apache.org/jira/browse/HIVE-7876
> Although there is some improvement, it is only correct in the first run. 
> There will be duplicate column stats later. Thanks to [~ekoifman] 's comments



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7944) current update stats for columns of a partition of a table is not correct

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7944:
---
Affects Version/s: 0.14.0

> current update stats for columns of a partition of a table is not correct
> -
>
> Key: HIVE-7944
> URL: https://issues.apache.org/jira/browse/HIVE-7944
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.14.0
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
> Fix For: 0.14.0
>
> Attachments: HIVE-7944.1.patch, HIVE-7944.2.patch
>
>
> We worked hard towards faster update stats for columns of a partition of a 
> table previously 
> https://issues.apache.org/jira/browse/HIVE-7736
> and
> https://issues.apache.org/jira/browse/HIVE-7876
> Although there is some improvement, it is only correct in the first run. 
> There will be duplicate column stats later. Thanks to [~ekoifman] 's comments



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7944) current update stats for columns of a partition of a table is not correct

2014-09-03 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7944:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Pengcheng!

> current update stats for columns of a partition of a table is not correct
> -
>
> Key: HIVE-7944
> URL: https://issues.apache.org/jira/browse/HIVE-7944
> Project: Hive
>  Issue Type: Bug
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
> Fix For: 0.14.0
>
> Attachments: HIVE-7944.1.patch, HIVE-7944.2.patch
>
>
> We worked hard towards faster update stats for columns of a partition of a 
> table previously 
> https://issues.apache.org/jira/browse/HIVE-7736
> and
> https://issues.apache.org/jira/browse/HIVE-7876
> Although there is some improvement, it is only correct in the first run. 
> There will be duplicate column stats later. Thanks to [~ekoifman] 's comments



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 179 matches

Mail list logo