[jira] [Created] (HIVE-4637) Fix VectorUDAFSum.txt to honor the expected vector column type

2013-05-30 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-4637:
--

 Summary: Fix VectorUDAFSum.txt to honor the expected vector column 
type
 Key: HIVE-4637
 URL: https://issues.apache.org/jira/browse/HIVE-4637
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor


"I think, its a bug in code generation for VectorUDAFSumDouble.
The template VectorUDAFSum.txt, assumes LongColumnVector for input rather than 
having it  replaced by code generation."


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3271) Privilege can be granted by any user(not owner) to any user(even to the same user)

2013-05-30 Thread Unnikrishnan V T (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671119#comment-13671119
 ] 

Unnikrishnan V T commented on HIVE-3271:


It happens regardless of whether the user is a superuser or not.

> Privilege can be granted by any user(not owner) to any user(even to the same 
> user)
> --
>
> Key: HIVE-3271
> URL: https://issues.apache.org/jira/browse/HIVE-3271
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.8.1
>Reporter: Unnikrishnan V T
>Priority: Critical
> Attachments: Screenshot.png
>
>
> I have created two users user 'unni' and user 'sachin'. user unni created a 
> table 'test3' so that user sachin cannot view that table. But user sachin is 
> able to grant all permission to the table test3.
> I have set 
> 1)hive.security.authorization.enabled=true(in hive-site.xml)
> 2)dfs.permissions=true(in hdfs-site.xml)
> 3)dfs.permissions.supergroup=supergroup(in hdfs-site.xml)
> User sachin and user unni are in supergroup group.
> The user sachin is even able to revoke all permissions from the owner of the 
> table user unni.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671117#comment-13671117
 ] 

Navis commented on HIVE-4633:
-

No it's not. it's on MapOperator in hive. Some hadoop versions remove scheme 
from path while making splits. HIVE-4619 is mend for that.

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4616) Simple reconnection support for jdbc2

2013-05-30 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4616:
--

Attachment: HIVE-4616.D10953.2.patch

navis updated the revision "HIVE-4616 [jira] Simple reconnection support for 
jdbc2".

  changed to try reconnect on next invocation, which seemed to be more natural

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D10953

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10953?vs=34035&id=34143#toc

AFFECTED FILES
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java

To: JIRA, navis
Cc: prasadm


> Simple reconnection support for jdbc2
> -
>
> Key: HIVE-4616
> URL: https://issues.apache.org/jira/browse/HIVE-4616
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4616.D10953.1.patch, HIVE-4616.D10953.2.patch
>
>
> jdbc:hive2://localhost:1/db2;autoReconnect=true
> simple reconnection on TransportException. If hiveserver2 has not been 
> shutdown, session could be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread rohithsharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671114#comment-13671114
 ] 

rohithsharma commented on HIVE-4633:


Is this fix is from hadoop? The current patch in hive makes continue . I feel 
handling in hadoop is better.

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread rohithsharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671114#comment-13671114
 ] 

rohithsharma commented on HIVE-4633:


Is this fix is from hadoop? The current patch in hive makes continue . I feel 
handling in hadoop is better.

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4616) Simple reconnection support for jdbc2

2013-05-30 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1367#comment-1367
 ] 

Phabricator commented on HIVE-4616:
---

navis has commented on the revision "HIVE-4616 [jira] Simple reconnection 
support for jdbc2".

INLINE COMMENTS
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java:206 ok.

REVISION DETAIL
  https://reviews.facebook.net/D10953

To: JIRA, navis
Cc: prasadm


> Simple reconnection support for jdbc2
> -
>
> Key: HIVE-4616
> URL: https://issues.apache.org/jira/browse/HIVE-4616
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4616.D10953.1.patch
>
>
> jdbc:hive2://localhost:1/db2;autoReconnect=true
> simple reconnection on TransportException. If hiveserver2 has not been 
> shutdown, session could be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4619) Hive 0.11.0 is not working with pre-cdh3u6 and hadoop-0.23

2013-05-30 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4619:


Description: 
path uris in input split are missing scheme (it's fixed on cdh3u6 and hadoop 
1.0)

{noformat}
2013-05-28 14:34:28,857 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding 
alias data_type to work list for file 
hdfs://qa14:9000/user/hive/warehouse/data_type
2013-05-28 14:34:28,858 ERROR org.apache.hadoop.hive.ql.exec.MapOperator: 
Configuration does not have any alias for path: 
/user/hive/warehouse/data_type/00_0
2013-05-28 14:34:28,875 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-05-28 14:34:28,877 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:387)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path 
are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and 
input path are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:516)
... 23 more
2013-05-28 14:34:28,881 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task
{noformat}

  was:
path uris in input split are missing scheme (it's fixed in cdh3u6)

{noformat}
2013-05-28 14:34:28,857 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding 
alias data_type to work list for file 
hdfs://qa14:9000/user/hive/warehouse/data_type
2013-05-28 14:34:28,858 ERROR org.apache.hadoop.hive.ql.exec.MapOperator: 
Configuration does not have any alias for path: 
/user/hive/warehouse/data_type/00_0
2013-05-28 14:34:28,875 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-05-28 14:34:28,877 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:387)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:3

[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671106#comment-13671106
 ] 

Navis commented on HIVE-4633:
-

I've thought this is happening only on pre CDH-u6 version of hadoop. I should 
update issue description on HIVE-4619.

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread rohithsharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rohithsharma resolved HIVE-4633.


Resolution: Duplicate

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671105#comment-13671105
 ] 

Navis commented on HIVE-4620:
-

Return value of TaskRunner#getTaskID() is not a task id but a task runner id, 
which can be a little confusing. Could you address that? Thanks.

> MR temp directory conflicts in case of parallel execution mode
> --
>
> Key: HIVE-4620
> URL: https://issues.apache.org/jira/browse/HIVE-4620
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Fix For: 0.12.0
>
> Attachments: HIVE-4620-1.patch
>
>
> In parallel query execution mode, all the parallel running task ends up 
> sharing the same temp/scratch directory. This could lead to file conflicts 
> and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread rohithsharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671101#comment-13671101
 ] 

rohithsharma commented on HIVE-4633:


It is working fine with patch :-) I duplicate this defect. 

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk

2013-05-30 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4636:
--

Attachment: HIVE-4636.D11013.1.patch

navis requested code review of "HIVE-4636 [jira] Failing on 
TestSemanticAnalysis.testAddReplaceCols in trunk".

Reviewers: JIRA

HIVE-4636 Failing on TestSemanticAnalysis.testAddReplaceCols in trunk

Seemed regression from HIVE-4475.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11013

AFFECTED FILES
  hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestSemanticAnalysis.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26289/

To: JIRA, navis


> Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
> ---
>
> Key: HIVE-4636
> URL: https://issues.apache.org/jira/browse/HIVE-4636
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 0.12.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-4636.D11013.1.patch
>
>
> Seemed regression from HIVE-4475. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk

2013-05-30 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4636:


Status: Patch Available  (was: Open)

> Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
> ---
>
> Key: HIVE-4636
> URL: https://issues.apache.org/jira/browse/HIVE-4636
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 0.12.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
>
> Seemed regression from HIVE-4475. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk

2013-05-30 Thread Navis (JIRA)
Navis created HIVE-4636:
---

 Summary: Failing on TestSemanticAnalysis.testAddReplaceCols in 
trunk
 Key: HIVE-4636
 URL: https://issues.apache.org/jira/browse/HIVE-4636
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 0.12.0
Reporter: Navis
Assignee: Navis
Priority: Trivial


Seemed regression from HIVE-4475. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4618) show create table creating unusable DDL when field delimiter is \001

2013-05-30 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671073#comment-13671073
 ] 

Shreepadma Venugopalan commented on HIVE-4618:
--

LGTM. +1 (non-binding).

> show create table creating unusable DDL when field delimiter is \001
> 
>
> Key: HIVE-4618
> URL: https://issues.apache.org/jira/browse/HIVE-4618
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
> Environment: CDH4.2
> Hive 0.10
>Reporter: Johndee Burks
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4618.D11007.1.patch
>
>
> When including a "fields terminated by" in the create statement. If the 
> delimiter is preceded by a \001, hive turns this into \u0001 which is 
> correct. However it then gives you a ddl that does not work because the 
> parser changes the \u0001 into u0001. 
> Example: 
> hive> create table j1 (a string) row format delimited fields terminated by 
> '\001';
> hive> show create table j1;
> CREATE  TABLE j1(
>   a string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\u0001'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1369664999')
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim \u0001
>   serialization.format\u0001
> hive> drop table j1;
> hive> CREATE  TABLE j1(
> >   a string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\u0001'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.mapred.TextInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> > LOCATION
> >   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> > TBLPROPERTIES (
> >   'transient_lastDdlTime'='1369664999');
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim u0001
>   serialization.formatu0001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671072#comment-13671072
 ] 

Navis commented on HIVE-4633:
-

Could you try patch in HIVE-4619?

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4435: Column stats: Distinct value estimator should use hash functions that are pairwise independent

2013-05-30 Thread Shreepadma Venugopalan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10841/
---

(Updated May 31, 2013, 2:24 a.m.)


Review request for hive, Ashutosh Chauhan and Navis Ryu.


Changes
---

Added reviewers.


Description
---

Fixes the FM estimator to use hash functions that are pairwise independent.


This addresses bug HIVE-4435.
https://issues.apache.org/jira/browse/HIVE-4435


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
 69e6f46 

Diff: https://reviews.apache.org/r/10841/diff/


Testing
---

The estimates are within the expected error after this fix. Tested on TPCH of 
varying sizes.


Thanks,

Shreepadma Venugopalan



[jira] [Updated] (HIVE-4618) show create table creating unusable DDL when field delimiter is \001

2013-05-30 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4618:
--

Attachment: HIVE-4618.D11007.1.patch

navis requested code review of "HIVE-4618 [jira] show create table creating 
unusable DDL when field delimiter is \001".

Reviewers: JIRA

HIVE-4618 show create table creating unusable DDL when field delimiter is \001

When including a "fields terminated by" in the create statement. If the 
delimiter is preceded by a \001, hive turns this into \u0001 which is correct. 
However it then gives you a ddl that does not work because the parser changes 
the \u0001 into u0001.

Example:

hive> create table j1 (a string) row format delimited fields terminated by 
'\001';

hive> show create table j1;
CREATE  TABLE j1(
  a string)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\u0001'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
TBLPROPERTIES (
  'transient_lastDdlTime'='1369664999')

hive> desc formatted j1;
…shortened to save space
Storage Desc Params:
field.delim \u0001
serialization.format\u0001

hive> drop table j1;

hive> CREATE  TABLE j1(
>   a string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\u0001'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1369664999');

hive> desc formatted j1;
…shortened to save space
Storage Desc Params:
field.delim u0001
serialization.formatu0001

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11007

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java
  ql/src/test/queries/clientpositive/unicode_notation.q
  ql/src/test/results/clientpositive/unicode_notation.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26277/

To: JIRA, navis


> show create table creating unusable DDL when field delimiter is \001
> 
>
> Key: HIVE-4618
> URL: https://issues.apache.org/jira/browse/HIVE-4618
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
> Environment: CDH4.2
> Hive 0.10
>Reporter: Johndee Burks
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4618.D11007.1.patch
>
>
> When including a "fields terminated by" in the create statement. If the 
> delimiter is preceded by a \001, hive turns this into \u0001 which is 
> correct. However it then gives you a ddl that does not work because the 
> parser changes the \u0001 into u0001. 
> Example: 
> hive> create table j1 (a string) row format delimited fields terminated by 
> '\001';
> hive> show create table j1;
> CREATE  TABLE j1(
>   a string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\u0001'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1369664999')
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim \u0001
>   serialization.format\u0001
> hive> drop table j1;
> hive> CREATE  TABLE j1(
> >   a string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\u0001'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.mapred.TextInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> > LOCATION
> >   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> > TBLPROPERTIES (
> >   'transient_lastDdlTime'='1369664999');
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim u0001
>   serialization.formatu0001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4618) show create table creating unusable DDL when field delimiter is \001

2013-05-30 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4618:


Status: Patch Available  (was: Open)

> show create table creating unusable DDL when field delimiter is \001
> 
>
> Key: HIVE-4618
> URL: https://issues.apache.org/jira/browse/HIVE-4618
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
> Environment: CDH4.2
> Hive 0.10
>Reporter: Johndee Burks
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4618.D11007.1.patch
>
>
> When including a "fields terminated by" in the create statement. If the 
> delimiter is preceded by a \001, hive turns this into \u0001 which is 
> correct. However it then gives you a ddl that does not work because the 
> parser changes the \u0001 into u0001. 
> Example: 
> hive> create table j1 (a string) row format delimited fields terminated by 
> '\001';
> hive> show create table j1;
> CREATE  TABLE j1(
>   a string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\u0001'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1369664999')
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim \u0001
>   serialization.format\u0001
> hive> drop table j1;
> hive> CREATE  TABLE j1(
> >   a string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\u0001'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.mapred.TextInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> > LOCATION
> >   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> > TBLPROPERTIES (
> >   'transient_lastDdlTime'='1369664999');
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim u0001
>   serialization.formatu0001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671062#comment-13671062
 ] 

Navis commented on HIVE-4620:
-

Failing one test but it seemed not related to this. I'll check on that first.

> MR temp directory conflicts in case of parallel execution mode
> --
>
> Key: HIVE-4620
> URL: https://issues.apache.org/jira/browse/HIVE-4620
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Fix For: 0.12.0
>
> Attachments: HIVE-4620-1.patch
>
>
> In parallel query execution mode, all the parallel running task ends up 
> sharing the same temp/scratch directory. This could lead to file conflicts 
> and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4618) show create table creating unusable DDL when field delimiter is \001

2013-05-30 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-4618:
---

Assignee: Navis

> show create table creating unusable DDL when field delimiter is \001
> 
>
> Key: HIVE-4618
> URL: https://issues.apache.org/jira/browse/HIVE-4618
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
> Environment: CDH4.2
> Hive 0.10
>Reporter: Johndee Burks
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4618.D11007.1.patch
>
>
> When including a "fields terminated by" in the create statement. If the 
> delimiter is preceded by a \001, hive turns this into \u0001 which is 
> correct. However it then gives you a ddl that does not work because the 
> parser changes the \u0001 into u0001. 
> Example: 
> hive> create table j1 (a string) row format delimited fields terminated by 
> '\001';
> hive> show create table j1;
> CREATE  TABLE j1(
>   a string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\u0001'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1369664999')
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim \u0001
>   serialization.format\u0001
> hive> drop table j1;
> hive> CREATE  TABLE j1(
> >   a string)
> > ROW FORMAT DELIMITED
> >   FIELDS TERMINATED BY '\u0001'
> > STORED AS INPUTFORMAT
> >   'org.apache.hadoop.mapred.TextInputFormat'
> > OUTPUTFORMAT
> >   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> > LOCATION
> >   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> > TBLPROPERTIES (
> >   'transient_lastDdlTime'='1369664999');
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
>   field.delim u0001
>   serialization.formatu0001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4616) Simple reconnection support for jdbc2

2013-05-30 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671000#comment-13671000
 ] 

Phabricator commented on HIVE-4616:
---

prasadm has commented on the revision "HIVE-4616 [jira] Simple reconnection 
support for jdbc2".

  LGTM. A minor nit.

  +1 (non-binding)

INLINE COMMENTS
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java:206 Nit: Might be 
helpful to add a message that the stacktrace is from reconnection attempt

REVISION DETAIL
  https://reviews.facebook.net/D10953

To: JIRA, navis
Cc: prasadm


> Simple reconnection support for jdbc2
> -
>
> Key: HIVE-4616
> URL: https://issues.apache.org/jira/browse/HIVE-4616
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4616.D10953.1.patch
>
>
> jdbc:hive2://localhost:1/db2;autoReconnect=true
> simple reconnection on TransportException. If hiveserver2 has not been 
> shutdown, session could be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4635) Invalid query parsing when handling order by on an aliased column

2013-05-30 Thread Hitesh Shah (JIRA)
Hitesh Shah created HIVE-4635:
-

 Summary: Invalid query parsing when handling order by on an 
aliased column
 Key: HIVE-4635
 URL: https://issues.apache.org/jira/browse/HIVE-4635
 Project: Hive
  Issue Type: Bug
Reporter: Hitesh Shah


Assuming simple table src1, src2:

create table src1 (key int, value string);
create table src2 (key int, value string);

Ordering by s2.key gives an error:

hive>SELECT s2.key, count(distinct s2.value) as cnt FROM src1 s1 join src2 s2 
on (s1.key = s2.key) GROUP BY s2.key ORDER BY s2.key;
FAILED: SemanticException [Error 10004]: Line 1:117 Invalid table alias or 
column reference 's2': (possible column names are: key, cnt)

Ordering by key allows the hive query to run. 

However, if I select both s1.key and s2.key:

hive> SELECT s1.key, s2.key, count(distinct s2.value) as cnt FROM src1 s1 join 
src2 s2 on (s1.key = s2.key) GROUP BY s2.key, s1.key ORDER BY s2.key; 
FAILED: SemanticException [Error 10004]: Line 1:133 Invalid table alias or 
column reference 's2': (possible column names are: key, cnt)

Ordering by key in the above scenario allows the job to run but there is no 
indication which column is actually being used to order the results. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2304) Support PreparedStatement.setObject

2013-05-30 Thread Robert Roland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670810#comment-13670810
 ] 

Robert Roland commented on HIVE-2304:
-

Can this get into a Hive release soon, please?


> Support PreparedStatement.setObject
> ---
>
> Key: HIVE-2304
> URL: https://issues.apache.org/jira/browse/HIVE-2304
> Project: Hive
>  Issue Type: Sub-task
>  Components: JDBC
>Affects Versions: 0.7.1
>Reporter: Ido Hadanny
>Assignee: Ido Hadanny
>Priority: Minor
> Attachments: HIVE-0.8-SetObject.2.patch.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> PreparedStatement.setObject is important for spring's jdbcTemplate support

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4525) Support timestamps earlier than 1970 and later than 2038

2013-05-30 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HIVE-4525:
-

Status: Patch Available  (was: Open)

> Support timestamps earlier than 1970 and later than 2038
> 
>
> Key: HIVE-4525
> URL: https://issues.apache.org/jira/browse/HIVE-4525
> Project: Hive
>  Issue Type: Bug
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D10755.1.patch, D10755.2.patch
>
>
> TimestampWritable currently serializes timestamps using the lower 31 bits of 
> an int. This does not allow to store timestamps earlier than 1970 or later 
> than a certain point in 2038.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4608) Vectorized UDFs for Timestamp in nanoseconds

2013-05-30 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670741#comment-13670741
 ] 

Eric Hanson commented on HIVE-4608:
---

Please see my comments on review board. 

> Vectorized UDFs for Timestamp in nanoseconds
> 
>
> Key: HIVE-4608
> URL: https://issues.apache.org/jira/browse/HIVE-4608
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
>  Labels: vectorization
> Attachments: 
> 0001-Vectorized-UDFs-for-timestamp-functions-which-accept.patch
>
>
> Vectorized UDFs for timestamp functions which accept long vectors
> VectorUDFYearLong   
> VectorUDFMonthLong
> VectorUDFWeekOfYearLong   
> VectorUDFDayOfMonthLong
> VectorUDFHourLong   
> VectorUDFMinuteLong
> VectorUDFSecondLong   
> VectorUDFUnixTimeStampLong 
> and tests for them against their non-vectorized implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Vectorized Timestamp functions for long nanosecond based timestamps

2013-05-30 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11530/#review21203
---



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java


Sun Java style convention is to have a blank line before all comments. I 
don't personally mind but that's what I've been asked to do :-).



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java


Please use full sentences starting with caps and ending with period.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java


Can you confirm this expression will be constant-folded by the compiler? 
Otherwise this should be evaluated by hand in advance. 



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java


If you know in what cases ms can be negative, can you add that to the 
comment? That seems unusual. 

In think this because if the time is negative (before the epoch) you could 
get negative nanos. So you want to convert before creating the timestamp. Is 
that right? Please elaborate a little in the comment.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java


Would it be possible to re-use a timestamp that belongs to the class here, 
rather than calling new? If so, please do that to speed this up. I think you 
can do what you need with setTime() and setNanos().

Eliminating new() in the inner loop of vector processing tends to speed 
things up a lot.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFWeekOfYearLong.java


Please explain why constant 4 is correct here and what it means.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFYearLong.java


Nice work! Good, compact code to initialize year boundaries. No use of 
new() or calls to heavy calendar methods in the inner loop. Awesome! :-)



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFYearLong.java


can you use minYear and maxYear here instead of literals?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFYearLong.java


can you use minYear + 1 and maxYear instead of literals?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFYearLong.java


Given that this function is moderately heavy anyway (with the binary 
search) I think making it virtual will not slow things down much. 

But if it gets faster we should seriously consider creating a separate 
evaluate method for VectorUDFYearLong and make this a static function to avoid 
virtual method call overhead.




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFYearLong.java


Did you consider calculating approximate year using something like

approxYear = yearBase + (int)(time / nanosPerYear) - 1; 

and then linearly search forward to find year boundary? I wonder if that 
would be faster than binary search.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTimestampExpressions.java


Overall this is a good set of unit tests! It's quite comprehensive. Thanks.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTimestampExpressions.java


Can you comment this function to explain how you are using long[] inputs? I 
think I understand but a comment would help.





ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTimestampExpressions.java


if inputs.length == 1 (which I think it often does in your tests), then I % 
1 is always 0. So you are always loading up input vectors with all 0. Is there 
a reason for this? If so, please explain, or if not, consider a wider range of 
inputs.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTimestampExpressions.java


what does /*begin-macro*/ mean?



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTimestampExpressions.java


Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #388

2013-05-30 Thread Apache Jenkins Server
See 



[jira] [Commented] (HIVE-3271) Privilege can be granted by any user(not owner) to any user(even to the same user)

2013-05-30 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670610#comment-13670610
 ] 

Mikhail Antonov commented on HIVE-3271:
---

Just to make sure - does that happen for you when user "sachin" is _not_ in 
supergroup, or regardless of whether he is in supergroup or not?

> Privilege can be granted by any user(not owner) to any user(even to the same 
> user)
> --
>
> Key: HIVE-3271
> URL: https://issues.apache.org/jira/browse/HIVE-3271
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.8.1
>Reporter: Unnikrishnan V T
>Priority: Critical
> Attachments: Screenshot.png
>
>
> I have created two users user 'unni' and user 'sachin'. user unni created a 
> table 'test3' so that user sachin cannot view that table. But user sachin is 
> able to grant all permission to the table test3.
> I have set 
> 1)hive.security.authorization.enabled=true(in hive-site.xml)
> 2)dfs.permissions=true(in hdfs-site.xml)
> 3)dfs.permissions.supergroup=supergroup(in hdfs-site.xml)
> User sachin and user unni are in supergroup group.
> The user sachin is even able to revoke all permissions from the owner of the 
> table user unni.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3271) Privilege can be granted by any user(not owner) to any user(even to the same user)

2013-05-30 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670612#comment-13670612
 ] 

Mikhail Antonov commented on HIVE-3271:
---

Curious if anyone has looked into this bug and decided on it's severity? Bug is 
almost 1 year old.

> Privilege can be granted by any user(not owner) to any user(even to the same 
> user)
> --
>
> Key: HIVE-3271
> URL: https://issues.apache.org/jira/browse/HIVE-3271
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security
>Affects Versions: 0.8.1
>Reporter: Unnikrishnan V T
>Priority: Critical
> Attachments: Screenshot.png
>
>
> I have created two users user 'unni' and user 'sachin'. user unni created a 
> table 'test3' so that user sachin cannot view that table. But user sachin is 
> able to grant all permission to the table test3.
> I have set 
> 1)hive.security.authorization.enabled=true(in hive-site.xml)
> 2)dfs.permissions=true(in hdfs-site.xml)
> 3)dfs.permissions.supergroup=supergroup(in hdfs-site.xml)
> User sachin and user unni are in supergroup group.
> The user sachin is even able to revoke all permissions from the owner of the 
> table user unni.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4436) hive.exec.parallel=true doesn't work on hadoop-2

2013-05-30 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670603#comment-13670603
 ] 

Owen O'Malley commented on HIVE-4436:
-

+1, looks good.

> hive.exec.parallel=true doesn't work on hadoop-2
> 
>
> Key: HIVE-4436
> URL: https://issues.apache.org/jira/browse/HIVE-4436
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.12.0
> Environment: Ubuntu LXC (hive-trunk), CDH 4 on Debian
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-4436.patch, HIVE-4436-test.tgz, parallel_sorted.q
>
>
> While running a hive query with multiple independent stages, 
> hive.exec.parallel is a valid optimization to use.
> The query tested has 3 MR jobs - the first job is the root dependency and the 
> 2 further job depend on the first one.
> When hive.exec.parallel is turned on, the job fails with the following 
> exception
> {code}
> java.io.IOException: java.lang.InterruptedException
>   at org.apache.hadoop.ipc.Client.call(Client.java:1214)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>   at $Proxy12.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>   at $Proxy12.mkdirs(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:447)
>   at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2165)
>   at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:544)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1916)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.createTmpDirs(ExecDriver.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:444)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
> Caused by: java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279)
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:921)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1208)
> {code}
> The query plan is as follows
> {code}
>   Stage-9 is a root stage
>   Stage-8 depends on stages: Stage-9
>   Stage-3 depends on stages: Stage-8
>   Stage-0 depends on stages: Stage-3
>   Stage-4 depends on stages: Stage-0
>   Stage-5 depends on stages: Stage-8
>   Stage-1 depends on stages: Stage-5
>   Stage-6 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-9
> Map Reduce Local Work
>   Stage: Stage-8
> Map Reduce
> Map Join Operator
>   Stage: Stage-3
> Map Reduce
>   Stage: Stage-0
> Move Operator
>   Stage: Stage-4
> Stats-Aggr Operator
>   Stage: Stage-5
> Map Reduce
>   Stage: Stage-1
> Move Operator
>   Stage: Stage-6
> Stats-Aggr Operator
> {code}
> -I cannot conclude that this is purely a hive issue, will file a bug on HDFS 
> if that does show up during triage.-
> *Triaged* - set hive.stats.autogather=false; removes the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2013-05-30 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670583#comment-13670583
 ] 

Shreepadma Venugopalan commented on HIVE-4629:
--

@Carl: The proposed addition to TCLIService.thrift is the following new API and 
structs,

{noformat}
// GetLog()
// Fetch operation log from the server corresponding to
// a particular OperationHandle.

struct TGetLogReq {
  // Operation whose log is requested
  1: required TOperationHandle operationHandle
}

struct TGetLogResp {
  1: required TStatus status
  2: required string log
}

service TCLIService {
...
...
TGetLogResp GetLog(1:TGetLogReq req);
}
{noformat}


> HS2 should support an API to retrieve query logs
> 
>
> Key: HIVE-4629
> URL: https://issues.apache.org/jira/browse/HIVE-4629
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4305) Use a single system for dependency resolution

2013-05-30 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670568#comment-13670568
 ] 

Carl Steinbach commented on HIVE-4305:
--

[~alangates]: last I heard you were planning to arrange a meeting between you, 
me, and Travis to go over the details of integrating HCatalog into Hive's 
existing build infrastructure. Is this still going to happen? I also agree that 
getting this resolved is a priority -- the amount of time required to build 
Hive doubled with the addition of HCatalog.


> Use a single system for dependency resolution
> -
>
> Key: HIVE-4305
> URL: https://issues.apache.org/jira/browse/HIVE-4305
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure, HCatalog
>Reporter: Travis Crawford
>Assignee: Carl Steinbach
> Attachments: HIVE-4305.1.wip.patch
>
>
> Both Hive and HCatalog use ant as their build tool. However, Hive uses ivy 
> for dependency resolution while HCatalog uses maven-ant-tasks. With the 
> project merge we should converge on a single tool for dependency resolution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2013-05-30 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670559#comment-13670559
 ] 

Carl Steinbach commented on HIVE-4629:
--

@Shreepadma: before making any code changes I think it would be a good idea to 
get feedback on the changes you plan to make to TCLIService and CLIService.

> HS2 should support an API to retrieve query logs
> 
>
> Key: HIVE-4629
> URL: https://issues.apache.org/jira/browse/HIVE-4629
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4436) hive.exec.parallel=true doesn't work on hadoop-2

2013-05-30 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4436:
--

Attachment: parallel_sorted.q

Wrote a test-case, but the bug is not reproducible in TestCliDriver & needs a 
single node hadoop cluster.

The file spray with the multi-insert is within the single reducer, but the 
sorting & bucketing of each of those files spawns off a separate job ending up 
with 4 potentially parallel tasks.

The tasks are being executed in the local mode in a single thread and do not 
have the same timing to trigger the race condition.

> hive.exec.parallel=true doesn't work on hadoop-2
> 
>
> Key: HIVE-4436
> URL: https://issues.apache.org/jira/browse/HIVE-4436
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.12.0
> Environment: Ubuntu LXC (hive-trunk), CDH 4 on Debian
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-4436.patch, HIVE-4436-test.tgz, parallel_sorted.q
>
>
> While running a hive query with multiple independent stages, 
> hive.exec.parallel is a valid optimization to use.
> The query tested has 3 MR jobs - the first job is the root dependency and the 
> 2 further job depend on the first one.
> When hive.exec.parallel is turned on, the job fails with the following 
> exception
> {code}
> java.io.IOException: java.lang.InterruptedException
>   at org.apache.hadoop.ipc.Client.call(Client.java:1214)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>   at $Proxy12.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>   at $Proxy12.mkdirs(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:447)
>   at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2165)
>   at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:544)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1916)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.createTmpDirs(ExecDriver.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:444)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
> Caused by: java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279)
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:921)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1208)
> {code}
> The query plan is as follows
> {code}
>   Stage-9 is a root stage
>   Stage-8 depends on stages: Stage-9
>   Stage-3 depends on stages: Stage-8
>   Stage-0 depends on stages: Stage-3
>   Stage-4 depends on stages: Stage-0
>   Stage-5 depends on stages: Stage-8
>   Stage-1 depends on stages: Stage-5
>   Stage-6 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-9
> Map Reduce Local Work
>   Stage: Stage-8
> Map Reduce
> Map Join Operator
>   Stage: Stage-3
> Map Reduce
>   Stage: Stage-0
> Move Operator
>   Stage: Stage-4
> Stats-Aggr Operator
>   Stage: Stage-5
> Map Reduce
>   Stage: Stage-1
> Move Operator
>   Stage: Stage-6
> Stats-Aggr Operator
> {code}
> -I cannot conclude that this is purely a hive issue, will file a bug on HDFS 
> if that does show up during triage.-
> *Triaged* - set hive.stats.autogather=false; removes the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 2124 - Failure

2013-05-30 Thread Apache Jenkins Server
Changes for Build #2094
[navis] HIVE-4209 Cache evaluation result of deterministic expression and reuse 
it (Navis via namit)


Changes for Build #2095

Changes for Build #2096

Changes for Build #2097
[cws] HIVE-4530. Enforce minmum ant version required in build script (Arup 
Malakar via cws)

[omalley] Preparing RELEASE_NOTES for Hive 0.11.0rc2.


Changes for Build #2098
[omalley] Update release notes for 0.11.0rc2

[omalley] HIVE-4527 Fix eclipse project template (Carl Steinbach via omalley)

[omalley] HIVE-4505 Hive can't load transforms with remote scripts. (Prasad 
Majumdar and Gunther Hagleitner
via omalley)

[omalley] HIVE-4498 TestBeeLineWithArgs.testPositiveScriptFile fails (Thejas 
Nair via omalley)


Changes for Build #2099

Changes for Build #2100

Changes for Build #2101

Changes for Build #2102

Changes for Build #2103
[daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows


Changes for Build #2104
[daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness


Changes for Build #2105
[omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions 
(Gunther 
Hagleitner via omalley)

[omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther 
Hagleitner via
omalley)


Changes for Build #2106

Changes for Build #2107
[omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there 
are many 
partitions (Gopal V via omalley)


Changes for Build #2108

Changes for Build #2109

Changes for Build #2110

Changes for Build #2111
[omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther 
Hagleitner
via omalley)

[omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther 
Hagleitner via
omalley)


Changes for Build #2112

Changes for Build #2113
[gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates)


Changes for Build #2114
[gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table 
formatting (gates)


Changes for Build #2115

Changes for Build #2116
[navis] JDBC2: HiveDriver should not throw RuntimeException when passed an 
invalid URL (Richard Ding via Navis)


Changes for Build #2117

Changes for Build #2118

Changes for Build #2119

Changes for Build #2120

Changes for Build #2121
[navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to 
un-selected join keys in columnExprMap (Yin Huai via Navis)

[navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when 
mapjoin.mapreduce=true (Gunther Hagleitner via Navis)


Changes for Build #2122

Changes for Build #2123

Changes for Build #2124
[gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2124)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2124/ to 
view the results.

[jira] [Commented] (HIVE-4543) Broken link in HCat 0.5 doc (Reader and Writer Interfaces)

2013-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670533#comment-13670533
 ] 

Hudson commented on HIVE-4543:
--

Integrated in Hive-trunk-h0.21 #2124 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2124/])
HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates) (Revision 1487654)

 Result = FAILURE
gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1487654
Files : 
* /hive/trunk/hcatalog/src/docs/src/documentation/content/xdocs/readerwriter.xml


> Broken link in HCat 0.5 doc (Reader and Writer Interfaces)
> --
>
> Key: HIVE-4543
> URL: https://issues.apache.org/jira/browse/HIVE-4543
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Lefty Leverenz
>Assignee: Lefty Leverenz
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4543.1.patch, HIVE-4543.2.patch, HIVE-4543_3.patch, 
> readerwriter.html, readerwriter.pdf, readerwriter.xml
>
>
> Due to HCatalog's move from the incubator to Hive, a link to 
> TestReaderWriter.java is broken at the end of the "Reader and Writer 
> Interfaces" doc for HCat 0.5 
> ([here|http://hive.apache.org/docs/hcat_r0.5.0/readerwriter.html#Complete+Example+Program]).
>   This should be fixed in the html and pdf files.
> Thanks to Himanshu Bari for pointing this out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-4629) HS2 should support an API to retrieve query logs

2013-05-30 Thread Shreepadma Venugopalan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4629 started by Shreepadma Venugopalan.

> HS2 should support an API to retrieve query logs
> 
>
> Key: HIVE-4629
> URL: https://issues.apache.org/jira/browse/HIVE-4629
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4628) HS2 sessionmanager should synchronize the call to insert/remove session objects from session hash map

2013-05-30 Thread Shreepadma Venugopalan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan resolved HIVE-4628.
--

Resolution: Not A Problem

> HS2 sessionmanager should synchronize the call to insert/remove session 
> objects from session hash map
> -
>
> Key: HIVE-4628
> URL: https://issues.apache.org/jira/browse/HIVE-4628
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.11.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Critical
>
> HS2 SessionManager maintains a hashmap of active HS2 sessions. However, 
> insert and deletes to this hashmap is not synchronized. A consequence of this 
> is a racing thread could overwrite a valid session object in the hashmap and 
> we could end up losing a session!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4628) HS2 sessionmanager should synchronize the call to insert/remove session objects from session hash map

2013-05-30 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670516#comment-13670516
 ] 

Shreepadma Venugopalan commented on HIVE-4628:
--

Good catch Tejas. Looks like this is not an issue any more. I've set the 
appropriate status.

> HS2 sessionmanager should synchronize the call to insert/remove session 
> objects from session hash map
> -
>
> Key: HIVE-4628
> URL: https://issues.apache.org/jira/browse/HIVE-4628
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.11.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Critical
>
> HS2 SessionManager maintains a hashmap of active HS2 sessions. However, 
> insert and deletes to this hashmap is not synchronized. A consequence of this 
> is a racing thread could overwrite a valid session object in the hashmap and 
> we could end up losing a session!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4628) HS2 sessionmanager should synchronize the call to insert/remove session objects from session hash map

2013-05-30 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670468#comment-13670468
 ] 

Thejas M Nair commented on HIVE-4628:
-

Are you talking about handleToSession in SessionManager ? There is a 
"synchronized(sessionMapLock) " block around access to it.


> HS2 sessionmanager should synchronize the call to insert/remove session 
> objects from session hash map
> -
>
> Key: HIVE-4628
> URL: https://issues.apache.org/jira/browse/HIVE-4628
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.11.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Critical
>
> HS2 SessionManager maintains a hashmap of active HS2 sessions. However, 
> insert and deletes to this hashmap is not synchronized. A consequence of this 
> is a racing thread could overwrite a valid session object in the hashmap and 
> we could end up losing a session!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4634) Add hasNull flag to ORC index

2013-05-30 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-4634:
---

 Summary: Add hasNull flag to ORC index
 Key: HIVE-4634
 URL: https://issues.apache.org/jira/browse/HIVE-4634
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Reporter: Owen O'Malley


It would help the predicate pushdown, if the index recorded whether each 10k 
rows had null values in that column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4631) Hive doesn't flush errors of multi-stage queries to STDERR until end of query

2013-05-30 Thread Brad Ruderman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Ruderman updated HIVE-4631:


Priority: Major  (was: Minor)

> Hive doesn't flush errors of multi-stage queries to STDERR until end of query
> -
>
> Key: HIVE-4631
> URL: https://issues.apache.org/jira/browse/HIVE-4631
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0
>Reporter: Brad Ruderman
>
> When running a multi-stage query from the CLI for example 
> say the query is:
> SELECT  COUNT(*) FROM ( 
> SELECT user from users
> where datetime = 05-10-2013
> UNION ALL
> SELECT user from users
> where datetime = 05-10-2013 
> ) a
> and the command to execute it:
> hive -e "" 
> If one of the jobs fails for example the 1st of 3, it should immediately 
> write that to the standard error and flush it. Therefore if you are scripting 
> a hive query you can catch the exception and terminate the other jobs rather 
> then having to wait for them to run to handle the exception from the first 
> job.
> See here for more details:
> http://stackoverflow.com/questions/16825066/hive-flush-errors-of-multi-stage-jobs-to-stderr-in-python
> Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4603) VectorSelectOperator projections change the index of columns for subsequent operators.

2013-05-30 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-4603.
-

   Resolution: Fixed
Fix Version/s: vectorization-branch

I just committed this to the vectorization branch. Thanks, Jitendra!

> VectorSelectOperator projections change the index of columns for subsequent 
> operators.
> --
>
> Key: HIVE-4603
> URL: https://issues.apache.org/jira/browse/HIVE-4603
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Fix For: vectorization-branch
>
> Attachments: HIVE-4603.1.patch, HIVE-4603.2.patch
>
>
> VectorSelectOperator projections change the index of columns for subsequent 
> operators.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4452) Add support for COUNT(*) in vector aggregates

2013-05-30 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4452:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Remus!

> Add support for COUNT(*) in vector aggregates
> -
>
> Key: HIVE-4452
> URL: https://issues.apache.org/jira/browse/HIVE-4452
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
> Fix For: vectorization-branch
>
> Attachments: HIVE-4452.0.patch.txt
>
>
> COUNT(*) must count NULL values

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4602) Enable running all hive e2e tests under vectorization

2013-05-30 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4602:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Tony!

> Enable running all hive e2e tests under vectorization
> -
>
> Key: HIVE-4602
> URL: https://issues.apache.org/jira/browse/HIVE-4602
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Tony Murphy
>Assignee: Tony Murphy
> Fix For: vectorization-branch
>
> Attachments: HIVE-4602.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4595) Support strings in GROUP BY keys

2013-05-30 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4595:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Remus!

> Support strings in GROUP BY keys
> 
>
> Key: HIVE-4595
> URL: https://issues.apache.org/jira/browse/HIVE-4595
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
> Fix For: vectorization-branch
>
> Attachments: HIVE-4595.patch.2.txt, HIVE-4595.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4511) Vectorized reader support for Byte Boolean and Timestamp.

2013-05-30 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4511:


   Resolution: Fixed
Fix Version/s: vectorization-branch
   Status: Resolved  (was: Patch Available)

I just committed this to the vectorization branch. Thanks, Sarvesh!

> Vectorized reader support for Byte Boolean and Timestamp.
> -
>
> Key: HIVE-4511
> URL: https://issues.apache.org/jira/browse/HIVE-4511
> Project: Hive
>  Issue Type: Sub-task
>  Components: File Formats
>Reporter: Jitendra Nath Pandey
>Assignee: Sarvesh Sakalanaga
> Fix For: vectorization-branch
>
> Attachments: Hive-4511.0.patch, Hive-4511.1.patch
>
>
> Byte, boolean and timestamp support should be added to vectorized orc reader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread rohithsharma (JIRA)
rohithsharma created HIVE-4633:
--

 Summary: MR Jobs execution failed.
 Key: HIVE-4633
 URL: https://issues.apache.org/jira/browse/HIVE-4633
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hive-0.11.0 + Hadoop-0.23 
Reporter: rohithsharma
Priority: Critical


I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR jobs 
got failed. When I look into logs, below exception is thrown in hive.log

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path 
are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
{noformat}




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4633) MR Jobs execution failed.

2013-05-30 Thread rohithsharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670338#comment-13670338
 ] 

rohithsharma commented on HIVE-4633:


Complete exception trace is as below

{noformat}
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:402)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:154)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:149)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path 
are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and 
input path are inconsistent
at org.apache.hadoop.hive.q
{noformat}

> MR Jobs execution failed.
> -
>
> Key: HIVE-4633
> URL: https://issues.apache.org/jira/browse/HIVE-4633
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive-0.11.0 + Hadoop-0.23 
>Reporter: rohithsharma
>Priority: Critical
>
> I am running Hive-0.11.0 + Hadoop-0.23 version. All queries that spawn MR 
> jobs got failed. When I look into logs, below exception is thrown in hive.log
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:522)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4543) Broken link in HCat 0.5 doc (Reader and Writer Interfaces)

2013-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670244#comment-13670244
 ] 

Hudson commented on HIVE-4543:
--

Integrated in Hive-trunk-hadoop2 #217 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/217/])
HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates) (Revision 1487654)

 Result = FAILURE
gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1487654
Files : 
* /hive/trunk/hcatalog/src/docs/src/documentation/content/xdocs/readerwriter.xml


> Broken link in HCat 0.5 doc (Reader and Writer Interfaces)
> --
>
> Key: HIVE-4543
> URL: https://issues.apache.org/jira/browse/HIVE-4543
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Lefty Leverenz
>Assignee: Lefty Leverenz
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4543.1.patch, HIVE-4543.2.patch, HIVE-4543_3.patch, 
> readerwriter.html, readerwriter.pdf, readerwriter.xml
>
>
> Due to HCatalog's move from the incubator to Hive, a link to 
> TestReaderWriter.java is broken at the end of the "Reader and Writer 
> Interfaces" doc for HCat 0.5 
> ([here|http://hive.apache.org/docs/hcat_r0.5.0/readerwriter.html#Complete+Example+Program]).
>   This should be fixed in the html and pdf files.
> Thanks to Himanshu Bari for pointing this out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670160#comment-13670160
 ] 

Navis commented on HIVE-4620:
-

Looks good to me. running test

> MR temp directory conflicts in case of parallel execution mode
> --
>
> Key: HIVE-4620
> URL: https://issues.apache.org/jira/browse/HIVE-4620
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Fix For: 0.12.0
>
> Attachments: HIVE-4620-1.patch
>
>
> In parallel query execution mode, all the parallel running task ends up 
> sharing the same temp/scratch directory. This could lead to file conflicts 
> and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4436) hive.exec.parallel=true doesn't work on hadoop-2

2013-05-30 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670154#comment-13670154
 ] 

Gopal V commented on HIVE-4436:
---

[~navis] will add a clientpositive testcase and will update the reviewboard 
entry - https://reviews.apache.org/r/10993/

> hive.exec.parallel=true doesn't work on hadoop-2
> 
>
> Key: HIVE-4436
> URL: https://issues.apache.org/jira/browse/HIVE-4436
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.12.0
> Environment: Ubuntu LXC (hive-trunk), CDH 4 on Debian
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-4436.patch, HIVE-4436-test.tgz
>
>
> While running a hive query with multiple independent stages, 
> hive.exec.parallel is a valid optimization to use.
> The query tested has 3 MR jobs - the first job is the root dependency and the 
> 2 further job depend on the first one.
> When hive.exec.parallel is turned on, the job fails with the following 
> exception
> {code}
> java.io.IOException: java.lang.InterruptedException
>   at org.apache.hadoop.ipc.Client.call(Client.java:1214)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>   at $Proxy12.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>   at $Proxy12.mkdirs(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:447)
>   at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2165)
>   at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:544)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1916)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.createTmpDirs(ExecDriver.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:444)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
> Caused by: java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279)
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:921)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1208)
> {code}
> The query plan is as follows
> {code}
>   Stage-9 is a root stage
>   Stage-8 depends on stages: Stage-9
>   Stage-3 depends on stages: Stage-8
>   Stage-0 depends on stages: Stage-3
>   Stage-4 depends on stages: Stage-0
>   Stage-5 depends on stages: Stage-8
>   Stage-1 depends on stages: Stage-5
>   Stage-6 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-9
> Map Reduce Local Work
>   Stage: Stage-8
> Map Reduce
> Map Join Operator
>   Stage: Stage-3
> Map Reduce
>   Stage: Stage-0
> Move Operator
>   Stage: Stage-4
> Stats-Aggr Operator
>   Stage: Stage-5
> Map Reduce
>   Stage: Stage-1
> Move Operator
>   Stage: Stage-6
> Stats-Aggr Operator
> {code}
> -I cannot conclude that this is purely a hive issue, will file a bug on HDFS 
> if that does show up during triage.-
> *Triaged* - set hive.stats.autogather=false; removes the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4436) hive.exec.parallel=true doesn't work on hadoop-2

2013-05-30 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670145#comment-13670145
 ] 

Navis commented on HIVE-4436:
-

[~gopalv] Could you include a test case and make a phabricator or review board 
entry? Thanks.

> hive.exec.parallel=true doesn't work on hadoop-2
> 
>
> Key: HIVE-4436
> URL: https://issues.apache.org/jira/browse/HIVE-4436
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.12.0
> Environment: Ubuntu LXC (hive-trunk), CDH 4 on Debian
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-4436.patch, HIVE-4436-test.tgz
>
>
> While running a hive query with multiple independent stages, 
> hive.exec.parallel is a valid optimization to use.
> The query tested has 3 MR jobs - the first job is the root dependency and the 
> 2 further job depend on the first one.
> When hive.exec.parallel is turned on, the job fails with the following 
> exception
> {code}
> java.io.IOException: java.lang.InterruptedException
>   at org.apache.hadoop.ipc.Client.call(Client.java:1214)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>   at $Proxy12.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>   at $Proxy12.mkdirs(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:447)
>   at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2165)
>   at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:544)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1916)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.createTmpDirs(ExecDriver.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:444)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
> Caused by: java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279)
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:921)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1208)
> {code}
> The query plan is as follows
> {code}
>   Stage-9 is a root stage
>   Stage-8 depends on stages: Stage-9
>   Stage-3 depends on stages: Stage-8
>   Stage-0 depends on stages: Stage-3
>   Stage-4 depends on stages: Stage-0
>   Stage-5 depends on stages: Stage-8
>   Stage-1 depends on stages: Stage-5
>   Stage-6 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-9
> Map Reduce Local Work
>   Stage: Stage-8
> Map Reduce
> Map Join Operator
>   Stage: Stage-3
> Map Reduce
>   Stage: Stage-0
> Move Operator
>   Stage: Stage-4
> Stats-Aggr Operator
>   Stage: Stage-5
> Map Reduce
>   Stage: Stage-1
> Move Operator
>   Stage: Stage-6
> Stats-Aggr Operator
> {code}
> -I cannot conclude that this is purely a hive issue, will file a bug on HDFS 
> if that does show up during triage.-
> *Triaged* - set hive.stats.autogather=false; removes the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira