[jira] [Updated] (HIVE-4120) Implement decimal encoding for ORC

2013-04-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4120:
--

Attachment: HIVE-4120.D10047.2.patch

omalley updated the revision HIVE-4120 [jira] Implement decimal encoding for 
ORC.

  updated to deal with limited decimal lengths

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D10047

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10047?vs=31431id=31473#toc

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java
  ql/src/gen/protobuf/gen-java/org/apache/hadoop/hive/ql/io/orc/OrcProto.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/DecimalColumnStatistics.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestSerializationUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/SerializationUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
  ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestStringRedBlackTree.java
  ql/src/test/queries/clientpositive/decimal_4.q
  ql/src/test/results/clientpositive/decimal_4.q.out

To: JIRA, omalley


 Implement decimal encoding for ORC
 --

 Key: HIVE-4120
 URL: https://issues.apache.org/jira/browse/HIVE-4120
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4120.D10047.1.patch, HIVE-4120.D10047.2.patch, 
 HIVE-4120.D9207.1.patch


 Currently, ORC does not have an encoder for decimal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1953) Hive should process comments in CliDriver

2013-04-09 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-1953:
-

Attachment: HIVE-1953.tests.update.patch

Hi Navis,

I had mentioned when I uploaded the patch that the unit tests needed to be
run with -Doverwrite=true before commit. I had not attached the results for
all the tests. Luckily, however have the results. Attaching them as well.
This needs to go in as well. Basically the tests have some extraneous
comments in the prehook/post hook appearing in the output.

+ Vikram.






-- 
Nothing better than when appreciated for hard work.
-Mark


 Hive should process comments in CliDriver
 -

 Key: HIVE-1953
 URL: https://issues.apache.org/jira/browse/HIVE-1953
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-1953.1.patch, HIVE-1953.2.patch, HIVE-1953.3.patch, 
 HIVE-1953.4.patch, HIVE-1953.tests.update.patch


 If you put commend before a set command, it will faile. 
 Like this:
 -- TestSerDe is a user defined serde where the default delimiter is Ctrl-B
 -- the user is overwriting it with ctrlC
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 Hive should process the comment in CliDriver, and ignore the comment right 
 away, instead of passing it to the downstream processors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1953) Hive should process comments in CliDriver

2013-04-09 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-1953:
-

Attachment: HIVE-1953.tests.update.patch

Hi [~navis]

I had mentioned in the review board request that the patch needs to be run with 
-Doverwrite=true for tests affected by this change. Luckily, I had a full run 
of the tests but hadn't uploaded them for concern that the tests are a moving 
target and the suite of tests can be run before commit with -Doverwrite=true. I 
am attaching the outputs of these tests here for commit as well since they 
cleanly apply against trunk.

Thanks
Vikram.

 Hive should process comments in CliDriver
 -

 Key: HIVE-1953
 URL: https://issues.apache.org/jira/browse/HIVE-1953
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-1953.1.patch, HIVE-1953.2.patch, HIVE-1953.3.patch, 
 HIVE-1953.4.patch, HIVE-1953.tests.update.patch, HIVE-1953.tests.update.patch


 If you put commend before a set command, it will faile. 
 Like this:
 -- TestSerDe is a user defined serde where the default delimiter is Ctrl-B
 -- the user is overwriting it with ctrlC
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 Hive should process the comment in CliDriver, and ignore the comment right 
 away, instead of passing it to the downstream processors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-09 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626276#comment-13626276
 ] 

Phabricator commented on HIVE-3509:
---

njain has commented on the revision HIVE-3509 [jira] Exclusive locks are not 
acquired when using dynamic partitions.

INLINE COMMENTS
  ql/src/test/results/clientpositive/lock3.q.out:73 Something is wrong here ? 
Shouldn't it be EXCLUSIVE -- this is a incomplete output.

REVISION DETAIL
  https://reviews.facebook.net/D10065

To: JIRA, MattMartin
Cc: njain


 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1953) Hive should process comments in CliDriver

2013-04-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626279#comment-13626279
 ] 

Navis commented on HIVE-1953:
-

Regretfully I forgot that test is running via Driver. Running all tests changed 
in these two days.

 Hive should process comments in CliDriver
 -

 Key: HIVE-1953
 URL: https://issues.apache.org/jira/browse/HIVE-1953
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-1953.1.patch, HIVE-1953.2.patch, HIVE-1953.3.patch, 
 HIVE-1953.4.patch, HIVE-1953.tests.update.patch, HIVE-1953.tests.update.patch


 If you put commend before a set command, it will faile. 
 Like this:
 -- TestSerDe is a user defined serde where the default delimiter is Ctrl-B
 -- the user is overwriting it with ctrlC
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 Hive should process the comment in CliDriver, and ignore the comment right 
 away, instead of passing it to the downstream processors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626285#comment-13626285
 ] 

Namit Jain commented on HIVE-3509:
--

comments

 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-09 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626288#comment-13626288
 ] 

Phabricator commented on HIVE-3509:
---

njain has commented on the revision HIVE-3509 [jira] Exclusive locks are not 
acquired when using dynamic partitions.

INLINE COMMENTS
  ql/src/test/results/clientpositive/lock4.q.out:73 same as lock3
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:718 This also needs to be 
fixed

REVISION DETAIL
  https://reviews.facebook.net/D10065

To: JIRA, MattMartin
Cc: njain


 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-4310:


Assignee: Namit Jain

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1953) Hive should process comments in CliDriver

2013-04-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626295#comment-13626295
 ] 

Navis commented on HIVE-1953:
-

Committed, I should be more careful.

 Hive should process comments in CliDriver
 -

 Key: HIVE-1953
 URL: https://issues.apache.org/jira/browse/HIVE-1953
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-1953.1.patch, HIVE-1953.2.patch, HIVE-1953.3.patch, 
 HIVE-1953.4.patch, HIVE-1953.tests.update.patch, HIVE-1953.tests.update.patch


 If you put commend before a set command, it will faile. 
 Like this:
 -- TestSerDe is a user defined serde where the default delimiter is Ctrl-B
 -- the user is overwriting it with ctrlC
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 Hive should process the comment in CliDriver, and ignore the comment right 
 away, instead of passing it to the downstream processors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626296#comment-13626296
 ] 

Namit Jain commented on HIVE-4310:
--

https://reviews.facebook.net/D10071

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


答复: StackOverflowError when add jar using multiple thread cocurrently

2013-04-09 Thread Wangwenli
Hi all,

I created one issue for this problem and provide one patch there, hope someone 
can review it. thks
https://issues.apache.org/jira/browse/HIVE-4317

Regards
Wenli
-邮件原件-
发件人: Wangwenli [mailto:wangwe...@huawei.com] 
发送时间: 2013年4月8日 20:28
收件人: u...@hive.apache.org; dev@hive.apache.org
主题: StackOverflowError when add jar using multiple thread cocurrently

Hi All,
Recently we find that when multiple jdbc connection concurrently add jars, 
hiveserver will throw StackOverflowError when serializeMapRedWork to hdfs, I 
find the relate issue hive-2666 is similar, but I think it missed the 
concurrently scenario.

I find it is because the classloader is changed which will lead to the infinite 
loop. Any one met this issue? Any suggestion?


Regards
Wenli


[jira] [Created] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-09 Thread Gopal V (JIRA)
Gopal V created HIVE-4318:
-

 Summary: OperatorHooks hit performance even when not used
 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V


Operator Hooks inserted into Operator.java cause a performance hit even when it 
is not being used.

For a count(1) query tested with  without the operator hook calls.

{code:title=with}
2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
84.07 sec
Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
OK
28800991
Time taken: 40.407 seconds, Fetched: 1 row(s)
{code}

{code:title=without}
2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
68.48 sec
...
Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
OK
28800991
Time taken: 35.907 seconds, Fetched: 1 row(s)
{code}

The effect is multiplied by the number of operators in the pipeline that has to 
forward the row - the more operators there are the, the slower the query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4317) StackOverflowError when add jar concurrently

2013-04-09 Thread wangwenli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626357#comment-13626357
 ] 

wangwenli commented on HIVE-4317:
-

this issue is similar with hive-2706, hive-2666, pls reference .

 StackOverflowError when add jar concurrently 
 -

 Key: HIVE-4317
 URL: https://issues.apache.org/jira/browse/HIVE-4317
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0, 0.10.0
Reporter: wangwenli
Priority: Minor

 scenario: multiple thread add jar and do select operation by jdbc 
 concurrently , when hiveserver serializeMapRedWork sometimes, it will throw 
 StackOverflowError from XMLEncoder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-09 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4318:
--

Description: 
Operator Hooks inserted into Operator.java cause a performance hit even when it 
is not being used.

For a count(1) query tested with  without the operator hook calls.

{code:title=with}
2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
84.07 sec
Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
OK
28800991
Time taken: 40.407 seconds, Fetched: 1 row(s)
{code}

{code:title=without}
2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
68.48 sec
...
Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
OK
28800991
Time taken: 35.907 seconds, Fetched: 1 row(s)
{code}

The effect is multiplied by the number of operators in the pipeline that has to 
forward the row - the more operators there are the, the slower the query.

The modification made to test this was 

{code:title=Operator.java}
--- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
+++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
@@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
HiveException {
   return;
 }
 OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
tag);
-preProcessCounter();
-enterOperatorHooks(opHookContext);
+//preProcessCounter();
+//enterOperatorHooks(opHookContext);
 processOp(row, tag);
-exitOperatorHooks(opHookContext);
-postProcessCounter();
+//exitOperatorHooks(opHookContext);
+//postProcessCounter();
   }
{code}

  was:
Operator Hooks inserted into Operator.java cause a performance hit even when it 
is not being used.

For a count(1) query tested with  without the operator hook calls.

{code:title=with}
2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
84.07 sec
Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
OK
28800991
Time taken: 40.407 seconds, Fetched: 1 row(s)
{code}

{code:title=without}
2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
68.48 sec
...
Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
OK
28800991
Time taken: 35.907 seconds, Fetched: 1 row(s)
{code}

The effect is multiplied by the number of operators in the pipeline that has to 
forward the row - the more operators there are the, the slower the query.


 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V

 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4317) StackOverflowError when add jar concurrently

2013-04-09 Thread wangwenli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwenli updated HIVE-4317:


Attachment: (was: hive-4317.patch)

 StackOverflowError when add jar concurrently 
 -

 Key: HIVE-4317
 URL: https://issues.apache.org/jira/browse/HIVE-4317
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0, 0.10.0
Reporter: wangwenli
Priority: Minor

 scenario: multiple thread add jar and do select operation by jdbc 
 concurrently , when hiveserver serializeMapRedWork sometimes, it will throw 
 StackOverflowError from XMLEncoder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-09 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-4318:
-

Assignee: Gunther Hagleitner

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner

 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4317) StackOverflowError when add jar concurrently

2013-04-09 Thread wangwenli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwenli updated HIVE-4317:


Status: Patch Available  (was: Open)

 StackOverflowError when add jar concurrently 
 -

 Key: HIVE-4317
 URL: https://issues.apache.org/jira/browse/HIVE-4317
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0, 0.9.0
Reporter: wangwenli
Priority: Minor

 scenario: multiple thread add jar and do select operation by jdbc 
 concurrently , when hiveserver serializeMapRedWork sometimes, it will throw 
 StackOverflowError from XMLEncoder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4317) StackOverflowError when add jar concurrently

2013-04-09 Thread wangwenli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwenli updated HIVE-4317:


Attachment: hive-4317.1.patch

 StackOverflowError when add jar concurrently 
 -

 Key: HIVE-4317
 URL: https://issues.apache.org/jira/browse/HIVE-4317
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0, 0.10.0
Reporter: wangwenli
Priority: Minor
 Attachments: hive-4317.1.patch


 scenario: multiple thread add jar and do select operation by jdbc 
 concurrently , when hiveserver serializeMapRedWork sometimes, it will throw 
 StackOverflowError from XMLEncoder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3994) Hive metastore is not working on PostgreSQL 9.2 (most likely on anything 9.0+)

2013-04-09 Thread Andy Jefferson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626362#comment-13626362
 ] 

Andy Jefferson commented on HIVE-3994:
--

Obviously DataNucleus 3.x supports this new Postgresql syntax, but sadly you 
keep on using an ancient version

 Hive metastore is not working on PostgreSQL 9.2 (most likely on anything 9.0+)
 --

 Key: HIVE-3994
 URL: https://issues.apache.org/jira/browse/HIVE-3994
 Project: Hive
  Issue Type: Improvement
Reporter: Jarek Jarcec Cecho

 I'm getting following exception when running metastore on PostgreSQL 9.2:
 {code}
 Caused by: javax.jdo.JDODataStoreException: Error executing JDOQL query 
 SELECT THIS.TBL_NAME AS NUCORDER0 FROM TBLS THIS LEFT OUTER JOIN 
 DBS THIS_DATABASE_NAME ON THIS.DB_ID = THIS_DATABASE_NAME.DB_ID 
 WHERE THIS_DATABASE_NAME.NAME = ? AND (LOWER(THIS.TBL_NAME) LIKE ? 
 ESCAPE '\\' ) ORDER BY NUCORDER0  : ERROR: invalid escape string
   Hint: Escape string must be empty or one character..
 NestedThrowables:
 org.postgresql.util.PSQLException: ERROR: invalid escape string
   Hint: Escape string must be empty or one character.
 at 
 org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:313)
 at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:252)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTables(ObjectStore.java:759)
 ... 28 more
 Caused by: org.postgresql.util.PSQLException: ERROR: invalid escape string
   Hint: Escape string must be empty or one character.
 at 
 org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2096)
 at 
 org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1829)
 at 
 org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
 at 
 org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:510)
 at 
 org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:386)
 at 
 org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:271)
 at 
 org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
 at 
 org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
 at 
 org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:457)
 at 
 org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:123)
 at 
 org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288)
 at org.datanucleus.store.query.Query.executeQuery(Query.java:1657)
 at 
 org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245)
 at org.datanucleus.store.query.Query.executeWithArray(Query.java:1499)
 at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243)
 ... 29 more
 {code}
 I've google a bit about that and I found a lot of similar issues in different 
 projects thus I'm assuming that this might be some backward compatibility 
 issue on PostgreSQL side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4316) bug introduced by HIVE-3464

2013-04-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626387#comment-13626387
 ] 

Namit Jain commented on HIVE-4316:
--

Thanks Navis

 bug introduced by HIVE-3464
 ---

 Key: HIVE-4316
 URL: https://issues.apache.org/jira/browse/HIVE-4316
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain

   // for outer joins, it should not exceed 16 aliases (short type)
 
   if (!node.getNoOuterJoin() || !target.getNoOuterJoin()) {
 if (node.getRightAliases().length + 
 target.getRightAliases().length + 1  16) {
   LOG.info(ErrorMsg.JOINNODE_OUTERJOIN_MORETHAN_16);
   continue;
 }
   }
 It is checking RightAliases() twice

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4316) bug introduced by HIVE-3464

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-4316.
--

Resolution: Not A Problem

 bug introduced by HIVE-3464
 ---

 Key: HIVE-4316
 URL: https://issues.apache.org/jira/browse/HIVE-4316
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain

   // for outer joins, it should not exceed 16 aliases (short type)
 
   if (!node.getNoOuterJoin() || !target.getNoOuterJoin()) {
 if (node.getRightAliases().length + 
 target.getRightAliases().length + 1  16) {
   LOG.info(ErrorMsg.JOINNODE_OUTERJOIN_MORETHAN_16);
   continue;
 }
   }
 It is checking RightAliases() twice

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626434#comment-13626434
 ] 

Hudson commented on HIVE-2340:
--

Integrated in Hive-trunk-hadoop2 #147 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/147/])
HIVE-2340 : optimize orderby followed by a groupby (Navis via Ashutosh 
Chauhan) (Revision 1465721)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465721
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
* /hive/trunk/ql/src/test/queries/clientpositive/auto_join26.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby_distinct_samekey.q
* /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate.q
* /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q
* /hive/trunk/ql/src/test/results/clientpositive/groupby2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby2_map_skew.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby_cube1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby_rollup1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/infer_bucket_sort.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ppd2.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out
* /hive/trunk/ql/src/test/results/compiler/plan/join1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join3.q.xml


 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Fix For: 0.11.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.12.patch, 
 HIVE-2340.13.patch, HIVE-2340.14.patch, 
 HIVE-2340.14.rebased_and_schema_clone.patch, HIVE-2340.15.patch, 
 HIVE-2340.1.patch.txt, HIVE-2340.D1209.10.patch, HIVE-2340.D1209.11.patch, 
 HIVE-2340.D1209.12.patch, HIVE-2340.D1209.13.patch, HIVE-2340.D1209.14.patch, 
 HIVE-2340.D1209.15.patch, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, 
 HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4310:
-

Attachment: hive.4310.1.patch-nohcat

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4310.1.patch, hive.4310.1.patch-nohcat




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4310:
-

Attachment: hive.4310.1.patch

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4310.1.patch, hive.4310.1.patch-nohcat




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Carter Shanklin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626636#comment-13626636
 ] 

Carter Shanklin commented on HIVE-4271:
---

Gunther, one thing to consider is Teradata interoperability and they support up 
to 38 rather than 36. They claim to do this with 16 bytes. See 
http://developer.teradata.com/tools/articles/how-many-digits-in-a-decimal

I believe SQL Server is also 38 but I'm not sure. If we can get 38 that would 
be ideal from a compatibility point of view. If there is a big performance hit 
due to encoding or whatever other reason that's a good reason to go with 36 
rather than 38 since there's probably not too many apps using 37 or 38. There 
are bound to be some out there somewhere though.

Last thought, starting with 18 is fine since it futureproofed from a DDL point 
of view but there is good upside to being able to make stronger compatibility 
statements.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


lots of tests failing

2013-04-09 Thread Namit Jain
It seems that the comments have been removed from the output files, and so a 
lot of tests are failing.
I have not  debugged, but https://issues.apache.org/jira/browse/HIVE-1953 looks 
like the culprit.

Navis, is that so ? Are you updating the log files ?


[jira] [Commented] (HIVE-1953) Hive should process comments in CliDriver

2013-04-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626714#comment-13626714
 ] 

Namit Jain commented on HIVE-1953:
--

[~navis], are a lot of tests failing due to this ?

 Hive should process comments in CliDriver
 -

 Key: HIVE-1953
 URL: https://issues.apache.org/jira/browse/HIVE-1953
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-1953.1.patch, HIVE-1953.2.patch, HIVE-1953.3.patch, 
 HIVE-1953.4.patch, HIVE-1953.tests.update.patch, HIVE-1953.tests.update.patch


 If you put commend before a set command, it will faile. 
 Like this:
 -- TestSerDe is a user defined serde where the default delimiter is Ctrl-B
 -- the user is overwriting it with ctrlC
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 Hive should process the comment in CliDriver, and ignore the comment right 
 away, instead of passing it to the downstream processors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4211) Common column and partition column are defined the same type and union them, it will hints Schema of both sides of union should match.

2013-04-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626722#comment-13626722
 ] 

Namit Jain commented on HIVE-4211:
--

Seems like a old bug - dont remember the exact reason.
Can you fix it ?

 Common column and partition column are defined the same type and union them, 
 it will hints Schema of both sides of union should match. 
 ---

 Key: HIVE-4211
 URL: https://issues.apache.org/jira/browse/HIVE-4211
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.9.0, 0.11.0
Reporter: Daisy.Yuan
  Labels: patch
 Attachments: PartitionColumnTypInfo.patch


 create table UnionBoolA (id boolean, no boolean) row format delimited fields 
 terminated by ' ';
 load data local inpath '/opt/files/unionboola.txt' into table UnionBoolA;
 create table UnionPartionBool (id int) partitioned by (no boolean) row format 
 delimited fields terminated by ' ';
 load data local inpath '/opt/files/unionpartint.txt' into table 
 UnionPartionBool partition(no=true);
 unionboola.txt:
 true true
 false true
 true true
 false true
 unionpartint.txt:
 111
 444
 1122
 44
 when I execute
 select * from( select no from UnionBoolA union all select no from 
 UnionPartionBool) unionResult, it is failed. The exception info is as 
 follows:
 FAILED: Error in semantic analysis: 1:66 Schema of both sides of union should 
 match: Column no is of type boolean on first table and type string on second 
 table. Error encountered near token 'UnionPartionBool'
 org.apache.hadoop.hive.ql.parse.SemanticException: 1:66 Schema of both sides 
 of union should match: Column no is of type boolean on first table and type 
 string on second table. Error encountered near token 'UnionPartionBool'
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genUnionPlan(SemanticAnalyzer.java:6295)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6733)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6748)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7556)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:244)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:621)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:525)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1153)
 at 
 org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:226)
 at 
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:630)
 at 
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:618)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:535)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:532)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:532)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 So I execute explain select no from  UnionPartionBool to see the partition 
 column, and find the partition column type is string.
 And all the partition column type is changed to be 
 TypeInfoFactory.stringTypeInfo. It is in 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(). And it is 
 todo task. Now I modify it to be 
 TypeInfoFactory.getPrimitiveTypeInfo(part_col.getType()).It can fix this bug. 
 And you can see what I modified in the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4199) ORC writer doesn't handle non-UTF8 encoded Text properly

2013-04-09 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4199:


Status: Open  (was: Patch Available)

 ORC writer doesn't handle non-UTF8 encoded Text properly
 

 Key: HIVE-4199
 URL: https://issues.apache.org/jira/browse/HIVE-4199
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4199.HIVE-4199.HIVE-4199.D9501.1.patch, 
 HIVE-4199.HIVE-4199.HIVE-4199.D9501.2.patch, 
 HIVE-4199.HIVE-4199.HIVE-4199.D9501.3.patch, 
 HIVE-4199.HIVE-4199.HIVE-4199.D9501.4.patch


 StringTreeWriter currently converts fields stored as Text objects into 
 Strings. This can lose information (see 
 http://en.wikipedia.org/wiki/Replacement_character#Replacement_character), 
 and is also unnecessary since the dictionary stores Text objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4105) Hive MapJoinOperator unnecessarily deserializes values for all join-keys

2013-04-09 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HIVE-4105:
--

Assignee: Vinod Kumar Vavilapalli
  Status: Patch Available  (was: Open)

 Hive MapJoinOperator unnecessarily deserializes values for all join-keys
 

 Key: HIVE-4105
 URL: https://issues.apache.org/jira/browse/HIVE-4105
 Project: Hive
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: HIVE-4105-20130301.1.txt, HIVE-4105-20130301.txt, 
 HIVE-4105.patch


 We can avoid this for inner-joins. Hive does an explicit value 
 de-serialization up front so even for those rows which won't emit output. In 
 these cases, we can do just with key de-serialization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4319) Revert changes checked-in as part of HIVE-1953

2013-04-09 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-4319:


 Summary: Revert changes checked-in as part of HIVE-1953
 Key: HIVE-4319
 URL: https://issues.apache.org/jira/browse/HIVE-4319
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Fix For: 0.11.0
 Attachments: HIVE-4319.1.patch

In the patch supplied on the jira-1953, I missed running tests for hadoop 2.0, 
TestBeeLineDriver and MiniMRCliDriver. I am providing a revert patch here so 
that I can run all these tests and provide a uber patch later but don't want to 
holdup the community because of this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4319) Revert changes checked-in as part of HIVE-1953

2013-04-09 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4319:
-

Attachment: HIVE-4319.1.patch

 Revert changes checked-in as part of HIVE-1953
 --

 Key: HIVE-4319
 URL: https://issues.apache.org/jira/browse/HIVE-4319
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Fix For: 0.11.0

 Attachments: HIVE-4319.1.patch


 In the patch supplied on the jira-1953, I missed running tests for hadoop 
 2.0, TestBeeLineDriver and MiniMRCliDriver. I am providing a revert patch 
 here so that I can run all these tests and provide a uber patch later but 
 don't want to holdup the community because of this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626937#comment-13626937
 ] 

Hudson commented on HIVE-2340:
--

Integrated in Hive-trunk-h0.21 #2053 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2053/])
HIVE-2340 : optimize orderby followed by a groupby (Navis via Ashutosh 
Chauhan) (Revision 1465721)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465721
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
* /hive/trunk/ql/src/test/queries/clientpositive/auto_join26.q
* /hive/trunk/ql/src/test/queries/clientpositive/groupby_distinct_samekey.q
* /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate.q
* /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q
* /hive/trunk/ql/src/test/results/clientpositive/groupby2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby2_map_skew.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby_cube1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out
* /hive/trunk/ql/src/test/results/clientpositive/groupby_rollup1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/infer_bucket_sort.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ppd2.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out
* /hive/trunk/ql/src/test/results/compiler/plan/join1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join3.q.xml


 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Fix For: 0.11.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.12.patch, 
 HIVE-2340.13.patch, HIVE-2340.14.patch, 
 HIVE-2340.14.rebased_and_schema_clone.patch, HIVE-2340.15.patch, 
 HIVE-2340.1.patch.txt, HIVE-2340.D1209.10.patch, HIVE-2340.D1209.11.patch, 
 HIVE-2340.D1209.12.patch, HIVE-2340.D1209.13.patch, HIVE-2340.D1209.14.patch, 
 HIVE-2340.D1209.15.patch, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, 
 HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4314) Result of mapjoin_test_outer.q is not deterministic

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626938#comment-13626938
 ] 

Hudson commented on HIVE-4314:
--

Integrated in Hive-trunk-h0.21 #2053 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2053/])
HIVE-4314 Result of mapjoin_test_outer.q is not deterministic
(Navis via namit) (Revision 1465892)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465892
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/mapjoin_test_outer.q
* /hive/trunk/ql/src/test/results/clientpositive/mapjoin_test_outer.q.out


 Result of mapjoin_test_outer.q is not deterministic
 ---

 Key: HIVE-4314
 URL: https://issues.apache.org/jira/browse/HIVE-4314
 Project: Hive
  Issue Type: Wish
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-4314.D10053.1.patch


 Shows different result between hadoop-20 and hadoop-23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3308) Mixing avro and snappy gives null values

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626939#comment-13626939
 ] 

Hudson commented on HIVE-3308:
--

Integrated in Hive-trunk-h0.21 #2053 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2053/])
HIVE-3308 Mixing avro and snappy gives null values (Bennie Schut via Navis) 
(Revision 1465849)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465849
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/avro_compression_enabled.q
* /hive/trunk/ql/src/test/results/clientpositive/avro_compression_enabled.q.out
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java


 Mixing avro and snappy gives null values
 

 Key: HIVE-3308
 URL: https://issues.apache.org/jira/browse/HIVE-3308
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Bennie Schut
Assignee: Bennie Schut
 Fix For: 0.11.0

 Attachments: HIVE-3308.patch1.txt, HIVE-3308.patch2.txt


 On default hive uses LazySimpleSerDe for output.
 When I now enable compression and select count(*) from avrotable the output 
 is a file with the .avro extension but this then will display null values 
 since the file is in reality not an avro file but a file created by 
 LazySimpleSerDe using compression so should be a .snappy file.
 This causes any job (exception select * from avrotable is that not truly a 
 job) to show null values.
 If you use any serde other then avro you can temporarily fix this by setting 
 set hive.output.file.extension=.snappy and it will correctly work again but 
 this won't work on avro since it overwrites the hive.output.file.extension 
 during initializing.
 When you dump the query result into a table with create table bla as you 
 can rename the .avro file into .snappy and the select from bla will also 
 magiacally work again.
 Input and Ouput serdes don't always match so when I use avro as an input 
 format it should not set the hive.output.file.extension.
 Onces it's set all queries will use it and fail making the connection useless 
 to reuse.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4311) DOS line endings in auto_join26.q

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626940#comment-13626940
 ] 

Hudson commented on HIVE-4311:
--

Integrated in Hive-trunk-h0.21 #2053 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2053/])
HIVE-4311 : DOS line endings in auto_join26.q (Gunther Hagleitner via 
Ashutosh Chauhan) (Revision 1465820)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465820
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/auto_join26.q


 DOS line endings in auto_join26.q
 -

 Key: HIVE-4311
 URL: https://issues.apache.org/jira/browse/HIVE-4311
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.11.0

 Attachments: HIVE-4311.patch


 Seems like the auto_join26.q got changed to DOS by HIVE-2340.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4319) Revert changes checked-in as part of HIVE-1953

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626951#comment-13626951
 ] 

Ashutosh Chauhan commented on HIVE-4319:


+1

 Revert changes checked-in as part of HIVE-1953
 --

 Key: HIVE-4319
 URL: https://issues.apache.org/jira/browse/HIVE-4319
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Fix For: 0.11.0

 Attachments: HIVE-4319.1.patch


 In the patch supplied on the jira-1953, I missed running tests for hadoop 
 2.0, TestBeeLineDriver and MiniMRCliDriver. I am providing a revert patch 
 here so that I can run all these tests and provide a uber patch later but 
 don't want to holdup the community because of this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4248) Implement a memory manager for ORC

2013-04-09 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626957#comment-13626957
 ] 

Owen O'Malley commented on HIVE-4248:
-

Kevin,
  After thinking about it a bit more, how about if I ask the writers to 
re-check their memory relative to their allocation when the pool has shrunk by 
more than 10% from the last time they checked. I ran a quick experiment where I 
had a pool of 1GB and an increasing set of 250MB writers. By only doing the 
check when the pool has changed by more than 10%, as 1000 writers were added it 
cut down the number checks from 1000 to 49. Does that sound reasonable?

 Implement a memory manager for ORC
 --

 Key: HIVE-4248
 URL: https://issues.apache.org/jira/browse/HIVE-4248
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4248.D9993.1.patch, HIVE-4248.D9993.2.patch


 With the large default stripe size (256MB) and dynamic partitions, it is 
 quite easy for users to run out of memory when writing ORC files. We probably 
 need a solution that keeps track of the total number of concurrent ORC 
 writers and divides the available heap space between them. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4306) PTFDeserializer should reconstruct OIs based on InputOI passed to PTFOperator

2013-04-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4306:
--

Attachment: HIVE-4306.D10017.2.patch

hbutani updated the revision HIVE-4306 [jira] PTFDeserializer should 
reconstruct OIs based on InputOI passed to PTFOperator.

- update comments
- Merge branch 'ptf' into HIVE-4306
- don't filter virtual cols; fix npath

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D10017

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10017?vs=31371id=31533#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDeserializer.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java
  ql/src/test/queries/clientpositive/auto_join26.q
  ql/src/test/results/clientpositive/ptf.q.out
  testutils/hadoop

To: JIRA, ashutoshc, hbutani


 PTFDeserializer should reconstruct OIs based on InputOI passed to PTFOperator
 -

 Key: HIVE-4306
 URL: https://issues.apache.org/jira/browse/HIVE-4306
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Prajakta Kalmegh
 Attachments: HIVE-4306.2.patch.txt, HIVE-4306.D10017.1.patch, 
 HIVE-4306.D10017.2.patch


 Currently PTFDesc holds onto shape information that is used by the 
 PTFDeserializer to reconstruct OIs during runtime. This could interfere with 
 changes made to OIs during Optimization. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-04-09 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4189:


Attachment: HIVE-4189.2.patch.txt

 ORC fails with String column that ends in lots of nulls
 ---

 Key: HIVE-4189
 URL: https://issues.apache.org/jira/browse/HIVE-4189
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4189.1.patch.txt, HIVE-4189.2.patch.txt


 When ORC attempts to write out a string column that ends in enough nulls to 
 span an index stride, StringTreeWriter's writeStripe method will get an 
 exception from TreeWriter's writeStripe method
 Column has wrong number of index entries found: x expected: y
 This is caused by rowIndexValueCount having multiple entries equal to the 
 number of non-null rows in the column, combined with the fact that 
 StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1953) Hive should process comments in CliDriver

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627032#comment-13627032
 ] 

Hudson commented on HIVE-1953:
--

Integrated in Hive-trunk-hadoop2 #148 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/148/])
Missing test results from HIVE-1953 (Vikram Dixit K via Navis) (Revision 
1465903)
HIVE-1953 Hive should process comments in CliDriver (Vikram Dixit K via Navis) 
(Revision 1465890)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465903
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/alter_table_serde.q
* /hive/trunk/ql/src/test/results/clientpositive/add_part_exist.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/add_partition_no_whitelist.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/add_partition_with_whitelist.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alias_casted_column.q.out
* /hive/trunk/ql/src/test/results/clientpositive/allcolref_in_udf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter5.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/alter_numbuckets_partitioned_table.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/alter_numbuckets_partitioned_table2.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/alter_partition_clusterby_sortby.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_partition_coltype.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/alter_partition_protect_mode.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/alter_partition_with_whitelist.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_rename_partition.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/alter_rename_partition_authorization.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_table_serde.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_table_serde2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ambiguous_col.q.out
* /hive/trunk/ql/src/test/results/clientpositive/archive_multi.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join14_hadoop20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_join25.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/autogen_colalias.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_change_schema.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_compression_enabled.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_evolved_schemas.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_joins.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_nullable_fields.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_sanity_test.q.out
* /hive/trunk/ql/src/test/results/clientpositive/avro_schema_error_message.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ba_table1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ba_table2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ba_table3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ba_table_udfs.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ba_table_union.q.out
* /hive/trunk/ql/src/test/results/clientpositive/binary_output_format.q.out
* /hive/trunk/ql/src/test/results/clientpositive/binary_table_bincolserde.q.out
* /hive/trunk/ql/src/test/results/clientpositive/binary_table_colserde.q.out
* 

[jira] [Commented] (HIVE-4314) Result of mapjoin_test_outer.q is not deterministic

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627033#comment-13627033
 ] 

Hudson commented on HIVE-4314:
--

Integrated in Hive-trunk-hadoop2 #148 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/148/])
HIVE-4314 Result of mapjoin_test_outer.q is not deterministic
(Navis via namit) (Revision 1465892)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465892
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/mapjoin_test_outer.q
* /hive/trunk/ql/src/test/results/clientpositive/mapjoin_test_outer.q.out


 Result of mapjoin_test_outer.q is not deterministic
 ---

 Key: HIVE-4314
 URL: https://issues.apache.org/jira/browse/HIVE-4314
 Project: Hive
  Issue Type: Wish
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-4314.D10053.1.patch


 Shows different result between hadoop-20 and hadoop-23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3308) Mixing avro and snappy gives null values

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627034#comment-13627034
 ] 

Hudson commented on HIVE-3308:
--

Integrated in Hive-trunk-hadoop2 #148 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/148/])
HIVE-3308 Mixing avro and snappy gives null values (Bennie Schut via Navis) 
(Revision 1465849)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465849
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/avro_compression_enabled.q
* /hive/trunk/ql/src/test/results/clientpositive/avro_compression_enabled.q.out
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java


 Mixing avro and snappy gives null values
 

 Key: HIVE-3308
 URL: https://issues.apache.org/jira/browse/HIVE-3308
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Bennie Schut
Assignee: Bennie Schut
 Fix For: 0.11.0

 Attachments: HIVE-3308.patch1.txt, HIVE-3308.patch2.txt


 On default hive uses LazySimpleSerDe for output.
 When I now enable compression and select count(*) from avrotable the output 
 is a file with the .avro extension but this then will display null values 
 since the file is in reality not an avro file but a file created by 
 LazySimpleSerDe using compression so should be a .snappy file.
 This causes any job (exception select * from avrotable is that not truly a 
 job) to show null values.
 If you use any serde other then avro you can temporarily fix this by setting 
 set hive.output.file.extension=.snappy and it will correctly work again but 
 this won't work on avro since it overwrites the hive.output.file.extension 
 during initializing.
 When you dump the query result into a table with create table bla as you 
 can rename the .avro file into .snappy and the select from bla will also 
 magiacally work again.
 Input and Ouput serdes don't always match so when I use avro as an input 
 format it should not set the hive.output.file.extension.
 Onces it's set all queries will use it and fail making the connection useless 
 to reuse.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4311) DOS line endings in auto_join26.q

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627035#comment-13627035
 ] 

Hudson commented on HIVE-4311:
--

Integrated in Hive-trunk-hadoop2 #148 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/148/])
HIVE-4311 : DOS line endings in auto_join26.q (Gunther Hagleitner via 
Ashutosh Chauhan) (Revision 1465820)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1465820
Files : 
* /hive/trunk/ql/src/test/queries/clientpositive/auto_join26.q


 DOS line endings in auto_join26.q
 -

 Key: HIVE-4311
 URL: https://issues.apache.org/jira/browse/HIVE-4311
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.11.0

 Attachments: HIVE-4311.patch


 Seems like the auto_join26.q got changed to DOS by HIVE-2340.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627063#comment-13627063
 ] 

Gunther Hagleitner commented on HIVE-4318:
--

Nice find! Thanks! I can see how that will be problematic in the inner loop. 
Just glancing at it:

- the only place where we use operator-hooks seems to be the home-grown 
profiler. Neat to have, but not a good reason to slow down performance.
- this seems to add the following overhead to the inner loop:
  - allocation of context per row
  - iteration through empty list of hooks (utils always sets empty list)
  - two additional virtual calls

Is anyone really using/planning on using this? The easiest fix would be to just 
remove it, which seems the right thing to do. It doesn't seem useful to have a 
feature that adds a lot of overhead in the inner loop. There are ways to cut 
down on the overhead for when there are no hooks, but I'd like to know if 
there's opposition to removing it first.



 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner

 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4320) Consider extending max limit for precision to 38

2013-04-09 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-4320:


 Summary: Consider extending max limit for precision to 38
 Key: HIVE-4320
 URL: https://issues.apache.org/jira/browse/HIVE-4320
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner


Max precision of 38 still fits in 128. It changes the way you do math on these 
numbers though. Need to see if there will be perf implications, but there's a 
strong case to support 38 (instead of 36) to comply with other DBs. (Oracle, 
SQL Server, Teradata).



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627096#comment-13627096
 ] 

Ashutosh Chauhan commented on HIVE-4318:


I agree with Gunther's analysis and suggestion to remove them altogether. If 
only reason to introduce operator level hook was to measure performance that we 
have failed in that objective by actually making system slower not faster. 
Profiling should be done using jvm toolkits like yourkit, jvisualvm etc., not 
by modifying code.

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner

 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627100#comment-13627100
 ] 

Gunther Hagleitner commented on HIVE-4271:
--

Thanks Carter and Eric for the feedback. I've opened HIVE-4320 in response to 
your comments. I'd like to think through how we're going to do math on these 
numbers, but you make a great point.

I'd still like to move forward with this jira and do the rest in HIVE-4320.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627112#comment-13627112
 ] 

Ashutosh Chauhan commented on HIVE-4271:


The exact precision and how to implement in performant way we can discuss on 
separate jira. Lets use this jira to limit the max precision.
+1 to the latest patch.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2013-04-09 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627148#comment-13627148
 ] 

Arup Malakar commented on HIVE-3620:


Error log I see in the server is:
{code}
2013-04-09 19:47:41,955 ERROR thrift.ProcessFunction 
(ProcessFunction.java:process(41)) - Internal error processing 
get_databasejava.lang.OutOfMemoryError: Java heap spaceat 
java.lang.AbstractStringBuilder.init(AbstractStringBuilder.java:45)at 
java.lang.StringBuilder.init(StringBuilder.java:80)at 
oracle.net.ns.Packet.init(Packet.java:513)at 
oracle.net.ns.Packet.init(Packet.java:142)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:279)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1042)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:301)
at 
oracle.jdbc.driver.PhysicalConnection.init(PhysicalConnection.java:531)   
 at oracle.jdbc.driver.T4CConnection.init(T4CConnection.java:221)at 
oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32) 
   at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:503)
at 
yjava.database.jdbc.oracle.KeyDbDriverWrapper.connect(KeyDbDriverWrapper.java:81)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)at 
org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:75)
at 
org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at 
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1148)
at 
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
at 
org.datanucleus.store.rdbms.ConnectionProviderPriorityList.getConnection(ConnectionProviderPriorityList.java:57)
at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:363)
at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:322)
at 
org.datanucleus.store.connection.ConnectionManagerImpl.enlistResource(ConnectionManagerImpl.java:388)
at 
org.datanucleus.store.connection.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:253)
at 
org.datanucleus.store.connection.AbstractConnectionFactory.getConnection(AbstractConnectionFactory.java:60)
at 
org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:338)
at 
org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:307)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:582)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1692)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1527)
at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:405)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:424)
{code}

Show table takes time too:
{code}
hive show tables;
OK
load_test_table_2_0
testTime taken: 285.705 seconds

Log in server:

2013-04-09 19:53:52,783 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(434)) - 5: get_database: default
2013-04-09 19:54:09,143 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:newRawStore(391)) - 5: Opening raw store with implemenation 
class:org.apache.hadoop.hive.metastore.ObjectStore
2013-04-09 19:57:44,812 INFO  metastore.ObjectStore 
(ObjectStore.java:initialize(222)) - ObjectStore, initialize called
2013-04-09 19:57:44,816 INFO  metastore.ObjectStore 
(ObjectStore.java:setConf(205)) - Initialized ObjectStore
2013-04-09 19:57:51,700 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(434)) - 6: get_database: default
2013-04-09 19:57:51,706 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:newRawStore(391)) - 6: Opening raw store with implemenation 
class:org.apache.hadoop.hive.metastore.ObjectStore
2013-04-09 19:57:51,712 INFO  metastore.ObjectStore 
(ObjectStore.java:initialize(222)) - ObjectStore, initialize called
2013-04-09 19:57:51,714 INFO  metastore.ObjectStore 
(ObjectStore.java:setConf(205)) - Initialized ObjectStore
2013-04-09 19:57:52,048 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(434)) - 6: get_tables: db=default pat=.*
2013-04-09 19:57:52,262 ERROR DataNucleus.Transaction 
(Log4JLogger.java:error(115)) - Operation rollback failed on resource: 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$EmulatedXAResource@18d3a2f, 
error code UNKNOWN and transaction: [DataNucleus Transaction, ID=Xid=�, 
enlisted 

[jira] [Updated] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-09 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4318:
-

Attachment: HIVE-4318.1.patch

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4199) ORC writer doesn't handle non-UTF8 encoded Text properly

2013-04-09 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627200#comment-13627200
 ] 

Phabricator commented on HIVE-4199:
---

sxyuan has commented on the revision HIVE-4199 [jira] ORC writer doesn't 
handle non-UTF8 encoded Text properly.

  Inline comments.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringRedBlackTree.java:44 The 
reason why I kept the add(String) method is that it can avoid doing two copies 
when the original data is actually a String. If the dictionary only takes Text 
objects, the writer will have to convert the String to a new Text object, and 
then set(Text) will copy the bytes over to the dictionary's internal Text 
object.
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java:717 I've looked 
into adding statistics for non-UTF8 strings, but I discovered that the stats 
are serialized to Protobuf objects which require strings to be UTF8 encoded. Do 
you have any suggestions?

REVISION DETAIL
  https://reviews.facebook.net/D9501

To: kevinwilfong, sxyuan
Cc: JIRA, omalley


 ORC writer doesn't handle non-UTF8 encoded Text properly
 

 Key: HIVE-4199
 URL: https://issues.apache.org/jira/browse/HIVE-4199
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4199.HIVE-4199.HIVE-4199.D9501.1.patch, 
 HIVE-4199.HIVE-4199.HIVE-4199.D9501.2.patch, 
 HIVE-4199.HIVE-4199.HIVE-4199.D9501.3.patch, 
 HIVE-4199.HIVE-4199.HIVE-4199.D9501.4.patch


 StringTreeWriter currently converts fields stored as Text objects into 
 Strings. This can lose information (see 
 http://en.wikipedia.org/wiki/Replacement_character#Replacement_character), 
 and is also unnecessary since the dictionary stores Text objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4321) Add Compile/Execute support to Hive Server

2013-04-09 Thread Sarah Parra (JIRA)
Sarah Parra created HIVE-4321:
-

 Summary: Add Compile/Execute support to Hive Server
 Key: HIVE-4321
 URL: https://issues.apache.org/jira/browse/HIVE-4321
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Thrift API
Reporter: Sarah Parra


Adds support for query compilation in Hive Server 2 and adds Thrift support for 
compile/execute APIs.

This enables scenarios that need to compile a query before it is executed, e.g. 
and ODBC driver that implements SQLPrepare/SQLExecute. This is commonly used 
for a client that needs metadata for the query before it is executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4321) Add Compile/Execute support to Hive Server

2013-04-09 Thread Sarah Parra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarah Parra updated HIVE-4321:
--

Attachment: CompileExecute.patch

 Add Compile/Execute support to Hive Server
 --

 Key: HIVE-4321
 URL: https://issues.apache.org/jira/browse/HIVE-4321
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Thrift API
Reporter: Sarah Parra
 Attachments: CompileExecute.patch


 Adds support for query compilation in Hive Server 2 and adds Thrift support 
 for compile/execute APIs.
 This enables scenarios that need to compile a query before it is executed, 
 e.g. and ODBC driver that implements SQLPrepare/SQLExecute. This is commonly 
 used for a client that needs metadata for the query before it is executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4321) Add Compile/Execute support to Hive Server

2013-04-09 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627252#comment-13627252
 ] 

Shreepadma Venugopalan commented on HIVE-4321:
--

[~sarahparra]: Can you post a review request on phabricator or review board? 
Please remove the files that are auto generated by the thrift compiler in the 
review request. Thanks.

 Add Compile/Execute support to Hive Server
 --

 Key: HIVE-4321
 URL: https://issues.apache.org/jira/browse/HIVE-4321
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Thrift API
Reporter: Sarah Parra
 Attachments: CompileExecute.patch


 Adds support for query compilation in Hive Server 2 and adds Thrift support 
 for compile/execute APIs.
 This enables scenarios that need to compile a query before it is executed, 
 e.g. and ODBC driver that implements SQLPrepare/SQLExecute. This is commonly 
 used for a client that needs metadata for the query before it is executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-09 Thread Samuel Yuan (JIRA)
Samuel Yuan created HIVE-4322:
-

 Summary: SkewedInfo in Metastore Thrift API cannot be deserialized 
in Python
 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor


The Thrift-generated Python code that deserializes Thrift objects fails 
whenever a complex type is used as a map key, because by default mutable Python 
objects such as lists do not have a hash function. See 
https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.

The SkewedInfo struct contains a map which uses a list as a key, breaking the 
Python Thrift interface. It is not possible to specify the mapping from Thrift 
types to Python types, or otherwise we could map Thrift lists to Python tuples. 
Instead, the proposed workaround wraps the list inside a new struct. This alone 
does not accomplish anything, but allows Python clients to define a hash 
function for the struct class, e.g.:

def f(object):
return hash(tuple(object.skewedValueList))

SkewedValueList.__hash__ = f

In practice a more efficient hash might be defined that does not involve 
copying the list. The advantage of wrapping the list inside a struct is that 
the client does not have to define the hash on the list itself, which would 
change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-09 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627275#comment-13627275
 ] 

Gang Tim Liu commented on HIVE-4322:


[~sxyuan] Good write up. thank you for working on it.

 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor

 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2013-04-09 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627283#comment-13627283
 ] 

Thiruvel Thirumoolan commented on HIVE-3620:


I have had this problem in the past (in my case 0.2 million partitions, was 
stress testing dynamic partitions). Metastore crashes badly, may be mine was a 
remote situation. The workaround I did was to drop one hierarchy of partition. 
In my case, there were many partition keys and I used to drop the topmost one 
instead of dropping the table.

May be its worthwhile to visit HIVE-3214 and see if there is anything we could 
do at datanucleus end.

 Drop table using hive CLI throws error when the total number of partition in 
 the table is around 50K.
 -

 Key: HIVE-3620
 URL: https://issues.apache.org/jira/browse/HIVE-3620
 Project: Hive
  Issue Type: Bug
Reporter: Arup Malakar

 hive drop table load_test_table_2_0; 
  
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timedout
   
   
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask 
 The DB used is Oracle and hive had only one table:
 select COUNT(*) from PARTITIONS;
 54839
 I can try and play around with the parameter 
 hive.metastore.client.socket.timeout if that is what is being used. But it is 
 200 seconds as of now, and 200 seconds for a drop table calls seems high 
 already.
 Thanks,
 Arup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: lots of tests failing

2013-04-09 Thread Navis류승우
Yes, it's all my fault. I've forgot that the Driver affects test results.

Reverting HIVE-1953 is created(HIVE-4319) and ready to be committed.

Again, sorry for all the troubles to community (especially for
Vikram). I should think twice before doing things.

2013/4/10 Namit Jain nj...@fb.com:
 It seems that the comments have been removed from the output files, and so a 
 lot of tests are failing.
 I have not  debugged, but https://issues.apache.org/jira/browse/HIVE-1953 
 looks like the culprit.

 Navis, is that so ? Are you updating the log files ?


[jira] [Resolved] (HIVE-4319) Revert changes checked-in as part of HIVE-1953

2013-04-09 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4319.


Resolution: Fixed

Committed to trunk. Thanks, Vikram for quick turnaround to keep trunk builds 
green!

 Revert changes checked-in as part of HIVE-1953
 --

 Key: HIVE-4319
 URL: https://issues.apache.org/jira/browse/HIVE-4319
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Fix For: 0.11.0

 Attachments: HIVE-4319.1.patch


 In the patch supplied on the jira-1953, I missed running tests for hadoop 
 2.0, TestBeeLineDriver and MiniMRCliDriver. I am providing a revert patch 
 here so that I can run all these tests and provide a uber patch later but 
 don't want to holdup the community because of this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #115

2013-04-09 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/115/

--
[...truncated 41948 lines...]
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2013-04-09 17:37:00,435 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-04-09_17-36-57_635_1950736896251787967/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201304091737_1576487496.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-04-09_17-37-01_910_2013011078774809933/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-04-09_17-37-01_910_2013011078774809933/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201304091737_941468010.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable

[jira] [Updated] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4271:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.11.0

 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: lots of tests failing

2013-04-09 Thread Ashutosh Chauhan
No worries, Navis! It happens with all of us. I have reverted the patch and
build should be green now.
Only thing I will suggest to keep in mind is to run all tests before doing
commit.

Ashutosh


On Tue, Apr 9, 2013 at 5:08 PM, Navis류승우 navis@nexr.com wrote:

 Yes, it's all my fault. I've forgot that the Driver affects test results.

 Reverting HIVE-1953 is created(HIVE-4319) and ready to be committed.

 Again, sorry for all the troubles to community (especially for
 Vikram). I should think twice before doing things.

 2013/4/10 Namit Jain nj...@fb.com:
  It seems that the comments have been removed from the output files, and
 so a lot of tests are failing.
  I have not  debugged, but
 https://issues.apache.org/jira/browse/HIVE-1953 looks like the culprit.
 
  Navis, is that so ? Are you updating the log files ?



[jira] [Commented] (HIVE-4107) Update Hive 0.10.0 RELEASE_NOTES.txt

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627327#comment-13627327
 ] 

Ashutosh Chauhan commented on HIVE-4107:


+1

 Update Hive 0.10.0 RELEASE_NOTES.txt
 

 Key: HIVE-4107
 URL: https://issues.apache.org/jira/browse/HIVE-4107
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.10.0
Reporter: Lefty Leverenz
Assignee: Thejas M Nair
 Fix For: 0.11.0

 Attachments: HIVE-4107.1.patch


 Hive release 0.10.0 includes a RELEASE_NOTES.txt file left over from release 
 0.8.1 (branch-0.8-r2).
 It needs to be updated to match the JIRA change log here:  
 [https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12320745styleName=TextprojectId=12310843].
 Thanks to Eric Chu for drawing attention to this problem on 
 u...@hive.apache.org.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4107) Update Hive 0.10.0 RELEASE_NOTES.txt

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627328#comment-13627328
 ] 

Ashutosh Chauhan commented on HIVE-4107:


Committed to trunk. Thanks, Thejas!

 Update Hive 0.10.0 RELEASE_NOTES.txt
 

 Key: HIVE-4107
 URL: https://issues.apache.org/jira/browse/HIVE-4107
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.10.0
Reporter: Lefty Leverenz
Assignee: Thejas M Nair
 Fix For: 0.11.0

 Attachments: HIVE-4107.1.patch


 Hive release 0.10.0 includes a RELEASE_NOTES.txt file left over from release 
 0.8.1 (branch-0.8-r2).
 It needs to be updated to match the JIRA change log here:  
 [https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12320745styleName=TextprojectId=12310843].
 Thanks to Eric Chu for drawing attention to this problem on 
 u...@hive.apache.org.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Subscribe to List

2013-04-09 Thread karthik tunga
Hi Manikanda,

Please follow the instructions at
http://hive.apache.org/mailing_lists.html#Developers to subscribe.

Cheers,
Karthik


On 9 April 2013 20:43, Manikanda Prabhu gmkprabhu1...@gmail.com wrote:

 Hi ,

 Kindly include me as part of this group.

 My personal email id is gmkprabhu1...@gmail.com

 Regards,
 Mani



[jira] [Commented] (HIVE-4306) PTFDeserializer should reconstruct OIs based on InputOI passed to PTFOperator

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627347#comment-13627347
 ] 

Ashutosh Chauhan commented on HIVE-4306:


ptf_npath.q still fails. Query succeeds but fails in diff with .q.out file. I 
didn't look into the output whether previous results were correct or the new 
one. Harish, can you take a look and see if new tests results good, can you 
update .q.out. Also, patch contains unnecessary diffs for auto_join26.q and 
testutils/hadoop files. 

 PTFDeserializer should reconstruct OIs based on InputOI passed to PTFOperator
 -

 Key: HIVE-4306
 URL: https://issues.apache.org/jira/browse/HIVE-4306
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Prajakta Kalmegh
 Attachments: HIVE-4306.2.patch.txt, HIVE-4306.D10017.1.patch, 
 HIVE-4306.D10017.2.patch


 Currently PTFDesc holds onto shape information that is used by the 
 PTFDeserializer to reconstruct OIs during runtime. This could interfere with 
 changes made to OIs during Optimization. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4323) sqlline dependency is not required

2013-04-09 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-4323:
---

 Summary: sqlline dependency is not required
 Key: HIVE-4323
 URL: https://issues.apache.org/jira/browse/HIVE-4323
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair


Since sqlline has been incorporated as beeline, the jar dependency is not 
required.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4323) sqlline dependency is not required

2013-04-09 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4323:


Attachment: HIVE-4323.1.patch

HIVE-4323.1.patch - removes it from ivy dependency and eclipse-templates file.

It will be useful to have this patch committed to 0.11 branch as well, so that 
we don't ship an unnecessary jar.


 sqlline dependency is not required
 --

 Key: HIVE-4323
 URL: https://issues.apache.org/jira/browse/HIVE-4323
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4323.1.patch


 Since sqlline has been incorporated as beeline, the jar dependency is not 
 required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4323) sqlline dependency is not required

2013-04-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627371#comment-13627371
 ] 

Gunther Hagleitner commented on HIVE-4323:
--

Looks good to me! (That jar caused me some grief today because of sf.net 
outage.)

 sqlline dependency is not required
 --

 Key: HIVE-4323
 URL: https://issues.apache.org/jira/browse/HIVE-4323
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4323.1.patch


 Since sqlline has been incorporated as beeline, the jar dependency is not 
 required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3861) Upgrade hbase dependency to 0.94.2

2013-04-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627384#comment-13627384
 ] 

Gunther Hagleitner commented on HIVE-3861:
--

Rebased and updated as suggested by [~sushanth]. Thanks for the feedback!

 Upgrade hbase dependency to 0.94.2
 --

 Key: HIVE-3861
 URL: https://issues.apache.org/jira/browse/HIVE-3861
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-3861.2.patch, HIVE-3861.patch


 Hive tests fail to run against hbase v0.94.2. Proposing to upgrade the 
 dependency and change the test setup to properly work with the newer version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3861) Upgrade hbase dependency to 0.94.2

2013-04-09 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-3861:
-

Status: Patch Available  (was: Open)

 Upgrade hbase dependency to 0.94.2
 --

 Key: HIVE-3861
 URL: https://issues.apache.org/jira/browse/HIVE-3861
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-3861.2.patch, HIVE-3861.patch


 Hive tests fail to run against hbase v0.94.2. Proposing to upgrade the 
 dependency and change the test setup to properly work with the newer version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3509:
--

Attachment: HIVE-3509.D10065.2.patch

MattMartin updated the revision HIVE-3509 [jira] Exclusive locks are not 
acquired when using dynamic partitions.

- Updated Driver.java to make sure that dynamic partitions are handled 
properly when 1 or more parts of the dynamic partition is specified (e.g. 
insert overwrite table tstsrcpart partition (ds ='2008-04-08', hr) select key, 
value, hr where ds = '2008-04-08';).  The earlier check-in on this 
branch/issue only addressed queries that used dynamic partitions without 
specifying any of the partitions (e.g. insert overwrite table tstsrcpart 
partition (ds, hr) select key, value, ds, hr where ds = '2008-04-08';)
- Fixing LINT error.

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D10065

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10065?vs=31467id=31563#toc

AFFECTED FILES
  data/conf/hive-site.xml
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java
  ql/src/test/org/apache/hadoop/hive/ql/hooks/PreExecutePrinter.java
  ql/src/test/queries/clientnegative/lockneg1.q
  ql/src/test/queries/clientnegative/lockneg2.q
  ql/src/test/queries/clientnegative/lockneg3.q
  ql/src/test/queries/clientnegative/lockneg4.q
  ql/src/test/queries/clientnegative/lockneg5.q
  ql/src/test/queries/clientpositive/lock1.q
  ql/src/test/queries/clientpositive/lock2.q
  ql/src/test/queries/clientpositive/lock3.q
  ql/src/test/queries/clientpositive/lock4.q
  ql/src/test/results/clientnegative/lockneg1.q.out
  ql/src/test/results/clientnegative/lockneg2.q.out
  ql/src/test/results/clientnegative/lockneg3.q.out
  ql/src/test/results/clientnegative/lockneg4.q.out
  ql/src/test/results/clientnegative/lockneg5.q.out
  ql/src/test/results/clientpositive/lock1.q.out
  ql/src/test/results/clientpositive/lock2.q.out
  ql/src/test/results/clientpositive/lock3.q.out
  ql/src/test/results/clientpositive/lock4.q.out

To: JIRA, MattMartin
Cc: njain


 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, 
 HIVE-3509.D10065.2.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4120) Implement decimal encoding for ORC

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627393#comment-13627393
 ] 

Ashutosh Chauhan commented on HIVE-4120:


[~owen.omalley] Latest patch from jira is not applying cleanly.

 Implement decimal encoding for ORC
 --

 Key: HIVE-4120
 URL: https://issues.apache.org/jira/browse/HIVE-4120
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4120.D10047.1.patch, HIVE-4120.D10047.2.patch, 
 HIVE-4120.D9207.1.patch


 Currently, ORC does not have an encoder for decimal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-04-09 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-4324:
---

 Summary: ORC Turn off dictionary encoding when number of distinct 
keys is greater than threshold
 Key: HIVE-4324
 URL: https://issues.apache.org/jira/browse/HIVE-4324
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


Add a configurable threshold so that if the number of distinct values in a 
string column is greater than that fraction of non-null values, dictionary 
encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #342

2013-04-09 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/342/



[jira] [Commented] (HIVE-4268) Beeline should support the -f option

2013-04-09 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627400#comment-13627400
 ] 

Prasad Mujumdar commented on HIVE-4268:
---

[~robw] Thanks for the patch. Looks find overall. I left a minor comment on the 
review board. 

 Beeline should support the -f option
 

 Key: HIVE-4268
 URL: https://issues.apache.org/jira/browse/HIVE-4268
 Project: Hive
  Issue Type: Improvement
  Components: CLI, HiveServer2
Affects Versions: 0.10.0
Reporter: Rob Weltman
Assignee: Rob Weltman
  Labels: HiveServer2
 Fix For: 0.11.0

 Attachments: HIVE-4268.1.patch.txt, HIVE-4268.2.patch.txt


 Beeline should support the -f option (pass in a script to execute) for 
 compatibility with the Hive CLI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: lots of tests failing

2013-04-09 Thread Namit Jain
No issues - thanks.


On 4/10/13 6:16 AM, Ashutosh Chauhan hashut...@apache.org wrote:

No worries, Navis! It happens with all of us. I have reverted the patch
and
build should be green now.
Only thing I will suggest to keep in mind is to run all tests before doing
commit.

Ashutosh


On Tue, Apr 9, 2013 at 5:08 PM, Navis류승우 navis@nexr.com wrote:

 Yes, it's all my fault. I've forgot that the Driver affects test
results.

 Reverting HIVE-1953 is created(HIVE-4319) and ready to be committed.

 Again, sorry for all the troubles to community (especially for
 Vikram). I should think twice before doing things.

 2013/4/10 Namit Jain nj...@fb.com:
  It seems that the comments have been removed from the output files,
and
 so a lot of tests are failing.
  I have not  debugged, but
 https://issues.apache.org/jira/browse/HIVE-1953 looks like the culprit.
 
  Navis, is that so ? Are you updating the log files ?




[jira] [Created] (HIVE-4325) Merge HCat NOTICE file with Hive NOTICE file

2013-04-09 Thread Alan Gates (JIRA)
Alan Gates created HIVE-4325:


 Summary: Merge HCat NOTICE file with Hive NOTICE file
 Key: HIVE-4325
 URL: https://issues.apache.org/jira/browse/HIVE-4325
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0


There are a few items in the HCat NOTICE.txt that are not in Hive's NOTICE.  We 
need to merge these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4325) Merge HCat NOTICE file with Hive NOTICE file

2013-04-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4325:
-

Attachment: HIVE-4325.patch

 Merge HCat NOTICE file with Hive NOTICE file
 

 Key: HIVE-4325
 URL: https://issues.apache.org/jira/browse/HIVE-4325
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4325.patch


 There are a few items in the HCat NOTICE.txt that are not in Hive's NOTICE.  
 We need to merge these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4325) Merge HCat NOTICE file with Hive NOTICE file

2013-04-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4325:
-

Status: Patch Available  (was: Open)

 Merge HCat NOTICE file with Hive NOTICE file
 

 Key: HIVE-4325
 URL: https://issues.apache.org/jira/browse/HIVE-4325
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4325.patch


 There are a few items in the HCat NOTICE.txt that are not in Hive's NOTICE.  
 We need to merge these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Branch for HIVE-4160

2013-04-09 Thread Ashutosh Chauhan
Sounds good. I will create a branch soon.

Thanks,
Ashutosh


On Mon, Apr 8, 2013 at 7:31 PM, Namit Jain nj...@fb.com wrote:

 Sounds good to me


 On 4/9/13 12:04 AM, Jitendra Pandey jiten...@hortonworks.com wrote:

 I agree that we shouldn't wait too long before merging the branch.
 We are targeting to have basic queries working within a month from now and
 will definitely propose to merge the branch back into trunk at that point.
 We will limit the scope of the work on the branch to just a few operators
 and primitive datatypes. Does that sound reasonable?
 
 regards
 jitendra
 
 On Wed, Apr 3, 2013 at 9:03 PM, Namit Jain nj...@fb.com wrote:
 
  There is no right answer, but I feel if you go this path a long way, it
  will be very difficult
  to merge back. Given that this is not a new functionality, and
 improvement
  to existing code
  (which will also evolve), it will become difficult to maintain/review a
  big diff in the future.
 
  I haven't thought much about it, but can start by creating the
 high-level
  interfaces first, and then
  going from there. For e.g.: create interfaces for operators which take
 in
  an array of rows instead of
  a single row - initially the array size can always be 1. Now, proceed
 from
  there.
 
  What makes you think, merging a branch 6 months/1 year from now will be
  easier than working on the
  current branch ?
 
  Having said that, both approaches can be made to work - but I think you
  are just delaying the
  merging work instead of taking the hit upfront.
 
  Thanks,
  -namit
 
 
 
  On 4/4/13 2:40 AM, Jitendra Pandey jiten...@hortonworks.com wrote:
 
 We did consider implementing these changes on the trunk. But, it
 would
  take several patches in various parts of the code before a simple end
 to
  end query can be executed on vectorized path. For example a patch for
  vectorized expressions  will be a significant amount of code, but will
 not
  be used in a query until a vectorized operator is implemented and the
  query
  plan is modified to use the vectorized path. Vectorization of even
 basic
  expressions becomes non trivial because we need to optimize for various
  cases like chain of expressions, for non-null columns or repeating
 values
  and also handle case for nullable columns, or short circuit
 optimization
  etc. Careful handling of these is important for performance gains.
  
   Committing those intermediate patches in trunk  without stabilizing
 them
  in a branch first might be a cause of concern.
  
A separate branch will let us make incremental changes to the system
 so
  that each patch addresses a single feature or functionality and is
 small
  enough to review.
 We will make sure that the branch is frequently updated with the
  changes
  in the trunk to avoid conflicts at the time of the merge.
Also, we plan to propose merger of the branch as soon as a basic end
 to
  end query begins to work and is sufficiently tested, instead of waiting
  for
  all operators to get vectorized. Initially our target is to make select
  and
  filter operators work with vectorized expressions for primitive types.
  
 We will have a single global configuration flag that can be used to
  turn
  off the entire vectorization code path and we will specifically test to
  make sure that when this flag is off there is no regression on the
 current
  system. When vectorization is turned on, we will have a validation
 step to
  make sure the given query is supported on the vectorization path
 otherwise
  it will fall back to current code path.
  
Although, we intend to follow commit then review policy on the branch
  for
  speed of development, each patch will have an associated jira and will
 be
  available for review and feedback.
  
  thanks
  jitendra
  
  On Tue, Apr 2, 2013 at 8:37 PM, Namit Jain nj...@fb.com wrote:
  
   It will be difficult to merge back the branch.
   Can you stage your changes incrementally ?
  
   I mean, start with the making the operators vectorized - it can be a
 for
   loop to
   start with ? I think it will be very difficult to merge it back if we
   diverge on this.
   I would recommend starting with simple interfaces for operators and
 then
   plugging them
   in slowly instead of a new branch, unless this approach is extremely
   difficult.
  
  
   Thanks,
   -namit
  
   On 4/3/13 1:52 AM, Jitendra Pandey jiten...@hortonworks.com
 wrote:
  
   Hi Folks,
I want to propose for creation of a separate branch for
 HIVE-4160
   work. This is a significant amount of work, and support for very
 basic
   functionality will need big chunks of code. It will also take some
  time to
   stabilize and test. A separate dev branch will allow us to do this
 work
   incrementally and collaboratively. We have already uploaded a design
   document on the jira for comments/feedback.
   
   thanks
   jitendra
   
   
   --
   http://hortonworks.com/download/
  
  
  
  
  --
  http://hortonworks.com/download/
 
 
 
 
 

[jira] [Updated] (HIVE-4326) Clean up remaining items in hive/hcatalog/historical/trunk

2013-04-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4326:
-

Attachment: HIVE-4326.patch

This patch cannot be applied and checked in as is, since it contains svn 
propedits and svn mvs.  But I'm posting here so people can take a look.  If no 
one objects I'll check it in.

 Clean up remaining items in hive/hcatalog/historical/trunk
 --

 Key: HIVE-4326
 URL: https://issues.apache.org/jira/browse/HIVE-4326
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4326.patch


 There are a few files remaining in HCat's historical trunk.  Most of them 
 need to be removed, a few need to be moved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4326) Clean up remaining items in hive/hcatalog/historical/trunk

2013-04-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4326:
-

Status: Patch Available  (was: Open)

Once this patch is applied, the only remaining file in 
hcatalog/historical/trunk is CHANGES.txt.  This should be moved to 
hive/hcatalog-historical/trunk/CHANGES.txt.  I think we should keep this file 
around for historical info.

 Clean up remaining items in hive/hcatalog/historical/trunk
 --

 Key: HIVE-4326
 URL: https://issues.apache.org/jira/browse/HIVE-4326
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4326.patch


 There are a few files remaining in HCat's historical trunk.  Most of them 
 need to be removed, a few need to be moved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4326) Clean up remaining items in hive/hcatalog/historical/trunk

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627438#comment-13627438
 ] 

Ashutosh Chauhan commented on HIVE-4326:


[~alangates] Mithun has some comments on HIVE-4265 on how the end state should 
look like. I believe your proposal is in agreement to that. Also, can you 
resolve HIVE-4265 now that remaining work is getting done on this ticket.

 Clean up remaining items in hive/hcatalog/historical/trunk
 --

 Key: HIVE-4326
 URL: https://issues.apache.org/jira/browse/HIVE-4326
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4326.patch


 There are a few files remaining in HCat's historical trunk.  Most of them 
 need to be removed, a few need to be moved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4265) HCatalog branches need to move out of trunk/hcatalog/historical

2013-04-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627439#comment-13627439
 ] 

Alan Gates commented on HIVE-4265:
--

I've filed HIVE-4326 which covers all this.  Let me know if the changes look 
good.

 HCatalog branches need to move out of trunk/hcatalog/historical
 ---

 Key: HIVE-4265
 URL: https://issues.apache.org/jira/browse/HIVE-4265
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0


 When I moved the HCatalog code into Hive I put it all under trunk/hcatalog 
 (as we had discussed).  However this is causing checkouts and 'ant tar' both 
 to pick up the old hcatalog branches, which is not what we want.  We need to 
 move this branch code somewhere else.  Two options have been suggested:
 # Put it in branches, prefixing 'hcatalog-' to the branch name.  So 
 trunk/hcatalog/historical/branches/branch-0.5 would move to 
 branches/hcatalog-branch-0.5
 # Put it in a top level hcatalog-historical directory.
 Either is fine, but we need to do this soon to avoid plugging checkouts and 
 builds with dead code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4265) HCatalog branches need to move out of trunk/hcatalog/historical

2013-04-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates resolved HIVE-4265.
--

Resolution: Fixed

Resolving as Ashutosh has moved the branches and HIVE-4326 covers moving the 
remainders of hcatalog/historical/trunk.

 HCatalog branches need to move out of trunk/hcatalog/historical
 ---

 Key: HIVE-4265
 URL: https://issues.apache.org/jira/browse/HIVE-4265
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0


 When I moved the HCatalog code into Hive I put it all under trunk/hcatalog 
 (as we had discussed).  However this is causing checkouts and 'ant tar' both 
 to pick up the old hcatalog branches, which is not what we want.  We need to 
 move this branch code somewhere else.  Two options have been suggested:
 # Put it in branches, prefixing 'hcatalog-' to the branch name.  So 
 trunk/hcatalog/historical/branches/branch-0.5 would move to 
 branches/hcatalog-branch-0.5
 # Put it in a top level hcatalog-historical directory.
 Either is fine, but we need to do this soon to avoid plugging checkouts and 
 builds with dead code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4325) Merge HCat NOTICE file with Hive NOTICE file

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627446#comment-13627446
 ] 

Ashutosh Chauhan commented on HIVE-4325:


+1

 Merge HCat NOTICE file with Hive NOTICE file
 

 Key: HIVE-4325
 URL: https://issues.apache.org/jira/browse/HIVE-4325
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4325.patch


 There are a few items in the HCat NOTICE.txt that are not in Hive's NOTICE.  
 We need to merge these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4310:
-

Attachment: hive.4310.2.patch-nohcat

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4310.1.patch, hive.4310.1.patch-nohcat, 
 hive.4310.2.patch-nohcat




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4316) bug introduced by HIVE-3464

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-4316:


Assignee: Namit Jain

 bug introduced by HIVE-3464
 ---

 Key: HIVE-4316
 URL: https://issues.apache.org/jira/browse/HIVE-4316
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain

   // for outer joins, it should not exceed 16 aliases (short type)
 
   if (!node.getNoOuterJoin() || !target.getNoOuterJoin()) {
 if (node.getRightAliases().length + 
 target.getRightAliases().length + 1  16) {
   LOG.info(ErrorMsg.JOINNODE_OUTERJOIN_MORETHAN_16);
   continue;
 }
   }
 It is checking RightAliases() twice

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4327) NPE in constant folding with decimal

2013-04-09 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-4327:


 Summary: NPE in constant folding with decimal
 Key: HIVE-4327
 URL: https://issues.apache.org/jira/browse/HIVE-4327
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor


The query:

SELECT dec * cast('123456789012345678901234567890.1234567' as decimal) FROM 
DECIMAL_PRECISION LIMIT 1

fails with an NPE while constant folding. This only happens when the decimal is 
out of range of max precision.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4327) NPE in constant folding with decimal

2013-04-09 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4327:
-

Attachment: HIVE-4327.1.q

 NPE in constant folding with decimal
 

 Key: HIVE-4327
 URL: https://issues.apache.org/jira/browse/HIVE-4327
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4327.1.q


 The query:
 SELECT dec * cast('123456789012345678901234567890.1234567' as decimal) FROM 
 DECIMAL_PRECISION LIMIT 1
 fails with an NPE while constant folding. This only happens when the decimal 
 is out of range of max precision.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4310:
-

Attachment: hive.4310.3.patch-nohcat

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4310.1.patch, hive.4310.1.patch-nohcat, 
 hive.4310.2.patch-nohcat, hive.4310.3.patch-nohcat




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4327) NPE in constant folding with decimal

2013-04-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627470#comment-13627470
 ] 

Gunther Hagleitner commented on HIVE-4327:
--

Review: https://reviews.facebook.net/D10095

 NPE in constant folding with decimal
 

 Key: HIVE-4327
 URL: https://issues.apache.org/jira/browse/HIVE-4327
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4327.1.q


 The query:
 SELECT dec * cast('123456789012345678901234567890.1234567' as decimal) FROM 
 DECIMAL_PRECISION LIMIT 1
 fails with an NPE while constant folding. This only happens when the decimal 
 is out of range of max precision.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira