[jira] [Updated] (HIVE-4345) Pushing down query conditions to support on-the-fly filtering at file parsing

2013-04-12 Thread Yifeng Geng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifeng Geng updated HIVE-4345:
--

Attachment: hive-0.10.0.patch2

 Pushing down query conditions to support on-the-fly filtering at file parsing
 -

 Key: HIVE-4345
 URL: https://issues.apache.org/jira/browse/HIVE-4345
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yifeng Geng
  Labels: patch
 Fix For: 0.10.0


 Serialize predicate conditions in query plan to MapredWork class, so the 
 FileFormat class can use the conditions to do on-the-fly filtering on the 
 files. It can improve the performance a lot for processsing certain binary 
 files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4345) Pushing down query conditions to support on-the-fly filtering at file parsing

2013-04-12 Thread Yifeng Geng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifeng Geng updated HIVE-4345:
--

Attachment: (was: hive-0.10.0.patch2)

 Pushing down query conditions to support on-the-fly filtering at file parsing
 -

 Key: HIVE-4345
 URL: https://issues.apache.org/jira/browse/HIVE-4345
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yifeng Geng
  Labels: patch
 Fix For: 0.10.0


 Serialize predicate conditions in query plan to MapredWork class, so the 
 FileFormat class can use the conditions to do on-the-fly filtering on the 
 files. It can improve the performance a lot for processsing certain binary 
 files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4345) Pushing down query conditions to support on-the-fly filtering at file parsing

2013-04-12 Thread Yifeng Geng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifeng Geng updated HIVE-4345:
--

Status: Patch Available  (was: Open)

 Pushing down query conditions to support on-the-fly filtering at file parsing
 -

 Key: HIVE-4345
 URL: https://issues.apache.org/jira/browse/HIVE-4345
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yifeng Geng
  Labels: patch
 Fix For: 0.10.0


 Serialize predicate conditions in query plan to MapredWork class, so the 
 FileFormat class can use the conditions to do on-the-fly filtering on the 
 files. It can improve the performance a lot for processsing certain binary 
 files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4345) Pushing down query conditions to support on-the-fly filtering at file parsing

2013-04-12 Thread Yifeng Geng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifeng Geng updated HIVE-4345:
--

Attachment: HIVE-4345.patch

 Pushing down query conditions to support on-the-fly filtering at file parsing
 -

 Key: HIVE-4345
 URL: https://issues.apache.org/jira/browse/HIVE-4345
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yifeng Geng
  Labels: patch
 Fix For: 0.10.0

 Attachments: HIVE-4345.patch


 Serialize predicate conditions in query plan to MapredWork class, so the 
 FileFormat class can use the conditions to do on-the-fly filtering on the 
 files. It can improve the performance a lot for processsing certain binary 
 files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4345) Pushing down query conditions to support on-the-fly filtering at file parsing

2013-04-12 Thread Yifeng Geng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifeng Geng updated HIVE-4345:
--

Description: Serialize predicate conditions in query plan to MapredWork 
class, so the FileFormat class can use the conditions to do on-the-fly 
filtering on the files. It can improve the performance a lot for processsing 
certain binary files(NetCDF files for example).  (was: Serialize predicate 
conditions in query plan to MapredWork class, so the FileFormat class can use 
the conditions to do on-the-fly filtering on the files. It can improve the 
performance a lot for processsing certain binary files.)

 Pushing down query conditions to support on-the-fly filtering at file parsing
 -

 Key: HIVE-4345
 URL: https://issues.apache.org/jira/browse/HIVE-4345
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yifeng Geng
  Labels: patch
 Fix For: 0.10.0

 Attachments: HIVE-4345.patch


 Serialize predicate conditions in query plan to MapredWork class, so the 
 FileFormat class can use the conditions to do on-the-fly filtering on the 
 files. It can improve the performance a lot for processsing certain binary 
 files(NetCDF files for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4345) Pushing down query conditions to support on-the-fly filtering at the file parsing

2013-04-12 Thread Yifeng Geng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifeng Geng updated HIVE-4345:
--

Summary: Pushing down query conditions to support on-the-fly filtering at 
the file parsing  (was: Pushing down query conditions to support on-the-fly 
filtering at file parsing)

 Pushing down query conditions to support on-the-fly filtering at the file 
 parsing
 -

 Key: HIVE-4345
 URL: https://issues.apache.org/jira/browse/HIVE-4345
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yifeng Geng
  Labels: patch
 Fix For: 0.10.0

 Attachments: HIVE-4345.patch


 Serialize predicate conditions in query plan to MapredWork class, so the 
 FileFormat class can use the conditions to do on-the-fly filtering on the 
 files. It can improve the performance a lot for processsing certain binary 
 files(NetCDF files for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4294) Single sourced multi query cannot handle lateral view

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4294:
-

Attachment: hive.4294.3.patch

 Single sourced multi query cannot handle lateral view
 -

 Key: HIVE-4294
 URL: https://issues.apache.org/jira/browse/HIVE-4294
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: hive.4294.3.patch, HIVE-4294.D10161.1.patch, 
 HIVE-4294.D10161.2.patch


 For example,
 {noformat}
 hive explain from src 
  select key, C lateral view explode(array(key, value)) A as C;
 FAILED: ParseException line 3:22 missing EOF at 'view' near 'lateral'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4280) TestRetryingHMSHandler is failing on trunk.

2013-04-12 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4280:
-

Attachment: HIVE-4280-2.patch.txt
HIVE-4280-1.patch.txt

I uploaded two patches. HIVE-4280-2.patch.txt is [~ashutoshc]'s suggestion. In 
HIVE-4280-1.patch.txt, other database names are changed, too. Both of them 
passed tests.

 TestRetryingHMSHandler is failing on trunk.
 ---

 Key: HIVE-4280
 URL: https://issues.apache.org/jira/browse/HIVE-4280
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Teddy Choi
 Attachments: HIVE-4280-1.patch.txt, HIVE-4280-2.patch.txt


 Newly added testcase TestRetryingHMSHandler fails on trunk. 
 https://builds.apache.org/job/Hive-trunk-h0.21/2040/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4280) TestRetryingHMSHandler is failing on trunk.

2013-04-12 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4280:
-

Fix Version/s: 0.11.0
   Status: Patch Available  (was: Open)

 TestRetryingHMSHandler is failing on trunk.
 ---

 Key: HIVE-4280
 URL: https://issues.apache.org/jira/browse/HIVE-4280
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Teddy Choi
 Fix For: 0.11.0

 Attachments: HIVE-4280-1.patch.txt, HIVE-4280-2.patch.txt


 Newly added testcase TestRetryingHMSHandler fails on trunk. 
 https://builds.apache.org/job/Hive-trunk-h0.21/2040/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4241) optimize hive.enforce.sorting and hive.enforce bucketing join

2013-04-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629901#comment-13629901
 ] 

Hudson commented on HIVE-4241:
--

Integrated in Hive-trunk-h0.21 #2058 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2058/])
HIVE-4241 optimize hive.enforce.sorting and hive.enforce bucketing join
(Namit Jain via Gang Tim Liu) (Revision 1467174)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1467174
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_1.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_2.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_3.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_4.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_5.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_6.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_7.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_8.q
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_1.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_2.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_3.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_4.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_5.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_6.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_7.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_8.q.out


 optimize hive.enforce.sorting and hive.enforce bucketing join
 -

 Key: HIVE-4241
 URL: https://issues.apache.org/jira/browse/HIVE-4241
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.11.0

 Attachments: hive.4241.1.patch, hive.4241.1.patch-nohcat, 
 hive.4241.2.patch-nohcat, hive.4241.3.patch, hive.4241.4.patch


 Consider the following scenario:
 T1: sorted and bucketed by key into 2 buckets
 T2: sorted and bucketed by key into 2 buckets
 T3: sorted and bucketed by key into 2 buckets
 set hive.enforce.sorting=true;
 set hive.enforce.bucketing=true;
 insert overwrite table T3
 select .. from T1 join T2 on T1.key = T2.key;
 Since T1, T2 and T3 are sorted/bucketed by the join, and the above join is
 being performed as a sort-merge join, T3 should be bucketed/sorted without
 the need for an extra reducer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4241) optimize hive.enforce.sorting and hive.enforce bucketing join

2013-04-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629900#comment-13629900
 ] 

Hudson commented on HIVE-4241:
--

Integrated in Hive-trunk-hadoop2 #153 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/153/])
HIVE-4241 optimize hive.enforce.sorting and hive.enforce bucketing join
(Namit Jain via Gang Tim Liu) (Revision 1467174)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1467174
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_1.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_2.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_3.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_4.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_5.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_6.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_7.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketsortoptimize_insert_8.q
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_1.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_2.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_3.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_4.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_5.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_6.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_7.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/bucketsortoptimize_insert_8.q.out


 optimize hive.enforce.sorting and hive.enforce bucketing join
 -

 Key: HIVE-4241
 URL: https://issues.apache.org/jira/browse/HIVE-4241
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.11.0

 Attachments: hive.4241.1.patch, hive.4241.1.patch-nohcat, 
 hive.4241.2.patch-nohcat, hive.4241.3.patch, hive.4241.4.patch


 Consider the following scenario:
 T1: sorted and bucketed by key into 2 buckets
 T2: sorted and bucketed by key into 2 buckets
 T3: sorted and bucketed by key into 2 buckets
 set hive.enforce.sorting=true;
 set hive.enforce.bucketing=true;
 insert overwrite table T3
 select .. from T1 join T2 on T1.key = T2.key;
 Since T1, T2 and T3 are sorted/bucketed by the join, and the above join is
 being performed as a sort-merge join, T3 should be bucketed/sorted without
 the need for an extra reducer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4294) Single sourced multi query cannot handle lateral view

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4294:
-

Attachment: hive.4294.4.patch

 Single sourced multi query cannot handle lateral view
 -

 Key: HIVE-4294
 URL: https://issues.apache.org/jira/browse/HIVE-4294
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: hive.4294.3.patch, hive.4294.4.patch, 
 HIVE-4294.D10161.1.patch, HIVE-4294.D10161.2.patch


 For example,
 {noformat}
 hive explain from src 
  select key, C lateral view explode(array(key, value)) A as C;
 FAILED: ParseException line 3:22 missing EOF at 'view' near 'lateral'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4294) Single sourced multi query cannot handle lateral view

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4294:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Navis

 Single sourced multi query cannot handle lateral view
 -

 Key: HIVE-4294
 URL: https://issues.apache.org/jira/browse/HIVE-4294
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.11.0

 Attachments: hive.4294.3.patch, hive.4294.4.patch, 
 HIVE-4294.D10161.1.patch, HIVE-4294.D10161.2.patch


 For example,
 {noformat}
 hive explain from src 
  select key, C lateral view explode(array(key, value)) A as C;
 FAILED: ParseException line 3:22 missing EOF at 'view' near 'lateral'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3891) physical optimizer changes for auto sort-merge join

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3891:
-

Attachment: hive.3891.10.patch

 physical optimizer changes for auto sort-merge join
 ---

 Key: HIVE-3891
 URL: https://issues.apache.org/jira/browse/HIVE-3891
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: auto_sortmerge_join_1.q, auto_sortmerge_join_1.q.out, 
 hive.3891.10.patch, hive.3891.1.patch, hive.3891.2.patch, hive.3891.3.patch, 
 hive.3891.4.patch, hive.3891.5.patch, hive.3891.6.patch, hive.3891.7.patch, 
 HIVE-3891_8.patch, hive.3891.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4294) Single sourced multi query cannot handle lateral view

2013-04-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629933#comment-13629933
 ] 

Hudson commented on HIVE-4294:
--

Integrated in Hive-trunk-h0.21 #2059 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2059/])
HIVE-4294 Single sourced multi query cannot handle lateral view
(Navis via namit) (Revision 1467196)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1467196
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientpositive/multi_insert_lateral_view.q
* /hive/trunk/ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out


 Single sourced multi query cannot handle lateral view
 -

 Key: HIVE-4294
 URL: https://issues.apache.org/jira/browse/HIVE-4294
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.11.0

 Attachments: hive.4294.3.patch, hive.4294.4.patch, 
 HIVE-4294.D10161.1.patch, HIVE-4294.D10161.2.patch


 For example,
 {noformat}
 hive explain from src 
  select key, C lateral view explode(array(key, value)) A as C;
 FAILED: ParseException line 3:22 missing EOF at 'view' near 'lateral'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4294) Single sourced multi query cannot handle lateral view

2013-04-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629936#comment-13629936
 ] 

Hudson commented on HIVE-4294:
--

Integrated in Hive-trunk-hadoop2 #154 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/154/])
HIVE-4294 Single sourced multi query cannot handle lateral view
(Navis via namit) (Revision 1467196)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1467196
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientpositive/multi_insert_lateral_view.q
* /hive/trunk/ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out


 Single sourced multi query cannot handle lateral view
 -

 Key: HIVE-4294
 URL: https://issues.apache.org/jira/browse/HIVE-4294
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.11.0

 Attachments: hive.4294.3.patch, hive.4294.4.patch, 
 HIVE-4294.D10161.1.patch, HIVE-4294.D10161.2.patch


 For example,
 {noformat}
 hive explain from src 
  select key, C lateral view explode(array(key, value)) A as C;
 FAILED: ParseException line 3:22 missing EOF at 'view' near 'lateral'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-hadoop2 - Build # 154 - Still Failing

2013-04-12 Thread Apache Jenkins Server
Changes for Build #138
[namit] HIVE-4289 HCatalog build fails when behind a firewall
(Samuel Yuan via namit)

[namit] HIVE-4281 add hive.map.groupby.sorted.testmode
(Namit via Gang Tim Liu)

[hashutosh] Moving hcatalog site outside of trunk

[hashutosh] Moving hcatalog branches outside of trunk

[hashutosh] HIVE-4259 : SEL operator created with missing columnExprMap for 
unions (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4156 : need to add protobuf classes to hive-exec.jar (Owen 
Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-3464 : Merging join tree may reorder joins which could be 
invalid (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4138 : ORC's union object inspector returns a type name that 
isn't parseable by TypeInfoUtils (Owen Omalley via Ashutosh Chauhan)

[cws] HIVE-4119. ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with 
NPE if the table is empty (Shreepadma Venugopalan via cws)

[hashutosh] HIVE-4252 : hiveserver2 string representation of complex types are 
inconsistent with cli (Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4179 : NonBlockingOpDeDup does not merge SEL operators 
correctly (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4269 : fix handling of binary type in hiveserver2, jdbc driver 
(Thejas Nair via Ashutosh Chauhan)

[namit] HIVE-4174 Round UDF converts BigInts to double
(Chen Chun via namit)

[namit] HIVE-4240 optimize hive.enforce.bucketing and hive.enforce sorting 
insert
(Gang Tim Liu via namit)

[navis] HIVE-4288 Add IntelliJ project files files to .gitignore (Roshan Naik 
via Navis)

[namit] HIVE-4272 partition wise metadata does not work for text files

[hashutosh] HIVE-896 : Add LEAD/LAG/FIRST/LAST analytical windowing functions 
to Hive. (Harish Butani via Ashutosh Chauhan)

[namit] HIVE-4260 union_remove_12, union_remove_13 are failing on hadoop2
(Gunther Hagleitner via namit)

[hashutosh] HIVE-3951 : Allow Decimal type columns in Regex Serde (Mark Grover 
via Ashutosh Chauhan)

[namit] HIVE-4270 bug in hive.map.groupby.sorted in the presence of multiple 
input partitions
(Namit via Gang Tim Liu)

[hashutosh] HIVE-3850 : hour() function returns 12 hour clock value when using 
timestamp datatype (Anandha and Franklin via Ashutosh Chauhan)

[hashutosh] HIVE-4122 : Queries fail if timestamp data not in expected format 
(Prasad Mujumdar via Ashutosh Chauhan)

[hashutosh] HIVE-4170 : [REGRESSION] FsShell.close closes filesystem, removing 
temporary directories (Navis via Ashutosh Chauhan)

[gates] HIVE-4264 Moved hcatalog trunk code up to hive/trunk/hcatalog

[hashutosh] HIVE-4263 : Adjust build.xml package command to move all hcat jars 
and binaries into build (Alan Gates via Ashutosh Chauhan)

[namit] HIVE-4258 Log logical plan tree for debugging
(Navis via namit)

[navis] HIVE-2264 Hive server is SHUTTING DOWN when invalid queries beeing 
executed

[kevinwilfong] HIVE-4235. CREATE TABLE IF NOT EXISTS uses inefficient way to 
check if table exists. (Gang Tim Liu via kevinwilfong)

[gangtimliu] HIVE-4157: ORC runs out of heap when writing (Kevin Wilfong vi 
Gang Tim Liu)

[gangtimliu] HIVE-4155: Expose ORC's FileDump as a service

[gangtimliu] HIVE-4159:RetryingHMSHandler doesn't retry in enough cases (Kevin 
Wilfong vi Gang Tim Liu)

[namit] HIVE-4149 wrong results big outer joins with array of ints
(Navis via namit)

[namit] HIVE-3958 support partial scan for analyze command - RCFile
(Gang Tim Liu via namit)

[gates] Removing old branches to limit size of Hive downloads.

[gates] Removing tags directory as we no longer need them and they're in the 
history.

[gates] Moving HCatalog into Hive.

[gates] Test that perms work for hcatalog

[hashutosh] HIVE-4007 : Create abstract classes for serializer and deserializer 
(Namit Jain via Ashutosh Chauhan)

[hashutosh] HIVE-3381 : Result of outer join is not valid (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-3980 : Cleanup after 3403 (Namit Jain via Ashutosh Chauhan)

[hashutosh] HIVE-4042 : ignore mapjoin hint (Namit Jain via Ashutosh Chauhan)

[namit] HIVE-3348 semi-colon in comments in .q file does not work
(Nick Collins via namit)

[namit] HIVE-4212 sort merge join should work for outer joins for more than 8 
inputs
(Namit via Gang Tim Liu)

[namit] HIVE-4219 explain dependency does not capture the input table
(Namit via Gang Tim Liu)

[kevinwilfong] HIVE-4092. Store complete names of tables in column access 
analyzer (Samuel Yuan via kevinwilfong)

[namit] HIVE-4208 Clientpositive test parenthesis_star_by is non-deteministic
(Mark Grover via namit)

[cws] HIVE-4217. Fix show_create_table_*.q test failures (Carl Steinbach via 
cws)

[namit] HIVE-4206 Sort merge join does not work for outer joins for 7 inputs
(Namit via Gang Tim Liu)

[kevinwilfong] HIVE-4188. TestJdbcDriver2.testDescribeTable failing 
consistently. (Prasad Mujumdar via kevinwilfong)

[hashutosh] HIVE-3820 Consider creating a literal like D or BD for representing 
Decimal type constants (Gunther Hagleitner 

[jira] [Updated] (HIVE-3891) physical optimizer changes for auto sort-merge join

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3891:
-

Attachment: hive.3891.11.patch

 physical optimizer changes for auto sort-merge join
 ---

 Key: HIVE-3891
 URL: https://issues.apache.org/jira/browse/HIVE-3891
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: auto_sortmerge_join_1.q, auto_sortmerge_join_1.q.out, 
 hive.3891.10.patch, hive.3891.11.patch, hive.3891.1.patch, hive.3891.2.patch, 
 hive.3891.3.patch, hive.3891.4.patch, hive.3891.5.patch, hive.3891.6.patch, 
 hive.3891.7.patch, HIVE-3891_8.patch, hive.3891.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names

2013-04-12 Thread caofangkun (JIRA)
caofangkun created HIVE-4346:


 Summary: when writing data into filesystem from queries ,the 
output files could contain a line of column names 
 Key: HIVE-4346
 URL: https://issues.apache.org/jira/browse/HIVE-4346
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: caofangkun
Priority: Minor


For example :
hivedesc src;
key string
value string
hiveselect * from src;
1 10
2 20
hiveset hive.output.contain.columnnames=true;
hiveinsert overwrite local directory './test1' select * from src ;
hive!cat './test1/00_0';
key^Avalue
1^A10
2^A20

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-4167:


Assignee: Namit Jain  (was: Vikram Dixit K)

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629948#comment-13629948
 ] 

Namit Jain commented on HIVE-4167:
--

I was able to reproduce it:

CREATE TABLE bucket_small (key string, value string) partitioned by (ds string) 
CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
load data local inpath '../data/files/smallsrcsortbucket1outof4.txt' INTO TABLE 
bucket_small partition(ds='2008-04-08');
load data local inpath '../data/files/smallsrcsortbucket2outof4.txt' INTO TABLE 
bucket_small partition(ds='2008-04-08');

CREATE TABLE bucket_big (key string, value string) partitioned by (ds string) 
CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE;
load data local inpath '../data/files/srcsortbucket1outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');
load data local inpath '../data/files/srcsortbucket2outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');
load data local inpath '../data/files/srcsortbucket3outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');
load data local inpath '../data/files/srcsortbucket4outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');

load data local inpath '../data/files/srcsortbucket1outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');
load data local inpath '../data/files/srcsortbucket2outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');
load data local inpath '../data/files/srcsortbucket3outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');
load data local inpath '../data/files/srcsortbucket4outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');

set hive.auto.convert.join=true;
set hive.auto.convert.sortmerge.join=true;
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;

-- Since size is being used to find the big table, the order of the tables in 
the join does not matter
explain extended select count(*) FROM bucket_small a JOIN bucket_big b ON a.key 
= b.key;
select count(*) FROM bucket_small a JOIN bucket_big b ON a.key = b.key;

explain extended select count(*) FROM bucket_big a JOIN bucket_small b ON a.key 
= b.key;
select count(*) FROM bucket_big a JOIN bucket_small b ON a.key = b.key;

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629956#comment-13629956
 ] 

Vikram Dixit K commented on HIVE-4167:
--

Hi [~namit]

I have a rebased patch on trunk. I was trying to produce a test using the 
tables available in the unit tests. Can I use the test you have provided in 
this jira?

Thanks
Vikram.

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629957#comment-13629957
 ] 

Gunther Hagleitner commented on HIVE-4318:
--

[~kevinwilfong]: Here are the additional numbers. 

Summary: You were right about counters having a significant effect despite the 
flag, but OperatorHooks are definitely expensive too.

All tests were run on EC2, single node setup. I used ~3m rows, single table, 
stored in rc file. Query was count\(*\) with a simple not very selective where 
clause. I've ran each different build 5 times and averaged the last 3 runs. 
There was little difference between the runs. Hive.task.progress was off in all 
runs, no actual operator hooks were installed.

I've also tested both removing counters and a fixed version of counters. The 
fixed version places the check for the flag at the right place to avoid 
unnecessary calls to System.currentTimeMillis(), as well as unnecessary 
counting of the rows, etc.

Numbers:

{noformat}
Current trunk: 44.5 seconds
Fix for counters, unchanged operator hooks: 33.5 seconds (Kevin, that's the run 
you asked for)
Fix for counters, removal of operator hooks: 29.3 seconds
Removal of both operator hooks and counters completely: 27.9 seconds
{noformat}

Proposal:

- Remove operator hooks and backport to 0.11 branch. That's a regression that 
was introduced between 0.10 and 0.11, I believe.
- Remove profiler for now and backport to 0.11 branch. Profiler doesn't work 
without operator hooks right now. I'll open a jira to re-introduce profiler in 
a way that doesn't add any code to the inner loop (maybe hidden behind static 
final var that is false, so compiler removes it).
- Counters: Change this patch to include my fix for counters and backport to 
0.11. This gives us a significant boost, but isn't a regression from the last 
version. I'll open a jira to dig deeper and see if we can get even closer to 
the result with the counters completely removed.

How does that sound?

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3996) Correctly enforce the memory limit on the multi-table map-join

2013-04-12 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-3996:
-

Status: Patch Available  (was: Open)

 Correctly enforce the memory limit on the multi-table map-join
 --

 Key: HIVE-3996
 URL: https://issues.apache.org/jira/browse/HIVE-3996
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-3996_2.patch, HIVE-3996_3.patch, HIVE-3996_4.patch, 
 HIVE-3996_5.patch, HIVE-3996_6.patch, HIVE-3996_7.patch, HIVE-3996_8.patch, 
 HIVE-3996_9.patch, HIVE-3996.patch


 Currently with HIVE-3784, the joins are converted to map-joins based on 
 checks of the table size against the config variable: 
 hive.auto.convert.join.noconditionaltask.size. 
 However, the current implementation will also merge multiple mapjoin 
 operators into a single task regardless of whether the sum of the table sizes 
 will exceed the configured value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-446) Implement TRUNCATE

2013-04-12 Thread caofangkun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629960#comment-13629960
 ] 

caofangkun commented on HIVE-446:
-

Hi ALL:
Whether it is necessary to enhance the syntax like  
TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE;
to remove data from EXTERNAL table ?

 Implement TRUNCATE
 --

 Key: HIVE-446
 URL: https://issues.apache.org/jira/browse/HIVE-446
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Prasad Chakka
Assignee: Navis
 Fix For: 0.11.0

 Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, 
 HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch


 truncate the data but leave the table and metadata intact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629965#comment-13629965
 ] 

Vikram Dixit K commented on HIVE-4167:
--

Hi Namit,

I was able to reproduce this issue so far on my setup. However, I wasn't
sure on how to reproduce this issue on using tables in unit-tests. I can
provide an updated patch with your test right away. I am still actively
working on this issue.

Thanks
Vikram.






-- 
Nothing better than when appreciated for hard work.
-Mark


 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4167:
-

Attachment: hive.4167.1.patch

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629966#comment-13629966
 ] 

Namit Jain commented on HIVE-4167:
--

I have a fix, and a testcase for this.
Can you take a look ?



 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629967#comment-13629967
 ] 

Namit Jain commented on HIVE-4167:
--

https://reviews.facebook.net/D10209

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3996) Correctly enforce the memory limit on the multi-table map-join

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629969#comment-13629969
 ] 

Namit Jain commented on HIVE-3996:
--

+1

Running tests 

 Correctly enforce the memory limit on the multi-table map-join
 --

 Key: HIVE-3996
 URL: https://issues.apache.org/jira/browse/HIVE-3996
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-3996_2.patch, HIVE-3996_3.patch, HIVE-3996_4.patch, 
 HIVE-3996_5.patch, HIVE-3996_6.patch, HIVE-3996_7.patch, HIVE-3996_8.patch, 
 HIVE-3996_9.patch, HIVE-3996.patch


 Currently with HIVE-3784, the joins are converted to map-joins based on 
 checks of the table size against the config variable: 
 hive.auto.convert.join.noconditionaltask.size. 
 However, the current implementation will also merge multiple mapjoin 
 operators into a single task regardless of whether the sum of the table sizes 
 will exceed the configured value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4106) SMB joins fail in multi-way joins

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629979#comment-13629979
 ] 

Namit Jain commented on HIVE-4106:
--

[~vikram.dixit], can you load the failing query ?

 SMB joins fail in multi-way joins
 -

 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Blocker
 Attachments: auto_sortmerge_join_12.q, HIVE-4106.patch


 I see array out of bounds exception in case of multi way smb joins. This is 
 related to changes that went in as part of HIVE-3403. This issue has been 
 discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4106) SMB joins fail in multi-way joins

2013-04-12 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629994#comment-13629994
 ] 

Vikram Dixit K commented on HIVE-4106:
--

[~namit]
{noformat}
select i_item_id
   ,i_item_desc
   ,i_current_price
 from item i
  join inventory inv on (inv.inv_item_sk = i.i_item_sk)
  join date_dim d on (d.d_date_sk = inv.inv_date_sk)
  join store_sales ss on (ss.ss_item_sk = i.i_item_sk)
 where i_current_price between 62 and 62+30.0
 and d_date between '2000-05-25' and '2000-07-27'
 and i_manufact_id in (129,270,821,423)
 and inv_quantity_on_hand between 100 and 500
 group by i_item_id,i_item_desc,i_current_price
 order by i_item_id
 limit 100;
{noformat}

This is the TPC-DS benchmark query (scale 1 but this does not matter) where 
store_sales and inventory are sorted as follows:

store_sales: sorted by ss_item_sk, partitioned on ss_sold_date, clustered by 
ss_item_sk
inventory: partitioned by (inv_date string) clustered by (inv_item_sk) sorted 
by (inv_item_sk)
item: non-partitioned, bucketed clustered by (i_item_sk) sorted by (i_item_sk)
date_dim: non-partitioned, non-bucketed.

 SMB joins fail in multi-way joins
 -

 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Blocker
 Attachments: auto_sortmerge_join_12.q, HIVE-4106.patch


 I see array out of bounds exception in case of multi way smb joins. This is 
 related to changes that went in as part of HIVE-3403. This issue has been 
 discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3996) Correctly enforce the memory limit on the multi-table map-join

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3996:
-

Attachment: hive.3996.9.patch-nohcat

 Correctly enforce the memory limit on the multi-table map-join
 --

 Key: HIVE-3996
 URL: https://issues.apache.org/jira/browse/HIVE-3996
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-3996_2.patch, HIVE-3996_3.patch, HIVE-3996_4.patch, 
 HIVE-3996_5.patch, HIVE-3996_6.patch, HIVE-3996_7.patch, HIVE-3996_8.patch, 
 HIVE-3996_9.patch, hive.3996.9.patch-nohcat, HIVE-3996.patch


 Currently with HIVE-3784, the joins are converted to map-joins based on 
 checks of the table size against the config variable: 
 hive.auto.convert.join.noconditionaltask.size. 
 However, the current implementation will also merge multiple mapjoin 
 operators into a single task regardless of whether the sum of the table sizes 
 will exceed the configured value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4106) SMB joins fail in multi-way joins

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629996#comment-13629996
 ] 

Namit Jain commented on HIVE-4106:
--

This should not a mutli-way SMB join - this is due to HIVE-4167.
Can you apply the patch for 4167 and check if this works ?

 SMB joins fail in multi-way joins
 -

 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Blocker
 Attachments: auto_sortmerge_join_12.q, HIVE-4106.patch


 I see array out of bounds exception in case of multi way smb joins. This is 
 related to changes that went in as part of HIVE-3403. This issue has been 
 discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4275) Hive does not differentiate scheme and authority in file uris

2013-04-12 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4275:
-

Status: Patch Available  (was: Open)

 Hive does not differentiate scheme and authority in file uris
 -

 Key: HIVE-4275
 URL: https://issues.apache.org/jira/browse/HIVE-4275
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-4275.patch


 Consider the following set of queries:
 ALTER TABLE abc ADD PARTITION (x='0') LOCATION 'file:///foo';
 ALTER TABLE abc ADD PARTITION (x='1') LOCATION '/foo';
 select count(*) from abc;
 Even though there are different files under these directories, depending on 
 number of mappers, the count produces a value = num of mappers * num of files 
 in the 2 directories. This is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4275) Hive does not differentiate scheme and authority in file uris

2013-04-12 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4275:
-

Attachment: HIVE-4275.patch

Without the change, hive generates count as 4 (when it should be 2).

 Hive does not differentiate scheme and authority in file uris
 -

 Key: HIVE-4275
 URL: https://issues.apache.org/jira/browse/HIVE-4275
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.0

 Attachments: HIVE-4275.patch


 Consider the following set of queries:
 ALTER TABLE abc ADD PARTITION (x='0') LOCATION 'file:///foo';
 ALTER TABLE abc ADD PARTITION (x='1') LOCATION '/foo';
 select count(*) from abc;
 Even though there are different files under these directories, depending on 
 number of mappers, the count produces a value = num of mappers * num of files 
 in the 2 directories. This is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4278) HCat needs to get current Hive jars instead of pulling them from maven repo

2013-04-12 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4278:
---

Attachment: HIVE-4278.approach2.patch

Hi folks, I tried another approach that might hopefully be acceptable, and have 
uploaded a patch for that.

The premise:
a) We want to unblock HCat's ability to build from inside hive, picking the 
latest jars hive builds.
b) We want to do this without changing how the hive workflow looks like, 
including not changing publishing to the current ivy cache dir.
c) We don't want to perform invasive surgery to HCat switching it back to ivy 
either, till we can form a consensus as to which build tool we should 
standardize around.

Assumption:
a) Builtins can be removed ( HIVE-4304 and hive-dev mailing list discussion ) 
or, rather, at least, in the meanwhile, be disabled as a transitive dependency 
from other targets.
b) It is okay for hcat, during its build, to look at the jars hive has just 
built, and publish that to the local maven cache.

Future work:
a) Trying to unify version numbers between hcat and hive - I'm still unhappy 
about the number of files in which the string 0.12.0-SNAPSHOT occurs.

Is this compromise acceptable to all? Thoughts?

 HCat needs to get current Hive jars instead of pulling them from maven repo
 ---

 Key: HIVE-4278
 URL: https://issues.apache.org/jira/browse/HIVE-4278
 Project: Hive
  Issue Type: Sub-task
  Components: Build Infrastructure, HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Travis Crawford
Priority: Blocker
 Fix For: 0.11.0

 Attachments: HIVE-4278.approach2.patch, HIVE-4278.D9981.1.patch


 The HCatalog build is currently pulling Hive jars from the maven repo instead 
 of using the ones built as part of the current build.  Now that it is part of 
 Hive it should use the jars being built instead of pulling them from maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4167:
-

Attachment: hive.4167.2.patch

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, hive.4167.2.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4304) Remove unused builtins and pdk submodules

2013-04-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630154#comment-13630154
 ] 

Ashutosh Chauhan commented on HIVE-4304:


[~traviscrawford] Are you planning to upload a patch for this soon? I believe 
this will accelerate HIVE-4278

 Remove unused builtins and pdk submodules
 -

 Key: HIVE-4304
 URL: https://issues.apache.org/jira/browse/HIVE-4304
 Project: Hive
  Issue Type: Improvement
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-4304.1.patch


 Moving from email. The 
 [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and 
 [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not 
 believed to be in use and should be removed. The main benefits are 
 simplification and maintainability of the Hive code base.
 Forwarded conversation
 Subject: builtins submodule - is it still needed?
 
 From: Travis Crawford traviscrawf...@gmail.com
 Date: Thu, Apr 4, 2013 at 2:01 PM
 To: u...@hive.apache.org, dev@hive.apache.org
 Hey hive gurus -
 Is the builtins hive submodule in use? The submodule was added in
 HIVE-2523 as a location for builtin-UDFs, but it appears to not have
 taken off. Any objections to removing it?
 DETAILS
 For HIVE-4278 I'm making some build changes for the HCatalog
 integration. The builtins submodule causes issues because it delays
 building until the packaging phase - so HCatalog can't depend on
 builtins, which it does transitively.
 While investigating a path forward I discovered the builtins
 submodule contains very little code, and likely could either go away
 entirely or merge into ql, simplifying things both for users and
 developers.
 Thoughts? Can anyone with context help me understand builtins, both
 in general and around its non-standard build? For your trouble I'll
 either make the submodule go away/merge into another submodule, or
 update the docs with what we learn.
 Thanks!
 Travis
 --
 From: Ashutosh Chauhan ashutosh.chau...@gmail.com
 Date: Fri, Apr 5, 2013 at 3:10 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org u...@hive.apache.org
 I haven't used it myself anytime till now. Neither have met anyone who used
 it or plan to use it.
 Ashutosh
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Gunther Hagleitner ghagleit...@hortonworks.com
 Date: Fri, Apr 5, 2013 at 3:11 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org
 +1
 I would actually go a step further and propose to remove both PDK and
 builtins. I've went through the code for both and here is what I found:
 Builtins:
 - BuiltInUtils.java: Empty file
 - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was
 intended as a building block for PDK
 PDK:
 - some helper build.xml/test setup + teardown scripts
 - Classes/annotations to help run unit tests
 - rot13 as an example
 From what I can tell it's a fair assessment that it hasn't taken off, last
 commits to it seem to have happened more than 1.5 years ago.
 Thanks,
 Gunther.
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Owen O'Malley omal...@apache.org
 Date: Fri, Apr 5, 2013 at 4:45 PM
 To: u...@hive.apache.org
 +1 to removing them. 
 We have a Rot13 example in 
 ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java 
 anyways. *smile*
 -- Owen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4275) Hive does not differentiate scheme and authority in file uris

2013-04-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4275:
---

Fix Version/s: (was: 0.11.0)
   Status: Open  (was: Patch Available)

I get following error after I applied your patch and ran the included test.
{noformat}
   [junit] Failed query: schemeAuthority.q
[junit] mkdir: Incomplete HDFS URI, no host: hdfs:///tmp/test
{noformat}

 Hive does not differentiate scheme and authority in file uris
 -

 Key: HIVE-4275
 URL: https://issues.apache.org/jira/browse/HIVE-4275
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4275.patch


 Consider the following set of queries:
 ALTER TABLE abc ADD PARTITION (x='0') LOCATION 'file:///foo';
 ALTER TABLE abc ADD PARTITION (x='1') LOCATION '/foo';
 select count(*) from abc;
 Even though there are different files under these directories, depending on 
 number of mappers, the count produces a value = num of mappers * num of files 
 in the 2 directories. This is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-446) Implement TRUNCATE

2013-04-12 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630258#comment-13630258
 ] 

Gang Tim Liu commented on HIVE-446:
---

External table is used in the context where data is not fully managed. If it 
ends up that there is a need to remove data behind external table, a question 
can be asked why do you define it as external table?.

Saying that, possibly the proposed syntax and semantics are not consistent to 
external table use case.

thanks

 Implement TRUNCATE
 --

 Key: HIVE-446
 URL: https://issues.apache.org/jira/browse/HIVE-446
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Prasad Chakka
Assignee: Navis
 Fix For: 0.11.0

 Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, 
 HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch


 truncate the data but leave the table and metadata intact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character

2013-04-12 Thread Shuaishuai Nie (JIRA)
Shuaishuai Nie created HIVE-4348:


 Summary: Unit test compile fail at hbase-handler project on 
Windows becuase of illegal escape character
 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie


The problem is because the automatically generated test case hardcoded file 
path string of query file using \ instead of \\ as escape character. The 
change should be in the TestHBaseCliDriver.vm and TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4161) create clean and small default set of tests for TestBeeLineDriver

2013-04-12 Thread Rob Weltman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630298#comment-13630298
 ] 

Rob Weltman commented on HIVE-4161:
---

See the new tests in the patch to HIVE-4268. That is probably a better place to 
put BeeLine tests than in ql.


 create clean and small default set of tests for TestBeeLineDriver
 -

 Key: HIVE-4161
 URL: https://issues.apache.org/jira/browse/HIVE-4161
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: HiveServer2
 Fix For: 0.11.0


 HiveServer2 (HIVE-2935) has added TestBeeLineDriver on the lines of 
 TestCliDriver, which runs all the tests in TestCliDriver through the beeline 
 commandline, which uses jdbc+hive server2.
 There are failures in many of the test cases after the rebase of the patch 
 against latest hive code.
 The tests also almost double the time taken to run hive unit tests because 
 TestCliDriver takes bulk of the hive unit test runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 2060 - Still Failing

2013-04-12 Thread Apache Jenkins Server
Changes for Build #2032
[namit] HIVE-4219 explain dependency does not capture the input table
(Namit via Gang Tim Liu)


Changes for Build #2033
[gates] Removing old branches to limit size of Hive downloads.

[gates] Removing tags directory as we no longer need them and they're in the 
history.

[gates] Moving HCatalog into Hive.

[gates] Test that perms work for hcatalog

[hashutosh] HIVE-4007 : Create abstract classes for serializer and deserializer 
(Namit Jain via Ashutosh Chauhan)

[hashutosh] HIVE-3381 : Result of outer join is not valid (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-3980 : Cleanup after 3403 (Namit Jain via Ashutosh Chauhan)

[hashutosh] HIVE-4042 : ignore mapjoin hint (Namit Jain via Ashutosh Chauhan)

[namit] HIVE-3348 semi-colon in comments in .q file does not work
(Nick Collins via namit)

[namit] HIVE-4212 sort merge join should work for outer joins for more than 8 
inputs
(Namit via Gang Tim Liu)


Changes for Build #2034
[namit] HIVE-3958 support partial scan for analyze command - RCFile
(Gang Tim Liu via namit)


Changes for Build #2035
[kevinwilfong] HIVE-4235. CREATE TABLE IF NOT EXISTS uses inefficient way to 
check if table exists. (Gang Tim Liu via kevinwilfong)

[gangtimliu] HIVE-4157: ORC runs out of heap when writing (Kevin Wilfong vi 
Gang Tim Liu)

[gangtimliu] HIVE-4155: Expose ORC's FileDump as a service

[gangtimliu] HIVE-4159:RetryingHMSHandler doesn't retry in enough cases (Kevin 
Wilfong vi Gang Tim Liu)

[namit] HIVE-4149 wrong results big outer joins with array of ints
(Navis via namit)


Changes for Build #2036
[gates] HIVE-4264 Moved hcatalog trunk code up to hive/trunk/hcatalog

[hashutosh] HIVE-4263 : Adjust build.xml package command to move all hcat jars 
and binaries into build (Alan Gates via Ashutosh Chauhan)

[namit] HIVE-4258 Log logical plan tree for debugging
(Navis via namit)

[navis] HIVE-2264 Hive server is SHUTTING DOWN when invalid queries beeing 
executed


Changes for Build #2037

Changes for Build #2038
[hashutosh] HIVE-4122 : Queries fail if timestamp data not in expected format 
(Prasad Mujumdar via Ashutosh Chauhan)

[hashutosh] HIVE-4170 : [REGRESSION] FsShell.close closes filesystem, removing 
temporary directories (Navis via Ashutosh Chauhan)


Changes for Build #2039
[hashutosh] HIVE-3850 : hour() function returns 12 hour clock value when using 
timestamp datatype (Anandha and Franklin via Ashutosh Chauhan)


Changes for Build #2040
[hashutosh] HIVE-3951 : Allow Decimal type columns in Regex Serde (Mark Grover 
via Ashutosh Chauhan)

[namit] HIVE-4270 bug in hive.map.groupby.sorted in the presence of multiple 
input partitions
(Namit via Gang Tim Liu)


Changes for Build #2041

Changes for Build #2042

Changes for Build #2043
[hashutosh] HIVE-4252 : hiveserver2 string representation of complex types are 
inconsistent with cli (Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4179 : NonBlockingOpDeDup does not merge SEL operators 
correctly (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4269 : fix handling of binary type in hiveserver2, jdbc driver 
(Thejas Nair via Ashutosh Chauhan)

[namit] HIVE-4174 Round UDF converts BigInts to double
(Chen Chun via namit)

[namit] HIVE-4240 optimize hive.enforce.bucketing and hive.enforce sorting 
insert
(Gang Tim Liu via namit)

[navis] HIVE-4288 Add IntelliJ project files files to .gitignore (Roshan Naik 
via Navis)


Changes for Build #2044
[namit] HIVE-4289 HCatalog build fails when behind a firewall
(Samuel Yuan via namit)

[namit] HIVE-4281 add hive.map.groupby.sorted.testmode
(Namit via Gang Tim Liu)

[hashutosh] Moving hcatalog site outside of trunk

[hashutosh] Moving hcatalog branches outside of trunk

[hashutosh] HIVE-4259 : SEL operator created with missing columnExprMap for 
unions (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4156 : need to add protobuf classes to hive-exec.jar (Owen 
Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-3464 : Merging join tree may reorder joins which could be 
invalid (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4138 : ORC's union object inspector returns a type name that 
isn't parseable by TypeInfoUtils (Owen Omalley via Ashutosh Chauhan)

[cws] HIVE-4119. ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with 
NPE if the table is empty (Shreepadma Venugopalan via cws)


Changes for Build #2045

Changes for Build #2046
[hashutosh] HIVE-4067 : Followup to HIVE-701: reduce ambiguity in grammar 
(Samuel Yuan via Ashutosh Chauhan)


Changes for Build #2047

Changes for Build #2048
[gangtimliu] HIVE-4298: add tests for distincts for hive.map.groutp.sorted. 
(Namit via Gang Tim Liu)

[hashutosh] HIVE-4128 : Support avg(decimal) (Brock Noland via Ashutosh Chauhan)

[kevinwilfong] HIVE-4151. HiveProfiler NPE with ScriptOperator. (Pamela Vagata 
via kevinwilfong)


Changes for Build #2049
[hashutosh] HIVE-3985 : Update new UDAFs introduced for Windowing to work with 
new Decimal Type (Brock Noland via Ashutosh Chauhan)

Hive-trunk-hadoop2 - Build # 155 - Still Failing

2013-04-12 Thread Apache Jenkins Server
Changes for Build #138
[namit] HIVE-4289 HCatalog build fails when behind a firewall
(Samuel Yuan via namit)

[namit] HIVE-4281 add hive.map.groupby.sorted.testmode
(Namit via Gang Tim Liu)

[hashutosh] Moving hcatalog site outside of trunk

[hashutosh] Moving hcatalog branches outside of trunk

[hashutosh] HIVE-4259 : SEL operator created with missing columnExprMap for 
unions (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4156 : need to add protobuf classes to hive-exec.jar (Owen 
Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-3464 : Merging join tree may reorder joins which could be 
invalid (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4138 : ORC's union object inspector returns a type name that 
isn't parseable by TypeInfoUtils (Owen Omalley via Ashutosh Chauhan)

[cws] HIVE-4119. ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with 
NPE if the table is empty (Shreepadma Venugopalan via cws)

[hashutosh] HIVE-4252 : hiveserver2 string representation of complex types are 
inconsistent with cli (Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4179 : NonBlockingOpDeDup does not merge SEL operators 
correctly (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4269 : fix handling of binary type in hiveserver2, jdbc driver 
(Thejas Nair via Ashutosh Chauhan)

[namit] HIVE-4174 Round UDF converts BigInts to double
(Chen Chun via namit)

[namit] HIVE-4240 optimize hive.enforce.bucketing and hive.enforce sorting 
insert
(Gang Tim Liu via namit)

[navis] HIVE-4288 Add IntelliJ project files files to .gitignore (Roshan Naik 
via Navis)

[namit] HIVE-4272 partition wise metadata does not work for text files

[hashutosh] HIVE-896 : Add LEAD/LAG/FIRST/LAST analytical windowing functions 
to Hive. (Harish Butani via Ashutosh Chauhan)

[namit] HIVE-4260 union_remove_12, union_remove_13 are failing on hadoop2
(Gunther Hagleitner via namit)

[hashutosh] HIVE-3951 : Allow Decimal type columns in Regex Serde (Mark Grover 
via Ashutosh Chauhan)

[namit] HIVE-4270 bug in hive.map.groupby.sorted in the presence of multiple 
input partitions
(Namit via Gang Tim Liu)

[hashutosh] HIVE-3850 : hour() function returns 12 hour clock value when using 
timestamp datatype (Anandha and Franklin via Ashutosh Chauhan)

[hashutosh] HIVE-4122 : Queries fail if timestamp data not in expected format 
(Prasad Mujumdar via Ashutosh Chauhan)

[hashutosh] HIVE-4170 : [REGRESSION] FsShell.close closes filesystem, removing 
temporary directories (Navis via Ashutosh Chauhan)

[gates] HIVE-4264 Moved hcatalog trunk code up to hive/trunk/hcatalog

[hashutosh] HIVE-4263 : Adjust build.xml package command to move all hcat jars 
and binaries into build (Alan Gates via Ashutosh Chauhan)

[namit] HIVE-4258 Log logical plan tree for debugging
(Navis via namit)

[navis] HIVE-2264 Hive server is SHUTTING DOWN when invalid queries beeing 
executed

[kevinwilfong] HIVE-4235. CREATE TABLE IF NOT EXISTS uses inefficient way to 
check if table exists. (Gang Tim Liu via kevinwilfong)

[gangtimliu] HIVE-4157: ORC runs out of heap when writing (Kevin Wilfong vi 
Gang Tim Liu)

[gangtimliu] HIVE-4155: Expose ORC's FileDump as a service

[gangtimliu] HIVE-4159:RetryingHMSHandler doesn't retry in enough cases (Kevin 
Wilfong vi Gang Tim Liu)

[namit] HIVE-4149 wrong results big outer joins with array of ints
(Navis via namit)

[namit] HIVE-3958 support partial scan for analyze command - RCFile
(Gang Tim Liu via namit)

[gates] Removing old branches to limit size of Hive downloads.

[gates] Removing tags directory as we no longer need them and they're in the 
history.

[gates] Moving HCatalog into Hive.

[gates] Test that perms work for hcatalog

[hashutosh] HIVE-4007 : Create abstract classes for serializer and deserializer 
(Namit Jain via Ashutosh Chauhan)

[hashutosh] HIVE-3381 : Result of outer join is not valid (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-3980 : Cleanup after 3403 (Namit Jain via Ashutosh Chauhan)

[hashutosh] HIVE-4042 : ignore mapjoin hint (Namit Jain via Ashutosh Chauhan)

[namit] HIVE-3348 semi-colon in comments in .q file does not work
(Nick Collins via namit)

[namit] HIVE-4212 sort merge join should work for outer joins for more than 8 
inputs
(Namit via Gang Tim Liu)

[namit] HIVE-4219 explain dependency does not capture the input table
(Namit via Gang Tim Liu)

[kevinwilfong] HIVE-4092. Store complete names of tables in column access 
analyzer (Samuel Yuan via kevinwilfong)

[namit] HIVE-4208 Clientpositive test parenthesis_star_by is non-deteministic
(Mark Grover via namit)

[cws] HIVE-4217. Fix show_create_table_*.q test failures (Carl Steinbach via 
cws)

[namit] HIVE-4206 Sort merge join does not work for outer joins for 7 inputs
(Namit via Gang Tim Liu)

[kevinwilfong] HIVE-4188. TestJdbcDriver2.testDescribeTable failing 
consistently. (Prasad Mujumdar via kevinwilfong)

[hashutosh] HIVE-3820 Consider creating a literal like D or BD for representing 
Decimal type constants (Gunther Hagleitner 

Preferred way to run unit tests

2013-04-12 Thread kulkarni.swar...@gmail.com
Hello,

I have been trying to run the unit tests for the last hive release (0.10).
For me they have been taking  in access of 10 hrs to run (not to mention
the occasional failures with some of the flaky tests).

Current I am just doing a ant clean package test. Is there a better way
to run these? Also is it possible for the build to ignore any test failures
and complete?

Thanks for any help.

-- 
Swarnim


[jira] [Commented] (HIVE-4304) Remove unused builtins and pdk submodules

2013-04-12 Thread Travis Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630350#comment-13630350
 ] 

Travis Crawford commented on HIVE-4304:
---

Just started tests at 
https://travis.ci.cloudbees.com/job/HIVE-4304_rm_builtins_pdk/ and will post if 
they pass.

 Remove unused builtins and pdk submodules
 -

 Key: HIVE-4304
 URL: https://issues.apache.org/jira/browse/HIVE-4304
 Project: Hive
  Issue Type: Improvement
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-4304.1.patch


 Moving from email. The 
 [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and 
 [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not 
 believed to be in use and should be removed. The main benefits are 
 simplification and maintainability of the Hive code base.
 Forwarded conversation
 Subject: builtins submodule - is it still needed?
 
 From: Travis Crawford traviscrawf...@gmail.com
 Date: Thu, Apr 4, 2013 at 2:01 PM
 To: u...@hive.apache.org, dev@hive.apache.org
 Hey hive gurus -
 Is the builtins hive submodule in use? The submodule was added in
 HIVE-2523 as a location for builtin-UDFs, but it appears to not have
 taken off. Any objections to removing it?
 DETAILS
 For HIVE-4278 I'm making some build changes for the HCatalog
 integration. The builtins submodule causes issues because it delays
 building until the packaging phase - so HCatalog can't depend on
 builtins, which it does transitively.
 While investigating a path forward I discovered the builtins
 submodule contains very little code, and likely could either go away
 entirely or merge into ql, simplifying things both for users and
 developers.
 Thoughts? Can anyone with context help me understand builtins, both
 in general and around its non-standard build? For your trouble I'll
 either make the submodule go away/merge into another submodule, or
 update the docs with what we learn.
 Thanks!
 Travis
 --
 From: Ashutosh Chauhan ashutosh.chau...@gmail.com
 Date: Fri, Apr 5, 2013 at 3:10 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org u...@hive.apache.org
 I haven't used it myself anytime till now. Neither have met anyone who used
 it or plan to use it.
 Ashutosh
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Gunther Hagleitner ghagleit...@hortonworks.com
 Date: Fri, Apr 5, 2013 at 3:11 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org
 +1
 I would actually go a step further and propose to remove both PDK and
 builtins. I've went through the code for both and here is what I found:
 Builtins:
 - BuiltInUtils.java: Empty file
 - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was
 intended as a building block for PDK
 PDK:
 - some helper build.xml/test setup + teardown scripts
 - Classes/annotations to help run unit tests
 - rot13 as an example
 From what I can tell it's a fair assessment that it hasn't taken off, last
 commits to it seem to have happened more than 1.5 years ago.
 Thanks,
 Gunther.
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Owen O'Malley omal...@apache.org
 Date: Fri, Apr 5, 2013 at 4:45 PM
 To: u...@hive.apache.org
 +1 to removing them. 
 We have a Rot13 example in 
 ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java 
 anyways. *smile*
 -- Owen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters

2013-04-12 Thread Xi Fang (JIRA)
Xi Fang created HIVE-4349:
-

 Summary: Fix the Hive unit test failures when the Hive enlistment 
root path is longer than ~12 characters
 Key: HIVE-4349
 URL: https://issues.apache.org/jira/browse/HIVE-4349
 Project: Hive
  Issue Type: Bug
Reporter: Xi Fang
 Fix For: 0.11.0


If the Hive enlistment root path is longer than 12 chars then test classpath 
“hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the 
Hive unit tests on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters

2013-04-12 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-4349:
--

Attachment: HIVE-4349.patch

 Fix the Hive unit test failures when the Hive enlistment root path is longer 
 than ~12 characters
 

 Key: HIVE-4349
 URL: https://issues.apache.org/jira/browse/HIVE-4349
 Project: Hive
  Issue Type: Bug
Reporter: Xi Fang
 Fix For: 0.11.0

 Attachments: HIVE-4349.patch


 If the Hive enlistment root path is longer than 12 chars then test classpath 
 “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the 
 Hive unit tests on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters

2013-04-12 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-4349:
--

Attachment: (was: HIVE-4349.patch)

 Fix the Hive unit test failures when the Hive enlistment root path is longer 
 than ~12 characters
 

 Key: HIVE-4349
 URL: https://issues.apache.org/jira/browse/HIVE-4349
 Project: Hive
  Issue Type: Bug
Reporter: Xi Fang
 Fix For: 0.11.0

 Attachments: HIVE-4349.1.patch


 If the Hive enlistment root path is longer than 12 chars then test classpath 
 “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the 
 Hive unit tests on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters

2013-04-12 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-4349:
--

Attachment: HIVE-4349.1.patch

 Fix the Hive unit test failures when the Hive enlistment root path is longer 
 than ~12 characters
 

 Key: HIVE-4349
 URL: https://issues.apache.org/jira/browse/HIVE-4349
 Project: Hive
  Issue Type: Bug
Reporter: Xi Fang
 Fix For: 0.11.0

 Attachments: HIVE-4349.1.patch


 If the Hive enlistment root path is longer than 12 chars then test classpath 
 “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the 
 Hive unit tests on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4339) build fails after branch (hcatalog version not updated)

2013-04-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4339:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 build fails after branch (hcatalog version not updated)
 ---

 Key: HIVE-4339
 URL: https://issues.apache.org/jira/browse/HIVE-4339
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4339.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters

2013-04-12 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630405#comment-13630405
 ] 

Xi Fang commented on HIVE-4349:
---

This is the current solution. 
1) Before setting up the class path environment variable, find the list of JARs 
in the “test-classpath” and copy all of them to a test jar folder from 
various folders. This is done in a task shortenclasspath. That means, all the 
required JARs will be in a single folder.
2) Include the “test jar*” in the class path to reduce the class path size.
3) Set the environment variable

 Fix the Hive unit test failures when the Hive enlistment root path is longer 
 than ~12 characters
 

 Key: HIVE-4349
 URL: https://issues.apache.org/jira/browse/HIVE-4349
 Project: Hive
  Issue Type: Bug
Reporter: Xi Fang
 Fix For: 0.11.0

 Attachments: HIVE-4349.1.patch


 If the Hive enlistment root path is longer than 12 chars then test classpath 
 “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the 
 Hive unit tests on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters

2013-04-12 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-4349:
--

Affects Version/s: 0.11.0
   Status: Patch Available  (was: Open)

 Fix the Hive unit test failures when the Hive enlistment root path is longer 
 than ~12 characters
 

 Key: HIVE-4349
 URL: https://issues.apache.org/jira/browse/HIVE-4349
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Xi Fang
 Fix For: 0.11.0

 Attachments: HIVE-4349.1.patch


 If the Hive enlistment root path is longer than 12 chars then test classpath 
 “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the 
 Hive unit tests on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4347) Hcatalog build fail on Windows because javadoc command exceed length limit

2013-04-12 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-4347:
-

Attachment: HIVE-4347.patch

 Hcatalog build fail on Windows because javadoc command exceed length limit
 --

 Key: HIVE-4347
 URL: https://issues.apache.org/jira/browse/HIVE-4347
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure, HCatalog, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
  Labels: build, patch
 Attachments: HIVE-4347.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 When building Hcatalog on Window 8, build fail because 
 HIVE_DIR\hcatalog\build.xml:213: Javadoc failed: java.io.IOException: Cannot 
 run program JAVA_HOME\bin\javadoc.exe: CreateProces
 s error=206, The filename or extension is too long

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4350) support AS keyword for table alias

2013-04-12 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-4350:
---

 Summary: support AS keyword for table alias
 Key: HIVE-4350
 URL: https://issues.apache.org/jira/browse/HIVE-4350
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.10.0
Reporter: Thejas M Nair


SQL standard supports AS optional keyword, while creating an table alias.

http://savage.net.au/SQL/sql-92.bnf.html#table reference

Hive gives a error when the optional keyword is used -
select * from tiny as t1;
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: ParseException line 1:19 mismatched input 'as' expecting EOF near 'tiny'



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-12 Thread Mihir Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihir Kulkarni updated HIVE-4342:
-

Priority: Critical  (was: Major)

 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Priority: Critical
 Attachments: example.txt


 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attached file (example.txt) contains the schema and exact query which fails 
 on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Pamela Vagata (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630511#comment-13630511
 ] 

Pamela Vagata commented on HIVE-4318:
-

Thanks for running these separately :) I just looked in OperatorHookUtils.java 
which is where the opHooks list is being initialized - it looks like the 
opHooks list is always being initialized even if there are no OperatorHooks 
installed. My suspicion is that if we returned null instead of an empty list, 
the numbers would be different since a null check should be much cheaper. Would 
you mind modifying OperatorHookUtils.getOperatorHooks to return null instead of 
an empty list and then rerun the MBM with the code for the OperatorHooks left 
in and also commented out to see what the difference is?

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-12 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630546#comment-13630546
 ] 

Gang Tim Liu commented on HIVE-4322:


+1 after test passes

 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4322.HIVE-4322.HIVE-4322.HIVE-4322.D10203.1.patch


 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-12 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630549#comment-13630549
 ] 

Phabricator commented on HIVE-4322:
---

gangtimliu has commented on the revision HIVE-4322 [jira] SkewedInfo in 
Metastore Thrift API cannot be deserialized in Python.

  code lgtm.

  looking at hadoop 23 tests.

INLINE COMMENTS
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1127 
what's difference in this line? thanks

REVISION DETAIL
  https://reviews.facebook.net/D10203

To: gangtimliu, sxyuan
Cc: kevinwilfong, JIRA


 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4322.HIVE-4322.HIVE-4322.HIVE-4322.D10203.1.patch


 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-12 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630553#comment-13630553
 ] 

Phabricator commented on HIVE-4322:
---

sxyuan has commented on the revision HIVE-4322 [jira] SkewedInfo in Metastore 
Thrift API cannot be deserialized in Python.

INLINE COMMENTS
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1127 
Just fixed the name of the function.

REVISION DETAIL
  https://reviews.facebook.net/D10203

To: gangtimliu, sxyuan
Cc: kevinwilfong, JIRA


 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4322.HIVE-4322.HIVE-4322.HIVE-4322.D10203.1.patch


 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4351) Thrift code generation fails due to hcatalog

2013-04-12 Thread Gang Tim Liu (JIRA)
Gang Tim Liu created HIVE-4351:
--

 Summary: Thrift code generation fails due to hcatalog
 Key: HIVE-4351
 URL: https://issues.apache.org/jira/browse/HIVE-4351
 Project: Hive
  Issue Type: Bug
  Components: Thrift API
Affects Versions: 0.11.0
Reporter: Gang Tim Liu
Assignee: Ashutosh Chauhan


It fails to generate thrift code since hcatalog doesn't have Target thriftif

ant thriftif -Dthrift.home=/usr/local
.
BUILD FAILED

Target thriftif does not exist in the project hcatalog. 




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4275) Hive does not differentiate scheme and authority in file uris

2013-04-12 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4275:
-

Attachment: HIVE-4275.2.patch

The test was actually for TestMinimrCliDriver. I had missed the changes in 
build-common.xml.

 Hive does not differentiate scheme and authority in file uris
 -

 Key: HIVE-4275
 URL: https://issues.apache.org/jira/browse/HIVE-4275
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4275.2.patch, HIVE-4275.patch


 Consider the following set of queries:
 ALTER TABLE abc ADD PARTITION (x='0') LOCATION 'file:///foo';
 ALTER TABLE abc ADD PARTITION (x='1') LOCATION '/foo';
 select count(*) from abc;
 Even though there are different files under these directories, depending on 
 number of mappers, the count produces a value = num of mappers * num of files 
 in the 2 directories. This is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4275) Hive does not differentiate scheme and authority in file uris

2013-04-12 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4275:
-

Status: Patch Available  (was: Open)

 Hive does not differentiate scheme and authority in file uris
 -

 Key: HIVE-4275
 URL: https://issues.apache.org/jira/browse/HIVE-4275
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4275.2.patch, HIVE-4275.patch


 Consider the following set of queries:
 ALTER TABLE abc ADD PARTITION (x='0') LOCATION 'file:///foo';
 ALTER TABLE abc ADD PARTITION (x='1') LOCATION '/foo';
 select count(*) from abc;
 Even though there are different files under these directories, depending on 
 number of mappers, the count produces a value = num of mappers * num of files 
 in the 2 directories. This is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4275) Hive does not differentiate scheme and authority in file uris

2013-04-12 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630576#comment-13630576
 ] 

Vikram Dixit K commented on HIVE-4275:
--

Review board request.

https://reviews.apache.org/r/10429/

 Hive does not differentiate scheme and authority in file uris
 -

 Key: HIVE-4275
 URL: https://issues.apache.org/jira/browse/HIVE-4275
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4275.2.patch, HIVE-4275.patch


 Consider the following set of queries:
 ALTER TABLE abc ADD PARTITION (x='0') LOCATION 'file:///foo';
 ALTER TABLE abc ADD PARTITION (x='1') LOCATION '/foo';
 select count(*) from abc;
 Even though there are different files under these directories, depending on 
 number of mappers, the count produces a value = num of mappers * num of files 
 in the 2 directories. This is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4352) Guava not getting included in build package

2013-04-12 Thread Mark Wagner (JIRA)
Mark Wagner created HIVE-4352:
-

 Summary: Guava not getting included in build package
 Key: HIVE-4352
 URL: https://issues.apache.org/jira/browse/HIVE-4352
 Project: Hive
  Issue Type: Bug
Reporter: Mark Wagner


Since HIVE-4148, Guava is not getting included in the appropriate packages. 
This manifests as a ClassNotFoundException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4352) Guava not getting included in build package

2013-04-12 Thread Mark Wagner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Wagner reassigned HIVE-4352:
-

Assignee: Mark Wagner

 Guava not getting included in build package
 ---

 Key: HIVE-4352
 URL: https://issues.apache.org/jira/browse/HIVE-4352
 Project: Hive
  Issue Type: Bug
Reporter: Mark Wagner
Assignee: Mark Wagner

 Since HIVE-4148, Guava is not getting included in the appropriate packages. 
 This manifests as a ClassNotFoundException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4353) Add AvroObjectInspectorGenerator API

2013-04-12 Thread Edward C. Skoviak (JIRA)
Edward C. Skoviak created HIVE-4353:
---

 Summary: Add AvroObjectInspectorGenerator API
 Key: HIVE-4353
 URL: https://issues.apache.org/jira/browse/HIVE-4353
 Project: Hive
  Issue Type: Improvement
Reporter: Edward C. Skoviak
Priority: Minor


Whilst working on a Hive project where I am auto-generating a create table hive 
command for clients, I became very aware how helpful an API would be for the 
AvroObjectInspectorGenerator. This functionality would make it very easy for 
consumer's to pull out column names and types.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4353) Add AvroObjectInspectorGenerator API

2013-04-12 Thread Edward C. Skoviak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward C. Skoviak updated HIVE-4353:


Description: Whilst working on a Hive project where I am auto-generating a 
create table hive command for clients, I became very aware how helpful an API 
would be for the AvroObjectInspectorGenerator. This functionality would make it 
very easy for consumer's to pull out column names and types, especially in the 
scenario where you can not use the auto-generate flag.  (was: Whilst working on 
a Hive project where I am auto-generating a create table hive command for 
clients, I became very aware how helpful an API would be for the 
AvroObjectInspectorGenerator. This functionality would make it very easy for 
consumer's to pull out column names and types.)

 Add AvroObjectInspectorGenerator API
 

 Key: HIVE-4353
 URL: https://issues.apache.org/jira/browse/HIVE-4353
 Project: Hive
  Issue Type: Improvement
Reporter: Edward C. Skoviak
Priority: Minor

 Whilst working on a Hive project where I am auto-generating a create table 
 hive command for clients, I became very aware how helpful an API would be for 
 the AvroObjectInspectorGenerator. This functionality would make it very easy 
 for consumer's to pull out column names and types, especially in the scenario 
 where you can not use the auto-generate flag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4353) Add AvroObjectInspectorGenerator API

2013-04-12 Thread Edward C. Skoviak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward C. Skoviak updated HIVE-4353:


Description: Whilst working on a Hive project where I am auto-generating a 
create table hive command for clients, I became very aware how helpful an API 
would be for the AvroObjectInspectorGenerator. This functionality would make it 
very easy for consumer's to pull out column names and types, and would be 
especially helpful in the scenario where you can not use the auto-generate 
flag.  (was: Whilst working on a Hive project where I am auto-generating a 
create table hive command for clients, I became very aware how helpful an API 
would be for the AvroObjectInspectorGenerator. This functionality would make it 
very easy for consumer's to pull out column names and types, especially in the 
scenario where you can not use the auto-generate flag.)

 Add AvroObjectInspectorGenerator API
 

 Key: HIVE-4353
 URL: https://issues.apache.org/jira/browse/HIVE-4353
 Project: Hive
  Issue Type: Improvement
Reporter: Edward C. Skoviak
Priority: Minor

 Whilst working on a Hive project where I am auto-generating a create table 
 hive command for clients, I became very aware how helpful an API would be for 
 the AvroObjectInspectorGenerator. This functionality would make it very easy 
 for consumer's to pull out column names and types, and would be especially 
 helpful in the scenario where you can not use the auto-generate flag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-12 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630605#comment-13630605
 ] 

Phabricator commented on HIVE-4322:
---

gangtimliu has accepted the revision HIVE-4322 [jira] SkewedInfo in Metastore 
Thrift API cannot be deserialized in Python.

  thanks

REVISION DETAIL
  https://reviews.facebook.net/D10203

BRANCH
  svn

ARCANIST PROJECT
  hive

To: gangtimliu, sxyuan
Cc: kevinwilfong, JIRA


 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4322.HIVE-4322.HIVE-4322.HIVE-4322.D10203.1.patch


 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630607#comment-13630607
 ] 

Gunther Hagleitner commented on HIVE-4318:
--

[~pamelavagata]: I saw that too and I am sure it would make the numbers 
slightly better. There's also the issue with allocating a new object for each 
invocation which is probably even worse than the empty list. My point though is 
this: Even if we get it down to where I fixed counters too, you would still pay 
a price for the feature. No counters v fixed counters is still faster (see 
above). 

From this thread it seems that the profiler is a valuable feature for keeping 
taps on performance in the dev cycle, operator hooks on the other hand are not 
that useful. Anything you add there has a tremendously bad effect on 
performance.

From that I concluded that we should change the profiler not to rely on 
operator hooks and also not to contribute to run time in production. The best 
way to me is to remove it temporarily and handle it in a new jira (where we 
can discuss the how in more detail).

Does that make sense?


 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4318:
-

Attachment: HIVE-4318.2.patch

Here's the patch that goes with the proposal

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1708) make hive history file configurable

2013-04-12 Thread Nitin Pawar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630609#comment-13630609
 ] 

Nitin Pawar commented on HIVE-1708:
---

I did add a new setting to hive-site.xml and made some change in the cli code 
and tested it for making hive history optional. 

I wanted to add one more property for the hive history file path but currently 
it is set to .hivehistory inside each individual users home directory. If I 
have to retain this property how will I keep the default value in 
hive-site.xml. As all the users will have different home directories on 
different linux distributions, how do we default the path then? 

can we change the file path to something like log location which resides inside 
/tmp ? Is that an acceptable change? 

 make hive history file configurable
 ---

 Key: HIVE-1708
 URL: https://issues.apache.org/jira/browse/HIVE-1708
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain

 Currentlly, it is derived from
 System.getProperty(user.home)/.hivehistory;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3129) Create windows native scripts (CMD files) to run hive on windows without Cygwin

2013-04-12 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-3129:
--

Attachment: HIVE-3129.1.patch

The attached patch has the windows native command scripts, which can run hive 
on windows without Cygwin. We attach this patch because we already have the 
windows specific command scripts. We will deal with the unification of scripts 
in a separate JIRA. Additionally, unit test scripts will follow.

 Create windows native scripts (CMD files)  to run hive on windows without 
 Cygwin
 

 Key: HIVE-3129
 URL: https://issues.apache.org/jira/browse/HIVE-3129
 Project: Hive
  Issue Type: Bug
  Components: CLI, Windows
Reporter: Kanna Karanam
  Labels: Windows
 Attachments: HIVE-3129.1.patch


 Create the cmd files equivalent to 
 a)Bin\hive
 b)Bin\hive-config.sh
 c)Bin\Init-hive-dfs.sh
 d)Bin\ext\cli.sh
 e)Bin\ext\debug.sh
 f)Bin\ext\help.sh
 g)Bin\ext\hiveserver.sh
 h)Bin\ext\jar.sh
 i)Bin\ext\hwi.sh
 j)Bin\ext\lineage.sh
 k)Bin\ext\metastore.sh
 l)Bin\ext\rcfilecat.sh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3129) Create windows native scripts (CMD files) to run hive on windows without Cygwin

2013-04-12 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-3129:
--

Affects Version/s: 0.11.0
   Status: Patch Available  (was: Open)

 Create windows native scripts (CMD files)  to run hive on windows without 
 Cygwin
 

 Key: HIVE-3129
 URL: https://issues.apache.org/jira/browse/HIVE-3129
 Project: Hive
  Issue Type: Bug
  Components: CLI, Windows
Affects Versions: 0.11.0
Reporter: Kanna Karanam
  Labels: Windows
 Attachments: HIVE-3129.1.patch


 Create the cmd files equivalent to 
 a)Bin\hive
 b)Bin\hive-config.sh
 c)Bin\Init-hive-dfs.sh
 d)Bin\ext\cli.sh
 e)Bin\ext\debug.sh
 f)Bin\ext\help.sh
 g)Bin\ext\hiveserver.sh
 h)Bin\ext\jar.sh
 i)Bin\ext\hwi.sh
 j)Bin\ext\lineage.sh
 k)Bin\ext\metastore.sh
 l)Bin\ext\rcfilecat.sh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #345

2013-04-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/345/

--
[...truncated 36424 lines...]
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2013-04-12_13-54-44_549_3988602030308142448/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201304121354_725414661.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2013-04-12_13-54-48_381_7641344880379777334/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2013-04-12_13-54-48_381_7641344880379777334/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201304121354_1471248745.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201304121354_1742716093.txt
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201304121354_1105327587.txt
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: 

[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630637#comment-13630637
 ] 

Kevin Wilfong commented on HIVE-4318:
-

It's not clear to me that we can't cut down the cost added by operator hooks 
when there are no operator hooks present to the point where it does not 
significantly affect performance.

Pam, could you provide Gunther a patch which sets the list of operator hooks to 
null rather than the empty list, and initializes the OperatorHookContext in the 
calls to enterOperatorHooks and exitOperatorHooks after the check if the list 
is null.  This should limit the impact of operator hooks, to two method calls 
and two null checks.  We could even put the check if this.operatorHooks==null 
around the method calls themselves, in case the Java compiler isn't inlining it 
for some reason.

If after that, they still introduce a substantial amount of overhead, there's 
not much more we can do, and I'd be ok with removing operator hooks. 

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4284) Implement class for vectorized row batch

2013-04-12 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4284:
--

Summary: Implement class for vectorized row batch  (was: Implement class 
for vectorized row group.)

 Implement class for vectorized row batch
 

 Key: HIVE-4284
 URL: https://issues.apache.org/jira/browse/HIVE-4284
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson

 Vectorized row group object will represent the row group that vectorized 
 operators will work on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4296) ant thriftif fails on hcatalog

2013-04-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630687#comment-13630687
 ] 

Ashutosh Chauhan commented on HIVE-4296:


Thanks, Travis for review. Committed to trunk. Thanks, Roshan!


 ant thriftif  fails on  hcatalog
 

 Key: HIVE-4296
 URL: https://issues.apache.org/jira/browse/HIVE-4296
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.10.0
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4296.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4351) Thrift code generation fails due to hcatalog

2013-04-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4351.


   Resolution: Duplicate
Fix Version/s: 0.12.0

Fixed via HIVE-4296

 Thrift code generation fails due to hcatalog
 

 Key: HIVE-4351
 URL: https://issues.apache.org/jira/browse/HIVE-4351
 Project: Hive
  Issue Type: Bug
  Components: Thrift API
Affects Versions: 0.11.0
Reporter: Gang Tim Liu
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0


 It fails to generate thrift code since hcatalog doesn't have Target thriftif
 ant thriftif -Dthrift.home=/usr/local
 .
 BUILD FAILED
 
 Target thriftif does not exist in the project hcatalog. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-12 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630689#comment-13630689
 ] 

Gang Tim Liu commented on HIVE-4322:


Committed. thank Samuel Yuan.

 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4322.HIVE-4322.HIVE-4322.HIVE-4322.D10203.1.patch


 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4322) SkewedInfo in Metastore Thrift API cannot be deserialized in Python

2013-04-12 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-4322:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

 SkewedInfo in Metastore Thrift API cannot be deserialized in Python
 ---

 Key: HIVE-4322
 URL: https://issues.apache.org/jira/browse/HIVE-4322
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-4322.HIVE-4322.HIVE-4322.HIVE-4322.D10203.1.patch


 The Thrift-generated Python code that deserializes Thrift objects fails 
 whenever a complex type is used as a map key, because by default mutable 
 Python objects such as lists do not have a hash function. See 
 https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.
 The SkewedInfo struct contains a map which uses a list as a key, breaking the 
 Python Thrift interface. It is not possible to specify the mapping from 
 Thrift types to Python types, or otherwise we could map Thrift lists to 
 Python tuples. Instead, the proposed workaround wraps the list inside a new 
 struct. This alone does not accomplish anything, but allows Python clients to 
 define a hash function for the struct class, e.g.:
 def f(object):
 return hash(tuple(object.skewedValueList))
 SkewedValueList.__hash__ = f
 In practice a more efficient hash might be defined that does not involve 
 copying the list. The advantage of wrapping the list inside a struct is that 
 the client does not have to define the hash on the list itself, which would 
 change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4296) ant thriftif fails on hcatalog

2013-04-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4296:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

 ant thriftif  fails on  hcatalog
 

 Key: HIVE-4296
 URL: https://issues.apache.org/jira/browse/HIVE-4296
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.10.0
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.12.0

 Attachments: HIVE-4296.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4351) Thrift code generation fails due to hcatalog

2013-04-12 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630691#comment-13630691
 ] 

Gang Tim Liu commented on HIVE-4351:


thank [~ashutoshc] very much

 Thrift code generation fails due to hcatalog
 

 Key: HIVE-4351
 URL: https://issues.apache.org/jira/browse/HIVE-4351
 Project: Hive
  Issue Type: Bug
  Components: Thrift API
Affects Versions: 0.11.0
Reporter: Gang Tim Liu
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0


 It fails to generate thrift code since hcatalog doesn't have Target thriftif
 ant thriftif -Dthrift.home=/usr/local
 .
 BUILD FAILED
 
 Target thriftif does not exist in the project hcatalog. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630703#comment-13630703
 ] 

Gunther Hagleitner commented on HIVE-4318:
--

What I was trying to say is that you'll end up in the exact same place that you 
are with removing counters v fixing counters. Two method calls, two null 
checks. And from my testing there *is* still overhead (29.3 v 27.9). If you 
think that's not a valid conclusion, I'll rerun the stuff, but otherwise we 
should just skip that step.

Am I missing something?

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4344) CREATE VIEW fails when redundant casts are rewritten

2013-04-12 Thread Samuel Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samuel Yuan updated HIVE-4344:
--

Description: 
e.g. create view v as select cast(key as string) from src;

The rewriter tries to replace both cast(key as string) and key as `src`.`key`, 
because cast(key as string) is a no-op.

There may be other cases like this one.

See HIVE-2439 for context.

  was:
e.g. create view v as select cast(key as string) from src;

The rewriter tries to replace both cast(key as string) and key as `src`.`key`, 
because cast(key as string) is a no-op.

There may be other cases like this one.


 CREATE VIEW fails when redundant casts are rewritten
 

 Key: HIVE-4344
 URL: https://issues.apache.org/jira/browse/HIVE-4344
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan

 e.g. create view v as select cast(key as string) from src;
 The rewriter tries to replace both cast(key as string) and key as 
 `src`.`key`, because cast(key as string) is a no-op.
 There may be other cases like this one.
 See HIVE-2439 for context.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4344) CREATE VIEW fails when redundant casts are rewritten

2013-04-12 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4344:
--

Attachment: HIVE-4344.HIVE-4344.HIVE-4344.HIVE-4344.D10221.1.patch

sxyuan requested code review of HIVE-4344 [jira] CREATE VIEW fails when 
redundant casts are rewritten.

Reviewers: kevinwilfong

See JIRA for a description of the problem. This change relaxes the constraints 
on translations. Previously, if a new translation overlaps with an existing 
one, one must be a prefix or suffix of the other. This allows the case when one 
is completely contained inside the other.

TEST PLAN
  Run tests.

REVISION DETAIL
  https://reviews.facebook.net/D10221

AFFECTED FILES
  ql/src/test/results/clientpositive/create_view_translate.q.out
  ql/src/test/queries/clientpositive/create_view_translate.q
  ql/src/java/org/apache/hadoop/hive/ql/parse/UnparseTranslator.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/24441/

To: kevinwilfong, sxyuan
Cc: sambavim, JIRA


 CREATE VIEW fails when redundant casts are rewritten
 

 Key: HIVE-4344
 URL: https://issues.apache.org/jira/browse/HIVE-4344
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
 Attachments: HIVE-4344.HIVE-4344.HIVE-4344.HIVE-4344.D10221.1.patch


 e.g. create view v as select cast(key as string) from src;
 The rewriter tries to replace both cast(key as string) and key as 
 `src`.`key`, because cast(key as string) is a no-op.
 There may be other cases like this one.
 See HIVE-2439 for context.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4344) CREATE VIEW fails when redundant casts are rewritten

2013-04-12 Thread Samuel Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samuel Yuan updated HIVE-4344:
--

Status: Patch Available  (was: Open)

 CREATE VIEW fails when redundant casts are rewritten
 

 Key: HIVE-4344
 URL: https://issues.apache.org/jira/browse/HIVE-4344
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
 Attachments: HIVE-4344.HIVE-4344.HIVE-4344.HIVE-4344.D10221.1.patch


 e.g. create view v as select cast(key as string) from src;
 The rewriter tries to replace both cast(key as string) and key as 
 `src`.`key`, because cast(key as string) is a no-op.
 There may be other cases like this one.
 See HIVE-2439 for context.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4320) Consider extending max limit for precision to 38

2013-04-12 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4320:
-

Attachment: HIVE-4320.1.patch

 Consider extending max limit for precision to 38
 

 Key: HIVE-4320
 URL: https://issues.apache.org/jira/browse/HIVE-4320
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4320.1.patch


 Max precision of 38 still fits in 128. It changes the way you do math on 
 these numbers though. Need to see if there will be perf implications, but 
 there's a strong case to support 38 (instead of 36) to comply with other DBs. 
 (Oracle, SQL Server, Teradata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4320) Consider extending max limit for precision to 38

2013-04-12 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630754#comment-13630754
 ] 

Gunther Hagleitner commented on HIVE-4320:
--

Review: https://reviews.facebook.net/D10227

 Consider extending max limit for precision to 38
 

 Key: HIVE-4320
 URL: https://issues.apache.org/jira/browse/HIVE-4320
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4320.1.patch


 Max precision of 38 still fits in 128. It changes the way you do math on 
 these numbers though. Need to see if there will be perf implications, but 
 there's a strong case to support 38 (instead of 36) to comply with other DBs. 
 (Oracle, SQL Server, Teradata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4315) enable doAs in unsecure mode for hive server2, when MR job runs locally

2013-04-12 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4315:


Attachment: HIVE-4315.1.patch

 enable doAs in unsecure mode for hive server2, when MR job runs locally
 ---

 Key: HIVE-4315
 URL: https://issues.apache.org/jira/browse/HIVE-4315
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.11.0

 Attachments: HIVE-4315.1.patch


 When MR job is run locally by hive (instead of hadoop cluster), the MR job 
 ends up running as hiveserver user instead of the user submitting the query, 
 even if doAs configuration is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4130) Bring the Lead/Lag UDFs interface in line with Lead/Lag UDAFs

2013-04-12 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4130:
--

Attachment: HIVE-4130.D10233.1.patch

hbutani requested code review of HIVE-4130 [jira] Bring the Lead/Lag UDFs 
interface in line with Lead/Lag UDAFs.

Reviewers: JIRA, ashutoshc

support default vals for Lead/Lag UDFs

support a default value arg
both amt and defaultValue args can be optional

TEST PLAN
  existing tests

REVISION DETAIL
  https://reviews.facebook.net/D10233

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java
  ql/src/test/queries/clientpositive/windowing_expressions.q
  ql/src/test/results/clientpositive/windowing_expressions.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/24459/

To: JIRA, ashutoshc, hbutani


 Bring the Lead/Lag UDFs interface in line with Lead/Lag UDAFs
 -

 Key: HIVE-4130
 URL: https://issues.apache.org/jira/browse/HIVE-4130
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4130.D10233.1.patch


 - support a default value arg
 - both amt and defaultValue args can be optional

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Pamela Vagata (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pamela Vagata updated HIVE-4318:


Attachment: HIVE-4318.patch.pam.txt

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, 
 HIVE-4318.patch.pam.txt


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630814#comment-13630814
 ] 

Kevin Wilfong commented on HIVE-4318:
-

I'm just really surprised that a couple of null checks increase the amount of 
time by ~5% especially given that we do maybe 4 null checks in the 
FileSinkOperator's process method alone.

Of course, I can't argue with facts, so if you could try such a patch once it's 
available and post your results I'd really appreciate it.

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, 
 HIVE-4318.patch.pam.txt


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Pamela Vagata (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630820#comment-13630820
 ] 

Pamela Vagata commented on HIVE-4318:
-

I agree - I've just posted the patch, it would be great if you could post some 
results with this one too :)

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, 
 HIVE-4318.patch.pam.txt


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4261) union_remove_10 is failing on hadoop2 with assertion (root task with non-empty set of parents)

2013-04-12 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4261:
-

Attachment: HIVE-4261.2.patch

 union_remove_10 is failing on hadoop2 with assertion (root task with 
 non-empty set of parents)
 --

 Key: HIVE-4261
 URL: https://issues.apache.org/jira/browse/HIVE-4261
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Critical
 Fix For: 0.11.0

 Attachments: HIVE-4261.1.patch, HIVE-4261.2.patch


 Output seems to indicate that the stage plan is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4261) union_remove_10 is failing on hadoop2 with assertion (root task with non-empty set of parents)

2013-04-12 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630821#comment-13630821
 ] 

Gunther Hagleitner commented on HIVE-4261:
--

Thank you [~navis]! New patch is on phabricator. Still running tests, will 
report back with results.

 union_remove_10 is failing on hadoop2 with assertion (root task with 
 non-empty set of parents)
 --

 Key: HIVE-4261
 URL: https://issues.apache.org/jira/browse/HIVE-4261
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Critical
 Fix For: 0.11.0

 Attachments: HIVE-4261.1.patch, HIVE-4261.2.patch


 Output seems to indicate that the stage plan is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4347) Hcatalog build fail on Windows because javadoc command exceed length limit

2013-04-12 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-4347:
-

Fix Version/s: 0.11.0
   Status: Patch Available  (was: Open)

 Hcatalog build fail on Windows because javadoc command exceed length limit
 --

 Key: HIVE-4347
 URL: https://issues.apache.org/jira/browse/HIVE-4347
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure, HCatalog, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
  Labels: build, patch
 Fix For: 0.11.0

 Attachments: HIVE-4347.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 When building Hcatalog on Window 8, build fail because 
 HIVE_DIR\hcatalog\build.xml:213: Javadoc failed: java.io.IOException: Cannot 
 run program JAVA_HOME\bin\javadoc.exe: CreateProces
 s error=206, The filename or extension is too long

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character

2013-04-12 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-4348:
-

Status: Patch Available  (was: Open)

 Unit test compile fail at hbase-handler project on Windows becuase of illegal 
 escape character
 --

 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
 Attachments: HIVE-4348.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The problem is because the automatically generated test case hardcoded file 
 path string of query file using \ instead of \\ as escape character. The 
 change should be in the TestHBaseCliDriver.vm and 
 TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >