date:20130502


[ 
https://issues.apache.org/jira/browse/HIVE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647646#comment-13647646
 ] 

Ashutosh Chauhan commented on HIVE-4471:


+1. [~traviscrawford] would you like to take a look?

 Build fails with hcatalog checkstyle error
 --

 Key: HIVE-4471
 URL: https://issues.apache.org/jira/browse/HIVE-4471
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4471.1.patch, HIVE-4471.2.patch


 This is the output:
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 412 files
 [checkstyle] 
 /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/hcatalog/src/test/.gitignore:1:
  Missing a header - not enough lines in file.
 BUILD FAILED
 /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build.xml:296: 
 The following error occurred while executing this line:
 /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build.xml:298: 
 The following error occurred while executing this line:
 /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/hcatalog/build.xml:109:
  The following error occurred while executing this line:
 /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 1 errors and 0 warnings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4421) Improve memory usage by ORC dictionaries

2013-05-02 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647660#comment-13647660
 ] 

Phabricator commented on HIVE-4421:
---

ashutoshc has accepted the revision HIVE-4421 [jira] Improve memory usage by 
ORC dictionaries.

  +1 will commit if tests pass.

REVISION DETAIL
  https://reviews.facebook.net/D10545

BRANCH
  h-4421

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, omalley


 Improve memory usage by ORC dictionaries
 

 Key: HIVE-4421
 URL: https://issues.apache.org/jira/browse/HIVE-4421
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4421.D10545.1.patch, HIVE-4421.D10545.2.patch, 
 HIVE-4421.D10545.3.patch, HIVE-4421.D10545.4.patch


 Currently, for tables with many string columns, it is possible to 
 significantly underestimate the memory used by the ORC dictionaries and cause 
 the query to run out of memory in the task. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4455) HCatalog build directories get included in tar file produced by ant tar


 [ 
https://issues.apache.org/jira/browse/HIVE-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4455:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed trunk version as well. Thanks, Alan!

 HCatalog build directories get included in tar file produced by ant tar
 -

 Key: HIVE-4455
 URL: https://issues.apache.org/jira/browse/HIVE-4455
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure, HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.11.0

 Attachments: buildbloat.patch, HIVE-4455.patch, HIVE-4455-trunk.patch


 The excludes in the tar target aren't properly excluding the build 
 directories in HCatalog

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4461) hcatalog jars not getting published to maven repo


 [ 
https://issues.apache.org/jira/browse/HIVE-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4461:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Marking this as resolved, as per Alan's comments.

 hcatalog jars not getting published to maven repo
 -

 Key: HIVE-4461
 URL: https://issues.apache.org/jira/browse/HIVE-4461
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4461.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns


[ 
https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647690#comment-13647690
 ] 

Ashutosh Chauhan commented on HIVE-4392:


Ok. Lets go ahead with this patch than. [~navis] Do you want to update the 
patch with these tests or shall I go ahead with testing it for commit?

 Illogical InvalidObjectException throwed when use mulit aggregate functions 
 with star columns 
 --

 Key: HIVE-4392
 URL: https://issues.apache.org/jira/browse/HIVE-4392
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Apache Hadoop 0.20.1
 Apache Hive Trunk
Reporter: caofangkun
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, 
 HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch


 For Example:
 hive (default) create table liza_1 as 
select *, sum(key), sum(value) 
from new_src;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201304191025_0003, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0003
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 
 1
 2013-04-22 11:09:28,017 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:09:34,054 Stage-1 map = 0%,  reduce = 100%
 2013-04-22 11:09:37,074 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0003
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a 
 valid object name)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 MapReduce Jobs Launched: 
 Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 12 SUCCESS
 Total MapReduce CPU Time Spent: 0 msec
 hive (default) create table liza_1 as 
select *, sum(key), sum(value) 
from new_src   
group by key, value;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201304191025_0004, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0004
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 
 1
 2013-04-22 11:11:58,945 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:12:01,964 Stage-1 map = 0%,  reduce = 100%
 2013-04-22 11:12:04,982 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0004
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a 
 valid object name)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 MapReduce Jobs Launched: 
 Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 0 SUCCESS
 Total MapReduce CPU Time Spent: 0 msec
 But the following tow Queries  work:
 hive (default) create table liza_1 as select * from new_src;
 Total MapReduce jobs = 3
 Launching Job 1 out of 3
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201304191025_0006, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0006
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers:  0
 2013-04-22 11:15:00,681 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:15:03,697 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0006
 Stage-4 is selected by condition resolver.
 Stage-3 is filtered out by condition resolver.
 Stage-5 is filtered out by condition resolver.
 Moving data to: 
 hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 Table default.liza_1 stats:

[jira] [Resolved] (HIVE-4182) doAS does not work with HiveServer2 in non-kerberos mode with local job


 [ 
https://issues.apache.org/jira/browse/HIVE-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4182.


   Resolution: Fixed
Fix Version/s: 0.11.0

Fixed via HIVE-4315

 doAS does not work with HiveServer2 in non-kerberos mode with local job
 ---

 Key: HIVE-4182
 URL: https://issues.apache.org/jira/browse/HIVE-4182
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
  Labels: HiveServer2
 Fix For: 0.11.0

 Attachments: HIVE-4182.1.patch


 When HiveServer2 is configured without kerberos security enabled, and the 
 query gets launched as a local map-reduce job, the job runs as the user hive 
 server is running as , instead of the user who submitted the query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4476) HiveMetaStore caches the creation of a default db in a static way

2013-05-02 Thread Brock Noland (JIRA)

Brock Noland created HIVE-4476:
--

 Summary: HiveMetaStore caches the creation of a default db in a 
static way
 Key: HIVE-4476
 URL: https://issues.apache.org/jira/browse/HIVE-4476
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0, 0.11.0
Reporter: Brock Noland
Priority: Minor


Currently HiveMetaStore.HMSHandler has a static flag set to true if the JVM has 
ever created a default db:

https://github.com/apache/hive/blob/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L176

However, when testing it's nice to be able to create multiple HiveMetastore 
instances in a single JVM. Perhaps we should add a flag 
hive.metastore.always.create.default.db or something similar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4476) HiveMetaStore caches the creation of a default db in a static way

2013-05-02 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647744#comment-13647744
 ] 

Brock Noland commented on HIVE-4476:


perhaps the use of checkForDefaultDb in that class just needs to be modified.

 HiveMetaStore caches the creation of a default db in a static way
 -

 Key: HIVE-4476
 URL: https://issues.apache.org/jira/browse/HIVE-4476
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0, 0.11.0
Reporter: Brock Noland
Priority: Minor

 Currently HiveMetaStore.HMSHandler has a static flag set to true if the JVM 
 has ever created a default db:
 https://github.com/apache/hive/blob/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L176
 However, when testing it's nice to be able to create multiple HiveMetastore 
 instances in a single JVM. Perhaps we should add a flag 
 hive.metastore.always.create.default.db or something similar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4474) Column access not tracked properly for partitioned tables

2013-05-02 Thread Gang Tim Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647763#comment-13647763
 ] 

Gang Tim Liu commented on HIVE-4474:


running test.

 Column access not tracked properly for partitioned tables
 -

 Key: HIVE-4474
 URL: https://issues.apache.org/jira/browse/HIVE-4474
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Samuel Yuan
Assignee: Samuel Yuan
 Attachments: HIVE-4474.1.patch.txt


 The columns recorded as being accessed is incorrect for partitioned tables. 
 The index of accessed columns is a position in the list of non-partition 
 columns, but a list of all columns is being used right now to do the lookup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4477) remove redundant copy of arithmetic filter unit test testColOpScalarNumericFilterNullAndRepeatingLogic

Eric Hanson created HIVE-4477:
-

 Summary: remove redundant copy of arithmetic filter unit test 
testColOpScalarNumericFilterNullAndRepeatingLogic
 Key: HIVE-4477
 URL: https://issues.apache.org/jira/browse/HIVE-4477
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson


same test got ported to 2 different files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4448) Fix metastore warehouse incorrect location on Windows in unit tests

2013-05-02 Thread Shuaishuai Nie (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-4448:
-

Summary: Fix metastore warehouse incorrect location on Windows in unit 
tests  (was: Fix metastore warehouse incorrect path on Windows in unit tests)

 Fix metastore warehouse incorrect location on Windows in unit tests
 ---

 Key: HIVE-4448
 URL: https://issues.apache.org/jira/browse/HIVE-4448
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.11.0
 Environment: Windows
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-4448.1.patch


 Unit test cases which not using QTestUtil will pass incompatible Windows path 
 of METASTOREWAREHOUSE to HiveConf which result in creating the 
 /test/data/warehouse folder in the wrong location in Windows. This folder 
 will not be deleted at the beginning of the unit test and the content will 
 cause failure of unit tests if run the same test case repeatedly. The root 
 cause of this problem is for path like this 
 pfile://C:\hive\build\ql/test/data/warehouse, the C:\hive\build\ part 
 will be parsed as authority of the path and removed from the path string. The 
 patch will fix this problem and make the unit test result consistent between 
 Windows and Linux.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4477) remove redundant copy of arithmetic filter unit test testColOpScalarNumericFilterNullAndRepeatingLogic


 [ 
https://issues.apache.org/jira/browse/HIVE-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4477:
--

Attachment: HIVE-4477.1.patch

 remove redundant copy of arithmetic filter unit test 
 testColOpScalarNumericFilterNullAndRepeatingLogic
 --

 Key: HIVE-4477
 URL: https://issues.apache.org/jira/browse/HIVE-4477
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4477.1.patch


 same test got ported to 2 different files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3959) Update Partition Statistics in Metastore Layer

2013-05-02 Thread Gang Tim Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3959:
---

Attachment: HIVE-3959.patch.11.txt

 Update Partition Statistics in Metastore Layer
 --

 Key: HIVE-3959
 URL: https://issues.apache.org/jira/browse/HIVE-3959
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, Statistics
Reporter: Bhushan Mandhani
Assignee: Gang Tim Liu
Priority: Minor
 Attachments: HIVE-3959.patch.1, HIVE-3959.patch.11.txt, 
 HIVE-3959.patch.2, HIVE-3959.patch.9.txt


 When partitions are created using queries (insert overwrite and insert 
 into) then the StatsTask updates all stats. However, when partitions are 
 added directly through metadata-only partitions (either CLI or direct calls 
 to Thrift Metastore) no stats are populated even if hive.stats.reliable is 
 set to true. This puts us in a situation where we can't decide if stats are 
 truly reliable or not.
 We propose that the fast stats (numFiles and totalSize) which don't require 
 a scan of the data should always be populated and be completely reliable. For 
 now we are still excluding rowCount and rawDataSize because that will make 
 these operations very expensive. Currently they are quick metadata-only ops.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: remove redundant copy of arithmetic filter unit test testColOpScalarNumericFilterNullAndRepeatingLogic

2013-05-02 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10906/
---

Review request for hive.


Description
---

remove redundant copy of arithmetic filter unit test 
testColOpScalarNumericFilterNullAndRepeatingLogic


This addresses bug HIVE-4477.
https://issues.apache.org/jira/browse/HIVE-4477


Diffs
-

  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorFilterOperator.java 
3ad6c7f 

Diff: https://reviews.apache.org/r/10906/diff/


Testing
---


Thanks,

Eric Hanson

[jira] [Updated] (HIVE-4477) remove redundant copy of arithmetic filter unit test testColOpScalarNumericFilterNullAndRepeatingLogic


 [ 
https://issues.apache.org/jira/browse/HIVE-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4477:
--

Status: Patch Available  (was: Open)

 remove redundant copy of arithmetic filter unit test 
 testColOpScalarNumericFilterNullAndRepeatingLogic
 --

 Key: HIVE-4477
 URL: https://issues.apache.org/jira/browse/HIVE-4477
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4477.1.patch


 same test got ported to 2 different files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4477) remove redundant copy of arithmetic filter unit test testColOpScalarNumericFilterNullAndRepeatingLogic


[ 
https://issues.apache.org/jira/browse/HIVE-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647840#comment-13647840
 ] 

Eric Hanson commented on HIVE-4477:
---

Code review available at https://reviews.apache.org/r/10906/

 remove redundant copy of arithmetic filter unit test 
 testColOpScalarNumericFilterNullAndRepeatingLogic
 --

 Key: HIVE-4477
 URL: https://issues.apache.org/jira/browse/HIVE-4477
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4477.1.patch


 same test got ported to 2 different files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4478) In ORC, add boolean noNulls flag to column stripe metadata

Eric Hanson created HIVE-4478:
-

 Summary: In ORC, add boolean noNulls flag to column stripe metadata
 Key: HIVE-4478
 URL: https://issues.apache.org/jira/browse/HIVE-4478
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Owen O'Malley


Currently, the stripe metadata for ORC contains the min and max value for each 
column in the stripe. This will be used for stripe elimination. However, an 
additional bit of metadata, noNulls (true/false), is needed to help speed up 
vectorized query execution as much as 30%. 

The vectorized QE code has a Boolean flag for each column vector called 
noNulls. If this is true, all the null-checking logic is skipped. For simple 
filters and arithmetic expressions, this can save on the order of 30% of the 
time.

Once this noNulls stripe metadata is available, the vectorized iterator for ORC 
can be updated to avoid all expense to load the isNull bitmap, and efficiently 
set the noNulls flag for each column vector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4478) In ORC, add boolean noNulls flag to column stripe metadata

[
https://issues.apache.org/jira/browse/HIVE-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Hanson updated HIVE-4478:
--

Description:
Currently, the stripe metadata for ORC contains the min and max value for each
column in the stripe. This will be used for stripe elimination. However, an
additional bit of metadata for each column for each stripe, noNulls
(true/false), is needed to help speed up vectorized query execution as much as
30%.

The vectorized QE code has a Boolean flag for each column vector called
noNulls. If this is true, all the null-checking logic is skipped for that
column for a VectorizedRowBatch when an operation is performed on that column.
For simple filters and arithmetic expressions, this can save on the order of
30% of the time.

Once this noNulls stripe metadata is available, the vectorized iterator
(reader) for ORC can be updated to avoid all expense to load the isNull bitmap,
and efficiently set the noNulls flag for each column vector.

was:
Currently, the stripe metadata for ORC contains the min and max value for each
column in the stripe. This will be used for stripe elimination. However, an
additional bit of metadata, noNulls (true/false), is needed to help speed up
vectorized query execution as much as 30%.

The vectorized QE code has a Boolean flag for each column vector called
noNulls. If this is true, all the null-checking logic is skipped. For simple
filters and arithmetic expressions, this can save on the order of 30% of the
time.

Once this noNulls stripe metadata is available, the vectorized iterator for ORC
can be updated to avoid all expense to load the isNull bitmap, and efficiently
set the noNulls flag for each column vector.

In ORC, add boolean noNulls flag to column stripe metadata
--

Key: HIVE-4478
URL: https://issues.apache.org/jira/browse/HIVE-4478
Project: Hive
Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Owen O'Malley

Currently, the stripe metadata for ORC contains the min and max value for
each column in the stripe. This will be used for stripe elimination. However,
an additional bit of metadata for each column for each stripe, noNulls
(true/false), is needed to help speed up vectorized query execution as much
as 30%.
The vectorized QE code has a Boolean flag for each column vector called
noNulls. If this is true, all the null-checking logic is skipped for that
column for a VectorizedRowBatch when an operation is performed on that
column. For simple filters and arithmetic expressions, this can save on the
order of 30% of the time.
Once this noNulls stripe metadata is available, the vectorized iterator
(reader) for ORC can be updated to avoid all expense to load the isNull
bitmap, and efficiently set the noNulls flag for each column vector.

[jira] [Commented] (HIVE-4376) Document ORC file format in Hive wiki

2013-05-02 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647934#comment-13647934
 ] 

Lefty Leverenz commented on HIVE-4376:
--

Done.  You can find the ORC wikidoc here:  
[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC].

It's in the [Language 
Manual|https://cwiki.apache.org/confluence/display/Hive/LanguageManual] under a 
stub for File Formats.  Information about other file formats would also be 
helpful.

 Document ORC file format in Hive wiki
 -

 Key: HIVE-4376
 URL: https://issues.apache.org/jira/browse/HIVE-4376
 Project: Hive
  Issue Type: Bug
  Components: Documentation, Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Lefty Leverenz
Assignee: Lefty Leverenz
  Labels: wiki

 Add a wiki documenting the Optimized Row Columnar file format for Hive 
 release 0.11 ([HIVE-3874|https://issues.apache.org/jira/browse/HIVE-3874]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Need to track docs for future releases

2013-05-02 Thread Lefty Leverenz

Now that all the Hive docs are in the wiki, we can't commit new
documentation to trunk or branch.  But we don't want to add docs to the
wiki prematurely, so there's an increased likelihood that we'll lose track
of some doc requirements for future releases.  Does anyone know of a good
way to ensure that no doc gets left behind?

One possibility is to use labels on JIRAs that need future documentation.
 When HIVE-# gets committed with Fix in 0.12 and still needs docs, it
would get a label such as doc-needed-v0.12 which can be used to find all
the doc requirements at release time.

That might be the simplest solution, although I see two problems:  if the
fix number gets changed, the label has to change too; and sometimes people
enter a label that seems right to them but doesn't match exactly.

Another possibility is to use JIRAs, either adding a child JIRA for each
closed JIRA that still needs doc or using an umbrella JIRA for each
upcoming release.

An ideal solution would automatically spew out a list of JIRAS that need
docs for a given release number, either on request or when the release
happens.  Is that technically possible?

– Lefty

[jira] [Commented] (HIVE-4466) Fix continue.on.failure in unit tests to -well- continue on failure in unit tests


[ 
https://issues.apache.org/jira/browse/HIVE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647991#comment-13647991
 ] 

Ashutosh Chauhan commented on HIVE-4466:


+1 will commit if tests pass.

 Fix continue.on.failure in unit tests to -well- continue on failure in unit 
 tests
 -

 Key: HIVE-4466
 URL: https://issues.apache.org/jira/browse/HIVE-4466
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4466.1.patch


 continue.on.failure is no longer hooked up to anything in the build scripts. 
 more importantly, the only choice right now is to continue through a module 
 and then fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4479) Child expressions are not being evaluated hierarchically in a few templates.

Jitendra Nath Pandey created HIVE-4479:
--

 Summary: Child expressions are not being evaluated hierarchically 
in a few templates.
 Key: HIVE-4479
 URL: https://issues.apache.org/jira/browse/HIVE-4479
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


FilterColumnCompareColumn.txt, FilterStringColumnCompareScalar.txt and 
ScalarArithmeticColumn.txt are not evaluating the child expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4479) Child expressions are not being evaluated hierarchically in a few templates.


 [ 
https://issues.apache.org/jira/browse/HIVE-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4479:
---

Attachment: HIVE-4479.1.patch

 Child expressions are not being evaluated hierarchically in a few templates.
 

 Key: HIVE-4479
 URL: https://issues.apache.org/jira/browse/HIVE-4479
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4479.1.patch


 FilterColumnCompareColumn.txt, FilterStringColumnCompareScalar.txt and 
 ScalarArithmeticColumn.txt are not evaluating the child expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4480) Implement partition support for vectorized query execution

Sarvesh Sakalanaga created HIVE-4480:


 Summary: Implement partition support for vectorized query execution
 Key: HIVE-4480
 URL: https://issues.apache.org/jira/browse/HIVE-4480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4480) Implement partition support for vectorized query execution


 [ 
https://issues.apache.org/jira/browse/HIVE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4480:
-

Description: Add support for eager deserialization of row data using serde 
in the RecordReader layer. Also add support for partitions in this layer so 
that the vectorized batch is populated correctly.

 Implement partition support for vectorized query execution
 --

 Key: HIVE-4480
 URL: https://issues.apache.org/jira/browse/HIVE-4480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga

 Add support for eager deserialization of row data using serde in the 
 RecordReader layer. Also add support for partitions in this layer so that the 
 vectorized batch is populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-4454) Support partitioned tables in vectorized query execution.


 [ 
https://issues.apache.org/jira/browse/HIVE-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HIVE-4454.


Resolution: Duplicate

Duplicate of HIVE-4480.

 Support partitioned tables in vectorized query execution.
 -

 Key: HIVE-4454
 URL: https://issues.apache.org/jira/browse/HIVE-4454
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey

 Partitioned tables are very common use case. Vectorized code path should 
 support that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4481) Vectorized row batch should be initialized with additional columns to hold intermediate output.

Jitendra Nath Pandey created HIVE-4481:
--

 Summary: Vectorized row batch should be initialized with 
additional columns to hold intermediate output.
 Key: HIVE-4481
 URL: https://issues.apache.org/jira/browse/HIVE-4481
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Vectorized row batch should be initialized with additional columns to hold 
intermediate output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4481) Vectorized row batch should be initialized with additional columns to hold intermediate output.


 [ 
https://issues.apache.org/jira/browse/HIVE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4481:
---

Attachment: HIVE-4481.1.patch

 Vectorized row batch should be initialized with additional columns to hold 
 intermediate output.
 ---

 Key: HIVE-4481
 URL: https://issues.apache.org/jira/browse/HIVE-4481
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4481.1.patch


 Vectorized row batch should be initialized with additional columns to hold 
 intermediate output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4479) Child expressions are not being evaluated hierarchically in a few templates.


[ 
https://issues.apache.org/jira/browse/HIVE-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648034#comment-13648034
 ] 

Jitendra Nath Pandey commented on HIVE-4479:


Review board entry: https://reviews.apache.org/r/10908/

 Child expressions are not being evaluated hierarchically in a few templates.
 

 Key: HIVE-4479
 URL: https://issues.apache.org/jira/browse/HIVE-4479
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4479.1.patch


 FilterColumnCompareColumn.txt, FilterStringColumnCompareScalar.txt and 
 ScalarArithmeticColumn.txt are not evaluating the child expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4482) Template file VectorUDAFAvg.txt missing from public branch; CodeGen.java fails

Eric Hanson created HIVE-4482:
-

 Summary: Template file VectorUDAFAvg.txt missing from public 
branch; CodeGen.java fails
 Key: HIVE-4482
 URL: https://issues.apache.org/jira/browse/HIVE-4482
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Remus Rusanu


In vectorization branch, file
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFAvg.txt
is missing. So CodeGen.java doesn't run to completion, because it references 
that file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns

2013-05-02 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4392:
--

Attachment: HIVE-4392.D10431.5.patch

navis updated the revision HIVE-4392 [jira] Illogical InvalidObjectException 
throwed when use mulit aggregate functions with star columns.

  Added tests

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D10431

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10431?vs=33177id=33285#toc

AFFECTED FILES
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/queries/clientpositive/ctas_colname.q
  ql/src/test/results/clientpositive/ctas_colname.q.out

To: JIRA, ashutoshc, navis
Cc: hbutani


 Illogical InvalidObjectException throwed when use mulit aggregate functions 
 with star columns 
 --

 Key: HIVE-4392
 URL: https://issues.apache.org/jira/browse/HIVE-4392
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Apache Hadoop 0.20.1
 Apache Hive Trunk
Reporter: caofangkun
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, 
 HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch, HIVE-4392.D10431.5.patch


 For Example:
 hive (default) create table liza_1 as 
select *, sum(key), sum(value) 
from new_src;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201304191025_0003, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0003
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 
 1
 2013-04-22 11:09:28,017 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:09:34,054 Stage-1 map = 0%,  reduce = 100%
 2013-04-22 11:09:37,074 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0003
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a 
 valid object name)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 MapReduce Jobs Launched: 
 Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 12 SUCCESS
 Total MapReduce CPU Time Spent: 0 msec
 hive (default) create table liza_1 as 
select *, sum(key), sum(value) 
from new_src   
group by key, value;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201304191025_0004, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0004
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 
 1
 2013-04-22 11:11:58,945 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:12:01,964 Stage-1 map = 0%,  reduce = 100%
 2013-04-22 11:12:04,982 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0004
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a 
 valid object name)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 MapReduce Jobs Launched: 
 Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 0 SUCCESS
 Total MapReduce CPU Time Spent: 0 msec
 But the following tow Queries  work:
 hive (default) create table liza_1 as select * from new_src;
 Total MapReduce jobs = 3
 Launching Job 1 out of 3
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201304191025_0006, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006

[jira] [Commented] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns

2013-05-02 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648069#comment-13648069
 ] 

Navis commented on HIVE-4392:
-

Added tests. Not changed aggregation columns.

 Illogical InvalidObjectException throwed when use mulit aggregate functions 
 with star columns 
 --

 Key: HIVE-4392
 URL: https://issues.apache.org/jira/browse/HIVE-4392
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Apache Hadoop 0.20.1
 Apache Hive Trunk
Reporter: caofangkun
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, 
 HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch, HIVE-4392.D10431.5.patch


 For Example:
 hive (default) create table liza_1 as 
select *, sum(key), sum(value) 
from new_src;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201304191025_0003, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0003
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 
 1
 2013-04-22 11:09:28,017 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:09:34,054 Stage-1 map = 0%,  reduce = 100%
 2013-04-22 11:09:37,074 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0003
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a 
 valid object name)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 MapReduce Jobs Launched: 
 Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 12 SUCCESS
 Total MapReduce CPU Time Spent: 0 msec
 hive (default) create table liza_1 as 
select *, sum(key), sum(value) 
from new_src   
group by key, value;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201304191025_0004, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0004
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 
 1
 2013-04-22 11:11:58,945 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:12:01,964 Stage-1 map = 0%,  reduce = 100%
 2013-04-22 11:12:04,982 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0004
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a 
 valid object name)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 MapReduce Jobs Launched: 
 Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 0 SUCCESS
 Total MapReduce CPU Time Spent: 0 msec
 But the following tow Queries  work:
 hive (default) create table liza_1 as select * from new_src;
 Total MapReduce jobs = 3
 Launching Job 1 out of 3
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201304191025_0006, Tracking URL = 
 http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006
 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill 
 job_201304191025_0006
 Hadoop job information for Stage-1: number of mappers: 0; number of reducers:  0
 2013-04-22 11:15:00,681 Stage-1 map = 0%,  reduce = 0%
 2013-04-22 11:15:03,697 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201304191025_0006
 Stage-4 is selected by condition resolver.
 Stage-3 is filtered out by condition resolver.
 Stage-5 is filtered out by condition resolver.
 Moving data to: 
 hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001
 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
 Table default.liza_1 stats: [num_partitions: 0, num_files: 0, num_rows: 0, 
 total_size: 0, raw_data_size: 0]
 MapReduce Jobs

[jira] [Updated] (HIVE-4462) Finish support for modulo (%) operator for vectorized arithmetic


 [ 
https://issues.apache.org/jira/browse/HIVE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4462:
--

Attachment: HIVE-4462.1.patch

 Finish support for modulo (%) operator for vectorized arithmetic
 

 Key: HIVE-4462
 URL: https://issues.apache.org/jira/browse/HIVE-4462
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4462.1.patch


 Support for vectorized modulo (%) is missing in CodeGen.java for several 
 situations, e.g. most ColArithmeticScalar situations. This is to add modulo 
 operator for all necessary situations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: finish support for vectorized Modulo (%) operator

2013-05-02 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10911/
---

Review request for hive.


Description
---

finish support for vectorized Modulo (%) operator


This addresses bug HIVE-4462.
https://issues.apache.org/jira/browse/HIVE-4462


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/DoubleColModuloDoubleColumn.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/DoubleColModuloDoubleScalar.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/DoubleColModuloLongColumn.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/DoubleColModuloLongScalar.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/LongColModuloDoubleColumn.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/LongColModuloDoubleScalar.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java
 9279101 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorScalarColArithmetic.java
 7c8b9c3 

Diff: https://reviews.apache.org/r/10911/diff/


Testing
---


Thanks,

Eric Hanson

[jira] [Commented] (HIVE-4462) Finish support for modulo (%) operator for vectorized arithmetic


[ 
https://issues.apache.org/jira/browse/HIVE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648083#comment-13648083
 ] 

Eric Hanson commented on HIVE-4462:
---

Code review available at https://reviews.apache.org/r/10911/

 Finish support for modulo (%) operator for vectorized arithmetic
 

 Key: HIVE-4462
 URL: https://issues.apache.org/jira/browse/HIVE-4462
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4462.1.patch


 Support for vectorized modulo (%) is missing in CodeGen.java for several 
 situations, e.g. most ColArithmeticScalar situations. This is to add modulo 
 operator for all necessary situations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4462) Finish support for modulo (%) operator for vectorized arithmetic


 [ 
https://issues.apache.org/jira/browse/HIVE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4462:
--

Status: Patch Available  (was: Open)

 Finish support for modulo (%) operator for vectorized arithmetic
 

 Key: HIVE-4462
 URL: https://issues.apache.org/jira/browse/HIVE-4462
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4462.1.patch


 Support for vectorized modulo (%) is missing in CodeGen.java for several 
 situations, e.g. most ColArithmeticScalar situations. This is to add modulo 
 operator for all necessary situations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4480) Implement partition support for vectorized query execution


 [ 
https://issues.apache.org/jira/browse/HIVE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4480:
-

Attachment: Hive-4480.1.patch

 Implement partition support for vectorized query execution
 --

 Key: HIVE-4480
 URL: https://issues.apache.org/jira/browse/HIVE-4480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4480.1.patch


 Add support for eager deserialization of row data using serde in the 
 RecordReader layer. Also add support for partitions in this layer so that the 
 vectorized batch is populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4480) Implement partition support for vectorized query execution


[ 
https://issues.apache.org/jira/browse/HIVE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648105#comment-13648105
 ] 

Sarvesh Sakalanaga commented on HIVE-4480:
--

Patch uploaded

 Implement partition support for vectorized query execution
 --

 Key: HIVE-4480
 URL: https://issues.apache.org/jira/browse/HIVE-4480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4480.1.patch


 Add support for eager deserialization of row data using serde in the 
 RecordReader layer. Also add support for partitions in this layer so that the 
 vectorized batch is populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4483) Input format to read vector data from RC

Sarvesh Sakalanaga created HIVE-4483:


 Summary: Input format to read vector data from RC
 Key: HIVE-4483
 URL: https://issues.apache.org/jira/browse/HIVE-4483
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4484) Current hive is slower than previous versions

2013-05-02 Thread Navis (JIRA)

Navis created HIVE-4484:
---

 Summary: Current hive is slower than previous versions
 Key: HIVE-4484
 URL: https://issues.apache.org/jira/browse/HIVE-4484
 Project: Hive
  Issue Type: Task
 Environment: ubuntu 10.10, 4G, i7-8core
Reporter: Navis


Comparing logs for various patches, I've found query execution become slower 
than before. For example, (picked not-changed tests)

{noformat}
ppr_pushdown.q
135~140 sec : 2012-03-27 ~ 2012-07-17
140~160 sec : ~ 2012-11-28
160~220 sec : ~ 2013-03-30
220~250 src : ~ current (HIVE-4392)

join_nulls.q
295~310 sec : 2012-03-27 ~ 2012-07-17
310~330 sec : ~ 2012-11-28
330~370 sec : ~ 2013-03-30
400~460 src : ~ current (HIVE-4392)
{noformat}

This explains much on recent prolonged test time. It might be from changes on 
test framework. But still need investigation before adding more functionality 
into hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4480) Implement partition support for vectorized query execution