[jira] [Updated] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2013-01-31 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3778:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Tim

> Add MapJoinDesc.isBucketMapJoin() as part of explain plan
> -
>
> Key: HIVE-3778
> URL: https://issues.apache.org/jira/browse/HIVE-3778
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3778.patch.10, HIVE-3778.patch.10, 
> HIVE-3778.patch.3, HIVE-3778.patch.6, HIVE-3778.patch.8, HIVE-3778.patch.9
>
>
> This is follow up of HIVE-3767:
> Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3403) user should not specify mapjoin to perform sort-merge bucketed join

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567412#comment-13567412
 ] 

Namit Jain commented on HIVE-3403:
--

To help in review, the class hierarchy is:

AbstractBucketJoinProc
 AbstractSMBJoinProc
   SortedMergeBucketMapjoinProc
   SortedMergeJoinProc
 BucketMapjoinOptProc


The context needed is:

BucketJoinOptProcCtx
 SortBucketJoinOptProcCtx

Most of the code in AbstractBucketJoinProc and AbstractSMBJoinProc is old code 
moved.
BucketMapjoinOptProc is also old code – but there has been little refactoring 
to break it up into context.

As such, the only new code is SortedMergeJoinProc. Due to the refactoring, I am 
able to re-use a lot of code
between map-join and join processing.


> user should not specify mapjoin to perform sort-merge bucketed join
> ---
>
> Key: HIVE-3403
> URL: https://issues.apache.org/jira/browse/HIVE-3403
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3403.10.patch, hive.3403.11.patch, 
> hive.3403.12.patch, hive.3403.13.patch, hive.3403.14.patch, 
> hive.3403.15.patch, hive.3403.16.patch, hive.3403.17.patch, 
> hive.3403.18.patch, hive.3403.19.patch, hive.3403.1.patch, 
> hive.3403.21.patch, hive.3403.22.patch, hive.3403.2.patch, hive.3403.3.patch, 
> hive.3403.4.patch, hive.3403.5.patch, hive.3403.6.patch, hive.3403.7.patch, 
> hive.3403.8.patch, hive.3403.9.patch
>
>
> Currently, in order to perform a sort merge bucketed join, the user needs
> to set hive.optimize.bucketmapjoin.sortedmerge to true, and also specify the 
> mapjoin hint.
> The user should not specify any hints.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3403) user should not specify mapjoin to perform sort-merge bucketed join

2013-01-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3403:
-

Attachment: hive.3403.22.patch

> user should not specify mapjoin to perform sort-merge bucketed join
> ---
>
> Key: HIVE-3403
> URL: https://issues.apache.org/jira/browse/HIVE-3403
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3403.10.patch, hive.3403.11.patch, 
> hive.3403.12.patch, hive.3403.13.patch, hive.3403.14.patch, 
> hive.3403.15.patch, hive.3403.16.patch, hive.3403.17.patch, 
> hive.3403.18.patch, hive.3403.19.patch, hive.3403.1.patch, 
> hive.3403.21.patch, hive.3403.22.patch, hive.3403.2.patch, hive.3403.3.patch, 
> hive.3403.4.patch, hive.3403.5.patch, hive.3403.6.patch, hive.3403.7.patch, 
> hive.3403.8.patch, hive.3403.9.patch
>
>
> Currently, in order to perform a sort merge bucketed join, the user needs
> to set hive.optimize.bucketmapjoin.sortedmerge to true, and also specify the 
> mapjoin hint.
> The user should not specify any hints.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3403) user should not specify mapjoin to perform sort-merge bucketed join

2013-01-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3403:
-

Attachment: hive.3403.21.patch

> user should not specify mapjoin to perform sort-merge bucketed join
> ---
>
> Key: HIVE-3403
> URL: https://issues.apache.org/jira/browse/HIVE-3403
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3403.10.patch, hive.3403.11.patch, 
> hive.3403.12.patch, hive.3403.13.patch, hive.3403.14.patch, 
> hive.3403.15.patch, hive.3403.16.patch, hive.3403.17.patch, 
> hive.3403.18.patch, hive.3403.19.patch, hive.3403.1.patch, 
> hive.3403.21.patch, hive.3403.2.patch, hive.3403.3.patch, hive.3403.4.patch, 
> hive.3403.5.patch, hive.3403.6.patch, hive.3403.7.patch, hive.3403.8.patch, 
> hive.3403.9.patch
>
>
> Currently, in order to perform a sort merge bucketed join, the user needs
> to set hive.optimize.bucketmapjoin.sortedmerge to true, and also specify the 
> mapjoin hint.
> The user should not specify any hints.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2013-01-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3778:
-

Status: Open  (was: Patch Available)

comments

> Add MapJoinDesc.isBucketMapJoin() as part of explain plan
> -
>
> Key: HIVE-3778
> URL: https://issues.apache.org/jira/browse/HIVE-3778
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: HIVE-3778.patch.3, HIVE-3778.patch.6, HIVE-3778.patch.8, 
> HIVE-3778.patch.9
>
>
> This is follow up of HIVE-3767:
> Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567345#comment-13567345
 ] 

Namit Jain commented on HIVE-3778:
--

+1

Running tests

> Add MapJoinDesc.isBucketMapJoin() as part of explain plan
> -
>
> Key: HIVE-3778
> URL: https://issues.apache.org/jira/browse/HIVE-3778
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: HIVE-3778.patch.3, HIVE-3778.patch.6, HIVE-3778.patch.8
>
>
> This is follow up of HIVE-3767:
> Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3953) Reading of partitioned Avro data fails because of missing properties

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567337#comment-13567337
 ] 

Namit Jain commented on HIVE-3953:
--

Copying from HIVE-3833.
Can you provide me a complete testcase ? I will take a look.

> Reading of partitioned Avro data fails because of missing properties
> 
>
> Key: HIVE-3953
> URL: https://issues.apache.org/jira/browse/HIVE-3953
> Project: Hive
>  Issue Type: Bug
>Reporter: Mark Wagner
>
> After HIVE-3833, reading partitioned Avro data fails due to missing 
> properties. The "avro.schema.(url|literal)" properties are not making it all 
> the way to the SerDe. Non-partitioned data can still be read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567336#comment-13567336
 ] 

Namit Jain commented on HIVE-3833:
--

[~jakobhoman], this was definitely not intentional. Unfortunately, there was no 
test case, so I missed this.
Can you provide me a complete testcase ? I will take a look.

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, 
> hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, 
> hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566727#comment-13566727
 ] 

Namit Jain commented on HIVE-3874:
--

I took a stab at it. I am attaching it just in case - feel free to ignore it.
I was not able to get the protocol buffer file auto-generated from ant, so I 
manually generated it for the
purpose of this patch.

> Create a new Optimized Row Columnar file format for Hive
> 
>
> Key: HIVE-3874
> URL: https://issues.apache.org/jira/browse/HIVE-3874
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: hive.3874.2.patch, OrcFileIntro.pptx, orc.tgz
>
>
> There are several limitations of the current RC File format that I'd like to 
> address by creating a new format:
> * each column value is stored as a binary blob, which means:
> ** the entire column value must be read, decompressed, and deserialized
> ** the file format can't use smarter type-specific compression
> ** push down filters can't be evaluated
> * the start of each row group needs to be found by scanning
> * user metadata can only be added to the file when the file is created
> * the file doesn't store the number of rows per a file or row group
> * there is no mechanism for seeking to a particular row number, which is 
> required for external indexes.
> * there is no mechanism for storing light weight indexes within the file to 
> enable push-down filters to skip entire row groups.
> * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-01-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3874:
-

Attachment: hive.3874.2.patch

> Create a new Optimized Row Columnar file format for Hive
> 
>
> Key: HIVE-3874
> URL: https://issues.apache.org/jira/browse/HIVE-3874
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: hive.3874.2.patch, OrcFileIntro.pptx, orc.tgz
>
>
> There are several limitations of the current RC File format that I'd like to 
> address by creating a new format:
> * each column value is stored as a binary blob, which means:
> ** the entire column value must be read, decompressed, and deserialized
> ** the file format can't use smarter type-specific compression
> ** push down filters can't be evaluated
> * the start of each row group needs to be found by scanning
> * user metadata can only be added to the file when the file is created
> * the file doesn't store the number of rows per a file or row group
> * there is no mechanism for seeking to a particular row number, which is 
> required for external indexes.
> * there is no mechanism for storing light weight indexes within the file to 
> enable push-down filters to skip entire row groups.
> * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3403) user should not specify mapjoin to perform sort-merge bucketed join

2013-01-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3403:
-

Attachment: hive.3403.19.patch

> user should not specify mapjoin to perform sort-merge bucketed join
> ---
>
> Key: HIVE-3403
> URL: https://issues.apache.org/jira/browse/HIVE-3403
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3403.10.patch, hive.3403.11.patch, 
> hive.3403.12.patch, hive.3403.13.patch, hive.3403.14.patch, 
> hive.3403.15.patch, hive.3403.16.patch, hive.3403.17.patch, 
> hive.3403.18.patch, hive.3403.19.patch, hive.3403.1.patch, hive.3403.2.patch, 
> hive.3403.3.patch, hive.3403.4.patch, hive.3403.5.patch, hive.3403.6.patch, 
> hive.3403.7.patch, hive.3403.8.patch, hive.3403.9.patch
>
>
> Currently, in order to perform a sort merge bucketed join, the user needs
> to set hive.optimize.bucketmapjoin.sortedmerge to true, and also specify the 
> mapjoin hint.
> The user should not specify any hints.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3785) Core hive changes for HiveServer2 implementation

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566400#comment-13566400
 ] 

Namit Jain commented on HIVE-3785:
--

cc [~mgrover], [~prasadm]

> Core hive changes for HiveServer2 implementation
> 
>
> Key: HIVE-3785
> URL: https://issues.apache.org/jira/browse/HIVE-3785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authentication, Build Infrastructure, Configuration, 
> Thrift API
>Affects Versions: 0.10.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Attachments: HS2-changed-files-only.patch
>
>
> The subtask to track changes in the core hive components for HiveServer2 
> implementation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3785) Core hive changes for HiveServer2 implementation

2013-01-30 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566399#comment-13566399
 ] 

Namit Jain commented on HIVE-3785:
--

I am sorry for the delay on my part.
Can you refresh ? I will definitely review this time.

> Core hive changes for HiveServer2 implementation
> 
>
> Key: HIVE-3785
> URL: https://issues.apache.org/jira/browse/HIVE-3785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authentication, Build Infrastructure, Configuration, 
> Thrift API
>Affects Versions: 0.10.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Attachments: HS2-changed-files-only.patch
>
>
> The subtask to track changes in the core hive components for HiveServer2 
> implementation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566175#comment-13566175
 ] 

Namit Jain commented on HIVE-3874:
--

[~alangates], [~ashutoshc], what do you think ?

> Create a new Optimized Row Columnar file format for Hive
> 
>
> Key: HIVE-3874
> URL: https://issues.apache.org/jira/browse/HIVE-3874
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: OrcFileIntro.pptx, orc.tgz
>
>
> There are several limitations of the current RC File format that I'd like to 
> address by creating a new format:
> * each column value is stored as a binary blob, which means:
> ** the entire column value must be read, decompressed, and deserialized
> ** the file format can't use smarter type-specific compression
> ** push down filters can't be evaluated
> * the start of each row group needs to be found by scanning
> * user metadata can only be added to the file when the file is created
> * the file doesn't store the number of rows per a file or row group
> * there is no mechanism for seeking to a particular row number, which is 
> required for external indexes.
> * there is no mechanism for storing light weight indexes within the file to 
> enable push-down filters to skip entire row groups.
> * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566174#comment-13566174
 ] 

Namit Jain commented on HIVE-3874:
--

[~owen.omalley], do you want to get the patch in a compilable state in contrib ?
That way, we can work on getting it in, and continue development over there.


> Create a new Optimized Row Columnar file format for Hive
> 
>
> Key: HIVE-3874
> URL: https://issues.apache.org/jira/browse/HIVE-3874
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: OrcFileIntro.pptx, orc.tgz
>
>
> There are several limitations of the current RC File format that I'd like to 
> address by creating a new format:
> * each column value is stored as a binary blob, which means:
> ** the entire column value must be read, decompressed, and deserialized
> ** the file format can't use smarter type-specific compression
> ** push down filters can't be evaluated
> * the start of each row group needs to be found by scanning
> * user metadata can only be added to the file when the file is created
> * the file doesn't store the number of rows per a file or row group
> * there is no mechanism for seeking to a particular row number, which is 
> required for external indexes.
> * there is no mechanism for storing light weight indexes within the file to 
> enable push-down filters to skip entire row groups.
> * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3403) user should not specify mapjoin to perform sort-merge bucketed join

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3403:
-

Attachment: hive.3403.18.patch

> user should not specify mapjoin to perform sort-merge bucketed join
> ---
>
> Key: HIVE-3403
> URL: https://issues.apache.org/jira/browse/HIVE-3403
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3403.10.patch, hive.3403.11.patch, 
> hive.3403.12.patch, hive.3403.13.patch, hive.3403.14.patch, 
> hive.3403.15.patch, hive.3403.16.patch, hive.3403.17.patch, 
> hive.3403.18.patch, hive.3403.1.patch, hive.3403.2.patch, hive.3403.3.patch, 
> hive.3403.4.patch, hive.3403.5.patch, hive.3403.6.patch, hive.3403.7.patch, 
> hive.3403.8.patch, hive.3403.9.patch
>
>
> Currently, in order to perform a sort merge bucketed join, the user needs
> to set hive.optimize.bucketmapjoin.sortedmerge to true, and also specify the 
> mapjoin hint.
> The user should not specify any hints.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-933) Infer bucketing/sorting properties

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-933:


   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

> Infer bucketing/sorting properties
> --
>
> Key: HIVE-933
> URL: https://issues.apache.org/jira/browse/HIVE-933
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-933.10.patch.txt, HIVE-933.11.patch.txt, 
> HIVE-933.12.patch.txt, HIVE-933.13.patch.txt, HIVE-933.1.patch.txt, 
> HIVE-933.2.patch.txt, HIVE-933.3.patch.txt, HIVE-933.4.patch.txt, 
> HIVE-933.5.patch.txt, HIVE-933.6.patch.txt, HIVE-933.7.patch.txt, 
> HIVE-933.8.patch.txt, HIVE-933.9.patch.txt
>
>
> This is a long-term plan, and may require major changes.
> From the query, we can figure out the sorting/bucketing properties, and 
> change the metadata of the destination at that time.
> However, this means that different partitions may have different metadata. 
> Currently, the query plan is same for all the 
> partitions of the table - we can do the following:
> 1. In the first cut, have a simple approach where you take the union all 
> metadata, and create the most defensive plan.
> 2. Enhance mapredWork() to include partition specific operator trees.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-933) Infer bucketing/sorting properties

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565642#comment-13565642
 ] 

Namit Jain commented on HIVE-933:
-

+1

Running tests

> Infer bucketing/sorting properties
> --
>
> Key: HIVE-933
> URL: https://issues.apache.org/jira/browse/HIVE-933
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Kevin Wilfong
> Attachments: HIVE-933.10.patch.txt, HIVE-933.11.patch.txt, 
> HIVE-933.12.patch.txt, HIVE-933.1.patch.txt, HIVE-933.2.patch.txt, 
> HIVE-933.3.patch.txt, HIVE-933.4.patch.txt, HIVE-933.5.patch.txt, 
> HIVE-933.6.patch.txt, HIVE-933.7.patch.txt, HIVE-933.8.patch.txt, 
> HIVE-933.9.patch.txt
>
>
> This is a long-term plan, and may require major changes.
> From the query, we can figure out the sorting/bucketing properties, and 
> change the metadata of the destination at that time.
> However, this means that different partitions may have different metadata. 
> Currently, the query plan is same for all the 
> partitions of the table - we can do the following:
> 1. In the first cut, have a simple approach where you take the union all 
> metadata, and create the most defensive plan.
> 2. Enhance mapredWork() to include partition specific operator trees.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-933) Infer bucketing/sorting properties

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565630#comment-13565630
 ] 

Namit Jain commented on HIVE-933:
-

[~kevinwilfong], is it ready for review, or you are still working on it ?

> Infer bucketing/sorting properties
> --
>
> Key: HIVE-933
> URL: https://issues.apache.org/jira/browse/HIVE-933
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Kevin Wilfong
> Attachments: HIVE-933.10.patch.txt, HIVE-933.11.patch.txt, 
> HIVE-933.12.patch.txt, HIVE-933.1.patch.txt, HIVE-933.2.patch.txt, 
> HIVE-933.3.patch.txt, HIVE-933.4.patch.txt, HIVE-933.5.patch.txt, 
> HIVE-933.6.patch.txt, HIVE-933.7.patch.txt, HIVE-933.8.patch.txt, 
> HIVE-933.9.patch.txt
>
>
> This is a long-term plan, and may require major changes.
> From the query, we can figure out the sorting/bucketing properties, and 
> change the metadata of the destination at that time.
> However, this means that different partitions may have different metadata. 
> Currently, the query plan is same for all the 
> partitions of the table - we can do the following:
> 1. In the first cut, have a simple approach where you take the union all 
> metadata, and create the most defensive plan.
> 2. Enhance mapredWork() to include partition specific operator trees.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3672) Support altering partition column type in Hive

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3672:
-

Fix Version/s: (was: 0.10.0)
Affects Version/s: (was: 0.10.0)
   Status: Open  (was: Patch Available)

comments

> Support altering partition column type in Hive
> --
>
> Key: HIVE-3672
> URL: https://issues.apache.org/jira/browse/HIVE-3672
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, SQL
>Reporter: Jingwei Lu
>Assignee: Jingwei Lu
>  Labels: features
> Attachments: HIVE-3672.1.patch.txt, HIVE-3672.2.patch.txt, 
> HIVE-3672.3.patch.txt, HIVE-3672.4.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Currently, Hive does not allow altering partition column types.  As we've 
> discouraged users from using non-string partition column types, this presents 
> a problem for users who want to change there partition columns to be strings, 
> they have to rename their table, create a new table, and copy all the data 
> over.
> To support this via the CLI, adding a command like ALTER TABLE  
> PARTITION COLUMN ( );

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3778:
-

Status: Open  (was: Patch Available)

can you refresh ?

> Add MapJoinDesc.isBucketMapJoin() as part of explain plan
> -
>
> Key: HIVE-3778
> URL: https://issues.apache.org/jira/browse/HIVE-3778
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: HIVE-3778.patch.3, HIVE-3778.patch.6
>
>
> This is follow up of HIVE-3767:
> Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3950) Remove code for merging files via MR job

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565589#comment-13565589
 ] 

Namit Jain commented on HIVE-3950:
--

+1

> Remove code for merging files via MR job
> 
>
> Key: HIVE-3950
> URL: https://issues.apache.org/jira/browse/HIVE-3950
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: hive-3950_1.patch, hive-3950_2.patch, hive-3950.patch
>
>
> Hive can merge files either via MR job or via map only job. Obviously, doing 
> it via map-only job is more efficient, but there is an option of doing it via 
> MR job as well because CombineFileInputFormat is available only in 
> hadoop-0.20 and later. Since, we no longer support hadoop versions earlier 
> than 20 anymore all that is now dead code, we should get rid of it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565531#comment-13565531
 ] 

Namit Jain commented on HIVE-3784:
--

Thanks a lot Ashutosh

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, 
> hive.3784.21.patch, hive.3784.22.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3933) StatsWork of QueryPlan can't be serialized correctly

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3933:
-

Status: Open  (was: Patch Available)

comments

> StatsWork of QueryPlan can't be serialized correctly
> 
>
> Key: HIVE-3933
> URL: https://issues.apache.org/jira/browse/HIVE-3933
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: yi
>Priority: Minor
> Attachments: HIVE-3933.patch, HIVE-3933.patch
>
>
> QueryPlan is serialized using java.beans.XMLEncoder, but StatsWork of 
> QueryPlan doesn't not follow java bean syntax, so it can't be serialized 
> correctly.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:240)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1137)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:948)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3950) Remove code for merging files via MR job

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3950:
-

Status: Open  (was: Patch Available)

minor comments

> Remove code for merging files via MR job
> 
>
> Key: HIVE-3950
> URL: https://issues.apache.org/jira/browse/HIVE-3950
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: hive-3950_1.patch, hive-3950.patch
>
>
> Hive can merge files either via MR job or via map only job. Obviously, doing 
> it via map-only job is more efficient, but there is an option of doing it via 
> MR job as well because CombineFileInputFormat is available only in 
> hadoop-0.20 and later. Since, we no longer support hadoop versions earlier 
> than 20 anymore all that is now dead code, we should get rid of it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-933) Infer bucketing/sorting properties

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-933:


Status: Open  (was: Patch Available)

minor comments

> Infer bucketing/sorting properties
> --
>
> Key: HIVE-933
> URL: https://issues.apache.org/jira/browse/HIVE-933
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Kevin Wilfong
> Attachments: HIVE-933.10.patch.txt, HIVE-933.11.patch.txt, 
> HIVE-933.1.patch.txt, HIVE-933.2.patch.txt, HIVE-933.3.patch.txt, 
> HIVE-933.4.patch.txt, HIVE-933.5.patch.txt, HIVE-933.6.patch.txt, 
> HIVE-933.7.patch.txt, HIVE-933.8.patch.txt, HIVE-933.9.patch.txt
>
>
> This is a long-term plan, and may require major changes.
> From the query, we can figure out the sorting/bucketing properties, and 
> change the metadata of the destination at that time.
> However, this means that different partitions may have different metadata. 
> Currently, the query plan is same for all the 
> partitions of the table - we can do the following:
> 1. In the first cut, have a simple approach where you take the union all 
> metadata, and create the most defensive plan.
> 2. Enhance mapredWork() to include partition specific operator trees.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3873) lot of tests failing for hadoop 23

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3873:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Tim

> lot of tests failing for hadoop 23
> --
>
> Key: HIVE-3873
> URL: https://issues.apache.org/jira/browse/HIVE-3873
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Fix For: 0.11.0
>
> Attachments: HIVE-3873.patch
>
>
> The following tests are failing on hadoop 23:
> [junit] Failed query: archive_excludeHadoop20.q
> [junit] Failed query: archive_multi.q
> [junit] Failed query: index_bitmap.q
> [junit] Failed query: join_filters_overlap.q
> [junit] Failed query: join_nullsafe.q
> [junit] Failed query: list_bucket_dml_6.q
> [junit] Failed query: list_bucket_dml_7.q
> [junit] Failed query: list_bucket_dml_8.q
> [junit] Failed query: list_bucket_query_oneskew_3.q
> [junit] Failed query: parenthesis_star_by.q
> [junit] Failed query: recursive_dir.q
> Some of them may be log updates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3917) Support fast operation for analyze command

2013-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3917:
-

Status: Open  (was: Patch Available)

comments

> Support fast operation for analyze command
> --
>
> Key: HIVE-3917
> URL: https://issues.apache.org/jira/browse/HIVE-3917
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.11.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3917.patch.1
>
>
> hive supports analyze command to gather statistics from existing 
> tables/partition 
> https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables
> It collects:
> 1. Number of Rows
> 2. Number of files
> 3. Size in Bytes
> If table/partition is big, the operation would take time since it will open 
> all files and scan all data.
> It would be nice to support fast operation to gather statistics which doesn't 
> require to open all files:
> 1. Number of files
> 2. Size in Bytes
> Potential syntax is 
> ANALYZE TABLE tablename [PARTITION(partcol1[=val1], partcol2[=val2], ...)] 
> COMPUTE STATISTICS [noscan];
> In the future, all statistics without scan can be retrieved via this optional 
> parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3954) flag indecating statistics is stale

2013-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565170#comment-13565170
 ] 

Namit Jain commented on HIVE-3954:
--

Do you want to provide more details here ?

Will we have different flags for noscan stats from HIVE-3917 and other stats ?


> flag indecating statistics is stale
> ---
>
> Key: HIVE-3954
> URL: https://issues.apache.org/jira/browse/HIVE-3954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>
> We will introduce a flag to indicate whether statistics is stale.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3873) lot of tests failing for hadoop 23

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565167#comment-13565167
 ] 

Namit Jain commented on HIVE-3873:
--

+1

Running tests

> lot of tests failing for hadoop 23
> --
>
> Key: HIVE-3873
> URL: https://issues.apache.org/jira/browse/HIVE-3873
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Attachments: HIVE-3873.patch
>
>
> The following tests are failing on hadoop 23:
> [junit] Failed query: archive_excludeHadoop20.q
> [junit] Failed query: archive_multi.q
> [junit] Failed query: index_bitmap.q
> [junit] Failed query: join_filters_overlap.q
> [junit] Failed query: join_nullsafe.q
> [junit] Failed query: list_bucket_dml_6.q
> [junit] Failed query: list_bucket_dml_7.q
> [junit] Failed query: list_bucket_dml_8.q
> [junit] Failed query: list_bucket_query_oneskew_3.q
> [junit] Failed query: parenthesis_star_by.q
> [junit] Failed query: recursive_dir.q
> Some of them may be log updates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3949) Some test failures in hadoop 23

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565169#comment-13565169
 ] 

Namit Jain commented on HIVE-3949:
--

[~gangtimliu], can you update this with the failed tests once HIVE-3873 is 
checked in ?

> Some test failures in hadoop 23
> ---
>
> Key: HIVE-3949
> URL: https://issues.apache.org/jira/browse/HIVE-3949
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>
> This is follow up on hive-3873.
> We have fixed some test failures in 3873 and a few other jira issues.
> We will use this jira to track the rest failures: 
> https://builds.apache.org/job/Hive-trunk-hadoop2/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565159#comment-13565159
 ] 

Namit Jain commented on HIVE-3778:
--

comments

> Add MapJoinDesc.isBucketMapJoin() as part of explain plan
> -
>
> Key: HIVE-3778
> URL: https://issues.apache.org/jira/browse/HIVE-3778
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: HIVE-3778.patch.3, HIVE-3778.patch.6
>
>
> This is follow up of HIVE-3767:
> Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565154#comment-13565154
 ] 

Namit Jain commented on HIVE-3778:
--

[~gangtimliu], HIVE-3784 has been +1 already, so it should go in soon.
Can you refresh once HIVE-3784 is in ?

> Add MapJoinDesc.isBucketMapJoin() as part of explain plan
> -
>
> Key: HIVE-3778
> URL: https://issues.apache.org/jira/browse/HIVE-3778
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: HIVE-3778.patch.3, HIVE-3778.patch.6
>
>
> This is follow up of HIVE-3767:
> Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565050#comment-13565050
 ] 

Namit Jain commented on HIVE-3784:
--

Comments addressed in the latest patch.

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, 
> hive.3784.21.patch, hive.3784.22.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-28 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.22.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, 
> hive.3784.21.patch, hive.3784.22.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565018#comment-13565018
 ] 

Namit Jain commented on HIVE-3784:
--

The tests finished fine

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, 
> hive.3784.21.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, 
> hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, 
> hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-28 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564441#comment-13564441
 ] 

Namit Jain commented on HIVE-3784:
--

comments addressed. running tests

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, 
> hive.3784.21.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, 
> hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, 
> hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-28 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.21.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, 
> hive.3784.21.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, 
> hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, 
> hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job

2013-01-28 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3952:
-

Component/s: Query Processor

> merge map-job followed by map-reduce job
> 
>
> Key: HIVE-3952
> URL: https://issues.apache.org/jira/browse/HIVE-3952
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>
> Consider the query like:
> select count(*) FROM
> ( select idOne, idTwo, value FROM
>   bigTable   
>   JOIN
> 
>   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
>   
>   ) firstjoin 
> 
> JOIN  
> 
> smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
> where smallTableOne and smallTableTwo are smaller than 
> hive.auto.convert.join.noconditionaltask.size and
> hive.auto.convert.join.noconditionaltask is set to true.
> The joins are collapsed into mapjoins, and it leads to a map-only job
> (for the map-joins) followed by a map-reduce job (for the group by).
> Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3952) merge map-job followed by map-reduce job

2013-01-28 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3952:


 Summary: merge map-job followed by map-reduce job
 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain


Consider the query like:

select count(*) FROM
( select idOne, idTwo, value FROM
  bigTable   
  JOIN  
  
  smallTableOne on (bigTable.idOne = smallTableOne.idOne)   

  ) firstjoin   
  
JOIN
  
smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);


where smallTableOne and smallTableTwo are smaller than 
hive.auto.convert.join.noconditionaltask.size and
hive.auto.convert.join.noconditionaltask is set to true.

The joins are collapsed into mapjoins, and it leads to a map-only job
(for the map-joins) followed by a map-reduce job (for the group by).
Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-27 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563869#comment-13563869
 ] 

Namit Jain commented on HIVE-3784:
--

Patch submitted with comments addressed.

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-27 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.19.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-27 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.18.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-27 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.17.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-27 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563824#comment-13563824
 ] 

Namit Jain commented on HIVE-3784:
--

bq. Reading from code it feels like that its not possible to have a) union 
before mapjoin b) union after mapjoin c) common join after mapjoin.

Union before/after mapjoin hint will not be supported -- That is not true 
completely. Without a map-join hint, joins before/after union
will automatically get converted to a map-join. There is nothing which was 
supported before and will not be in future - it will work without
the hint.

bq. common join after mapjoin

Same thing. The only small case if for sort-merge joins, which I added recently 
in HIVE-3633. This will also be made automatic soon.

I will address your other comments.




> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3912) table_access_keys_stats.q fails with hadoop 0.23

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3912:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Sushanth

> table_access_keys_stats.q fails with hadoop 0.23
> 
>
> Key: HIVE-3912
> URL: https://issues.apache.org/jira/browse/HIVE-3912
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
> Environment: Hadoop 0.23  (2.0.2-alpha)
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3912.D8049.1.patch
>
>
> CliDriver test table_access_keys_stats.q fails with hadoop 0.23 because a 
> different order of results from the join is produced under 0.23. The data 
> itself doesn't seem wrong, but the output does not match the golden output 
> file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3921) recursive_dir.q fails on 0.23

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3921:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Sushanth

> recursive_dir.q fails on 0.23
> -
>
> Key: HIVE-3921
> URL: https://issues.apache.org/jira/browse/HIVE-3921
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
> Environment: Hadoop 0.23 (2.0.2-alpha)
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
>  Labels: 0.23, tests
> Fix For: 0.11.0
>
> Attachments: HIVE-3921.D8055.1.patch
>
>
> This test fails in 0.23
>   - It insists that hive.mapred.supports.subdirectories must be true for 
> mapred.input.dir.recursive to be used. Currently, HiveConf sets that as 
> false. 
>   - HIVE-3643 mentions param and says that once HIVE-3276 is in, we 
> should switch the param, and this jira has been committed.
>   - Testing with just setting that parameter in the .q file yeilds a 
> mismatch with the golden file, but one that looks like it should just update 
> the .out file:
> [junit] diff -a 
> /Users/sush/dev/hive.git/build/ql/test/logs/clientpositive/recursive_dir.q.out
>  
> /Users/sush/dev/hive.git/ql/src/test/results/clientpositive/recursive_dir.q.out
> [junit] 59d58
> [junit] < PREHOOK: Input: default@fact_daily
> [junit] 64d62
> [junit] < POSTHOOK: Input: default@fact_daily

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3923) join_filters_overlap.q fails on 0.23

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3923:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Sushanth

> join_filters_overlap.q fails on 0.23
> 
>
> Key: HIVE-3923
> URL: https://issues.apache.org/jira/browse/HIVE-3923
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
> Environment: Hadoop 0.23
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3923.D8079.1.patch
>
>
> As with some of the other broken tests on 0.23, this is broken because the 
> order of results generated by the query on 0.23 is different from the order 
> in the golden output file. However, there appears to be nothing wrong with 
> the query itself.
> This can be fixed by adding an order-by clause and regenerating the golden 
> file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3924) join_nullsafe.q fails on 0.23

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3924:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Sushanth

> join_nullsafe.q fails on 0.23
> -
>
> Key: HIVE-3924
> URL: https://issues.apache.org/jira/browse/HIVE-3924
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
> Environment: Hadoop 0.23
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3924.D8085.1.patch
>
>
> As with some of the other broken tests on 0.23, this is broken because the 
> order of results generated by the query on 0.23 is different from the order 
> in the golden output file. However, there appears to be nothing wrong with 
> the query itself. This can be fixed by adding an order-by clause and 
> regenerating the golden file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3628) Provide a way to use counters in Hive through UDF

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3628:
-

Status: Open  (was: Patch Available)

Not sure, is it ready for review ?

> Provide a way to use counters in Hive through UDF
> -
>
> Key: HIVE-3628
> URL: https://issues.apache.org/jira/browse/HIVE-3628
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Viji
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, 
> HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch
>
>
> Currently it is not possible to generate counters through UDF. We should 
> support this. 
> Pig currently allows this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3937) Hive Profiler

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3937:
-

Status: Open  (was: Patch Available)

comments on phabricator

> Hive Profiler
> -
>
> Key: HIVE-3937
> URL: https://issues.apache.org/jira/browse/HIVE-3937
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Attachments: HIVE-3937.1.patch.txt
>
>
> Adding a Hive Profiler implementation which tracks inclusive wall times and 
> call counts of the operators

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3672) Support altering partition column type in Hive

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3672:
-

Fix Version/s: (was: 0.10.0)
Affects Version/s: (was: 0.10.0)
   Status: Open  (was: Patch Available)

comments on phabricator

> Support altering partition column type in Hive
> --
>
> Key: HIVE-3672
> URL: https://issues.apache.org/jira/browse/HIVE-3672
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, SQL
>Reporter: Jingwei Lu
>Assignee: Jingwei Lu
>  Labels: features
> Attachments: HIVE-3672.1.patch.txt, HIVE-3672.2.patch.txt, 
> HIVE-3672.3.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Currently, Hive does not allow altering partition column types.  As we've 
> discouraged users from using non-string partition column types, this presents 
> a problem for users who want to change there partition columns to be strings, 
> they have to rename their table, create a new table, and copy all the data 
> over.
> To support this via the CLI, adding a command like ALTER TABLE  
> PARTITION COLUMN ( );

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3948) avro_nullable_fields.q is failing in trunk

2013-01-26 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563530#comment-13563530
 ] 

Namit Jain commented on HIVE-3948:
--

Can you file a patch ?

> avro_nullable_fields.q is failing in trunk
> --
>
> Key: HIVE-3948
> URL: https://issues.apache.org/jira/browse/HIVE-3948
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3527) Allow CREATE TABLE LIKE command to take TBLPROPERTIES

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3527:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

> Allow CREATE TABLE LIKE command to take TBLPROPERTIES
> -
>
> Key: HIVE-3527
> URL: https://issues.apache.org/jira/browse/HIVE-3527
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-3527.1.patch.txt, hive.3527.2.patch, 
> HIVE-3527.3.patch.txt, HIVE-3527.4.patch.txt, HIVE-3527.D5883.1.patch
>
>
> CREATE TABLE ... LIKE ... commands currently don't take TBLPROPERTIES.  I 
> think it would be a useful feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3944) Make accept qfile argument for miniMR tests

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3944:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Navis

> Make accept qfile argument for miniMR tests
> ---
>
> Key: HIVE-3944
> URL: https://issues.apache.org/jira/browse/HIVE-3944
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Fix For: 0.11.0
>
> Attachments: HIVE-3944.D8175.1.patch
>
>
> Currently, miniMR test runs all tests regardless of setting qfile argument.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3948) avro_nullable_fields.q is failing in trunk

2013-01-26 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3948:


 Summary: avro_nullable_fields.q is failing in trunk
 Key: HIVE-3948
 URL: https://issues.apache.org/jira/browse/HIVE-3948
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3527) Allow CREATE TABLE LIKE command to take TBLPROPERTIES

2013-01-26 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563403#comment-13563403
 ] 

Namit Jain commented on HIVE-3527:
--

Running tests

> Allow CREATE TABLE LIKE command to take TBLPROPERTIES
> -
>
> Key: HIVE-3527
> URL: https://issues.apache.org/jira/browse/HIVE-3527
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-3527.1.patch.txt, hive.3527.2.patch, 
> HIVE-3527.3.patch.txt, HIVE-3527.4.patch.txt, HIVE-3527.D5883.1.patch
>
>
> CREATE TABLE ... LIKE ... commands currently don't take TBLPROPERTIES.  I 
> think it would be a useful feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3946) Make it possible to configure for each stage

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3946:
-

Status: Open  (was: Patch Available)

> Make it possible to configure for each stage
> 
>
> Key: HIVE-3946
> URL: https://issues.apache.org/jira/browse/HIVE-3946
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3946.D8181.1.patch
>
>
> Some MR related configurations like "mapred.reduce.tasks" or 
> "hive.exec.reducers.bytes.per.reducer" needed to be configured for each stage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3873) lot of tests failing for hadoop 23

2013-01-26 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3873:
-

Status: Open  (was: Patch Available)

Tim, can you create a phabricator entry ?
Also, file a separate follow-up jira for the ones still remaining to be fixed.

> lot of tests failing for hadoop 23
> --
>
> Key: HIVE-3873
> URL: https://issues.apache.org/jira/browse/HIVE-3873
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Attachments: HIVE-3873.patch
>
>
> The following tests are failing on hadoop 23:
> [junit] Failed query: archive_excludeHadoop20.q
> [junit] Failed query: archive_multi.q
> [junit] Failed query: index_bitmap.q
> [junit] Failed query: join_filters_overlap.q
> [junit] Failed query: join_nullsafe.q
> [junit] Failed query: list_bucket_dml_6.q
> [junit] Failed query: list_bucket_dml_7.q
> [junit] Failed query: list_bucket_dml_8.q
> [junit] Failed query: list_bucket_query_oneskew_3.q
> [junit] Failed query: parenthesis_star_by.q
> [junit] Failed query: recursive_dir.q
> Some of them may be log updates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Status: Patch Available  (was: Open)

The tests passed

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3825) Add Operator level Hooks

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-3825.
--

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed

Committed. Thanks Pamela

> Add Operator level Hooks
> 
>
> Key: HIVE-3825
> URL: https://issues.apache.org/jira/browse/HIVE-3825
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
> HIVE-3825.patch.10.txt, HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, 
> HIVE-3825.patch.6.txt, HIVE-3825.patch.7.txt, HIVE-3825.patch.8.txt, 
> HIVE-3825.patch.9.txt, HIVE-3825.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3825) Add Operator level Hooks

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3825:
-

Attachment: HIVE-3825.patch.10.txt

> Add Operator level Hooks
> 
>
> Key: HIVE-3825
> URL: https://issues.apache.org/jira/browse/HIVE-3825
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
> HIVE-3825.patch.10.txt, HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, 
> HIVE-3825.patch.6.txt, HIVE-3825.patch.7.txt, HIVE-3825.patch.8.txt, 
> HIVE-3825.patch.9.txt, HIVE-3825.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3943) Skewed query fails if hdfs path has special characters

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3943:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Tim

> Skewed query fails if hdfs path has special characters
> --
>
> Key: HIVE-3943
> URL: https://issues.apache.org/jira/browse/HIVE-3943
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Fix For: 0.11.0
>
> Attachments: HIVE-3943.patch
>
>
> If partition name has special character like :, query will fail like:
> FAILED: IllegalArgumentException Pathname /... from ... is not a valid DFS 
> filename.
> rootcause is skewed map in partition metastore has unescapted path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3403) user should not specify mapjoin to perform sort-merge bucketed join

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3403:
-

Status: Open  (was: Patch Available)

waiting for HIVE-3784 first

> user should not specify mapjoin to perform sort-merge bucketed join
> ---
>
> Key: HIVE-3403
> URL: https://issues.apache.org/jira/browse/HIVE-3403
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3403.10.patch, hive.3403.11.patch, 
> hive.3403.12.patch, hive.3403.13.patch, hive.3403.14.patch, 
> hive.3403.15.patch, hive.3403.16.patch, hive.3403.17.patch, 
> hive.3403.1.patch, hive.3403.2.patch, hive.3403.3.patch, hive.3403.4.patch, 
> hive.3403.5.patch, hive.3403.6.patch, hive.3403.7.patch, hive.3403.8.patch, 
> hive.3403.9.patch
>
>
> Currently, in order to perform a sort merge bucketed join, the user needs
> to set hive.optimize.bucketmapjoin.sortedmerge to true, and also specify the 
> mapjoin hint.
> The user should not specify any hints.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3944) Make accept qfile argument for miniMR tests

2013-01-25 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562681#comment-13562681
 ] 

Namit Jain commented on HIVE-3944:
--

+1

> Make accept qfile argument for miniMR tests
> ---
>
> Key: HIVE-3944
> URL: https://issues.apache.org/jira/browse/HIVE-3944
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3944.D8175.1.patch
>
>
> Currently, miniMR test runs all tests regardless of setting qfile argument.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3933) StatsWork of QueryPlan can't be serialized correctly

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3933:
-

Status: Open  (was: Patch Available)

Please mark 'submit patch' after creating the phabricator entry

> StatsWork of QueryPlan can't be serialized correctly
> 
>
> Key: HIVE-3933
> URL: https://issues.apache.org/jira/browse/HIVE-3933
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: yi
>Priority: Minor
> Attachments: HIVE-3933.patch, HIVE-3933.patch
>
>
> QueryPlan is serialized using java.beans.XMLEncoder, but StatsWork of 
> QueryPlan doesn't not follow java bean syntax, so it can't be serialized 
> correctly.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:240)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1137)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:948)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.16.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3943) Skewed query fails if hdfs path has special characters

2013-01-25 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562653#comment-13562653
 ] 

Namit Jain commented on HIVE-3943:
--

+1

> Skewed query fails if hdfs path has special characters
> --
>
> Key: HIVE-3943
> URL: https://issues.apache.org/jira/browse/HIVE-3943
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3943.patch
>
>
> If partition name has special character like :, query will fail like:
> FAILED: IllegalArgumentException Pathname /... from ... is not a valid DFS 
> filename.
> rootcause is skewed map in partition metastore has unescapted path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3825) Add Operator level Hooks

2013-01-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3825:
-

Attachment: HIVE-3825.patch.9.txt

> Add Operator level Hooks
> 
>
> Key: HIVE-3825
> URL: https://issues.apache.org/jira/browse/HIVE-3825
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
> HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, HIVE-3825.patch.6.txt, 
> HIVE-3825.patch.7.txt, HIVE-3825.patch.8.txt, HIVE-3825.patch.9.txt, 
> HIVE-3825.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2273) IP data type

2013-01-25 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562551#comment-13562551
 ] 

Namit Jain commented on HIVE-2273:
--

[~franklinhu], are you still working on this ?

> IP data type
> 
>
> Key: HIVE-2273
> URL: https://issues.apache.org/jira/browse/HIVE-2273
> Project: Hive
>  Issue Type: New Feature
>Reporter: Franklin Hu
>Assignee: Franklin Hu
>Priority: Minor
> Attachments: hive-2273.1.patch, hive-2273.2.patch, hive-2273.3.patch
>
>
> new data type that supports both IPv4 and IPv6 in String and binary 
> serialized formats

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3933) StatsWork of QueryPlan can't be serialized correctly

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3933:
-

Status: Open  (was: Patch Available)

> StatsWork of QueryPlan can't be serialized correctly
> 
>
> Key: HIVE-3933
> URL: https://issues.apache.org/jira/browse/HIVE-3933
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: yi
>Priority: Minor
> Attachments: HIVE-3933.patch
>
>
> QueryPlan is serialized using java.beans.XMLEncoder, but StatsWork of 
> QueryPlan doesn't not follow java bean syntax, so it can't be serialized 
> correctly.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:240)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1137)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:948)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3933) StatsWork of QueryPlan can't be serialized correctly

2013-01-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562486#comment-13562486
 ] 

Namit Jain commented on HIVE-3933:
--

The code changes look good.
Can you create a phabricator entry ?

1. There is no apache header for the test file.
2. There are a couple of warnings in the test file.

> StatsWork of QueryPlan can't be serialized correctly
> 
>
> Key: HIVE-3933
> URL: https://issues.apache.org/jira/browse/HIVE-3933
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: yi
>Priority: Minor
> Attachments: HIVE-3933.patch
>
>
> QueryPlan is serialized using java.beans.XMLEncoder, but StatsWork of 
> QueryPlan doesn't not follow java bean syntax, so it can't be serialized 
> correctly.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:240)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1137)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:948)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.15.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562443#comment-13562443
 ] 

Namit Jain commented on HIVE-3784:
--

Running tests after refreshing.

[~vinodkv], does it look OK for your usecase ?

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, 
> hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, 
> hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.14.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, 
> hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, 
> hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.13.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562374#comment-13562374
 ] 

Namit Jain commented on HIVE-3784:
--

Recently, in https://issues.apache.org/jira/browse/HIVE-3633, support was added 
for sub-query sort-merge joins, where joins
could be performed across sub-queries, and each sub-query was transformed into 
a sort-merge join. This support is being removed,
will be added automatically as part of 
https://issues.apache.org/jira/browse/HIVE-3403


> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.12.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.11.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, 
> hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, 
> hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.10.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.10.patch, hive.3784.1.patch, 
> hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, 
> hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3672) Support altering partition column type in Hive

2013-01-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561538#comment-13561538
 ] 

Namit Jain commented on HIVE-3672:
--

[~jingweilu], is it ready for review ?
Please refresh, and mark 'Submit Patch' if it is ready for review ?

> Support altering partition column type in Hive
> --
>
> Key: HIVE-3672
> URL: https://issues.apache.org/jira/browse/HIVE-3672
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, SQL
>Affects Versions: 0.10.0
>Reporter: Jingwei Lu
>Assignee: Jingwei Lu
> Attachments: HIVE-3672.1.patch.txt, HIVE-3672.2.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Currently, Hive does not allow altering partition column types.  As we've 
> discouraged users from using non-string partition column types, this presents 
> a problem for users who want to change there partition columns to be strings, 
> they have to rename their table, create a new table, and copy all the data 
> over.
> To support this via the CLI, adding a command like ALTER TABLE  
> PARTITION COLUMN ( );

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561535#comment-13561535
 ] 

Namit Jain commented on HIVE-3833:
--

Yes, the tests passed for me for .23

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, 
> hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, 
> hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561532#comment-13561532
 ] 

Namit Jain commented on HIVE-3784:
--

I got the above query working. The latest patch has the changes.
Will start cleaning up, and fixing the patch.

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.9.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.8.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, 
> hive.3784.8.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3927) Potential overflow with new RCFileCat column sizes options

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3927:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

> Potential overflow with new RCFileCat column sizes options
> --
>
> Key: HIVE-3927
> URL: https://issues.apache.org/jira/browse/HIVE-3927
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-3927.1.patch.txt
>
>
> The uncompressed/compressed sizes of columns may fit into ints for a single 
> block of an RC file, but the same does not hold when they are summed across 
> the file.  Should update the array which aggregates this sum to be an array 
> of longs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3825) Add Operator level Hooks

2013-01-23 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561430#comment-13561430
 ] 

Namit Jain commented on HIVE-3825:
--

I have updated the patch file, with the excludes in build.xml etc.
You can take a look.

> Add Operator level Hooks
> 
>
> Key: HIVE-3825
> URL: https://issues.apache.org/jira/browse/HIVE-3825
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
> HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, HIVE-3825.patch.6.txt, 
> HIVE-3825.patch.7.txt, HIVE-3825.patch.8.txt, HIVE-3825.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3825) Add Operator level Hooks

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3825:
-

Attachment: HIVE-3825.patch.8.txt

> Add Operator level Hooks
> 
>
> Key: HIVE-3825
> URL: https://issues.apache.org/jira/browse/HIVE-3825
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
> HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, HIVE-3825.patch.6.txt, 
> HIVE-3825.patch.7.txt, HIVE-3825.patch.8.txt, HIVE-3825.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3784:
-

Attachment: hive.3784.7.patch

> de-emphasize mapjoin hint
> -
>
> Key: HIVE-3784
> URL: https://issues.apache.org/jira/browse/HIVE-3784
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
> hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union  -> MapJoin
> MapJoin-> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: (was: hive.3833.23.patch)

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, 
> hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, 
> hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: hive.3833.23.patch

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, 
> hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, 
> hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-23 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: hive.3833.23.patch

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, 
> hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, 
> hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: hive.3833.22.patch

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.2.patch, hive.3833.3.patch, 
> hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, 
> hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: (was: hive.3833.22.patch)

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, 
> hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, 
> hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: hive.3833.22.patch

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.2.patch, hive.3833.3.patch, 
> hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, 
> hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3833:
-

Attachment: hive.3833.21.patch

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, 
> hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, 
> hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3927) Potential overflow with new RCFileCat column sizes options

2013-01-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560376#comment-13560376
 ] 

Namit Jain commented on HIVE-3927:
--

+1

Running tests

> Potential overflow with new RCFileCat column sizes options
> --
>
> Key: HIVE-3927
> URL: https://issues.apache.org/jira/browse/HIVE-3927
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-3927.1.patch.txt
>
>
> The uncompressed/compressed sizes of columns may fit into ints for a single 
> block of an RC file, but the same does not hold when they are summed across 
> the file.  Should update the array which aggregates this sum to be an array 
> of longs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559745#comment-13559745
 ] 

Namit Jain commented on HIVE-3833:
--

The tests finished fine.

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, hive.3833.2.patch, 
> hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, 
> hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559572#comment-13559572
 ] 

Namit Jain commented on HIVE-3833:
--

bq. In case of identity converter, there is no conversion cost, but in case of 
non-identity this will be worse than current impl, since converter will examine 
every single column value, which wasn't the case earlier. However, it's not 
clear how expensive this would be?

For the above, it is fairly difficult to address. In a follow-up, I can add a 
serde level property, which indicates that the serde can handle different 
datatypes (for eg.
lazySimpleSerde) - if all the partitions of the table have serde's with this 
property, then we can use identityConverter. This is kind of hacky, and am not 
sure if it is
useful, since it should not be a common case. Usually, the partition schema 
should match the table schema.

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, hive.3833.2.patch, 
> hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, 
> hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559568#comment-13559568
 ] 

Namit Jain commented on HIVE-3833:
--

Addressed the comments including the last one.



> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, hive.3833.2.patch, 
> hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, 
> hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    2   3   4   5   6   7   8   9   10   11   >