[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system

2015-11-24 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025528#comment-15025528
 ] 

Jesus Camacho Rodriguez commented on HIVE-12508:


[~jpullokkaran], wouldn't it be safer to add this fix too till CALCITE-794 is 
fixed? Otherwise, I will close it as duplicate.

> HiveAggregateJoinTransposeRule places a heavy load on the metadata system
> -
>
> Key: HIVE-12508
> URL: https://issues.apache.org/jira/browse/HIVE-12508
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12508.patch
>
>
> Finding out whether the input is already unique requires a call to 
> areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy 
> load on the metadata system. This can lead to long CBO planning.
> This is a temporary fix that avoid the call to the method till then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system

2015-11-24 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025581#comment-15025581
 ] 

Laljo John Pullokkaran commented on HIVE-12508:
---

Can we run in to CALCITE-794 with HIVE-12503 patch?
My understanding is CALCITE-794 is not an issue with meta data systems itself 
rather its an issue when rule fires repeatedly on the same node.


> HiveAggregateJoinTransposeRule places a heavy load on the metadata system
> -
>
> Key: HIVE-12508
> URL: https://issues.apache.org/jira/browse/HIVE-12508
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12508.patch
>
>
> Finding out whether the input is already unique requires a call to 
> areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy 
> load on the metadata system. This can lead to long CBO planning.
> This is a temporary fix that avoid the call to the method till then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025654#comment-15025654
 ] 

Sergey Shelukhin edited comment on HIVE-12341 at 11/24/15 11:11 PM:


Sorry, please diff revisions 2 and 5 on RB, 3 contains generated code and in 4 
I forgot a file 0_o


was (Author: sershe):
Sorry, please diff revisions 2 and -4- 5 on RB, 3 contains generated code and 
in 4 I forgot a file 0_o

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025654#comment-15025654
 ] 

Sergey Shelukhin edited comment on HIVE-12341 at 11/24/15 11:11 PM:


Sorry, please diff revisions 2 and -4- 5 on RB, 3 contains generated code and 
in 4 I forgot a file 0_o


was (Author: sershe):
Sorry, please diff revisions 2 and 4 on RB, 3 contains generated code

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch

2015-11-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12307:
--
Attachment: HIVE-12307.2.patch

[~alangates], I uploaded a new patch with refactored write() and using your 
wrapper/delegator idea, which makes things look much cleaner.

> Streaming API TransactionBatch.close() must abort any remaining transactions 
> in the batch
> -
>
> Key: HIVE-12307
> URL: https://issues.apache.org/jira/browse/HIVE-12307
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12307.2.patch, HIVE-12307.patch
>
>
> When the client of TransactionBatch API encounters an error it must close() 
> the batch and start a new one.  This prevents attempts to continue writing to 
> a file that may damaged in some way.
> The close() should ensure to abort the any txns that still remain in the 
> batch and close (best effort) all the files it's writing to.  The batch 
> should also put itself into a mode where any future ops on this batch fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9642) Hive metastore client retries don't happen consistently for all api calls

2015-11-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025733#comment-15025733
 ] 

Thejas M Nair commented on HIVE-9642:
-

+1
Thanks for also adding the test case.

Note for others-  this new patch also addresses the cases where MetaStoreClient 
constructor has errors.


> Hive metastore client retries don't happen consistently for all api calls
> -
>
> Key: HIVE-9642
> URL: https://issues.apache.org/jira/browse/HIVE-9642
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Xiaobing Zhou
>Assignee: Daniel Dai
> Attachments: HIVE-9642.1.patch, HIVE-9642.2.patch, HIVE-9642.3.patch
>
>
> When org.apache.thrift.transport.TTransportException is thrown for issues 
> like socket timeout, the retry via RetryingMetaStoreClient happens only in 
> certain cases.
> Retry happens for the getDatabase call in but not for getAllDatabases().
> The reason is RetryingMetaStoreClient checks for TTransportException being 
> the cause for InvocationTargetException. But in case of some calls such as 
> getAllDatabases in HiveMetastoreClient, all exceptions get wrapped in a 
> MetaException. We should remove this unnecessary wrapping of exceptions for 
> certain functions in HMC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025596#comment-15025596
 ] 

Sergey Shelukhin commented on HIVE-12020:
-

+1 pending tests... I didn't check, I assume none of the property files changed 
logically thru both transitions (to XML and back)

> Revert log4j2 xml configuration to properties based configuration
> -
>
> Key: HIVE-12020
> URL: https://issues.apache.org/jira/browse/HIVE-12020
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch
>
>
> Log4j 2.4 release brought back properties based configuration. We should 
> revert XML based configuration and use properties based configuration instead 
> (less verbose and will be similar to old log4j properties). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12341:

Attachment: HIVE-12341.03.patch

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC

2015-11-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025661#comment-15025661
 ] 

Siddharth Seth commented on HIVE-12510:
---

If the NDC is taking care of this - setting it in the thread name isn't 
required.

> LLAP: Append attempt id either to thread name or NDC
> 
>
> Key: HIVE-12510
> URL: https://issues.apache.org/jira/browse/HIVE-12510
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Currently, in LLAP attempt id gets appended to both thread name and added to 
> NDC creating long log lines like below
> {code}
> [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]]
> {code}
> I think it will be sufficient to add only to NDC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12341:

Attachment: HIVE-12341.03.patch

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025779#comment-15025779
 ] 

Prasanth Jayachandran commented on HIVE-12510:
--

Fixed in .3 patch of HIVE-12020. Attempt id will only set in NDC now.

> LLAP: Append attempt id either to thread name or NDC
> 
>
> Key: HIVE-12510
> URL: https://issues.apache.org/jira/browse/HIVE-12510
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Currently, in LLAP attempt id gets appended to both thread name and added to 
> NDC creating long log lines like below
> {code}
> [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]]
> {code}
> I think it will be sufficient to add only to NDC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration

2015-11-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12020:
-
Attachment: HIVE-12020.3.patch

One more change related to LLAP. HIVE-12510 is addressed in this patch. Removed 
the attempt id from TezTaskRunner

> Revert log4j2 xml configuration to properties based configuration
> -
>
> Key: HIVE-12020
> URL: https://issues.apache.org/jira/browse/HIVE-12020
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch, 
> HIVE-12020.3.patch
>
>
> Log4j 2.4 release brought back properties based configuration. We should 
> revert XML based configuration and use properties based configuration instead 
> (less verbose and will be similar to old log4j properties). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC

2015-11-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12510:
-
Fix Version/s: 2.0.0

> LLAP: Append attempt id either to thread name or NDC
> 
>
> Key: HIVE-12510
> URL: https://issues.apache.org/jira/browse/HIVE-12510
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
>
> Currently, in LLAP attempt id gets appended to both thread name and added to 
> NDC creating long log lines like below
> {code}
> [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]]
> {code}
> I think it will be sufficient to add only to NDC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025612#comment-15025612
 ] 

Prasanth Jayachandran commented on HIVE-12020:
--

Let me list down the changes
1)  Datanucleus related logging have changed. Earlier man specific datanucleus 
loggers were explicitly added. In this patch, all top level data nucleus 
loggers are added (no need for specific loggers). Discussed with Sushanth about 
it and he said the new change is good. If we want specific logger changes we 
can do it later.
2) In log4j2.xml files, I have mistakenly added %x to pattern layout that will 
be used by NDC. I don't think anything other than llap uses NDC so %x is added 
only to llap properties.
3) Log4j version updated to 2.4.1 to workaround NPE with empty loggers
4) HiveEventCounter has been removed from root logger configuration. It was 
added by default and I don't think it is of much significance. It publishes 
count of msgs logged at different log levels to hadoop metrics. But I don't see 
any configurations for hadoop-metrics in hive source. If required, this can 
also be added back. 

Other than these changes, it's pretty much the same one-to-one copy from 
log4j2.xml.

> Revert log4j2 xml configuration to properties based configuration
> -
>
> Key: HIVE-12020
> URL: https://issues.apache.org/jira/browse/HIVE-12020
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch
>
>
> Log4j 2.4 release brought back properties based configuration. We should 
> revert XML based configuration and use properties based configuration instead 
> (less verbose and will be similar to old log4j properties). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025654#comment-15025654
 ] 

Sergey Shelukhin commented on HIVE-12341:
-

Sorry, please diff revisions 2 and 4 on RB, 3 contains generated code

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12513) Change LlapTokenIdentifier to use protbuf

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025716#comment-15025716
 ] 

Sergey Shelukhin commented on HIVE-12513:
-

Token is part of Hadoop security and is Writable. What we have is token 
identifier; that also has to be writable so that Hadoop security could write 
it, for all the basic parts of TokenIdentifier we inherit from delegation token 
indentifier.. We can use protobuf for our stuff and just write bytes into 
writable  (currently we add nothing to the basic token though), but basic token 
and superclass identifier have to stay writable.

> Change LlapTokenIdentifier to use protbuf
> -
>
> Key: HIVE-12513
> URL: https://issues.apache.org/jira/browse/HIVE-12513
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Siddharth Seth
>
> Follow up to HIVE-12341. Currently writable, which can get in the way of 
> upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12513) Change LlapTokenIdentifier to use protobuf

2015-11-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12513:
--
Summary: Change LlapTokenIdentifier to use protobuf  (was: Change 
LlapTokenIdentifier to use protbuf)

> Change LlapTokenIdentifier to use protobuf
> --
>
> Key: HIVE-12513
> URL: https://issues.apache.org/jira/browse/HIVE-12513
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Siddharth Seth
>
> Follow up to HIVE-12341. Currently writable, which can get in the way of 
> upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12513) Change LlapTokenIdentifier to use protobuf

2015-11-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025736#comment-15025736
 ] 

Siddharth Seth commented on HIVE-12513:
---

Writable just required bytes to be written and read back in. A protobuf 
instance wrapped in a writable can be used (used in Hadoop to allow for changes 
to the token across versions). Essentially, the serialized bytes end up being 
interpreted as Protobuf - with support for unknown fields etc.

> Change LlapTokenIdentifier to use protobuf
> --
>
> Key: HIVE-12513
> URL: https://issues.apache.org/jira/browse/HIVE-12513
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Security
>Reporter: Siddharth Seth
>
> Follow up to HIVE-12341. Currently writable, which can get in the way of 
> upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12512) Include driver logs in execution-level Operation logs

2015-11-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-12512:
---
Attachment: HIVE-12512.patch

> Include driver logs in execution-level Operation logs
> -
>
> Key: HIVE-12512
> URL: https://issues.apache.org/jira/browse/HIVE-12512
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Minor
> Attachments: HIVE-12512.patch
>
>
> When {{hive.server2.logging.operation.level}} is set to {{EXECUTION}} 
> (default),  operation logs do not include Driver logs, which contain useful 
> info like total number of jobs launched, stage getting executed, etc. that 
> help track high-level progress. It only adds a few more lines to the output.
> {code}
> 15/11/24 14:09:12 INFO ql.Driver: Semantic Analysis Completed
> 15/11/24 14:09:12 INFO ql.Driver: Starting 
> command(queryId=hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1): 
> select count(*) from sample_08
> 15/11/24 14:09:12 INFO ql.Driver: Query ID = 
> hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1
> 15/11/24 14:09:12 INFO ql.Driver: Total jobs = 1
> ...
> 15/11/24 14:09:40 INFO ql.Driver: MapReduce Jobs Launched:
> 15/11/24 14:09:40 INFO ql.Driver: Stage-Stage-1: Map: 1  Reduce: 1   
> Cumulative CPU: 3.58 sec   HDFS Read: 52956 HDFS Write: 4 SUCCESS
> 15/11/24 14:09:40 INFO ql.Driver: Total MapReduce CPU Time Spent: 3 seconds 
> 580 msec
> 15/11/24 14:09:40 INFO ql.Driver: OK
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-11-24 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11878:
--
Attachment: HIVE-11878.2.patch

Latest patch (HIVE-11878_approach3_with_review_comments1.patch) was not in the 
correct name format to kick off the pre-commit tests.  Re-uploading it as 
HIVE-11878.2.patch.

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878 ClassLoader Issues when Registering 
> Jars.pptx, HIVE-11878.2.patch, HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_approach3_per_session_clasloader.patch, 
> HIVE-11878_approach3_with_review_comments.patch, 
> HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12514) Setup renewal of LLAP tokens

2015-11-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12514:
--
Assignee: (was: Thejas M Nair)

> Setup renewal of LLAP tokens
> 
>
> Key: HIVE-12514
> URL: https://issues.apache.org/jira/browse/HIVE-12514
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Security
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11890) Create ORC module

2015-11-24 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-11890:
-
Attachment: HIVE-11890.patch

This patch is rebased to master.

It does:
* creates an orc submodule
* moves a couple more classes to the storage-api module
* moves most of the api and utility classes to the orc module

> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration

2015-11-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12020:
-
Attachment: HIVE-12020.2.patch

This patch includes changes for llap. Also updated log4j2 version from 2.4 to 
2.4.1 as we hit this issue https://issues.apache.org/jira/browse/LOG4J2-1153

> Revert log4j2 xml configuration to properties based configuration
> -
>
> Key: HIVE-12020
> URL: https://issues.apache.org/jira/browse/HIVE-12020
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch
>
>
> Log4j 2.4 release brought back properties based configuration. We should 
> revert XML based configuration and use properties based configuration instead 
> (less verbose and will be similar to old log4j properties). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12341:

Attachment: HIVE-12341.02.patch

Addressed the RB comments. [~sseth] can you take a look at Tez credentials 
transfer in particular, does that change make sense? I will set up a test for 
it later if all should be good.

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025745#comment-15025745
 ] 

Xuefu Zhang commented on HIVE-12184:


Patch looks good. A few minor comments on RB.

> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, 
> HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12487) Fix broken MiniLlap tests

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025831#comment-15025831
 ] 

Hive QA commented on HIVE-12487:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773865/HIVE-12487.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9864 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMarkPartition - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_temp_table_gb1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union31
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union32
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_short_regress
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6118/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6118/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6118/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773865 - PreCommit-HIVE-TRUNK-Build

> Fix broken MiniLlap tests
> -
>
> Key: HIVE-12487
> URL: https://issues.apache.org/jira/browse/HIVE-12487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Aleksei Statkevich
>Assignee: Aleksei Statkevich
>Priority: Critical
> Attachments: HIVE-12487.1.patch, HIVE-12487.2.patch, HIVE-12487.patch
>
>
> Currently MiniLlap tests fail with the following error:
> {code}
> TestMiniLlapCliDriver - did not produce a TEST-*.xml file
> {code}
> Supposedly, it started happening after HIVE-12319.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025848#comment-15025848
 ] 

Hive QA commented on HIVE-12483:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773549/HIVE-12483.1-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9788 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1013/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1013/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1013/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773549 - PreCommit-HIVE-SPARK-Build

> Fix precommit Spark test branch
> ---
>
> Key: HIVE-12483
> URL: https://issues.apache.org/jira/browse/HIVE-12483
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12483.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11488) Add sessionId and queryId info to HS2 log

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025855#comment-15025855
 ] 

Lefty Leverenz commented on HIVE-11488:
---

Edit permission is easy to get, you just need a Confluence username:

* [About This Wiki -- How to get permission to edit | 
https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit]

> Add sessionId and queryId info to HS2 log
> -
>
> Key: HIVE-11488
> URL: https://issues.apache.org/jira/browse/HIVE-11488
> Project: Hive
>  Issue Type: New Feature
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11488.2.patch, HIVE-11488.3.patch, HIVE-11488.patch
>
>
> Session is critical for a multi-user system like Hive. Currently Hive doesn't 
> log seessionId to the log file, which sometimes make debugging and analysis 
> difficult when multiple activities are going on at the same time and the log 
> from different sessions are mixed together.
> Currently, Hive already has the sessionId saved in SessionState and also 
> there is another sessionId in SessionHandle (Seems not used and I'm still 
> looking to understand it). Generally we should have one sessionId from the 
> beginning in the client side and server side. Seems we have some work on that 
> side first.
> The sessionId then can be added to log4j supported mapped diagnostic context 
> (MDC) and can be configured to output to log file through the log4j property. 
> MDC is per thread, so we need to add sessionId to the HS2 main thread and 
> then it will be inherited by the child threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025973#comment-15025973
 ] 

Lefty Leverenz commented on HIVE-12175:
---

Sounds good to me.  Adding a TODOC2.0 label.

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-24 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12175:
--
Labels: TODOC2.0  (was: )

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2015-11-24 Thread Hui Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Zheng updated HIVE-11531:
-
Attachment: HIVE-11531.02.patch

Thanks [~sershe] and [~jcamachorodriguez]
 I updated the patch.

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
> Attachments: HIVE-11531.02.patch, HIVE-11531.WIP.1.patch, 
> HIVE-11531.WIP.2.patch, HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025936#comment-15025936
 ] 

Lefty Leverenz commented on HIVE-12175:
---

Should [~prasanth_j]'s explanation be documented in the wiki?  Or does this 
need any other documentation?

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-1073) CREATE VIEW followup: track view dependency information in metastore

2015-11-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025987#comment-15025987
 ] 

Carl Steinbach commented on HIVE-1073:
--

Hi [~freepeter], thanks for writing up these notes!

bq. To track the view dependency, I will add a new class MTableDependency (name 
TBD) which contains srcTbl and dstTbl.

Since only views can have dependencies on other tables/views it probably makes 
sense to change the name to MViewDependency, and replace srcTbl with srcView.

> CREATE VIEW followup:  track view dependency information in metastore
> -
>
> Key: HIVE-1073
> URL: https://issues.apache.org/jira/browse/HIVE-1073
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Views
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: Wenlei Xie
>
> Add a generic mechanism for recording the fact that one object depends on 
> another.  First use case (to be implemented as part of this task) would be 
> views depending on tables or other views, but in the future we can also use 
> this for views depending on persistent functions, functions depending on 
> other functions, etc.
> This involves metastore modeling, QL analysis for deriving and recording the 
> dependencies (Ashish says something may already be available from the lineage 
> work), and CLI support for browsing dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025999#comment-15025999
 ] 

Chengxiang Li commented on HIVE-12466:
--

SparkCounters is only used for stats collection now, so yes, i think we may not 
need SparkCounters anymore if counter-based stats collection is removed. As far 
as i know, there is no other Hive features which depends on SparkCounters.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12411) Remove counter based stats collection mechanism

2015-11-24 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12411:
--
Labels: TODOC2.0  (was: )

> Remove counter based stats collection mechanism
> ---
>
> Key: HIVE-12411
> URL: https://issues.apache.org/jira/browse/HIVE-12411
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats 
> collection mechanism. Now we are targeting counter based stats collection 
> mechanism. The main advantages are as follows (1) counter based stats has 
> limitation on the length of the counter itself, if it is too long, MD5 will 
> be applied. (2) when there are a large number of partitions and columns, we 
> need to create a large number of counters in memory. This will put a heavy 
> load on the M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12411) Remove counter based stats collection mechanism

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025967#comment-15025967
 ] 

Lefty Leverenz commented on HIVE-12411:
---

Doc note:  This changes *hive.stats.dbclass* (removing counter as a value) and 
removes *hive.stats.key.prefix.reserve.length* so the wiki needs to be updated 
for release 2.0.0.

* [Configuration Properties -- hive.stats.dbclass | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.dbclass]
* [Configuration Properties -- hive.stats.key.prefix.reserve.length | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.key.prefix.reserve.length]

The Statistics doc does not mention counter-based stats so no update is 
required, although an explanation of collection mechanisms would be a helpful 
addition.   *hive.stats.dbclass* is discussed in the Usage section.

* [Statistics in Hive | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev]
** [Implementation | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Implementation]
** [Usage | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Usage]

> Remove counter based stats collection mechanism
> ---
>
> Key: HIVE-12411
> URL: https://issues.apache.org/jira/browse/HIVE-12411
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats 
> collection mechanism. Now we are targeting counter based stats collection 
> mechanism. The main advantages are as follows (1) counter based stats has 
> limitation on the length of the counter itself, if it is too long, MD5 will 
> be applied. (2) when there are a large number of partitions and columns, we 
> need to create a large number of counters in memory. This will put a heavy 
> load on the M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025953#comment-15025953
 ] 

Prasanth Jayachandran commented on HIVE-12175:
--

May be we should list all the custom serializers that hive uses in the 
documentation and provide note to user saying if any other serializer is 
required at runtime then runtime exception might be thrown on failure of object 
creation. 

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert

2015-11-24 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-4240:
-
Labels: TODOC11  (was: )

> optimize hive.enforce.bucketing and hive.enforce sorting insert
> ---
>
> Key: HIVE-4240
> URL: https://issues.apache.org/jira/browse/HIVE-4240
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
>  Labels: TODOC11
> Fix For: 0.11.0
>
> Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, 
> hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat
>
>
> Consider the following scenario:
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.reducers.max = 1;
> set hive.merge.mapfiles=false;
> set hive.merge.mapredfiles=false;
> -- Create two bucketed and sorted tables
> CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> FROM src
> INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *;
> -- Insert data into the bucketed table by selecting from another bucketed 
> table
> -- This should be a map-only operation
> INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1')
> SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1';
> We should not need a reducer to perform the above operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11107) Support for Performance regression test suite with TPCDS

2015-11-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11107:
-
Attachment: HIVE-11107.3.patch

> Support for Performance regression test suite with TPCDS
> 
>
> Key: HIVE-11107
> URL: https://issues.apache.org/jira/browse/HIVE-11107
> Project: Hive
>  Issue Type: Task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, 
> HIVE-11107.3.patch
>
>
> Support to add TPCDS queries to the performance regression test suite with 
> Hive CBO turned on.
> This benchmark is intended to make sure that subsequent changes to the 
> optimizer or any hive code do not yield any unexpected plan changes. i.e.  
> the intention is to not run the entire TPCDS query set, but just "explain 
> plan" for the TPCDS queries.
> As part of this jira, we will manually verify that expected hive 
> optimizations kick in for the queries (for given stats/dataset). If there is 
> a difference in plan within this test suite due to a future commit, it needs 
> to be analyzed and we need to make sure that it is not a regression.
> The test suite can be run in master branch from itests by 
> {code}
> mvn test -Dtest=TestPerfCliDriver -Phadoop-2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off

2015-11-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025977#comment-15025977
 ] 

Ashutosh Chauhan commented on HIVE-12491:
-

I guess what Gopal is pointing out is multiple PK case is missing which might 
help this use case. (as demonstrated in his WIP patch). 
Other thing is we failed to recognize that out of 3 columns, two are different 
udfs on same column, so we incorrectly computed denom for that. Ideally, we 
need to fix both but doing atleast one of these two will help.

> Column Statistics: 3 attribute join on a 2-source table is off
> --
>
> Key: HIVE-12491
> URL: https://issues.apache.org/jira/browse/HIVE-12491
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12491.WIP.patch
>
>
> The eased out denominator has to detect duplicate row-stats from different 
> attributes.
> {code}
>   private Long getEasedOutDenominator(List distinctVals) {
>   // Exponential back-off for NDVs.
>   // 1) Descending order sort of NDVs
>   // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * 
>   Collections.sort(distinctVals, Collections.reverseOrder());
>   long denom = distinctVals.get(0);
>   for (int i = 1; i < distinctVals.size(); i++) {
> denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << 
> i)));
>   }
>   return denom;
> }
> {code}
> This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 
> of which are derived from the same column.
> {code}
> Reduce Output Operator (RS_12)
>   key expressions: _col0 (type: bigint), year(_col2) (type: int), 
> month(_col2) (type: int)
>   sort order: +++
>   Map-reduce partition columns: _col0 (type: bigint), year(_col2) 
> (type: int), month(_col2) (type: int)
>   value expressions: _col1 (type: bigint)
>   Join Operator (JOIN_13)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) 
> (type: int)
>   1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) 
> (type: int)
> outputColumnNames: _col3
> {code}
> So the eased out denominator is off by a factor of 30,000 or so, causing OOMs 
> in map-joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12329) Turn on limit pushdown optimization by default

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025926#comment-15025926
 ] 

Prasanth Jayachandran commented on HIVE-12329:
--

cp_sel.q.out - I am guessing order is not guaranteed for limit pushdown? and 
that's why the change
insert_into3.q.out - Any idea why a new map task is introduced?

Other than these LGTM, +1

> Turn on limit pushdown optimization by default
> --
>
> Key: HIVE-12329
> URL: https://issues.apache.org/jira/browse/HIVE-12329
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12329.2.patch, HIVE-12329.patch
>
>
> Whenever applicable, this will always help, so this should be on by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off

2015-11-24 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025956#comment-15025956
 ] 

Pengcheng Xiong commented on HIVE-12491:


PK-FK inference in StatsRuleProcFactory is not limited to a single PK and a 
single FK. It is limited to a single PK only. That is, we allow single PK and 
multiple FKs. In a single PK and multiple FKs case, we first use PK-FK 
relationship to estimate the row count, NDV, etc and then join with other FKs 
without PK-FK inference. Hope it helps.

> Column Statistics: 3 attribute join on a 2-source table is off
> --
>
> Key: HIVE-12491
> URL: https://issues.apache.org/jira/browse/HIVE-12491
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12491.WIP.patch
>
>
> The eased out denominator has to detect duplicate row-stats from different 
> attributes.
> {code}
>   private Long getEasedOutDenominator(List distinctVals) {
>   // Exponential back-off for NDVs.
>   // 1) Descending order sort of NDVs
>   // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * 
>   Collections.sort(distinctVals, Collections.reverseOrder());
>   long denom = distinctVals.get(0);
>   for (int i = 1; i < distinctVals.size(); i++) {
> denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << 
> i)));
>   }
>   return denom;
> }
> {code}
> This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 
> of which are derived from the same column.
> {code}
> Reduce Output Operator (RS_12)
>   key expressions: _col0 (type: bigint), year(_col2) (type: int), 
> month(_col2) (type: int)
>   sort order: +++
>   Map-reduce partition columns: _col0 (type: bigint), year(_col2) 
> (type: int), month(_col2) (type: int)
>   value expressions: _col1 (type: bigint)
>   Join Operator (JOIN_13)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) 
> (type: int)
>   1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) 
> (type: int)
> outputColumnNames: _col3
> {code}
> So the eased out denominator is off by a factor of 30,000 or so, causing OOMs 
> in map-joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2015-11-24 Thread Hui Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026010#comment-15026010
 ] 

Hui Zheng commented on HIVE-11531:
--

Thanks [~jcamachorodriguez]
I have implemented 
{code}
LIMIT n OFFSET skip
{code}


> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
> Attachments: HIVE-11531.02.patch, HIVE-11531.WIP.1.patch, 
> HIVE-11531.WIP.2.patch, HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026017#comment-15026017
 ] 

Rui Li commented on HIVE-12466:
---

If spark counter is removed, does HoS support other methods to collect stats, 
like fs-based?

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026021#comment-15026021
 ] 

Lefty Leverenz commented on HIVE-4240:
--

Doc note:  This added configuration parameter *hive.optimize.bucketingsorting* 
to HiveConf.java, so it needs to be documented in the wiki.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

HIVE-12331 changes the description of *hive.optimize.bucketingsorting* in 
release 2.0.0.

> optimize hive.enforce.bucketing and hive.enforce sorting insert
> ---
>
> Key: HIVE-4240
> URL: https://issues.apache.org/jira/browse/HIVE-4240
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
>  Labels: TODOC11
> Fix For: 0.11.0
>
> Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, 
> hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat
>
>
> Consider the following scenario:
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.exec.reducers.max = 1;
> set hive.merge.mapfiles=false;
> set hive.merge.mapredfiles=false;
> -- Create two bucketed and sorted tables
> CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) 
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS;
> FROM src
> INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *;
> -- Insert data into the bucketed table by selecting from another bucketed 
> table
> -- This should be a map-only operation
> INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1')
> SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1';
> We should not need a reducer to perform the above operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs

2015-11-24 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12331:
--
Labels: TODOC2.0  (was: )

> Remove hive.enforce.bucketing & hive.enforce.sorting configs
> 
>
> Key: HIVE-12331
> URL: https://issues.apache.org/jira/browse/HIVE-12331
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12331.1.patch, HIVE-12331.patch
>
>
> If table is created as bucketed and/or sorted and this config is set to 
> false, you will insert data in wrong buckets and/or sort order and then if 
> you use these tables subsequently in BMJ or SMBJ you will get wrong results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026034#comment-15026034
 ] 

Chengxiang Li commented on HIVE-12466:
--

Yes, it does, at least at the time i implemented the counter-based stats 
collection for Spark, it does not relate to any part of our work on HoS, so i 
assume it should work just as well now.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026042#comment-15026042
 ] 

Xuefu Zhang commented on HIVE-12466:


Thanks, guys. Let's get this in and uses a separate JIRA to do the cleanup.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026048#comment-15026048
 ] 

Lefty Leverenz commented on HIVE-12331:
---

Doc note:  This removes *hive.enforce.bucketing* & *hive.enforce.sorting* from 
HiveConf.java, and changes the description of *hive.optimize.bucketingsorting* 
(created by HIVE-4240 in release 0.11.0 and not documented yet in the wiki) so 
Configuration Properties needs to be updated.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]
** [Configuration Properties -- hive.enforce.bucketing | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.enforce.bucketing]
** [Configuration Properties -- hive.enforce.sorting | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.enforce.sorting]
** [Configuration Properties -- hive.optimize.bucketingsorting | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.bucketingsorting]
  (this link will work after the parameter has been documented)

Other wikidocs that need updates because they mention the removed 
*hive.enforce.bucketing* parameter:

* [Hive Transactions -- Configuration (annotate *hive.enforce.bucketing* with 
version information) | 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration]
* [Hive Transactions -- Configuration Values to Set for INSERT, UPDATE, DELETE 
(annotate *hive.enforce.bucketing* with version information) | 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ConfigurationValuestoSetforINSERT,UPDATE,DELETE]
* [Bucketed Tables (3 instances of *hive.enforce.bucketing*) | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables]
* [AdminManual Configuration -- Hive Configuration Variables (1 instance of 
*hive.enforce.bucketing*) | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-HiveConfigurationVariables]
* [Configuration Properties -- Transactions and Compactor 
(*hive.enforce.bucketing* in list of parameters that need non-default values) | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]


> Remove hive.enforce.bucketing & hive.enforce.sorting configs
> 
>
> Key: HIVE-12331
> URL: https://issues.apache.org/jira/browse/HIVE-12331
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12331.1.patch, HIVE-12331.patch
>
>
> If table is created as bucketed and/or sorted and this config is set to 
> false, you will insert data in wrong buckets and/or sort order and then if 
> you use these tables subsequently in BMJ or SMBJ you will get wrong results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12329) Turn on limit pushdown optimization by default

2015-11-24 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12329:
--
Labels: TODOC2.0  (was: )

> Turn on limit pushdown optimization by default
> --
>
> Key: HIVE-12329
> URL: https://issues.apache.org/jira/browse/HIVE-12329
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12329.2.patch, HIVE-12329.patch
>
>
> Whenever applicable, this will always help, so this should be on by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12329) Turn on limit pushdown optimization by default

2015-11-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026080#comment-15026080
 ] 

Lefty Leverenz commented on HIVE-12329:
---

Doc note:  This changes the default value and description of 
*hive.limit.pushdown.memory.usage* which was introduced by HIVE-3562 in release 
0.12.0.  It needs to be updated in the wiki:

* [Configuration Properties -- hive.limit.pushdown.memory.usage | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.limit.pushdown.memory.usage]

*hive.limit.pushdown.memory.usage* is also mentioned in Hive on Spark: Getting 
Started but doesn't seem to need revision there -- it just shows a recommended 
value of 0.4:

* [Hive on Spark: Getting Started -- Recommended Configuration | 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark:+Getting+Started#HiveonSpark:GettingStarted-RecommendedConfiguration]

> Turn on limit pushdown optimization by default
> --
>
> Key: HIVE-12329
> URL: https://issues.apache.org/jira/browse/HIVE-12329
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12329.2.patch, HIVE-12329.patch
>
>
> Whenever applicable, this will always help, so this should be on by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026081#comment-15026081
 ] 

Chengxiang Li commented on HIVE-12466:
--

Committed to spark branch, thanks Rui for this contribution.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026089#comment-15026089
 ] 

Chengxiang Li commented on HIVE-12466:
--

HIVE-12515 is created for the following cleanup work.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026121#comment-15026121
 ] 

Hive QA commented on HIVE-12469:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773866/HIVE-12469.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9827 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6119/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6119/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6119/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773866 - PreCommit-HIVE-TRUNK-Build

> Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address 
> vulnerability
> -
>
> Key: HIVE-12469
> URL: https://issues.apache.org/jira/browse/HIVE-12469
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
>Priority: Blocker
> Attachments: HIVE-12469.2.patch, HIVE-12469.patch
>
>
> Currently the commons-collections (3.2.1) library allows for invocation of 
> arbitrary code through {{InvokerTransformer}}, need to bump the version of 
> commons-collections from 3.2.1 to 3.2.2 to resolve this issue.
> Results of {{mvn dependency:tree}}:
> {code}
> [INFO] 
> 
> [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql ---
> [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT
> [INFO] +- com.google.guava:guava:jar:14.0.1:compile
> [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile
> {code}
> {code}
> [INFO] 
> 
> [INFO] Building Hive Packaging 2.0.0-SNAPSHOT
> [INFO] 
> 
> [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile
> [INFO] |  +- org.apache.hbase:hbase-server:jar:1.1.1:compile
> [INFO] |  |  +- commons-collections:commons-collections:jar:3.2.1:compile
> {code}
> {code}
> [INFO] 
> 
> [INFO] Building Hive Common 2.0.0-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common ---
> [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile
> [INFO] |  +- commons-collections:commons-collections:jar:3.2.1:compile
> {code}
> {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage,  Shims, 
> Shims Common, Shims Scheduler)
> {code}
> [INFO] 
> 
> [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant ---
> [INFO] |  +- commons-collections:commons-collections:jar:3.1:compile
> {code}
> {code}
> [INFO]
>  
> [INFO] 
> 
> [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT
> [INFO] 
> 
> [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile
> [INFO] |  +- commons-collections:commons-collections:jar:3.2.1:compile
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12487) Fix broken MiniLlap tests

2015-11-24 Thread Aleksei Statkevich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026128#comment-15026128
 ] 

Aleksei Statkevich commented on HIVE-12487:
---

TestMiniLlapCliDriver tests pass fine now.

Spark tests pass fine for me locally. Error during test run seem to be 
unrelated:
{code}
Unexpected exception java.lang.IllegalStateException: Error trying to obtain 
executor info: java.lang.IllegalStateException: RPC channel is closed. at 
org.apache.hadoop.hive.ql.QTestUtil$1.setSparkSession(QTestUtil.java:1022)}
{code}



> Fix broken MiniLlap tests
> -
>
> Key: HIVE-12487
> URL: https://issues.apache.org/jira/browse/HIVE-12487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Aleksei Statkevich
>Assignee: Aleksei Statkevich
>Priority: Critical
> Attachments: HIVE-12487.1.patch, HIVE-12487.2.patch, HIVE-12487.patch
>
>
> Currently MiniLlap tests fail with the following error:
> {code}
> TestMiniLlapCliDriver - did not produce a TEST-*.xml file
> {code}
> Supposedly, it started happening after HIVE-12319.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-11-24 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026134#comment-15026134
 ] 

Laljo John Pullokkaran commented on HIVE-11927:
---

[~pxiong] Now that Calcite 1.5 is released lets update the patch.
We should just provide the executor and reuse Calcite reduction rules.

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, 
> HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, 
> HIVE-11927.06.patch, HIVE-11927.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6113) Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026135#comment-15026135
 ] 

Hive QA commented on HIVE-6113:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773890/HIVE-6113.4.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6120/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6120/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6120/

Messages:
{noformat}
 This message was trimmed, see log for full details 
main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-it-util 
---
[INFO] Compiling 51 source files to 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java:
 Some input files use or override a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hive-it-util ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/itests/util/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it-util ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp/conf
 [copy] Copying 14 files to 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-it-util ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-it-util ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-it-util ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/hive-it-util-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-it-util ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-it-util ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/itests/util/target/hive-it-util-2.0.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-util/2.0.0-SNAPSHOT/hive-it-util-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/itests/util/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-util/2.0.0-SNAPSHOT/hive-it-util-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Integration - Unit Tests 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-it-unit ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/itests/hive-unit/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/itests/hive-unit (includes 
= [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-it-unit ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-it-unit ---
[INFO] Executing tasks

main:
 [exec] + /bin/pwd
 [exec] 
/data/hive-ptest/working/apache-github-source-source/itests/hive-unit
 [exec] + BASE_DIR=./target
 [exec] + HIVE_ROOT=./target/../../../
 [exec] + DOWNLOAD_DIR=./../thirdparty
 [exec] + mkdir -p ./../thirdparty
 [exec] + download 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz
 spark
 [exec] + 
url=http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz
 [exec] + finalName=spark
 [exec] ++ basename 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz
 [exec] + tarName=spark-1.5.0-bin-hadoop2-without-hive.tgz
 [exec] + rm -rf ./target/spark
 [exec] 

[jira] [Updated] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-24 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12184:
-
Attachment: HIVE-12184.9.patch

> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, 
> HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.9.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026162#comment-15026162
 ] 

Xuefu Zhang commented on HIVE-12184:


+1 pending on test.

> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, 
> HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.9.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int);
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | i | int|  |
> +---++--+--+
> 1 row selected (0.049 seconds)
> 0: jdbc:hive2://localhost:1/default> use foo;
> 0: jdbc:hive2://localhost:1/default> describe foo.foo;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field foo (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path

2015-11-24 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-12055:
-
Target Version/s: 2.0.0

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path

2015-11-24 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-12055:
-
Attachment: HIVE-12055.patch

Updated to the current HIVE-11890 patch. Passes all of the ORC unit tests and 
qfiles.

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11890) Create ORC module

2015-11-24 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-11890:
-
Target Version/s: 2.0.0

> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12483) Fix precommit Spark test branch

2015-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12483:
---
Attachment: (was: HIVE-12483.1-spark.patch)

> Fix precommit Spark test branch
> ---
>
> Key: HIVE-12483
> URL: https://issues.apache.org/jira/browse/HIVE-12483
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12517) Beeline's use of failed connection(s) causes failures and leaks.

2015-11-24 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12517:
-
Attachment: HIVE-12517.patch

Attaching a patch fix with a proposed fix.
Below are results from a test.
{code}
beeline> !connect jdbc:hive2://localhost:1 hive1 hive1
scan complete in 9ms
Connecting to jdbc:hive2://localhost:1
Connected to: Apache Hive (version 2.0.0-SNAPSHOT)
Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 
hive1
Connecting to jdbc:hive2://localhost:1
Connected to: Apache Hive (version 2.0.0-SNAPSHOT)
Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 
hive1
Connecting to jdbc:hive2://localhost:1
Connected to: Apache Hive (version 2.0.0-SNAPSHOT)
Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
2: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 
hive1
Connecting to jdbc:hive2://localhost:1
Connected to: Apache Hive (version 2.0.0-SNAPSHOT)
Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
3: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:11000 hive1 
hive1
Connecting to jdbc:hive2://localhost:11000
Error: Could not open client transport with JDBC Uri: 
jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused 
(state=08S01,code=0)
3: jdbc:hive2://localhost:1> !tables
++--+-+-+--+--+
| TABLE_CAT  | TABLE_SCHEM  | TABLE_NAME  | TABLE_TYPE  | REMARKS  |
++--+-+-+--+--+
|| default  | char_nested_1   | TABLE   | NULL |
|| default  | src | TABLE   | NULL |
|| default  | char_nested_struct  | TABLE   | NULL |
|| default  | src_thrift  | TABLE   | NULL |
|| default  | x   | TABLE   | NULL |
++--+-+-+--+--+
3: jdbc:hive2://localhost:1> !list
4 active connections:
 #0  open jdbc:hive2://localhost:1
 #1  open jdbc:hive2://localhost:1
 #2  open jdbc:hive2://localhost:1
 #3  open jdbc:hive2://localhost:1
3: jdbc:hive2://localhost:1> !closeall
Closing: 3: jdbc:hive2://localhost:1
Closing: 2: jdbc:hive2://localhost:1
Closing: 1: jdbc:hive2://localhost:1
Closing: 0: jdbc:hive2://localhost:1
beeline> 
{code}

Also when the first connection attempt is unsuccessful, beeline prompt is 
current set to
{code}
beeline> !connect jdbc:hive2://localhost:11000 hive1 hive1
Connecting to jdbc:hive2://localhost:11000
Error: Could not open client transport with JDBC Uri: 
jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused 
(state=08S01,code=0)
0: jdbc:hive2://localhost:11000 (closed)>
{code}

With the patch, the prompt is still "beeline>" as below
{code}
beeline> !connect jdbc:hive2://localhost:11000 hive1 hive1
Connecting to jdbc:hive2://localhost:11000
Error: Could not open client transport with JDBC Uri: 
jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused 
(state=08S01,code=0)
beeline>
{code}



> Beeline's use of failed connection(s) causes failures and leaks.
> 
>
> Key: HIVE-12517
> URL: https://issues.apache.org/jira/browse/HIVE-12517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-12517.patch
>
>
> Beeline adds a bad connection(s) to the connection list and makes it the 
> current connection, so any subsequent queries will attempt to use this bad 
> connection and will fail. Even a "!close" would not work.
> 1) all queries fail unless !go is used.
> 2) !closeall cannot close the active connections either.
> 3) !exit will exit while attempting to establish these inactive connections 
> without closing the active connections. So this could hold up server side 
> resources.
> {code}
> beeline> !connect jdbc:hive2://localhost:1 hive1 hive1
> scan complete in 8ms
> Connecting to jdbc:hive2://localhost:1
> Connected to: Apache Hive (version 2.0.0-SNAPSHOT)
> Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://localhost:1> !connect 

[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch

2015-11-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026245#comment-15026245
 ] 

Sergio Peña commented on HIVE-12483:


[~xuefuz] schemeAuthority and schemeAuthority2 are passing now. I had to update 
the the ptest server running in the spark master instance to make it work. 
There was a race condition causing the errors, but it was solved after the 
update.

Are the other failing tests passing in your local branch as well?

> Fix precommit Spark test branch
> ---
>
> Key: HIVE-12483
> URL: https://issues.apache.org/jira/browse/HIVE-12483
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12498) ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect

2015-11-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12498:
-
Attachment: HIVE-12498.2.patch

Fixed test case to close file and use different file name.

> ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect
> -
>
> Key: HIVE-12498
> URL: https://issues.apache.org/jira/browse/HIVE-12498
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: ACID, ORC
> Attachments: HIVE-12498.1.patch, HIVE-12498.2.patch
>
>
> OrcRecordUpdater does not honor the  
> OrcRecordUpdater.OrcOptions.tableProperties()  setting.  
> It would need to translate the specified tableProperties (as listed in 
> OrcTableProperties enum)  to the properties that OrcWriter internally 
> understands (listed in HiveConf.ConfVars).
> This is needed for multiple clients.. like Streaming API and Compactor.
> {code:java}
> Properties orcTblProps = ..   // get Orc Table Properties from MetaStore;
> AcidOutputFormat.Options updaterOptions =   new 
> OrcRecordUpdater.OrcOptions(conf)
>  .inspector(..)
>  .bucket(..)
>  .minimumTransactionId(..)
>  .maximumTransactionId(..)
>  
> .tableProperties(orcTblProps); // <<== 
> OrcOutputFormat orcOutput =   new ...
> orcOutput.getRecordUpdater(partitionPath, updaterOptions );
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12413) Default mode for hive.mapred.mode should be strict

2015-11-24 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12413:

Attachment: HIVE-12413.3.patch

> Default mode for hive.mapred.mode should be strict
> --
>
> Key: HIVE-12413
> URL: https://issues.apache.org/jira/browse/HIVE-12413
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12413.1.patch, HIVE-12413.2.patch, 
> HIVE-12413.3.patch, HIVE-12413.patch
>
>
> Non-strict mode allows some questionable semantics and questionable 
> operations. Its better that user makes a conscious choice to enable such a 
> behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system

2015-11-24 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025601#comment-15025601
 ] 

Jesus Camacho Rodriguez commented on HIVE-12508:


The issue was (if I recall well) that you end up with cycles in the planning 
graph (between equivalent sets of expressions) and then a metadata provider can 
fire up indefinitely.

But I guess that as currently we execute this rule in isolation in Hive, and we 
know this rule will not produce cycles, we could close it as the metadata 
provider will never fire up indefinitely.

> HiveAggregateJoinTransposeRule places a heavy load on the metadata system
> -
>
> Key: HIVE-12508
> URL: https://issues.apache.org/jira/browse/HIVE-12508
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12508.patch
>
>
> Finding out whether the input is already unique requires a call to 
> areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy 
> load on the metadata system. This can lead to long CBO planning.
> This is a temporary fix that avoid the call to the method till then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12511) IN clause performs differently then = clause

2015-11-24 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025579#comment-15025579
 ] 

Jimmy Xiang commented on HIVE-12511:


I think we should fix GenericUDFIn to use common type for comparison instead of 
generic common type. In this case, for common type of int and string, we should 
use int instead of string.

> IN clause performs differently then = clause
> 
>
> Key: HIVE-12511
> URL: https://issues.apache.org/jira/browse/HIVE-12511
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> Similar to HIVE-11973, IN clause performs differently then = clause for "int" 
> type with string values.
> For example,
> {noformat}
> SELECT * FROM inttest WHERE iValue IN ('01');
> {noformat}
> will not return any rows with int iValue = 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-11-24 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025688#comment-15025688
 ] 

Jason Dere commented on HIVE-11878:
---

So removing JARs from the session will still require closing the existing 
classloader and creating a new one (with the specified JARs omitted from the 
list of URIs), correct?

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878 ClassLoader Issues when Registering 
> Jars.pptx, HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_approach3_per_session_clasloader.patch, 
> HIVE-11878_approach3_with_review_comments.patch, 
> HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025704#comment-15025704
 ] 

Prasanth Jayachandran commented on HIVE-11675:
--

Left some comments in RB

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12341:

Attachment: HIVE-12341.04.patch

Small fix to retry logic

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.04.patch, 
> HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-11-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-12366:
-

Assignee: Eugene Koifman  (was: Elias Elmqvist Wulcan)

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC

2015-11-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-12510.
--
Resolution: Implemented

The fix for this is included in .3 version of HIVE-12020

> LLAP: Append attempt id either to thread name or NDC
> 
>
> Key: HIVE-12510
> URL: https://issues.apache.org/jira/browse/HIVE-12510
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Currently, in LLAP attempt id gets appended to both thread name and added to 
> NDC creating long log lines like below
> {code}
> [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]]
> {code}
> I think it will be sufficient to add only to NDC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-12483) Fix precommit Spark test branch

2015-11-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12483:
---
Comment: was deleted

(was: 

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773549/HIVE-12483.1-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9788 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1012/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773549 - PreCommit-HIVE-SPARK-Build)

> Fix precommit Spark test branch
> ---
>
> Key: HIVE-12483
> URL: https://issues.apache.org/jira/browse/HIVE-12483
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12483.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025362#comment-15025362
 ] 

Sergey Shelukhin commented on HIVE-12510:
-

IIRC this is Tez naming convention from way before the NDC

> LLAP: Append attempt id either to thread name or NDC
> 
>
> Key: HIVE-12510
> URL: https://issues.apache.org/jira/browse/HIVE-12510
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Currently, in LLAP attempt id gets appended to both thread name and added to 
> NDC creating long log lines like below
> {code}
> [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]]
> {code}
> I think it will be sufficient to add only to NDC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12008) Hive queries failing when using count(*) on column in view

2015-11-24 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12008:

Attachment: HIVE-12008.5.patch

> Hive queries failing when using count(*) on column in view
> --
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch, HIVE-12008.5.patch
>
>
> count(*) on view with get_json_object() UDF and lateral views and unions 
> fails in the master with error:
> 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Error in configuring 
> object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
> ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147)
> ... 22 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> This query works fine in 1.1 version. 
> The last two qfile unit tests added by HIVE-11384 fail when hive.in.test is 
> false. It may relate how we handle prunelist for select. When select include 
> every column in a table, the prunelist for the select is empty. It may cause 
> issues to calculate its parent's prunelist.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system

2015-11-24 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025489#comment-15025489
 ] 

Laljo John Pullokkaran commented on HIVE-12508:
---

[~jcamachorodriguez] Given that HIVE-12503 fixes this we shouldn't be running 
into this issue any more.

> HiveAggregateJoinTransposeRule places a heavy load on the metadata system
> -
>
> Key: HIVE-12508
> URL: https://issues.apache.org/jira/browse/HIVE-12508
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12508.patch
>
>
> Finding out whether the input is already unique requires a call to 
> areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy 
> load on the metadata system. This can lead to long CBO planning.
> This is a temporary fix that avoid the call to the method till then.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025212#comment-15025212
 ] 

Xuefu Zhang commented on HIVE-12466:


Thanks to Rui/Chengxiang for working on this. I happened to see that 
counter-based stats gathering is completely removed by HIVE-12411. I'd like to 
knows its implications. Does it mean that we don't even need SparkCounter at 
all? Are there any impacts on Spark regarding stats collection with the 
removal. Thanks.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-11-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025457#comment-15025457
 ] 

Brock Noland commented on HIVE-11977:
-

[~dossett] Sorry, I just saw this ping! I moved my mail account and had not yet 
configured my rules appropiately. This patch looks good! Nice work


[~sershe] - agreed, it'd be great to see this in 1.x.

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Fix For: 2.0.0
>
> Attachments: HIVE-11977.2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11488) Add sessionId and queryId info to HS2 log

2015-11-24 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11488:

Fix Version/s: 2.0.0

> Add sessionId and queryId info to HS2 log
> -
>
> Key: HIVE-11488
> URL: https://issues.apache.org/jira/browse/HIVE-11488
> Project: Hive
>  Issue Type: New Feature
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11488.2.patch, HIVE-11488.3.patch, HIVE-11488.patch
>
>
> Session is critical for a multi-user system like Hive. Currently Hive doesn't 
> log seessionId to the log file, which sometimes make debugging and analysis 
> difficult when multiple activities are going on at the same time and the log 
> from different sessions are mixed together.
> Currently, Hive already has the sessionId saved in SessionState and also 
> there is another sessionId in SessionHandle (Seems not used and I'm still 
> looking to understand it). Generally we should have one sessionId from the 
> beginning in the client side and server side. Seems we have some work on that 
> side first.
> The sessionId then can be added to log4j supported mapped diagnostic context 
> (MDC) and can be configured to output to log file through the log4j property. 
> MDC is per thread, so we need to add sessionId to the HS2 main thread and 
> then it will be inherited by the child threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025339#comment-15025339
 ] 

Prasanth Jayachandran commented on HIVE-12510:
--

[~sseth]/[~sershe] any reason for attempt id to be added in both places?

> LLAP: Append attempt id either to thread name or NDC
> 
>
> Key: HIVE-12510
> URL: https://issues.apache.org/jira/browse/HIVE-12510
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Currently, in LLAP attempt id gets appended to both thread name and added to 
> NDC creating long log lines like below
> {code}
> [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]]
> {code}
> I think it will be sufficient to add only to NDC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12500) JDBC driver not overlaying params supplied via properties object when reading params from ZK

2015-11-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12500:

Summary: JDBC driver not overlaying params supplied via properties object 
when reading params from ZK  (was: JDBC driver not be overlaying params 
supplied via properties object when reading params from ZK)

> JDBC driver not overlaying params supplied via properties object when reading 
> params from ZK
> 
>
> Key: HIVE-12500
> URL: https://issues.apache.org/jira/browse/HIVE-12500
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12500.1.patch
>
>
> It makes sense to setup the connection info in one place. Right now part of 
> connection configuration happens in Utils#parseURL and part in the 
> HiveConnection constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch

2015-11-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025266#comment-15025266
 ] 

Eugene Koifman commented on HIVE-12307:
---

bq. I'm +1 on making this package level, but does it do any good to make the 
class non-private and leave the constructor private?
The class is made package level for testing only.  Private c'tor ensures that 
it's only constructed via factory methods as originally implemented.
bq. Why did you make the isClosed value volatile? 
heartbeating is commonly done from separate thread, for example, Storm does it 
this way.  Also, it's not unusual for  application clean up logic to come from 
a different thread (for example calling close() as a form of cancel).  So this 
is volatile to make sure this works properly regardless of how the client is 
implemented.
I didn't try any more general thread safety issues in this patch.  Judging by 
https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest#StreamingDataIngest-Example–Non-secureMode
 the original intent was to NOT to have multiple threads in a 
StreamingConnection.  It's worthwhile to do a thread safety review but was not 
my goal here.  

bq. write()
I'll refactor this



bq. SerializationError
This is was meant to indicate that a particular row is bad.  For example 
missing columns, etc.  This gives the client ability to drop this row (or send 
to dead letter queue) since replaying it won't help.  Unfortunately, w/o my 
changes here the client never sees SerializationError - it gets wrapped in 
other exceptions.
bq. abortImpl()
there is https://issues.apache.org/jira/browse/HIVE-12440 for that

> Streaming API TransactionBatch.close() must abort any remaining transactions 
> in the batch
> -
>
> Key: HIVE-12307
> URL: https://issues.apache.org/jira/browse/HIVE-12307
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12307.patch
>
>
> When the client of TransactionBatch API encounters an error it must close() 
> the batch and start a new one.  This prevents attempts to continue writing to 
> a file that may damaged in some way.
> The close() should ensure to abort the any txns that still remain in the 
> batch and close (best effort) all the files it's writing to.  The batch 
> should also put itself into a mode where any future ops on this batch fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11488) Add sessionId and queryId info to HS2 log

2015-11-24 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025393#comment-15025393
 ] 

Aihua Xu commented on HIVE-11488:
-

I'm wondering who need to edit the doc. I tried to edit but seems I don't have 
permission to edit the page .

> Add sessionId and queryId info to HS2 log
> -
>
> Key: HIVE-11488
> URL: https://issues.apache.org/jira/browse/HIVE-11488
> Project: Hive
>  Issue Type: New Feature
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11488.2.patch, HIVE-11488.3.patch, HIVE-11488.patch
>
>
> Session is critical for a multi-user system like Hive. Currently Hive doesn't 
> log seessionId to the log file, which sometimes make debugging and analysis 
> difficult when multiple activities are going on at the same time and the log 
> from different sessions are mixed together.
> Currently, Hive already has the sessionId saved in SessionState and also 
> there is another sessionId in SessionHandle (Seems not used and I'm still 
> looking to understand it). Generally we should have one sessionId from the 
> beginning in the client side and server side. Seems we have some work on that 
> side first.
> The sessionId then can be added to log4j supported mapped diagnostic context 
> (MDC) and can be configured to output to log file through the log4j property. 
> MDC is per thread, so we need to add sessionId to the HS2 main thread and 
> then it will be inherited by the child threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9599) remove derby, datanucleus and other not related to jdbc client classes from hive-jdbc-standalone.jar

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025447#comment-15025447
 ] 

Hive QA commented on HIVE-9599:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773859/HIVE-9599.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9827 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6117/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6117/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6117/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773859 - PreCommit-HIVE-TRUNK-Build

> remove derby, datanucleus and other not related to jdbc client classes from 
> hive-jdbc-standalone.jar
> 
>
> Key: HIVE-9599
> URL: https://issues.apache.org/jira/browse/HIVE-9599
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9599.1.patch, HIVE-9599.2.patch, HIVE-9599.3.patch, 
> HIVE-9599.3.patch
>
>
> Looks like the following packages (included to hive-jdbc-standalone.jar) are 
> not used when jdbc client opens jdbc connection and runs queries:
> {code}
> antlr/
> antlr/actions/cpp/
> antlr/actions/csharp/
> antlr/actions/java/
> antlr/actions/python/
> antlr/ASdebug/
> antlr/build/
> antlr/collections/
> antlr/collections/impl/
> antlr/debug/
> antlr/debug/misc/
> antlr/preprocessor/
> com/google/gson/
> com/google/gson/annotations/
> com/google/gson/internal/
> com/google/gson/internal/bind/
> com/google/gson/reflect/
> com/google/gson/stream/
> com/google/inject/
> com/google/inject/binder/
> com/google/inject/internal/
> com/google/inject/internal/asm/
> com/google/inject/internal/cglib/core/
> com/google/inject/internal/cglib/proxy/
> com/google/inject/internal/cglib/reflect/
> com/google/inject/internal/util/
> com/google/inject/matcher/
> com/google/inject/name/
> com/google/inject/servlet/
> com/google/inject/spi/
> com/google/inject/util/
> com/jamesmurty/utils/
> com/jcraft/jsch/
> com/jcraft/jsch/jce/
> com/jcraft/jsch/jcraft/
> com/jcraft/jsch/jgss/
> com/jolbox/bonecp/
> com/jolbox/bonecp/hooks/
> com/jolbox/bonecp/proxy/
> com/sun/activation/registries/
> com/sun/activation/viewers/
> com/sun/istack/
> com/sun/istack/localization/
> com/sun/istack/logging/
> com/sun/mail/handlers/
> com/sun/mail/iap/
> com/sun/mail/imap/
> com/sun/mail/imap/protocol/
> com/sun/mail/mbox/
> com/sun/mail/pop3/
> com/sun/mail/smtp/
> com/sun/mail/util/
> com/sun/xml/bind/
> com/sun/xml/bind/annotation/
> com/sun/xml/bind/api/
> com/sun/xml/bind/api/impl/
> com/sun/xml/bind/marshaller/
> com/sun/xml/bind/unmarshaller/
> com/sun/xml/bind/util/
> com/sun/xml/bind/v2/
> com/sun/xml/bind/v2/bytecode/
> com/sun/xml/bind/v2/model/annotation/
> com/sun/xml/bind/v2/model/core/
> com/sun/xml/bind/v2/model/impl/
> com/sun/xml/bind/v2/model/nav/
> com/sun/xml/bind/v2/model/runtime/
> com/sun/xml/bind/v2/runtime/
> com/sun/xml/bind/v2/runtime/output/
> com/sun/xml/bind/v2/runtime/property/
> com/sun/xml/bind/v2/runtime/reflect/
> com/sun/xml/bind/v2/runtime/reflect/opt/
> com/sun/xml/bind/v2/runtime/unmarshaller/
> com/sun/xml/bind/v2/schemagen/
> com/sun/xml/bind/v2/schemagen/episode/
> com/sun/xml/bind/v2/schemagen/xmlschema/
> com/sun/xml/bind/v2/util/
> com/sun/xml/txw2/
> com/sun/xml/txw2/annotation/
> com/sun/xml/txw2/output/
> com/thoughtworks/paranamer/
> contribs/mx/
> javax/activation/
> javax/annotation/
> javax/annotation/concurrent/
> javax/annotation/meta/
> javax/annotation/security/
> javax/el/
> javax/inject/
> javax/jdo/
> javax/jdo/annotations/
> javax/jdo/datastore/
> javax/jdo/identity/
> javax/jdo/listener/
> javax/jdo/metadata/
> javax/jdo/spi/
> javax/mail/
> javax/mail/event/
> javax/mail/internet/
> 

[jira] [Commented] (HIVE-12338) Add webui to HiveServer2

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026337#comment-15026337
 ] 

Hive QA commented on HIVE-12338:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774132/HIVE-12338.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 9822 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_nonascii
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_fetchwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_mapwork_table
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_fetchwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentStatements
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpHeaderSize
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testRootScratchDir
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testUdfBlackList
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testUdfBlackListOverride
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testUdfWhiteList
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark
org.apache.hive.jdbc.TestNoSaslAuth.org.apache.hive.jdbc.TestNoSaslAuth
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.TestSchedulerQueue.testFairSchedulerPrimaryQueueMapping
org.apache.hive.jdbc.TestSchedulerQueue.testFairSchedulerQueueMapping
org.apache.hive.jdbc.TestSchedulerQueue.testFairSchedulerSecondaryQueueMapping
org.apache.hive.jdbc.TestSchedulerQueue.testQueueMappingCheckDisabled
org.apache.hive.jdbc.authorization.TestHS2AuthzContext.org.apache.hive.jdbc.authorization.TestHS2AuthzContext
org.apache.hive.jdbc.authorization.TestHS2AuthzSessionContext.org.apache.hive.jdbc.authorization.TestHS2AuthzSessionContext
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthUDFBlacklist.testBlackListedUdfUsage
org.apache.hive.jdbc.miniHS2.TestHiveServer2.org.apache.hive.jdbc.miniHS2.TestHiveServer2
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.minikdc.TestHs2HooksWithMiniKdc.org.apache.hive.minikdc.TestHs2HooksWithMiniKdc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6121/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6121/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6121/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 44 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774132 - PreCommit-HIVE-TRUNK-Build

> Add 

[jira] [Commented] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch

2015-11-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024902#comment-15024902
 ] 

Eugene Koifman commented on HIVE-12307:
---

[~alangates] could you review please

> Streaming API TransactionBatch.close() must abort any remaining transactions 
> in the batch
> -
>
> Key: HIVE-12307
> URL: https://issues.apache.org/jira/browse/HIVE-12307
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12307.patch
>
>
> When the client of TransactionBatch API encounters an error it must close() 
> the batch and start a new one.  This prevents attempts to continue writing to 
> a file that may damaged in some way.
> The close() should ensure to abort the any txns that still remain in the 
> batch and close (best effort) all the files it's writing to.  The batch 
> should also put itself into a mode where any future ops on this batch fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-24 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12182:
-
Attachment: HIVE-12182.3.patch

Rebasing the last patch.

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.2.patch, HIVE-12182.3.patch, HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12008) Hive queries failing when using count(*) on column in view

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025019#comment-15025019
 ] 

Hive QA commented on HIVE-12008:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773842/HIVE-12008.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9827 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_view
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6116/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6116/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6116/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773842 - PreCommit-HIVE-TRUNK-Build

> Hive queries failing when using count(*) on column in view
> --
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch
>
>
> count(*) on view with get_json_object() UDF and lateral views and unions 
> fails in the master with error:
> 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Error in configuring 
> object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 

[jira] [Commented] (HIVE-12008) Hive queries failing when using count(*) on column in view

2015-11-24 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025057#comment-15025057
 ] 

Yongzhi Chen commented on HIVE-12008:
-

Need add the fixes for tez and spark results too. 

> Hive queries failing when using count(*) on column in view
> --
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch
>
>
> count(*) on view with get_json_object() UDF and lateral views and unions 
> fails in the master with error:
> 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Error in configuring 
> object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
> ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147)
> ... 22 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> This query works fine in 1.1 version. 
> The last two qfile unit tests added by HIVE-11384 fail when hive.in.test is 
> false. It may relate how we handle prunelist for select. When select include 
> every column in a table, the prunelist for the select is empty. It may cause 
> issues to calculate its parent's prunelist.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025072#comment-15025072
 ] 

Prasanth Jayachandran commented on HIVE-12175:
--

This patch is for master branch only. For branch 1.2.1 you can remove the 
changes remove the lines related to registration of  StandardConstant* related 
classes. I would recommend fixing the issue separately for branch-1.2.1 instead 
of upgrading the kryo version. I will put up another patch for branch-1 as soon 
as possible.

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12509) Regenerate q files after HIVE-12017 went in

2015-11-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024746#comment-15024746
 ] 

Ashutosh Chauhan commented on HIVE-12509:
-

+1

> Regenerate q files after HIVE-12017 went in
> ---
>
> Key: HIVE-12509
> URL: https://issues.apache.org/jira/browse/HIVE-12509
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12509.patch
>
>
> A few q files need to be updated, as they were not updated when HIVE-12017 
> went in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI

2015-11-24 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024797#comment-15024797
 ] 

Jimmy Xiang commented on HIVE-12484:


Metrics is a good option too. These meta operations should not take too much 
time, compared to SQL queries.
If some operation could take a long time, it is a good candidate to put on the 
web UI.
Right, the priority is lower than the SQL statments.

> Show meta operations on HS2 web UI
> --
>
> Key: HIVE-12484
> URL: https://issues.apache.org/jira/browse/HIVE-12484
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Jimmy Xiang
>
> As Mohit pointed out in the review of HIVE-12338, it is nice to show meta 
> operations on HS2 web UI too. So that we can have an end-to-end picture for 
> those operations access HMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged

2015-11-24 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024738#comment-15024738
 ] 

Jesus Camacho Rodriguez commented on HIVE-12465:


Sure, I will and update the issue.

> Hive might produce wrong results when (outer) joins are merged
> --
>
> Key: HIVE-12465
> URL: https://issues.apache.org/jira/browse/HIVE-12465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Attachments: HIVE-12465.01.patch, HIVE-12465.02.patch, 
> HIVE-12465.patch
>
>
> Consider the following query:
> {noformat}
> select * from
>   (select * from tab where tab.key = 0)a
> full outer join
>   (select * from tab_part where tab_part.key = 98)b
> join
>   tab_part c
> on a.key = b.key and b.key = c.key;
> {noformat}
> Hive should execute the full outer join operation (without ON clause) and 
> then the join operation (ON a.key = b.key and b.key = c.key). Instead, it 
> merges both joins, generating the following plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: tab
> filterExpr: (key = 0) (type: boolean)
> Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 0 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: tab_part
> filterExpr: (key = 98) (type: boolean)
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 98) (type: boolean)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 98 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: c
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: key (type: int)
>   sort order: +
>   Map-reduce partition columns: key (type: int)
>   Statistics: Num rows: 500 Data size: 47000 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: value (type: string), ds (type: string)
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Outer Join 0 to 1
>Inner Join 1 to 2
>   keys:
> 0 _col0 (type: int)
> 1 _col0 (type: int)
> 2 key (type: int)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8
>   Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1100 Data size: 103400 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

[jira] [Updated] (HIVE-12509) Regenerate q files after HIVE-12017 went in

2015-11-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12509:
---
Fix Version/s: 2.0.0

> Regenerate q files after HIVE-12017 went in
> ---
>
> Key: HIVE-12509
> URL: https://issues.apache.org/jira/browse/HIVE-12509
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-12509.patch
>
>
> A few q files need to be updated, as they were not updated when HIVE-12017 
> went in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12509) Regenerate q files after HIVE-12017 went in

2015-11-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12509:
---
Attachment: HIVE-12509.patch

[~ashutoshc], could you +1? It is just q file updates that I miss when I 
checked in HIVE-12017. Thanks

> Regenerate q files after HIVE-12017 went in
> ---
>
> Key: HIVE-12509
> URL: https://issues.apache.org/jira/browse/HIVE-12509
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12509.patch
>
>
> A few q files need to be updated, as they were not updated when HIVE-12017 
> went in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025108#comment-15025108
 ] 

Ashutosh Chauhan commented on HIVE-12175:
-

Forgot to ask, which classes need to be registered. If user is adding a udf 
with her classes. Will that work since her new classes are not gonna be 
registered with serializer.

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12399) Native Vector MapJoin can encounter "Null key not expected in MapJoin" and "Unexpected NULL in map join small table" exceptions

2015-11-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025125#comment-15025125
 ] 

Sergey Shelukhin commented on HIVE-12399:
-

+1

> Native Vector MapJoin can encounter  "Null key not expected in MapJoin" and 
> "Unexpected NULL in map join small table" exceptions
> 
>
> Key: HIVE-12399
> URL: https://issues.apache.org/jira/browse/HIVE-12399
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12399.01.patch, HIVE-12399.02.patch
>
>
> Instead of throw exception, just filter out NULLs in the Native Vector 
> MapJoin operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch

2015-11-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025143#comment-15025143
 ] 

Hive QA commented on HIVE-12483:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773549/HIVE-12483.1-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9788 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1012/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773549 - PreCommit-HIVE-SPARK-Build

> Fix precommit Spark test branch
> ---
>
> Key: HIVE-12483
> URL: https://issues.apache.org/jira/browse/HIVE-12483
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12483.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >