date:20150910

[jira] [Updated] (HIVE-11482) Add retrying thrift client for HiveServer2

2015-09-10 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11482:
--
Labels: TODOC2.0  (was: )

> Add retrying thrift client for HiveServer2
> --
>
> Key: HIVE-11482
> URL: https://issues.apache.org/jira/browse/HIVE-11482
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11482.02.patch
>
>
> Similar to 
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java,
>  this improvement request is to add a retrying thrift client for HiveServer2 
> to do retries upon thrift exceptions.
> Here are few commits done on a forked branch that can be picked - 
> https://github.com/InMobi/hive/commit/7fb957fb9c2b6000d37c53294e256460010cb6b7
> https://github.com/InMobi/hive/commit/11e4b330f051c3f58927a276d562446761c9cd6d
> https://github.com/InMobi/hive/commit/241386fd870373a9253dca0bcbdd4ea7e665406c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11779) Beeline-cli: support hive.cli.pretty.output.num.cols in new CLI[beeline-cli branch]

2015-09-10 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11779:

Assignee: Ke Jia

> Beeline-cli: support hive.cli.pretty.output.num.cols in new CLI[beeline-cli 
> branch]
> ---
>
> Key: HIVE-11779
> URL: https://issues.apache.org/jira/browse/HIVE-11779
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-11779.1-beeline-cli.patch, 
> HIVE-11779.2-beeline-cli.patch
>
>
> In the old CLI, it uses "hive.cli.pretty.output.num.cols" from the hive 
> configuration to use the number of columns when formatting output generated 
> by the DESCRIBE PRETTY table_name command . We need to support the previous 
> configuration using beeline functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-10 Thread Illya Yalovyy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-10980:
-
Attachment: HIVE-10980.patch

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-10 Thread Wojciech Indyk (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738319#comment-14738319
 ] 

Wojciech Indyk commented on HIVE-11329:
---

[~leftylev], [~spena] ok, I'll prepare a documentation.
The patch works with both Hadoop 2 as well as Hadoop 1.

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11482) Add retrying thrift client for HiveServer2

2015-09-10 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738323#comment-14738323
 ] 

Lefty Leverenz commented on HIVE-11482:
---

Doc note:  This adds five HiveServer2 configuration parameters to 
HiveConf.java, which will need to be documented in the wiki for 2.0.0.

The new configs are:

* hive.server2.thrift.client.retry.limit
* hive.server2.thrift.client.connect.retry.limit
* hive.server2.thrift.client.retry.delay.seconds
* hive.server2.thrift.client.user
* hive.server2.thrift.client.password

Wikidoc links:

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]
* [Setting Up HiveServer2 -- Configuration Properties in the hive-site.xml File 
| 
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-ConfigurationPropertiesinthehive-site.xmlFile]
  
(This has a link to the Configuration Properties doc so no new documentation is 
needed, unless you want to add some usage notes.  Perhaps 
hive.server2.thrift.client.user and hive.server2.thrift.client.password should 
be mentioned.)

> Add retrying thrift client for HiveServer2
> --
>
> Key: HIVE-11482
> URL: https://issues.apache.org/jira/browse/HIVE-11482
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11482.02.patch
>
>
> Similar to 
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java,
>  this improvement request is to add a retrying thrift client for HiveServer2 
> to do retries upon thrift exceptions.
> Here are few commits done on a forked branch that can be picked - 
> https://github.com/InMobi/hive/commit/7fb957fb9c2b6000d37c53294e256460010cb6b7
> https://github.com/InMobi/hive/commit/11e4b330f051c3f58927a276d562446761c9cd6d
> https://github.com/InMobi/hive/commit/241386fd870373a9253dca0bcbdd4ea7e665406c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11590) AvroDeserializer is very chatty

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738332#comment-14738332
 ] 

Hive QA commented on HIVE-11590:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754821/HIVE-11590.1.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9424 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5219/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5219/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5219/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754821 - PreCommit-HIVE-TRUNK-Build

> AvroDeserializer is very chatty
> ---
>
> Key: HIVE-11590
> URL: https://issues.apache.org/jira/browse/HIVE-11590
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-11590.1.patch.txt
>
>
> It seems like AvroDeserializer is currently very chatty with it logging tons 
> of messages at INFO level in the mapreduce logs. It would be helpful to push 
> down some of these to debug level to keep the logs clean.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11684) Implement limit pushdown through outer join in CBO

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738339#comment-14738339
 ] 

Jesus Camacho Rodriguez commented on HIVE-11684:


[~sershe], I had not seen that issue, but the goal is the same. However, it 
seems no work has been done in HIVE-11530.

As you know, currently we do not hit the CBO path for such simple queries, so 
the new rule will not be applied in those cases.
IMO we should not be implementing these optimizations twice i.e. in Hive and 
Calcite/CBO, which has happened before. I think we should be able to always go 
through CBO path for query optimization; we need to study where the specific 
overhead comes from, and either 1) reduce it if it is related to bad 
design/engineering, or 2) disable some of the most advanced features for those 
simple queries.

> Implement limit pushdown through outer join in CBO
> --
>
> Key: HIVE-11684
> URL: https://issues.apache.org/jira/browse/HIVE-11684
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, 
> HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11301) thrift metastore issue when getting stats results in disconnect

2015-09-10 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738352#comment-14738352
 ] 

Lefty Leverenz commented on HIVE-11301:
---

[~pxiong], versions 1.2.0 and 1.2.1 are already released, so any new commits to 
branch-1.2 will go into a future version 1.2.2.  That's why I think Fix 
Version/s should say 1.2.2 instead of 1.2.0.

> thrift metastore issue when getting stats results in disconnect
> ---
>
> Key: HIVE-11301
> URL: https://issues.apache.org/jira/browse/HIVE-11301
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Pengcheng Xiong
> Fix For: 1.2.0, 1.3.0, 2.0.0
>
> Attachments: HIVE-11301.01.patch, HIVE-11301.02.patch
>
>
> On metastore side it looks like this:
> {noformat}
> 2015-07-17 20:32:27,795 ERROR [pool-3-thread-150]: server.TThreadPoolServer 
> (TThreadPoolServer.java:run(294)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> and then
> {noformat}
> 2015-07-17 20:32:27,796 WARN  [pool-3-thread-150]: 
> transport.TIOStreamTransport (TIOStreamTransport.java:close(112)) - Error 
> closing output stream.
> java.net.SocketException: Socket closed
> at 
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
> at 
> org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110)
> at org.apache.thrift.transport.TSocket.close(TSocket.java:196)
> at 
> org.apache.hadoop.hive.thrift.TFilterTransport.close(TFilterTransport.java:52)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:304)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Which on client manifests as
> {noformat}
> 2015-07-17 20:32:27,796 WARN  [main()]: metastore.RetryingMetaStoreClient 
> (RetryingMetaStoreClient.java:invoke(187)) - MetaStoreClient lost connection. 
> Attempting to reconnect.
> org.apache.thrift.transport.TTransportException
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinary

[jira] [Commented] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-10 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738353#comment-14738353
 ] 

Lefty Leverenz commented on HIVE-11329:
---

Sorry to be a pest, but I still don't understand the fix version number.  Was 
the patch committed to branch-1 (for upcoming release 1.3.0) in addition to the 
commit to master (for upcoming 2.0.0) that I saw in the dev@hive mailing list?

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11684) Implement limit pushdown through outer join in CBO

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11684:
---
Attachment: HIVE-11684.05.patch

> Implement limit pushdown through outer join in CBO
> --
>
> Key: HIVE-11684
> URL: https://issues.apache.org/jira/browse/HIVE-11684
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, 
> HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, 
> HIVE-11684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Hive support "set role none"

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Attachment: HIVE-11780.001.patch

> Hive support "set role none"
> 
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Summary: "set role none" support  (was: Hive support "set role none")

> "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Summary: Add "set role none" support  (was: "set role none" support)

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Description: HIVE should allow user to disable all roles granted for 
current session by the statement {{SET ROLE NONE; }}

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE; }}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Description: HIVE should allow user to disable all roles granted for 
current session by the statement {{SET ROLE NONE;}}  (was: HIVE should allow 
user to disable all roles granted for current session by the statement {{SET 
ROLE NONE; }})

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11482) Add retrying thrift client for HiveServer2

2015-09-10 Thread Akshay Goyal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738410#comment-14738410
 ] 

Akshay Goyal commented on HIVE-11482:
-

[~leftylev] Thanks for pointing out. It seems I am not authorized to update the 
above mentioned wikidocs.

> Add retrying thrift client for HiveServer2
> --
>
> Key: HIVE-11482
> URL: https://issues.apache.org/jira/browse/HIVE-11482
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11482.02.patch
>
>
> Similar to 
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java,
>  this improvement request is to add a retrying thrift client for HiveServer2 
> to do retries upon thrift exceptions.
> Here are few commits done on a forked branch that can be picked - 
> https://github.com/InMobi/hive/commit/7fb957fb9c2b6000d37c53294e256460010cb6b7
> https://github.com/InMobi/hive/commit/11e4b330f051c3f58927a276d562446761c9cd6d
> https://github.com/InMobi/hive/commit/241386fd870373a9253dca0bcbdd4ea7e665406c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11482) Add retrying thrift client for HiveServer2

2015-09-10 Thread Akshay Goyal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738409#comment-14738409
 ] 

Akshay Goyal commented on HIVE-11482:
-

[~leftylev] Thanks for pointing out. It seems I am not authorized to update the 
above mentioned wikidocs.

> Add retrying thrift client for HiveServer2
> --
>
> Key: HIVE-11482
> URL: https://issues.apache.org/jira/browse/HIVE-11482
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11482.02.patch
>
>
> Similar to 
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java,
>  this improvement request is to add a retrying thrift client for HiveServer2 
> to do retries upon thrift exceptions.
> Here are few commits done on a forked branch that can be picked - 
> https://github.com/InMobi/hive/commit/7fb957fb9c2b6000d37c53294e256460010cb6b7
> https://github.com/InMobi/hive/commit/11e4b330f051c3f58927a276d562446761c9cd6d
> https://github.com/InMobi/hive/commit/241386fd870373a9253dca0bcbdd4ea7e665406c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738413#comment-14738413
 ] 

Ferdinand Xu commented on HIVE-11780:
-

+1 pending to the test

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Affects Version/s: 1.2.2
   2.0.0
   1.3.0

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Fix Version/s: 1.2.2
   2.0.0
   1.3.0

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Component/s: Authorization

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11781) Remove HiveLimit operator

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11781:
---
Attachment: HIVE-11781.patch

> Remove HiveLimit operator
> -
>
> Key: HIVE-11781
> URL: https://issues.apache.org/jira/browse/HIVE-11781
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11781.patch
>
>
> Calcite's Sort contains both: Sort and Limit operator. We should extend that 
> one. Further, we should get rid of HiveLimit operator which is never used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11781) Remove HiveLimit operator and rename HiveSort operator

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11781:
---
Summary: Remove HiveLimit operator and rename HiveSort operator  (was: 
Remove HiveLimit operator)

> Remove HiveLimit operator and rename HiveSort operator
> --
>
> Key: HIVE-11781
> URL: https://issues.apache.org/jira/browse/HIVE-11781
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11781.patch
>
>
> Calcite's Sort contains both: Sort and Limit operator. We should extend that 
> one. Further, we should get rid of HiveLimit operator which is never used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11779) Beeline-cli: support hive.cli.pretty.output.num.cols in new CLI[beeline-cli branch]

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738528#comment-14738528
 ] 

Hive QA commented on HIVE-11779:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755076/HIVE-11779.2-beeline-cli.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9447 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_describe_pretty
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/37/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/37/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-BEELINE-Build-37/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755076 - PreCommit-HIVE-BEELINE-Build

> Beeline-cli: support hive.cli.pretty.output.num.cols in new CLI[beeline-cli 
> branch]
> ---
>
> Key: HIVE-11779
> URL: https://issues.apache.org/jira/browse/HIVE-11779
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-11779.1-beeline-cli.patch, 
> HIVE-11779.2-beeline-cli.patch
>
>
> In the old CLI, it uses "hive.cli.pretty.output.num.cols" from the hive 
> configuration to use the number of columns when formatting output generated 
> by the DESCRIBE PRETTY table_name command . We need to support the previous 
> configuration using beeline functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-11753) DatabaseMetadata.getColumn returns precision 0 for varchar/decimal

2015-09-10 Thread Yaqiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yaqiong resolved HIVE-11753.

   Resolution: Won't Fix
Fix Version/s: 0.14.1

The issue was fixed in 0.14.1, however according to HIVE-5847, it was fixed in 
0.14.0

> DatabaseMetadata.getColumn returns precision 0 for varchar/decimal
> --
>
> Key: HIVE-11753
> URL: https://issues.apache.org/jira/browse/HIVE-11753
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Yaqiong
> Fix For: 0.14.1
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I hit same error as HIVE-5847 and HIVE-10933. JDBC version is 
> hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar.
> My test program as below:
> /*** test.java 
> **/
> import java.sql.SQLException;
> import java.sql.Connection;
> import java.sql.ResultSet;
> import java.sql.Statement;
> import java.sql.DriverManager;
> import java.lang.*;
> import java.sql.DatabaseMetaData;
> import java.sql.ResultSetMetaData;
> public class test {
>   private static String driverName = "org.apache.hive.jdbc.HiveDriver";
>   /**
>* @param args
>* @throws SQLException
>*/
>   public static void main(String[] args) throws SQLException {
>   try {
>   Class.forName(driverName);
> } catch (ClassNotFoundException e) {
>   // TODO Auto-generated catch block
>   e.printStackTrace();
>   System.exit(1);
> }
> //replace "hive" here with the name of the user the queries should run as
> Connection con = 
> DriverManager.getConnection("jdbc:hive2://:10001/default",
>  "hadoop", "");
> Statement stmt = con.createStatement();
> String tableName = "test";
> stmt.execute("drop table if exists " + tableName);
> stmt.execute("create table " + tableName + " (key varchar(10))  row 
> format delimited fields terminated by '\t'");
> ResultSet res = con.getMetaData().getColumns(null, "default", "test", 
> null);
> while (res.next()) {
>   System.out.println("COLUMN_NAME: " + res.getString(4));
>   System.out.println("COLUMN_PRECISION: " + res.getString(7));
> }
>   }
> }
> Resutl
> --
> COLUMN_NAME: key
> COLUMN_PRECISION: null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6091) Empty pipeout files are created for connection create/close

2015-09-10 Thread Bing Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-6091:
--
Attachment: HIVE-6091.2.patch

Re-generated the patch based on the latest code in master branch

> Empty pipeout files are created for connection create/close
> ---
>
> Key: HIVE-6091
> URL: https://issues.apache.org/jira/browse/HIVE-6091
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Attachments: HIVE-6091.1.patch, HIVE-6091.2.patch, HIVE-6091.patch
>
>
> Pipeout files are created when a connection is established and removed only 
> when data was produced. Instead we should create them only when data has to 
> be fetched or remove them whether data is fetched or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10495) Hive index creation code throws NPE if index table is null

2015-09-10 Thread Bing Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-10495:
---
Attachment: HIVE-10495.2.patch

Re-generated the patch based on the latest master branch.

> Hive index creation code throws NPE if index table is null
> --
>
> Key: HIVE-10495
> URL: https://issues.apache.org/jira/browse/HIVE-10495
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-10495.1.patch, HIVE-10495.2.patch
>
>
> The stack trace would be:
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_index(HiveMetaStore.java:2870)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
> at java.lang.reflect.Method.invoke(Method.java:611)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
> at $Proxy9.add_index(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createIndex(HiveMetaStoreClient.java:962)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10495) Hive index creation code throws NPE if index table is null

2015-09-10 Thread Bing Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-10495:
---
Affects Version/s: 1.2.1

> Hive index creation code throws NPE if index table is null
> --
>
> Key: HIVE-10495
> URL: https://issues.apache.org/jira/browse/HIVE-10495
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-10495.1.patch, HIVE-10495.2.patch
>
>
> The stack trace would be:
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_index(HiveMetaStore.java:2870)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
> at java.lang.reflect.Method.invoke(Method.java:611)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
> at $Proxy9.add_index(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createIndex(HiveMetaStoreClient.java:962)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738612#comment-14738612
 ] 

Hive QA commented on HIVE-11768:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754838/HIVE-11768.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9423 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hive.common.util.TestShutdownHookManager.shutdownHookManager
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5220/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5220/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5220/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754838 - PreCommit-HIVE-TRUNK-Build

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
> Attachments: HIVE-11768.1.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11783) Extending HPL/SQL parser

2015-09-10 Thread Dmitry Tolpeko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-11783:
--
Attachment: HIVE-11783.1.patch

Patch created. 

> Extending HPL/SQL parser
> 
>
> Key: HIVE-11783
> URL: https://issues.apache.org/jira/browse/HIVE-11783
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11783.1.patch
>
>
> Need to extend procedural SQL parser and synchronize code base by adding 
> PART_COUNT, PART_COUNT_BY functions as well as CMP ROW_COUNT, CMP SUM and 
> COPY TO HDFS statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11751) hive-exec-log4j2.xml settings causes DEBUG messages to be generated and ignored

2015-09-10 Thread Rajesh Balamohan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738688#comment-14738688
 ] 

Rajesh Balamohan commented on HIVE-11751:
-

Verified the patch. It no longer evaluates the DEBUG statements.

> hive-exec-log4j2.xml settings causes DEBUG messages to be generated and 
> ignored
> ---
>
> Key: HIVE-11751
> URL: https://issues.apache.org/jira/browse/HIVE-11751
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11751.1.patch, hive-exec-log4j2.xml, 
> hive-log4j2.xml, hiveserver2_log4j.png
>
>
> Setting "INFO" in 
> dist/hive/conf/hive-exec-log4j2.xml fixes the problem. Should it be made as 
> default in hive-exec-log4j2.xml? "--hiveconf hive.log.level=INFO" from 
> commandline does not have any impact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11759) Extend new cost model to correctly reflect limit cost

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11759:
---
Attachment: HIVE-11759.patch

> Extend new cost model to correctly reflect limit cost
> -
>
> Key: HIVE-11759
> URL: https://issues.apache.org/jira/browse/HIVE-11759
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11759.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11759) Extend new cost model to correctly reflect limit cost

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738731#comment-14738731
 ] 

Jesus Camacho Rodriguez commented on HIVE-11759:


[~jpullokkaran], could you take a look? Thanks

> Extend new cost model to correctly reflect limit cost
> -
>
> Key: HIVE-11759
> URL: https://issues.apache.org/jira/browse/HIVE-11759
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11759.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11785) Carriage return and new line are processed differently when hive.fetch.task.conversion is set to none

2015-09-10 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11785:

Attachment: test.parquet

> Carriage return and new line are processed differently when 
> hive.fetch.task.conversion is set to none
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. Seems both present incorrect results.
> {noformat}
> hive> create table repo (lvalue int, charstring string) stored as parquet;
> OK
> Time taken: 0.34 seconds
> hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
> Loading data to table default.repo
> chgrp: changing ownership of 
> 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
> belong to hive
> Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, 
> rawDataSize=0]
> OK
> Time taken: 0.732 seconds
> hive> set hive.fetch.task.conversion=more;
> hive> select * from repo;
> OK
> 1 newline
> here
> here  carriage return
> 3 both
> here
> Time taken: 0.253 seconds, Fetched: 3 row(s)
> hive> set hive.fetch.task.conversion=none;
> hive> select * from repo;
> Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441752031022_0006, Tracking URL = 
> http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
> Kill Command = 
> /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
> -kill job_1441752031022_0006
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
> 2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 980 msec
> Ended Job = job_1441752031022_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS 
> Write: 51 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 980 msec
> OK
> 1 newline
> NULL  NULL
> 2 carriage return
> NULL  NULL
> 3 both
> NULL  NULL
> Time taken: 25.131 seconds, Fetched: 6 row(s)
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11785) Carriage return and new line are processed differently when hive.fetch.task.conversion is set to none

2015-09-10 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11785:

Description: 
Create the table and perform the queries as follows. You will see different 
results when the setting changes. Seems both present incorrect results.
The expected result should be:
{noformat}
1   newline
here
2   carriage return
3   both
here
{noformat}

{noformat}
hive> create table repo (lvalue int, charstring string) stored as parquet;
OK
Time taken: 0.34 seconds
hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
Loading data to table default.repo
chgrp: changing ownership of 
'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
belong to hive
Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, rawDataSize=0]
OK
Time taken: 0.732 seconds
hive> set hive.fetch.task.conversion=more;
hive> select * from repo;
OK
1   newline
here
herecarriage return
3   both
here
Time taken: 0.253 seconds, Fetched: 3 row(s)
hive> set hive.fetch.task.conversion=none;
hive> select * from repo;
Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1441752031022_0006, Tracking URL = 
http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
Kill Command = 
/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
-kill job_1441752031022_0006
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
sec
MapReduce Total cumulative CPU time: 2 seconds 980 msec
Ended Job = job_1441752031022_0006
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS Write: 
51 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 980 msec
OK
1   newline
NULLNULL
2   carriage return
NULLNULL
3   both
NULLNULL
Time taken: 25.131 seconds, Fetched: 6 row(s)
hive>
{noformat}

  was:
Create the table and perform the queries as follows. You will see different 
results when the setting changes. Seems both present incorrect results.

{noformat}
hive> create table repo (lvalue int, charstring string) stored as parquet;
OK
Time taken: 0.34 seconds
hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
Loading data to table default.repo
chgrp: changing ownership of 
'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
belong to hive
Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, rawDataSize=0]
OK
Time taken: 0.732 seconds
hive> set hive.fetch.task.conversion=more;
hive> select * from repo;
OK
1   newline
here
herecarriage return
3   both
here
Time taken: 0.253 seconds, Fetched: 3 row(s)
hive> set hive.fetch.task.conversion=none;
hive> select * from repo;
Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1441752031022_0006, Tracking URL = 
http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
Kill Command = 
/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
-kill job_1441752031022_0006
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
sec
MapReduce Total cumulative CPU time: 2 seconds 980 msec
Ended Job = job_1441752031022_0006
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS Write: 
51 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 980 msec
OK
1   newline
NULLNULL
2   carriage return
NULLNULL
3   both
NULLNULL
Time taken: 25.131 seconds, Fetched: 6 row(s)
hive>
{noformat}


> Carriage return and new line are processed differently when 
> hive.fetch.task.conversion is set to none
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. Seems both present incorrect results.
> The expected result should be:
> {noformat}
> 1 newline
> here
> 2 carriage return
> 3 both
> here
> {noformat

[jira] [Commented] (HIVE-8846) Null checks missing in ORC list and map object inpsectors

2015-09-10 Thread Sushil Kumar S (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738770#comment-14738770
 ] 

Sushil Kumar S commented on HIVE-8846:
--

This issue has been fixed as part of HIVE-9111 by [~prasanth_j], correct me if 
am wrong.

> Null checks missing in ORC list and map object inpsectors
> -
>
> Key: HIVE-8846
> URL: https://issues.apache.org/jira/browse/HIVE-8846
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
>Reporter: Gregory Hart
>Priority: Critical
>  Labels: orcfile
>
> The OrcListObjectInspector and OrcMapObjectInspector classes do not check for 
> null data and instead throw an exception. To comply with the JavaDocs for 
> ListObjectInspector and MapObjectInspector, these classes should be updated 
> to check for null data.
> The following checks should be added for OrcListObjectInspector:
> - getListElement(Object, int) should return null for null list, 
> out-of-the-range index
> - getListLength(Object) should return -1 for data = null
> - getList(Object) should return null for data = null
> The following checks should be added for OrcMapObjectInspector:
> - getMap(Object) should return null for data = null
> - getMapSize(Object) return -1 for NULL map



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11710:

Attachment: HIVE-11710.patch

Currently we don't set SessionState.info to System.out which causes the query 
progress to output to System.err when set command is issued.

Now with the change, the progress is output to console as expected.
{noformat}
0: jdbc:hive2://> set a = 0;
No rows affected (0.003 seconds)
0: jdbc:hive2://> select count(*) from src;
Query ID = axu_20150910095139_f396d686-b4b1-4c5c-b8ed-4f74e2362920
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_local386038793_0002, Tracking URL = http://localhost:8080/
Kill Command = 
/Users/axu/Documents/workspaces/tools/hadoop/hadoop-2.6.0/bin/hadoop job  -kill 
job_local386038793_0002
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2015-09-10 09:51:40,613 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_local386038793_0002
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
+--+--+
| _c0  |
+--+--+
| 2|
+--+--+
{noformat}


> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11751) hive-exec-log4j2.xml settings causes DEBUG messages to be generated and ignored

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738779#comment-14738779
 ] 

Hive QA commented on HIVE-11751:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754846/HIVE-11751.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9424 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5221/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5221/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5221/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754846 - PreCommit-HIVE-TRUNK-Build

> hive-exec-log4j2.xml settings causes DEBUG messages to be generated and 
> ignored
> ---
>
> Key: HIVE-11751
> URL: https://issues.apache.org/jira/browse/HIVE-11751
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11751.1.patch, hive-exec-log4j2.xml, 
> hive-log4j2.xml, hiveserver2_log4j.png
>
>
> Setting "INFO" in 
> dist/hive/conf/hive-exec-log4j2.xml fixes the problem. Should it be made as 
> default in hive-exec-log4j2.xml? "--hiveconf hive.log.level=INFO" from 
> commandline does not have any impact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738780#comment-14738780
 ] 

Aihua Xu commented on HIVE-11710:
-

I didn't create code review for this since it's a simple change. Submit the 
first patch to see if it would break anything.

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-10 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738788#comment-14738788
 ] 

Yongzhi Chen commented on HIVE-11745:
-

[~csun], [~szehon] could you review the code? Thanks

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11774) Show macro definition for desc function

2015-09-10 Thread Damien Carol (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738812#comment-14738812
 ] 

Damien Carol commented on HIVE-11774:
-

[~navis] Should we had a {{DESC MACRO foo}} statement ??

> Show macro definition for desc function 
> 
>
> Key: HIVE-11774
> URL: https://issues.apache.org/jira/browse/HIVE-11774
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11774.1.patch.txt
>
>
> Currently, desc function shows nothing for macro. It would be helpful if it 
> shows the definition of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11329:
---
Fix Version/s: (was: 1.3.0)
   2.0.0

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11329:
---
Fix Version/s: 1.3.0

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738870#comment-14738870
 ] 

Sergio Peña commented on HIVE-11329:


Thanks [~leftylev] for your help on this. Sorry for the late update on this, 
but I now committed the patch to branch-1 as well.

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11762) TestHCatLoaderEncryption failures when using Hadoop 2.7

2015-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738889#comment-14738889
 ] 

Sergio Peña commented on HIVE-11762:


It looks good [~jdere]
+1

> TestHCatLoaderEncryption failures when using Hadoop 2.7
> ---
>
> Key: HIVE-11762
> URL: https://issues.apache.org/jira/browse/HIVE-11762
> Project: Hive
>  Issue Type: Bug
>  Components: Shims, Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11762.1.patch, HIVE-11762.2.patch
>
>
> When running TestHCatLoaderEncryption with -Dhadoop23.version=2.7.0, we get 
> the following error during setup():
> {noformat}
> testReadDataFromEncryptedHiveTableByPig[5](org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption)
>   Time elapsed: 3.648 sec  <<< ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.hadoop.hdfs.DFSClient.setKeyProvider(Lorg/apache/hadoop/crypto/key/KeyProviderCryptoExtension;)V
>   at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.getMiniDfs(Hadoop23Shims.java:534)
>   at 
> org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.initEncryptionShim(TestHCatLoaderEncryption.java:252)
>   at 
> org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:200)
> {noformat}
> It looks like between Hadoop 2.6 and Hadoop 2.7, the argument to 
> DFSClient.setKeyProvider() changed:
> {noformat}
>@VisibleForTesting
> -  public void setKeyProvider(KeyProviderCryptoExtension provider) {
> -this.provider = provider;
> +  public void setKeyProvider(KeyProvider provider) {
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738940#comment-14738940
 ] 

Hive QA commented on HIVE-11606:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754987/HIVE-11606.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9424 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5222/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5222/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5222/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754987 - PreCommit-HIVE-TRUNK-Build

> Bucket map joins fail at hash table construction time
> -
>
> Key: HIVE-11606
> URL: https://issues.apache.org/jira/browse/HIVE-11606
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-11606.1.patch, HIVE-11606.2.patch, 
> HIVE-11606.3.patch, HIVE-11606.4.patch
>
>
> {code}
> info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
> power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
> must be a power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8846) Null checks missing in ORC list and map object inpsectors

2015-09-10 Thread Gregory Hart (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738957#comment-14738957
 ] 

Gregory Hart commented on HIVE-8846:


HIVE-9111 only fixed the NPEs. It did not fix the IndexOutOfBoundsException 
thrown when getListElement(Object, int) receives an index out of range.

> Null checks missing in ORC list and map object inpsectors
> -
>
> Key: HIVE-8846
> URL: https://issues.apache.org/jira/browse/HIVE-8846
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
>Reporter: Gregory Hart
>Priority: Critical
>  Labels: orcfile
>
> The OrcListObjectInspector and OrcMapObjectInspector classes do not check for 
> null data and instead throw an exception. To comply with the JavaDocs for 
> ListObjectInspector and MapObjectInspector, these classes should be updated 
> to check for null data.
> The following checks should be added for OrcListObjectInspector:
> - getListElement(Object, int) should return null for null list, 
> out-of-the-range index
> - getListLength(Object) should return -1 for data = null
> - getList(Object) should return null for data = null
> The following checks should be added for OrcMapObjectInspector:
> - getMap(Object) should return null for data = null
> - getMapSize(Object) return -1 for NULL map



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11784) Extends new cost model to reflect HDFS read/write cost when a new execution phase is created

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11784:
---
Attachment: HIVE-11784.patch

> Extends new cost model to reflect HDFS read/write cost when a new execution 
> phase is created
> 
>
> Key: HIVE-11784
> URL: https://issues.apache.org/jira/browse/HIVE-11784
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11784.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11771) Parquet timestamp conversion errors

2015-09-10 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11771:
---
Attachment: HIVE-11771.1.patch

> Parquet timestamp conversion errors
> ---
>
> Key: HIVE-11771
> URL: https://issues.apache.org/jira/browse/HIVE-11771
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11771.1.patch
>
>
> We have some problem to read timestamp written to parquet file by other 
> tools. The value is wrong after the conversion (not the same as it is meant 
> to be).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11784) Extends new cost model to reflect HDFS read/write cost when a new execution phase is created

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739060#comment-14739060
 ] 

Jesus Camacho Rodriguez commented on HIVE-11784:


[~jpullokkaran], could you take a look? Thanks

> Extends new cost model to reflect HDFS read/write cost when a new execution 
> phase is created
> 
>
> Key: HIVE-11784
> URL: https://issues.apache.org/jira/browse/HIVE-11784
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11784.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11784) Extend new cost model to reflect HDFS read/write cost when a new execution phase is created

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11784:
---
Summary: Extend new cost model to reflect HDFS read/write cost when a new 
execution phase is created  (was: Extends new cost model to reflect HDFS 
read/write cost when a new execution phase is created)

> Extend new cost model to reflect HDFS read/write cost when a new execution 
> phase is created
> ---
>
> Key: HIVE-11784
> URL: https://issues.apache.org/jira/browse/HIVE-11784
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11784.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11759) Extend new cost model to correctly reflect limit cost

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11759:
---
Attachment: HIVE-11759.01.patch

> Extend new cost model to correctly reflect limit cost
> -
>
> Key: HIVE-11759
> URL: https://issues.apache.org/jira/browse/HIVE-11759
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11759.01.patch, HIVE-11759.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11789) Support multicolumn in CBO

2015-09-10 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11789:
---
Summary: Support multicolumn in CBO  (was: CBO to support multicolumn)

> Support multicolumn in CBO
> --
>
> Key: HIVE-11789
> URL: https://issues.apache.org/jira/browse/HIVE-11789
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Support multicolumn in CBO way in/out i.e. translate STRUCT in/to multicolumn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11301) thrift metastore issue when getting stats results in disconnect

2015-09-10 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11301:
---
Fix Version/s: (was: 1.2.0)
   1.2.2

> thrift metastore issue when getting stats results in disconnect
> ---
>
> Key: HIVE-11301
> URL: https://issues.apache.org/jira/browse/HIVE-11301
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Pengcheng Xiong
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11301.01.patch, HIVE-11301.02.patch
>
>
> On metastore side it looks like this:
> {noformat}
> 2015-07-17 20:32:27,795 ERROR [pool-3-thread-150]: server.TThreadPoolServer 
> (TThreadPoolServer.java:run(294)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> and then
> {noformat}
> 2015-07-17 20:32:27,796 WARN  [pool-3-thread-150]: 
> transport.TIOStreamTransport (TIOStreamTransport.java:close(112)) - Error 
> closing output stream.
> java.net.SocketException: Socket closed
> at 
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
> at 
> org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110)
> at org.apache.thrift.transport.TSocket.close(TSocket.java:196)
> at 
> org.apache.hadoop.hive.thrift.TFilterTransport.close(TFilterTransport.java:52)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:304)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Which on client manifests as
> {noformat}
> 2015-07-17 20:32:27,796 WARN  [main()]: metastore.RetryingMetaStoreClient 
> (RetryingMetaStoreClient.java:invoke(187)) - MetaStoreClient lost connection. 
> Attempting to reconnect.
> org.apache.thrift.transport.TTransportException
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_a

[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-09-10 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739130#comment-14739130
 ] 

Pengcheng Xiong commented on HIVE-11786:


I totally agree that these redundant columns violate database normalization 
rules and cause a lot of inconvenience. I also suffered a lot before. It seems 
to be a big change? also cc'ing [~ashutoshc] and [~alangates] to watch if it 
will affect HBase based metastore? Thanks.

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-09-10 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: HIVE-11634.91.patch

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11605) Incorrect results with bucket map join in tez.

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739180#comment-14739180
 ] 

Hive QA commented on HIVE-11605:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754984/HIVE-11605.4.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9424 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5223/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5223/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5223/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754984 - PreCommit-HIVE-TRUNK-Build

> Incorrect results with bucket map join in tez.
> --
>
> Key: HIVE-11605
> URL: https://issues.apache.org/jira/browse/HIVE-11605
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0, 1.2.0, 1.0.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Critical
> Attachments: HIVE-11605.1.patch, HIVE-11605.3.patch, 
> HIVE-11605.4.patch
>
>
> In some cases, we aggressively try to convert to a bucket map join and this 
> ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11724) WebHcat get jobs to order jobs on time order with latest at top

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739186#comment-14739186
 ] 

Hive QA commented on HIVE-11724:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754961/HIVE-11724.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5224/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5224/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5224/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5224/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   f4deff1..3df5457  branch-1   -> origin/branch-1
   7a71e50..7014407  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 7a71e50 HIVE-11754 : Not reachable code parts in StatsUtils 
(Navis via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at 7014407 HIVE-11761: DoubleWritable hashcode for GroupBy is not 
properly generated (Aihua Xu, reviewed by Chao Sun)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754961 - PreCommit-HIVE-TRUNK-Build

> WebHcat get jobs to order jobs on time order with latest at top
> ---
>
> Key: HIVE-11724
> URL: https://issues.apache.org/jira/browse/HIVE-11724
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Attachments: HIVE-11724.1.patch
>
>
> HIVE-5519 added pagination feature support to WebHcat. This implementation 
> returns the jobs lexicographically resulting in older jobs showing at the 
> top. 
> Improvement is to order them on time with latest at top. Typically latest 
> jobs (or running) ones are more relevant to the user. Time based ordering 
> with pagination makes more sense. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739219#comment-14739219
 ] 

Xuefu Zhang commented on HIVE-11710:


[~aihuaxu], thanks for working on this. Could you please point out in the code 
how set command changes the progress output to System.err without your patch?

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-10 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739232#comment-14739232
 ] 

Szehon Ho commented on HIVE-11745:
--

Sorry I see it , I'll leave the comments there

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-10 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739230#comment-14739230
 ] 

Szehon Ho commented on HIVE-11745:
--

Can you make a ReviewBoard?

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-10 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739244#comment-14739244
 ] 

Szehon Ho commented on HIVE-11745:
--

Left comments, please make sure the review board is sent to the group, so its 
easier to see as well.

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739276#comment-14739276
 ] 

Sergey Shelukhin commented on HIVE-11786:
-

Have you considered the perf impact on fetching column stats? In theory it 
should be small but I don't know what datanucleus would do

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739301#comment-14739301
 ] 

Sergey Shelukhin commented on HIVE-11609:
-

+0.9. Ideally someone familiar with this code should take a look. Otherwise I 
will add 0.1 tomorrow ;)
[~navis]  [~ashutoshc]

> Capability to add a filter to hbase scan via composite key doesn't work
> ---
>
> Key: HIVE-11609
> URL: https://issues.apache.org/jira/browse/HIVE-11609
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-11609.1.patch.txt, HIVE-11609.2.patch.txt
>
>
> It seems like the capability to add filter to an hbase scan which was added 
> as part of HIVE-6411 doesn't work. This is primarily because in the 
> HiveHBaseInputFormat, the filter is added in the getsplits instead of 
> getrecordreader. This works fine for start and stop keys but not for filter 
> because a filter is respected only when an actual scan is performed. This is 
> also related to the initial refactoring that was done as part of HIVE-3420.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11711) Merge hbase-metastore branch to trunk

2015-09-10 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739304#comment-14739304
 ] 

Alan Gates commented on HIVE-11711:
---

+1 for the merge

> Merge hbase-metastore branch to trunk
> -
>
> Key: HIVE-11711
> URL: https://issues.apache.org/jira/browse/HIVE-11711
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 2.0.0
>
> Attachments: HIVE-11711.1.patch, HIVE-11711.2.patch, 
> HIVE-11711.3.patch, HIVE-11711.4.patch, HIVE-11711.5.patch, 
> HIVE-11711.6.patch, HIVE-11711.7.patch
>
>
> Major development of hbase-metastore is done and it's time to merge the 
> branch back into master.
> Currently hbase-metastore is only invoked when running TestMiniTezCliDriver. 
> The instruction for setting up hbase-metastore is captured in 
> https://cwiki.apache.org/confluence/display/Hive/HBaseMetastoreDevelopmentGuide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11778) Merge beeline-cli branch to trunk

2015-09-10 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739303#comment-14739303
 ] 

Gunther Hagleitner commented on HIVE-11778:
---

This is proposing to essentially rip out hive CLI in favor of embedded beeline, 
correct? If so, I have two concerns before this goes in:

https://issues.apache.org/jira/browse/HIVE-10516 is this open. The original 
analysis seemed to indicate a sizable perf difference between hive cli and the 
new version. Where is that at now?

https://issues.apache.org/jira/browse/HIVE-10791 is still open with no 
indication that anyone will work on it. I would consider this a regression if 
we lost the UI by ripping out CLI without this. I think this should be closed 
first.

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11684) Implement limit pushdown through outer join in CBO

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739307#comment-14739307
 ] 

Sergey Shelukhin commented on HIVE-11684:
-

Makes sense. I think (2) might be a better approach, for example join 
reordering can be disabled for 1-2 joins.
Actually, I wonder if some CBO features should be disable-able via configs, 
like existing Hive optimizations. Should I file a JIRA for that?

Also what is the shipping timeframe for this JIRA? Is the work done on master, 
or on a feature branch?

> Implement limit pushdown through outer join in CBO
> --
>
> Key: HIVE-11684
> URL: https://issues.apache.org/jira/browse/HIVE-11684
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, 
> HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, 
> HIVE-11684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739314#comment-14739314
 ] 

Aihua Xu commented on HIVE-11710:
-

Thanks Xuefu for reviewing the code.

Actually it's following magic line which sets sessionState.err to System.err 
while we didn't set sessionState.info. 

{noformat}
  // TODO: for hadoop jobs, progress is printed out to session.err, 
74// we should find a way to feed back job progress to client   

75sessionState.err = new PrintStream(System.err, true, "UTF-8");
{noformat}

In SessionState class, it will redirect to System.err for info stream when 
SessionState.info is null.

{noformat}
public static PrintStream getInfoStream() {
  SessionState ss = SessionState.get();
  return ((ss != null) && (ss.info != null)) ? ss.info : getErrStream();
}
{noformat}

After taking a close look, the patch may not be a perfect fix. Actually 
SessionState.err was closed after set command (in HiveCommandOperation.java), 
but the object itself is left there. So when info is null and err is not null 
(but invalid), nothing is printed. Maybe we should recover SessionState.err 
after the set command is done.



> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-09-10 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: HIVE-11634.91.patch

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.91.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-09-10 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: (was: HIVE-11634.91.patch)

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739348#comment-14739348
 ] 

Xuefu Zhang commented on HIVE-11710:


Could we also check what Hive CLI is doing w.r.t this?

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11776) LLAP: Generate golden files for all MiniLlapCluster tests

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739371#comment-14739371
 ] 

Sergey Shelukhin commented on HIVE-11776:
-

btw, doesn't orc_llap have golden file output for TestCliDriver only? I thought 
it needs separate outputs per driver

> LLAP: Generate golden files for all MiniLlapCluster tests
> -
>
> Key: HIVE-11776
> URL: https://issues.apache.org/jira/browse/HIVE-11776
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Only orc_llap.q has golden file output. Generate for other tests too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11778) Merge beeline-cli branch to trunk

2015-09-10 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739373#comment-14739373
 ] 

Xuefu Zhang commented on HIVE-11778:


[~hagleitn], yes, the proposal is to get rid of old Hive CLI, as discussed in 
the dev mailing list. As to HIVE-10791, I had a comment in the JIRA about the 
completeness of the original feature. Could you comment on that and better yet, 
find the expert to complete the feature for Beeline?

As to performance, I think it makes sense to remeasure it to know where we are 
and whether the gap is acceptable.

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11776) LLAP: Generate golden files for all MiniLlapCluster tests

2015-09-10 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739375#comment-14739375
 ] 

Prasanth Jayachandran commented on HIVE-11776:
--

It has one for llap and one for default cli driver..
https://github.com/apache/hive/blob/llap/ql/src/test/results/clientpositive/llap/orc_llap.q.out
https://github.com/apache/hive/blob/llap/ql/src/test/results/clientpositive/orc_llap.q.out

> LLAP: Generate golden files for all MiniLlapCluster tests
> -
>
> Key: HIVE-11776
> URL: https://issues.apache.org/jira/browse/HIVE-11776
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Only orc_llap.q has golden file output. Generate for other tests too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739379#comment-14739379
 ] 

Hive QA commented on HIVE-11645:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754962/HIVE-11645.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9424 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5225/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5225/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5225/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754962 - PreCommit-HIVE-TRUNK-Build

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.2.patch, 
> HIVE-11645.3.patch, HIVE-11645.4.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-09-10 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739388#comment-14739388
 ] 

Prasanth Jayachandran commented on HIVE-11645:
--

The patch mostly looks good.
{code}
Loaded : 199/199 partitions.
 Time taken for load dynamic partitions : 9.471 seconds.
 Time taken for adding to write entity : 29
{code}

Time unit for write entity is still missing. 
Minor nit: Can you remove the space before ":" and full stop at the ends (to 
make it consistent). Also can you leave a line before "Loaded" string. It looks 
close to summary table.

{code}
VERTICES TOTAL_TASKS  FAILED_ATTEMPTS KILLED_TASKS DURATION_SECONDS
CPU_TIME_MILLIS GC_TIME_MILLIS  INPUT_RECORDS   OUTPUT_RECORDS 
Map 1  10013.29 
 0  3,472 13,0560
Loaded : 199/199 partitions.
{code}

Other than these minor cosmetic tweaks the patch LGTM, +1. 

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.2.patch, 
> HIVE-11645.3.patch, HIVE-11645.4.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11510) Metatool updateLocation warning on views

2015-09-10 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739405#comment-14739405
 ] 

Sushanth Sowmyan commented on HIVE-11510:
-

+1, committing to branch-1 and master.

> Metatool updateLocation warning on views
> 
>
> Key: HIVE-11510
> URL: https://issues.apache.org/jira/browse/HIVE-11510
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 0.14.0
>Reporter: Eric Czech
>Assignee: Wei Zheng
> Attachments: HIVE-11510.1.patch, HIVE-11510.2.patch, 
> HIVE-11510.3.patch
>
>
> If views are present in a hive database, issuing a 'hive metatool 
> -updateLocation' command will result in an error like this:
> ...
> Warning: Found records with bad LOCATION in SDS table.. 
> bad location URI: null
> bad location URI: null
> bad location URI: null
> 
> Based on the source code for Metatool, it looks like there would then be a 
> "bad location URI: null" message for every view and it also appears this is 
> happening simply because the 'sds' table in the hive schema has a column 
> called location that is NULL only for views.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739409#comment-14739409
 ] 

Aihua Xu commented on HIVE-11710:
-

CLI sets as follows in CLIDriver.java.

{noformat}
  ss.out = new PrintStream(System.out, true, "UTF-8");
  ss.info = new PrintStream(System.err, true, "UTF-8");
  ss.err = new CachingPrintStream(System.err, true, "UTF-8");
{noformat}



> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11778) Merge beeline-cli branch to trunk

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739412#comment-14739412
 ] 

Sergey Shelukhin commented on HIVE-11778:
-

At the time HIVE-8495 was done throwing away CLI was not even on the horizon... 
given that the in-place UI uses the same data as scrolling UI, so it should not 
be very difficult from that side, I think it makes much more sense for beeline 
expert to look at it ;)

I do have to say that on large hive dags the scrolling UI is impossible to use, 
so it would be a blocker. I don't even care much about the progress bar as long 
as it can avoid horizontal scrolling, and update numbers in place (that's just 
me :))

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11778) Merge beeline-cli branch to trunk

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739415#comment-14739415
 ] 

Sergey Shelukhin commented on HIVE-11778:
-

Btw, another thing before throwing Hive CLI is enabling beeline CLI Driver on 
HiveQA (HIVE-10884). I was close to getting it to work at some point but HiveQA 
was not able to run the tests for some mysterious reasons.

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-11778) Merge beeline-cli branch to trunk

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739415#comment-14739415
 ] 

Sergey Shelukhin edited comment on HIVE-11778 at 9/10/15 7:24 PM:
--

Btw, another thing before throwing Hive CLI is enabling beeline CLI Driver on 
HiveQA (HIVE-10884). I was close to getting it to work at some point but 
-HiveQA was not able to run the tests for some mysterious reasons- looks like 
the tests get stuck (sorry forgot we made as much progress as we did :)).


was (Author: sershe):
Btw, another thing before throwing Hive CLI is enabling beeline CLI Driver on 
HiveQA (HIVE-10884). I was close to getting it to work at some point but HiveQA 
was not able to run the tests for some mysterious reasons.

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10884) Enable some beeline tests and turn on HIVE-4239 by default

2015-09-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739425#comment-14739425
 ] 

Sergey Shelukhin commented on HIVE-10884:
-

Looks like the state is that the tests get stuck on HiveQA. On a Linux box 
here, I can run any tests as long as I run them one at a time, otherwise after 
the first test it gets stuck. The same thing might be happening on HiveQA.
On an unrelated note, in light of HIVE-11778 I wonder if we should move all 
tests to beeline driver now?
Feel free to assign to yourself if you want to take this over. I won't be able 
to resume working on this anytime soon. 

> Enable some beeline tests and turn on HIVE-4239 by default
> --
>
> Key: HIVE-10884
> URL: https://issues.apache.org/jira/browse/HIVE-10884
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10884.01.patch, HIVE-10884.02.patch, 
> HIVE-10884.03.patch, HIVE-10884.04.patch, HIVE-10884.05.patch, 
> HIVE-10884.06.patch, HIVE-10884.07.patch, HIVE-10884.07.patch, 
> HIVE-10884.patch
>
>
> See comments in HIVE-4239.
> Beeline tests with parallelism need to be enabled to turn compilation 
> parallelism on by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-09-10 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739430#comment-14739430
 ] 

Prasanth Jayachandran commented on HIVE-11645:
--

Another suggestion:
For every partition that we load we print 2 log lines
{code}
Loaded partition : {wr_returned_date=2003-07-31}
Partition tpcds_bin_partitioned_orc_4.wr_temp{wr_returned_date=2003-07-31} 
stats: [numFiles=1, numRows=1, totalSize=1676, rawDataSize=96]
{code}

Can this be combined to one in print summary?
{code}
Loaded partition: {wr_returned_date=2003-07-31} stats: [numFiles=1, numRows=1, 
totalSize=1676, rawDataSize=96]
{code}

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.2.patch, 
> HIVE-11645.3.patch, HIVE-11645.4.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-10 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11745:

Attachment: HIVE-11745.2.patch

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch, HIVE-11745.2.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739441#comment-14739441
 ] 

Aihua Xu commented on HIVE-11710:
-

Actually after {{IOUtils.cleanup(LOG, parentSession.getSessionState().err);}}, 
System.err will always be in a closed state although the object is there.

Seems We shouldn't close System.err stream.

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-10 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739443#comment-14739443
 ] 

Yongzhi Chen commented on HIVE-11745:
-

Attach second patch to fix the first issue mentioned in the review board, and 
add comment for the second issue. 

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch, HIVE-11745.2.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11727) Hive on Tez through Oozie: Some queries fail with fnf exception

2015-09-10 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739472#comment-14739472
 ] 

Gunther Hagleitner commented on HIVE-11727:
---

Failures are unrelated.

> Hive on Tez through Oozie: Some queries fail with fnf exception
> ---
>
> Key: HIVE-11727
> URL: https://issues.apache.org/jira/browse/HIVE-11727
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-11727.1.patch
>
>
> When we read back row containers from disk, a misconfiguration causes us to 
> look for a non-existing file.
> {noformat}
> Caused by: java.io.FileNotFoundException: File 
> file:/grid/0/hadoop/yarn/local/usercache/appcache/application_1440685000561_0028/container_e26_1440685000561_0028_01_05/container_tokens
>  does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:608)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:821)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:598)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:140)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
>   at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:169)
>   ... 31 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11710:

Attachment: (was: HIVE-11710.patch)

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-10 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11710:

Attachment: HIVE-11710.patch

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11727) Hive on Tez through Oozie: Some queries fail with fnf exception

2015-09-10 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739522#comment-14739522
 ] 

Gunther Hagleitner commented on HIVE-11727:
---

Committed to branch-1 and master

> Hive on Tez through Oozie: Some queries fail with fnf exception
> ---
>
> Key: HIVE-11727
> URL: https://issues.apache.org/jira/browse/HIVE-11727
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 1.3.0
>
> Attachments: HIVE-11727.1.patch
>
>
> When we read back row containers from disk, a misconfiguration causes us to 
> look for a non-existing file.
> {noformat}
> Caused by: java.io.FileNotFoundException: File 
> file:/grid/0/hadoop/yarn/local/usercache/appcache/application_1440685000561_0028/container_e26_1440685000561_0028_01_05/container_tokens
>  does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:608)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:821)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:598)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:140)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
>   at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:169)
>   ... 31 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work

2015-09-10 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739549#comment-14739549
 ] 

Ashutosh Chauhan commented on HIVE-11609:
-

In one of .q tests, following line is removed :

filterExpr: ((key.col1 = '238') and (key.col2 = '1238')) (type: boolean)

which indicates filter was not pushed to TableScanOp. Is that expected ?

> Capability to add a filter to hbase scan via composite key doesn't work
> ---
>
> Key: HIVE-11609
> URL: https://issues.apache.org/jira/browse/HIVE-11609
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-11609.1.patch.txt, HIVE-11609.2.patch.txt
>
>
> It seems like the capability to add filter to an hbase scan which was added 
> as part of HIVE-6411 doesn't work. This is primarily because in the 
> HiveHBaseInputFormat, the filter is added in the getsplits instead of 
> getrecordreader. This works fine for start and stop keys but not for filter 
> because a filter is respected only when an actual scan is performed. This is 
> also related to the initial refactoring that was done as part of HIVE-3420.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11762) TestHCatLoaderEncryption failures when using Hadoop 2.7

2015-09-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739556#comment-14739556
 ] 

Hive QA commented on HIVE-11762:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755035/HIVE-11762.2.patch

{color:red}ERROR:{color} -1 due to 592 failed/errored test(s), 9424 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join21
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join28
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join31
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_stats
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_stats2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestSparkCliD

[jira] [Updated] (HIVE-11678) Add AggregateProjectMergeRule

2015-09-10 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11678:

Attachment: HIVE-11678.3.patch

> Add AggregateProjectMergeRule
> -
>
> Key: HIVE-11678
> URL: https://issues.apache.org/jira/browse/HIVE-11678
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11678.2.patch, HIVE-11678.3.patch, HIVE-11678.patch
>
>
> This will help to get rid of extra projects on top of Aggregation, thus 
> compacting query plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-10 Thread Illya Yalovyy (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739561#comment-14739561
 ] 

Illya Yalovyy commented on HIVE-10980:
--

Patch is submitted for review:
https://reviews.apache.org/r/38268/

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10980) Merge of dynamic partitions loads all data to default partition

2015-09-10 Thread Illya Yalovyy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy reassigned HIVE-10980:


Assignee: Illya Yalovyy

> Merge of dynamic partitions loads all data to default partition
> ---
>
> Key: HIVE-10980
> URL: https://issues.apache.org/jira/browse/HIVE-10980
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2.4 (also reproduced on apache hive built from 
> trunk) 
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-10980.patch
>
>
> Conditions that lead to the issue:
> 1. Execution engine set to MapReduce
> 2. Partition columns have different types
> 3. Both static and dynamic partitions are used in the query
> 4. Dynamically generated partitions require merge
> Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".
> Steps to reproduce:
> set hive.exec.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=strict;
> set hive.optimize.sort.dynamic.partition=false;
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.execution.engine=mr;
> create external table sdp (
>   dataint bigint,
>   hour int,
>   req string,
>   cid string,
>   caid string
> )
> row format delimited
> fields terminated by ',';
> load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
> load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
> ...
> load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;
> create table tdp (cid string, caid string)
> partitioned by (dataint bigint, hour int, req string);
> insert overwrite table tdp partition (dataint=20150316, hour=16, req)
> select cid, caid, req from sdp where dataint=20150316 and hour=16;
> select * from tdp order by caid;
> show partitions tdp;
> Example of the input file:
> 20150316,16,reqA,clusterIdA,cacheId1
> 20150316,16,reqB,clusterIdB,cacheId2 
> 20150316,16,reqA,clusterIdC,cacheId3  
> 20150316,16,reqD,clusterIdD,cacheId4
> 20150316,16,reqA,clusterIdA,cacheId5  
> Actual result:
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__ 
> clusterIdA  cacheId12015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId22015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdC  cacheId32015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId42015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdA  cacheId52015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdD  cacheId82015031616  
> __HIVE_DEFAULT_PARTITION__
> clusterIdB  cacheId92015031616  
> __HIVE_DEFAULT_PARTITION__
> 
> dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11771) Parquet timestamp conversion errors

2015-09-10 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11771:
---
Attachment: HIVE-11771.2.patch

Attached v2 that is backward compatible.

> Parquet timestamp conversion errors
> ---
>
> Key: HIVE-11771
> URL: https://issues.apache.org/jira/browse/HIVE-11771
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11771.1.patch, HIVE-11771.2.patch
>
>
> We have some problem to read timestamp written to parquet file by other 
> tools. The value is wrong after the conversion (not the same as it is meant 
> to be).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11771) Parquet timestamp conversion errors

2015-09-10 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11771:
---
Priority: Minor  (was: Major)

> Parquet timestamp conversion errors
> ---
>
> Key: HIVE-11771
> URL: https://issues.apache.org/jira/browse/HIVE-11771
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11771.1.patch, HIVE-11771.2.patch
>
>
> We have some problem to read timestamp written to parquet file by other 
> tools. The value is wrong after the conversion (not the same as it is meant 
> to be).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-11763) Use * instead of sum(hash(*)) on Parquet predicate (PPD) integration tests

2015-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-11763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-11763:
--

Assignee: Sergio Peña

> Use * instead of sum(hash(*)) on Parquet predicate (PPD) integration tests
> --
>
> Key: HIVE-11763
> URL: https://issues.apache.org/jira/browse/HIVE-11763
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11763.1.patch
>
>
> The integration tests for Parquet predicate push down (PPD) use the following 
> query to validate the values filtered:
> {noformat}
> select sum(hash(*)) from ...
> {noformat}
> It would be better if we use {{select * from ...}} instead to see that those 
> values are correct. It is difficult to see if a value was filtered by seeing 
> the hash.
> Also, we can try to limit the number of rows of the INSERT ... SELECT 
> statmenet to avoid displaying many rows when validating the data. I think a 
> LIMIT 2 on each of the SELECT.
> For example, the parquet_ppd_boolean.ppd has this:
> {noformat}
> insert overwrite table newtypestbl select * from (select cast("apple" as 
> char(10)), cast("bee" as varchar(10)), 0.22, true from src src1 union all 
> select cast("hello" as char(10)), cast("world" as varchar(10)), 11.22, false 
> from src src2) uniontbl;
> {noformat}
> If we use LIMIT 2, then we will reduce the # of rows:
> {noformat}
> insert overwrite table newtypestbl select * from (select cast("apple" as 
> char(10)), cast("bee" as varchar(10)), 0.22, true from src src1 LIMIT 2 union 
> all select cast("hello" as char(10)), cast("world" as varchar(10)), 11.22, 
> false from src src2 LIMIT 2) uniontbl;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11763) Use * instead of sum(hash(*)) on Parquet predicate (PPD) integration tests

2015-09-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-11763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11763:
---
Attachment: HIVE-11763.1.patch

> Use * instead of sum(hash(*)) on Parquet predicate (PPD) integration tests
> --
>
> Key: HIVE-11763
> URL: https://issues.apache.org/jira/browse/HIVE-11763
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11763.1.patch
>
>
> The integration tests for Parquet predicate push down (PPD) use the following 
> query to validate the values filtered:
> {noformat}
> select sum(hash(*)) from ...
> {noformat}
> It would be better if we use {{select * from ...}} instead to see that those 
> values are correct. It is difficult to see if a value was filtered by seeing 
> the hash.
> Also, we can try to limit the number of rows of the INSERT ... SELECT 
> statmenet to avoid displaying many rows when validating the data. I think a 
> LIMIT 2 on each of the SELECT.
> For example, the parquet_ppd_boolean.ppd has this:
> {noformat}
> insert overwrite table newtypestbl select * from (select cast("apple" as 
> char(10)), cast("bee" as varchar(10)), 0.22, true from src src1 union all 
> select cast("hello" as char(10)), cast("world" as varchar(10)), 11.22, false 
> from src src2) uniontbl;
> {noformat}
> If we use LIMIT 2, then we will reduce the # of rows:
> {noformat}
> insert overwrite table newtypestbl select * from (select cast("apple" as 
> char(10)), cast("bee" as varchar(10)), 0.22, true from src src1 LIMIT 2 union 
> all select cast("hello" as char(10)), cast("world" as varchar(10)), 11.22, 
> false from src src2 LIMIT 2) uniontbl;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11762) TestHCatLoaderEncryption failures when using Hadoop 2.7

2015-09-10 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739647#comment-14739647
 ] 

Jason Dere commented on HIVE-11762:
---

Whoa, lot of failures .. I ran TestSparkCliDriver and see the following error:
{noformat}
2015-09-10 14:14:31,970 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - Exception in thread "main" 
java.lang.NoClassDefFoundError: org/apache/hadoop/crypto/key/KeyProvider
2015-09-10 14:14:31,970 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hadoop.hive.shims.Hadoop23Shims.(Hadoop23Shims.java:1058)
2015-09-10 14:14:31,970 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at java.lang.Class.forName0(Native 
Method)
2015-09-10 14:14:31,970 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
java.lang.Class.forName(Class.java:190)
2015-09-10 14:14:31,970 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:146)
2015-09-10 14:14:31,970 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:141)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hadoop.hive.conf.HiveConf$ConfVars.(HiveConf.java:369)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hive.spark.client.rpc.RpcConfiguration.(RpcConfiguration.java:46)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:146)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
java.lang.reflect.Method.invoke(Method.java:606)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.crypto.key.KeyProvider
2015-09-10 14:14:31,971 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
2015-09-10 14:14:31,972 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
2015-09-10 14:14:31,972 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
java.security.AccessController.doPrivileged(Native Method)
2015-09-10 14:14:31,972 INFO  [stderr-redir-1] client.SparkClientImpl 
(SparkClientImpl.java:run(588)) - at 
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
2015-09-10 14:14:31,972 INFO  [stderr-redir-1] client.S

1 2 >

1 - 100 of 142 matches

Mail list logo