[jira] [Commented] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986920#comment-14986920
 ] 

Lefty Leverenz commented on HIVE-11777:
---

+1 for the parameter description of hive.orc.splits.directory.batch.ms.

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12093) launch local task to process map join cost long time

2015-11-03 Thread liuchuanqi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986955#comment-14986955
 ] 

liuchuanqi commented on HIVE-12093:
---

yeah ,  after  use  cast as string  ,   run normal   . 
close this issue  


>  launch local task to process map join cost long time 
> --
>
> Key: HIVE-12093
> URL: https://issues.apache.org/jira/browse/HIVE-12093
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: liuchuanqi
>
>  launch local task to process map join cost long time   
> 2015-10-08 19:34:35 INFO 2015-10-08 19:34:35  Starting to launch local task 
> to process map join;  maximum memory = 1908932608
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  Dump the side-table for tag: 1 
> with group count: 148024 into file: 
> file:/tmp/test/6b99a4b8-0db3-4c62-a0f3-20547504b2b4/hive_2015-10-08_19-30-11_948_5184081524408167915-1/-local-10015/HashTable-Stage-33/MapJoin-mapfile71--.hashtable
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  Uploaded 1 File to: 
> file:/tmp/test/6b99a4b8-0db3-4c62-a0f3-20547504b2b4/hive_2015-10-08_19-30-11_948_5184081524408167915-1/-local-10015/HashTable-Stage-33/MapJoin-mapfile71--.hashtable
>  (8922201 bytes)
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  End of local task; Time Taken: 
> 1987.642 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12171) LLAP: BuddyAllocator failures when querying uncompressed data

2015-11-03 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12171:
--
Labels: TODOC2.0  (was: )

> LLAP: BuddyAllocator failures when querying uncompressed data
> -
>
> Key: HIVE-12171
> URL: https://issues.apache.org/jira/browse/HIVE-12171
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12171.01.patch, HIVE-12171.02.patch, 
> HIVE-12171.03.patch, HIVE-12171.other.patch, HIVE-12171.patch
>
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> testing.lineitem where l_shipdate >= '1993-01-01' and l_shipdate < 
> '1994-01-01' ;
> Caused by: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 492; at 0 out of 1
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:882)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:319)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11749) To sometimes deadlock when run a query

2015-11-03 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986965#comment-14986965
 ] 

Kai Sasaki commented on HIVE-11749:
---

This problem is related to the {{TableDesc#equals}} method. equals method 
cannot be simply {{synchronized}} because of the [consistency 
problem|http://stackoverflow.com/questions/1636399/correctly-synchronizing-equals-in-java].
 How can we solve it?

> To sometimes deadlock when run a query
> --
>
> Key: HIVE-11749
> URL: https://issues.apache.org/jira/browse/HIVE-11749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Ryu Kobayashi
>Assignee: Kai Sasaki
> Attachments: HIVE-11749.stack-tarace.txt
>
>
> But not always, to deadlock when it run the query. Environment are as follows:
> * Hadoop 2.6.0
> * Hive 0.13
> * JDK 1.7.0_79
> It will attach the stack trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11293) HiveConnection.setAutoCommit(true) throws exception

2015-11-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986901#comment-14986901
 ] 

Michał Węgrzyn commented on HIVE-11293:
---

Great. Thanks Alan.

> HiveConnection.setAutoCommit(true) throws exception
> ---
>
> Key: HIVE-11293
> URL: https://issues.apache.org/jira/browse/HIVE-11293
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.0.0
>Reporter: Andriy Shumylo
>Assignee: Michał Węgrzyn
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11293.2.patch, HIVE-11293.patch
>
>
> Effectively autoCommit is always true for HiveConnection, however 
> setAutoCommit(true) throws exception, causing problems in existing JDBC code.
> Should be 
> {code}
>   @Override
>   public void setAutoCommit(boolean autoCommit) throws SQLException {
> if (!autoCommit) {
>   throw new SQLException("disabling autocommit is not supported");
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11525) Bucket pruning

2015-11-03 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11525:
---
Attachment: HIVE-12315.1.patch

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11525) Bucket pruning

2015-11-03 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11525:
---
Attachment: HIVE-11525.1.patch

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11525) Bucket pruning

2015-11-03 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11525:
---
Attachment: (was: HIVE-12315.1.patch)

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12078) LLAP: document config settings

2015-11-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986954#comment-14986954
 ] 

Lefty Leverenz commented on HIVE-12078:
---

HIVE-12171 removes *hive.llap.io.cache.orc.arena.size* and adds 
*hive.llap.io.cache.orc.arena.count* in release 2.0.0.

> LLAP: document config settings
> --
>
> Key: HIVE-12078
> URL: https://issues.apache.org/jira/browse/HIVE-12078
> Project: Hive
>  Issue Type: Bug
>Reporter: Lefty Leverenz
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: llap
>
> Attachments: HIVE-12078.patch
>
>
> From HIVE-12060:
> There's a typo in the description of 
> hive.tez.input.generate.consistent.splits: "Whether to generate consisten 
> split" – need "t" for consistent.
> Several of the new hive.llap.* configs don't have descriptions. Are they for 
> internal use only?
> Please add newlines (\n) in the description of 
> hive.llap.queue.metrics.percentiles.intervals and keep the indentation 
> identical for all three lines of the description. (And to pick a nit, a few 
> config description indentations are off by one character, including that one.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-11-03 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11634:
--
Labels: TODOC2.0  (was: )

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>  Labels: TODOC2.0
> Fix For: 1.3.0
>
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12093) launch local task to process map join cost long time

2015-11-03 Thread liuchuanqi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuchuanqi resolved HIVE-12093.
---
Resolution: Duplicate

>  launch local task to process map join cost long time 
> --
>
> Key: HIVE-12093
> URL: https://issues.apache.org/jira/browse/HIVE-12093
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: liuchuanqi
>
>  launch local task to process map join cost long time   
> 2015-10-08 19:34:35 INFO 2015-10-08 19:34:35  Starting to launch local task 
> to process map join;  maximum memory = 1908932608
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  Dump the side-table for tag: 1 
> with group count: 148024 into file: 
> file:/tmp/test/6b99a4b8-0db3-4c62-a0f3-20547504b2b4/hive_2015-10-08_19-30-11_948_5184081524408167915-1/-local-10015/HashTable-Stage-33/MapJoin-mapfile71--.hashtable
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  Uploaded 1 File to: 
> file:/tmp/test/6b99a4b8-0db3-4c62-a0f3-20547504b2b4/hive_2015-10-08_19-30-11_948_5184081524408167915-1/-local-10015/HashTable-Stage-33/MapJoin-mapfile71--.hashtable
>  (8922201 bytes)
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  End of local task; Time Taken: 
> 1987.642 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12171) LLAP: BuddyAllocator failures when querying uncompressed data

2015-11-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986945#comment-14986945
 ] 

Lefty Leverenz commented on HIVE-12171:
---

Doc note:  This removes configuration parameter 
*hive.llap.io.cache.orc.arena.size* (see HIVE-12078) and adds 
*hive.llap.io.cache.orc.arena.count*, which will need to be documented in the 
wiki along with the other LLAP configs.

* [Hive Configuration Properties (needs a new LLAP section) | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties]

> LLAP: BuddyAllocator failures when querying uncompressed data
> -
>
> Key: HIVE-12171
> URL: https://issues.apache.org/jira/browse/HIVE-12171
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12171.01.patch, HIVE-12171.02.patch, 
> HIVE-12171.03.patch, HIVE-12171.other.patch, HIVE-12171.patch
>
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> testing.lineitem where l_shipdate >= '1993-01-01' and l_shipdate < 
> '1994-01-01' ;
> Caused by: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 492; at 0 out of 1
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:882)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:319)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-11-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986979#comment-14986979
 ] 

Lefty Leverenz commented on HIVE-11634:
---

Doc notes and questions:

1)  I added a TODOC2.0 label even though this has Fix Version 1.3.0, because I 
don't see a commit to branch-1 in email.  (The commit to master is 
b7986a8fbb950e7f76d70d923cf0d9ebee5e8360.)  Perhaps a branch-1 commit will come 
later.

2)  What is HiveConf.java.orig and why does the commit to master add a new 
configuration parameter to it 
(*hive.vectorized.execution.reducesink.new.enabled*)?  Neither the file nor the 
parameter appears in HIVE-11634.patch or HIVE-11634.995.patch.  HIVE-12290 
added HiveConf.java.orig to master seemingly by accident.

3)  This removes *hive.optimize.point.lookup.extract* from HiveConf.java and 
adds *hive.optimize.partition.columns.separate*, so the wiki needs to be 
updated for release 2.0.0 (or 1.3.0 if the patch is going to be committed to 
branch-1).


> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>  Labels: TODOC2.0
> Fix For: 1.3.0
>
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10397) LLAP: Implement Tez SplitSizeEstimator for Orc

2015-11-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987436#comment-14987436
 ] 

Eugene Koifman commented on HIVE-10397:
---

[~prasanth_j] [~gopalv] Is this portable to branch-1 or will it pull a bunch of 
other changes?


> LLAP: Implement Tez SplitSizeEstimator for Orc
> --
>
> Key: HIVE-10397
> URL: https://issues.apache.org/jira/browse/HIVE-10397
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-10397.2.patch, HIVE-10397.patch, 
> HIVE-10397.trunk.patch
>
>
> This is patch for HIVE-7428. For now this will be in llap branch as hive has 
> not bumped up the tez version yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11749) To sometimes deadlock when run a query

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987021#comment-14987021
 ] 

ASF GitHub Bot commented on HIVE-11749:
---

GitHub user Lewuathe opened a pull request:

https://github.com/apache/hive/pull/52

Initial patch

see: https://issues.apache.org/jira/browse/HIVE-11749

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Lewuathe/hive HIVE-11749

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/52.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #52


commit c4895579b98f8c097fa931ec9a6ce8d9f4e0c7ed
Author: Lewuathe 
Date:   2015-11-03T10:03:36Z

Initial patch




> To sometimes deadlock when run a query
> --
>
> Key: HIVE-11749
> URL: https://issues.apache.org/jira/browse/HIVE-11749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Ryu Kobayashi
>Assignee: Kai Sasaki
> Attachments: HIVE-11749.00.patch, HIVE-11749.stack-tarace.txt
>
>
> But not always, to deadlock when it run the query. Environment are as follows:
> * Hadoop 2.6.0
> * Hive 0.13
> * JDK 1.7.0_79
> It will attach the stack trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11749) To sometimes deadlock when run a query

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987022#comment-14987022
 ] 

ASF GitHub Bot commented on HIVE-11749:
---

Github user Lewuathe closed the pull request at:

https://github.com/apache/hive/pull/52


> To sometimes deadlock when run a query
> --
>
> Key: HIVE-11749
> URL: https://issues.apache.org/jira/browse/HIVE-11749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Ryu Kobayashi
>Assignee: Kai Sasaki
> Attachments: HIVE-11749.00.patch, HIVE-11749.stack-tarace.txt
>
>
> But not always, to deadlock when it run the query. Environment are as follows:
> * Hadoop 2.6.0
> * Hive 0.13
> * JDK 1.7.0_79
> It will attach the stack trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987229#comment-14987229
 ] 

Hive QA commented on HIVE-11981:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770240/HIVE-11981.093.patch

{color:green}SUCCESS:{color} +1 due to 16 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 64 failed/errored test(s), 9781 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testDefaultTypes
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testInOutFormat
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testMROutput
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testEmpty
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBaseAndDelta
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderDelta
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactAfterAbort
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactWhileStreaming
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.minorCompactAfterAbort
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.minorCompactWhileStreaming
org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask[3]
org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask[3]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatExternalDynamicCustomLocation[3]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalNonPartitioned.testHCatNonPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalPartitioned.testHCatPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask[3]
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatMutableNonPartitioned.testHCatNonPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatMutablePartitioned.testHCatPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatNonPartitioned.testHCatNonPartitionedTable[3]
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[3]
org.apache.hive.hcatalog.pig.TestE2EScenarios.testReadOrcAndRCFromPig
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown2[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic[3]
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapNullKey[3]
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData[3]
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[3]
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testDateCharTypes[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreFuncAllSimpleTypes[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteChar[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteDate2[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteDate3[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteDate[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteDecimalXY[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteDecimalX[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteDecimal[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteSmallint[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteTimestamp[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteTinyint[3]
org.apache.hive.hcatalog.pig.TestHCatStorer.testWriteVarchar[3]
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits

[jira] [Commented] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987126#comment-14987126
 ] 

Hive QA commented on HIVE-12320:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770217/HIVE-12320.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9760 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_int_type_promotion
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_schema_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_parquet_types
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5899/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5899/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5899/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770217 - PreCommit-HIVE-TRUNK-Build

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987014#comment-14987014
 ] 

Hive QA commented on HIVE-11777:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770198/HIVE-11777.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 425 failed/errored test(s), 9745 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-unionDistinct_1.q-insert_values_non_partitioned.q-insert_update_delete.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_project
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_delete
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_delete_own_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update_own_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_all_non_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_all_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_orig_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_tmp_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_where_no_match
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_where_non_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_where_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_whole_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_implicit_cast_during_insert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_acid_dynamic_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_acid_not_bucketed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into_with_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_nonacid_from_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_orig_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_update_delete
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_acid_not_bucketed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_dynamic_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_non_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_tmp_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_llap_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_dictionary_threshold
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_diff_part_cols
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_diff_part_cols2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_empty_strings
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ends_with_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_file_dump
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_int_type_promotion
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge_incompat1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge_incompat2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_boolean
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_char
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_decimal

[jira] [Updated] (HIVE-11749) To sometimes deadlock when run a query

2015-11-03 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HIVE-11749:
--
Attachment: HIVE-11749.00.patch

> To sometimes deadlock when run a query
> --
>
> Key: HIVE-11749
> URL: https://issues.apache.org/jira/browse/HIVE-11749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Ryu Kobayashi
>Assignee: Kai Sasaki
> Attachments: HIVE-11749.00.patch, HIVE-11749.stack-tarace.txt
>
>
> But not always, to deadlock when it run the query. Environment are as follows:
> * Hadoop 2.6.0
> * Hive 0.13
> * JDK 1.7.0_79
> It will attach the stack trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12273) Improve user level explain

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987641#comment-14987641
 ] 

Hive QA commented on HIVE-12273:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770271/HIVE-12273.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9730 tests executed
*Failed tests:*
{noformat}
TestCliDriver-ppd_outer_join2.q-groupby3_map_skew.q-merge_dynamic_partition2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-update_orig_table.q-vectorized_bucketmapjoin1.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5902/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5902/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5902/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770271 - PreCommit-HIVE-TRUNK-Build

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch, 
> HIVE-12273.03.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12202) NPE thrown when reading legacy ACID delta files

2015-11-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12202:
--
Fix Version/s: 1.3.0

> NPE thrown when reading legacy ACID delta files
> ---
>
> Key: HIVE-12202
> URL: https://issues.apache.org/jira/browse/HIVE-12202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Elliot West
>Assignee: Elliot West
>  Labels: transactions
> Fix For: 1.3.0
>
> Attachments: HIVE-12202.0.patch
>
>
> When reading legacy ACID deltas of the form {{delta_$startTxnId_$endTxnId}} a 
> {{NullPointerException}} is thrown on:
> {code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371}
> if(dmd.getStmtIds().isEmpty()) {
> {code}
> The older ACID data format (pre-Hive 1.3.0) which does not include the 
> statement ID, and code written for that format should still be supported. 
> Therefore the above condition should also include a null check or 
> alternatively {{AcidInputFormat.DeltaMetaData}} should never return null, and 
> return an empty list in this specific scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12252) Streaming API HiveEndPoint can be created w/o partitionVals for partitioned table

2015-11-03 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12252:
-
Attachment: HIVE-12252.1.patch

[~ekoifman] Can you review it?

> Streaming API HiveEndPoint can be created w/o partitionVals for partitioned 
> table
> -
>
> Key: HIVE-12252
> URL: https://issues.apache.org/jira/browse/HIVE-12252
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12252.1.patch
>
>
> When this happens, the write from Streaming API to this end point will 
> succeed but it will place the data in the table directory which is not correct
> Need to make the API throw in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11726) Pushed IN predicates to the metastore

2015-11-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11726:
---
Attachment: HIVE-11726.01.patch

Rebased patch and removed part concerning OR/AND predicates transformation into 
IN (which is already covered by HIVE-11634).

Updated patch description to reflect that.

> Pushed IN predicates to the metastore
> -
>
> Key: HIVE-11726
> URL: https://issues.apache.org/jira/browse/HIVE-11726
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11726.01.patch, HIVE-11726.patch
>
>
> The PointLookupOptimizer can turn off some of the optimizations due to its 
> use of tuple IN() clauses.
> HIVE-11573 introduced the extraction of sub-clauses that could be pushed down 
> till the TableScan operators, though they wouldn't be pushed down to the 
> metastore.
> In this issue, we tackle this problem by extending the filter parser of the 
> metastore to support IN clauses, including multiple columns. This allows to 
> push those additional predicates down throw directSQL to the metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12317) Emit current database in lineage info

2015-11-03 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-12317:
---
Attachment: HIVE-12317.1.patch

> Emit current database in lineage info
> -
>
> Key: HIVE-12317
> URL: https://issues.apache.org/jira/browse/HIVE-12317
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12317.1.patch
>
>
> It will be easier to emit current database info explicitly instead of finding 
> out such info from normalized column names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-03 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987754#comment-14987754
 ] 

Laljo John Pullokkaran commented on HIVE-12301:
---

I don't think i can get to this today/tomorrow.
[~jcamachorodriguez] Could you take a look?

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12196) NPE when converting bad timestamp value

2015-11-03 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12196:

Attachment: HIVE-12196.patch

> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
> Attachments: HIVE-12196.patch
>
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-03 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987892#comment-14987892
 ] 

Naveen Gangam commented on HIVE-12182:
--

Good point. I believe it should to prevent is from including quotes surrounding 
the comment in the query.

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12196) NPE when converting bad timestamp value

2015-11-03 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987922#comment-14987922
 ] 

Aihua Xu commented on HIVE-12196:
-

[~rdblue] This behavior actually matches most of the conversion functions. 
e.g., cast('abc' as int) or date('abc') will return NULL rather than throwing 
an exception. It's from the runtime execution.

The second one actually throws the exception during the compile time which 
converts the constant string to timestamp, which also seems to makes sense.





> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
> Attachments: HIVE-12196.patch
>
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12273) Improve user level explain

2015-11-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987750#comment-14987750
 ] 

Pengcheng Xiong commented on HIVE-12273:


The test case failures are unrelated and they also appear in the other 
pre-commit runs. Pushed to master. Thanks [~ashutoshc] for the review.

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch, 
> HIVE-12273.03.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12252) Streaming API HiveEndPoint can be created w/o partitionVals for partitioned table

2015-11-03 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987767#comment-14987767
 ] 

Wei Zheng commented on HIVE-12252:
--

There should be two cases to be handled:
1) When the table is partitioned, partitionVals shouldn't be empty
2) When the table is unpartitioned, partitionVals must be empty

> Streaming API HiveEndPoint can be created w/o partitionVals for partitioned 
> table
> -
>
> Key: HIVE-12252
> URL: https://issues.apache.org/jira/browse/HIVE-12252
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>
> When this happens, the write from Streaming API to this end point will 
> succeed but it will place the data in the table directory which is not correct
> Need to make the API throw in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12063) Pad Decimal numbers with trailing zeros to the scale of the column

2015-11-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-12063:
---
Attachment: HIVE-12063.3.patch

Rebased the patch with the latest master.

> Pad Decimal numbers with trailing zeros to the scale of the column
> --
>
> Key: HIVE-12063
> URL: https://issues.apache.org/jira/browse/HIVE-12063
> Project: Hive
>  Issue Type: Improvement
>  Components: Types
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-12063.1.patch, HIVE-12063.2.patch, 
> HIVE-12063.3.patch, HIVE-12063.patch
>
>
> HIVE-7373 was to address the problems of trimming tailing zeros by Hive, 
> which caused many problems including treating 0.0, 0.00 and so on as 0, which 
> has different precision/scale. Please refer to HIVE-7373 description. 
> However, HIVE-7373 was reverted by HIVE-8745 while the underlying problems 
> remained. HIVE-11835 was resolved recently to address one of the problems, 
> where 0.0, 0.00, and so on cannot be read into decimal(1,1).
> However, HIVE-11835 didn't address the problem of showing as 0 in query 
> result for any decimal values such as 0.0, 0.00, etc. This causes confusion 
> as 0 and 0.0 have different precision/scale than 0.
> The proposal here is to pad zeros for query result to the type's scale. This 
> not only removes the confusion described above, but also aligns with many 
> other DBs. Internal decimal number representation doesn't change, however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12266) When client exists abnormally, it doesn't release ACID locks

2015-11-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987884#comment-14987884
 ] 

Eugene Koifman commented on HIVE-12266:
---

committed to master: 
https://github.com/apache/hive/commit/595fa9988fcb3e67b60845b44e1df4cc49ce38b2
patch doesn't apply on branch-1

> When client exists abnormally, it doesn't release ACID locks
> 
>
> Key: HIVE-12266
> URL: https://issues.apache.org/jira/browse/HIVE-12266
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12266.1.patch, HIVE-12266.2.patch, 
> HIVE-12266.3.patch
>
>
> if you start Hive CLI (locking enabled) and run some command that acquires 
> locks and ^C the shell before command completes the locks for the command 
> remain until they timeout.
> I believe Beeline has the same issue.
> Need to add proper hooks to release locks when command dies. (As much as 
> possible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12317) Emit current database in lineage info

2015-11-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987773#comment-14987773
 ] 

Yongzhi Chen commented on HIVE-12317:
-

The change looks fine.
+1

> Emit current database in lineage info
> -
>
> Key: HIVE-12317
> URL: https://issues.apache.org/jira/browse/HIVE-12317
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12317.1.patch
>
>
> It will be easier to emit current database info explicitly instead of finding 
> out such info from normalized column names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12273) Improve user level explain

2015-11-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12273:
---
Affects Version/s: 1.2.0
   1.2.1

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch, 
> HIVE-12273.03.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12196) NPE when converting bad timestamp value

2015-11-03 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987858#comment-14987858
 ] 

Aihua Xu commented on HIVE-12196:
-

Simple fix which will check the value after the conversion. If it's null, 
return null. So if the timestamp is not correct, it will generate a null value.

> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
> Attachments: HIVE-12196.patch
>
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987911#comment-14987911
 ] 

Sergey Shelukhin commented on HIVE-12300:
-

{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}
Appear to fail in all the runs concurrent with this one. I will look at the 
other two.

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987780#comment-14987780
 ] 

Yongzhi Chen commented on HIVE-12182:
-

[~ngangam], could you add unit test to catch regression in the future? Thanks

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11525) Bucket pruning

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987839#comment-14987839
 ] 

Hive QA commented on HIVE-11525:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770276/HIVE-11525.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9761 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketpruning1
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5903/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5903/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5903/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770276 - PreCommit-HIVE-TRUNK-Build

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12196) NPE when converting bad timestamp value

2015-11-03 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987860#comment-14987860
 ] 

Ryan Blue commented on HIVE-12196:
--

[~aihuaxu], this doesn't look like the correct behavior. While it is great to 
not throw the NPE, I don't see why the first query shouldn't also throw a 
SemanticException about the invalid timestamp value.

> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
> Attachments: HIVE-12196.patch
>
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12266) When client exists abnormally, it doesn't release ACID locks

2015-11-03 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12266:
-
Attachment: HIVE-12266.branch-1.patch

Added branch-1 patch

> When client exists abnormally, it doesn't release ACID locks
> 
>
> Key: HIVE-12266
> URL: https://issues.apache.org/jira/browse/HIVE-12266
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12266.1.patch, HIVE-12266.2.patch, 
> HIVE-12266.3.patch, HIVE-12266.branch-1.patch
>
>
> if you start Hive CLI (locking enabled) and run some command that acquires 
> locks and ^C the shell before command completes the locks for the command 
> remain until they timeout.
> I believe Beeline has the same issue.
> Need to add proper hooks to release locks when command dies. (As much as 
> possible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11525) Bucket pruning

2015-11-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987870#comment-14987870
 ] 

Sergey Shelukhin commented on HIVE-11525:
-

Can you post an RB?

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11525) Bucket pruning

2015-11-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987871#comment-14987871
 ] 

Sergey Shelukhin commented on HIVE-11525:
-

out file is missing according to test results

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11525) Bucket pruning

2015-11-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987871#comment-14987871
 ] 

Sergey Shelukhin edited comment on HIVE-11525 at 11/3/15 7:03 PM:
--

out file is missing according to test results


was (Author: sershe):
out file is missing according to test results

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987831#comment-14987831
 ] 

Yongzhi Chen commented on HIVE-12182:
-

Following is from BaseSemanticAnalyzer.java
Should this patch do the similar(use unescapeSQLString ...)?
// child 2 is the optional comment of the column
if (child.getChildCount() == 3) {
  col.setComment(unescapeSQLString(child.getChild(2).getText()));
}




> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12305) CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not pull up constant expressions

2015-11-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12305:
---
Affects Version/s: 1.2.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not 
> pull up constant expressions
> ---
>
> Key: HIVE-12305
> URL: https://issues.apache.org/jira/browse/HIVE-12305
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 1.2.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-12305.01.patch
>
>
> to repro, run annotate_stats_groupby.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12305) CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not pull up constant expressions

2015-11-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12305:
---
Fix Version/s: 2.0.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not 
> pull up constant expressions
> ---
>
> Key: HIVE-12305
> URL: https://issues.apache.org/jira/browse/HIVE-12305
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 1.2.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-12305.01.patch
>
>
> to repro, run annotate_stats_groupby.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean

2015-11-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988224#comment-14988224
 ] 

Pengcheng Xiong commented on HIVE-12328:


[~thejas], may i ask what msg are you expecting? cc [~jpullokkaran]

> Join On clause needs  a semantic check to verify expression is boolean
> --
>
> Key: HIVE-12328
> URL: https://issues.apache.org/jira/browse/HIVE-12328
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Thejas M Nair
>Assignee: Pengcheng Xiong
>
> SQL join query fails at query runtime with a poor error message if the 
> expression in the on clause of join is not a boolean.
> Hive should give a proper error message at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean

2015-11-03 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988262#comment-14988262
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12328:
--

A qn here, is it possible to catch this error at an earlier phase, say 
FilterOperator.initializeOp() itself.

> Join On clause needs  a semantic check to verify expression is boolean
> --
>
> Key: HIVE-12328
> URL: https://issues.apache.org/jira/browse/HIVE-12328
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Thejas M Nair
>Assignee: Pengcheng Xiong
>
> SQL join query fails at query runtime with a poor error message if the 
> expression in the on clause of join is not a boolean.
> Hive should give a proper error message at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean

2015-11-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988280#comment-14988280
 ] 

Ashutosh Chauhan commented on HIVE-12328:
-

Apart from usability aspect (which is to give user a quick feedback and not 
wait till runtime), this also will waste cluster resources by submitting a job 
and than failing.

[~hsubramaniyan] FilterOp::initializeOp() is too late, since thats called at 
run time. Check needs to happen in plan generation phase.

> Join On clause needs  a semantic check to verify expression is boolean
> --
>
> Key: HIVE-12328
> URL: https://issues.apache.org/jira/browse/HIVE-12328
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Thejas M Nair
>Assignee: Pengcheng Xiong
>
> SQL join query fails at query runtime with a poor error message if the 
> expression in the on clause of join is not a boolean.
> Hive should give a proper error message at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11726) Pushed IN predicates to the metastore

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988197#comment-14988197
 ] 

Hive QA commented on HIVE-11726:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770355/HIVE-11726.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9767 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5905/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5905/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5905/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770355 - PreCommit-HIVE-TRUNK-Build

> Pushed IN predicates to the metastore
> -
>
> Key: HIVE-11726
> URL: https://issues.apache.org/jira/browse/HIVE-11726
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11726.01.patch, HIVE-11726.patch
>
>
> The PointLookupOptimizer can turn off some of the optimizations due to its 
> use of tuple IN() clauses.
> HIVE-11573 introduced the extraction of sub-clauses that could be pushed down 
> till the TableScan operators, though they wouldn't be pushed down to the 
> metastore.
> In this issue, we tackle this problem by extending the filter parser of the 
> metastore to support IN clauses, including multiple columns. This allows to 
> push those additional predicates down throw directSQL to the metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988426#comment-14988426
 ] 

Hive QA commented on HIVE-12320:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770364/HIVE-12320.2.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9766 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5906/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5906/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5906/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770364 - PreCommit-HIVE-TRUNK-Build

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Types
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.2.patch, HIVE-12320.3.patch, HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12297) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with '$' in typeInfo

2015-11-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12297:
---
Affects Version/s: 1.2.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
> '$' in typeInfo
> ---
>
> Key: HIVE-12297
> URL: https://issues.apache.org/jira/browse/HIVE-12297
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 1.2.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-12297.01.patch
>
>
> To repo, run udf_max.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12297) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with '$' in typeInfo

2015-11-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12297:
---
Fix Version/s: 2.0.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
> '$' in typeInfo
> ---
>
> Key: HIVE-12297
> URL: https://issues.apache.org/jira/browse/HIVE-12297
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 1.2.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-12297.01.patch
>
>
> To repo, run udf_max.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12297) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with '$' in typeInfo

2015-11-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988168#comment-14988168
 ] 

Pengcheng Xiong commented on HIVE-12297:


pushed to master, thanks [~ashutoshc] for the review!

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
> '$' in typeInfo
> ---
>
> Key: HIVE-12297
> URL: https://issues.apache.org/jira/browse/HIVE-12297
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 1.2.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-12297.01.patch
>
>
> To repo, run udf_max.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean

2015-11-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988215#comment-14988215
 ] 

Thejas M Nair commented on HIVE-12328:
--

cc [~ashutoshc]


> Join On clause needs  a semantic check to verify expression is boolean
> --
>
> Key: HIVE-12328
> URL: https://issues.apache.org/jira/browse/HIVE-12328
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Thejas M Nair
>Assignee: Pengcheng Xiong
>
> SQL join query fails at query runtime with a poor error message if the 
> expression in the on clause of join is not a boolean.
> Hive should give a proper error message at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean

2015-11-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988307#comment-14988307
 ] 

Thejas M Nair commented on HIVE-12328:
--

But there seems like a bigger problem here, than just improper error message. 
The column code is there in both LHS and RHS. It should have complained about 
ambigous column in this case. (fyi, i was trying to use the "join .. using " 
syntax, but used " on " instead ).

The join expression looks broken, the "keys" under " Map Join Operator" is 
empty -
{code}
0: jdbc:hive2://localhost:1/default> explain select * from sample_07 a join 
sample_07 b on (code);
+-+--+
|   
Explain 
  |
+-+--+
| Plan not optimized by CBO.

  |
|   

  |
| Vertex dependency in root stage   

  |
| Map 1 <- Map 2 (BROADCAST_EDGE)   

  |
|   

  |
| Stage-0   

  |
|Fetch Operator 

  |
|   limit:-1

  |
|   Stage-1 

  |
|  Map 1

  |
|  File Output Operator [FS_84] 

  |
| compressed:false  

  |
| Statistics:Num rows: 243 Data size: 50660 Basic stats: COMPLETE 
Column stats: NONE  
|
| 
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input 
format:":"org.apache.hadoop.mapred.TextInputFormat","output 
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
| Select Operator [SEL_83]  

  |
|
outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7"]

 

[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver

2015-11-03 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988190#comment-14988190
 ] 

Alan Gates commented on HIVE-10982:
---

This patch doesn't apply anymore because of HIVE-11718, which makes some 
similar changes.  The big difference is it doesn't allow setting fetchSize as 
part of the connection properties.  If you feel what we have now is good enough 
we can close this as a duplicate.  Or if you still want to be able to set 
fetchSize via the connection properties you can rebase the patch against the 
current code.

> Customizable the value of  java.sql.statement.setFetchSize in Hive JDBC Driver
> --
>
> Key: HIVE-10982
> URL: https://issues.apache.org/jira/browse/HIVE-10982
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Attachments: HIVE-10982.1.patch
>
>
> The current JDBC driver for Hive hard-code the value of setFetchSize to 50, 
> which will be a bottleneck for performance.
> Pentaho filed this issue as  http://jira.pentaho.com/browse/PDI-11511, whose 
> status is open.
> Also it has discussion in 
> http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform
> http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore

2015-11-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12075:

Attachment: HIVE-12075.01.patch
HIVE-12075.01.nogen.patch

Patches on top of HIVE-11675, so I could start testing them all together. 
Actually applies to master with one small conflict.

> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: HIVE-12075
> URL: https://issues.apache.org/jira/browse/HIVE-12075
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12075.01.nogen.patch, HIVE-12075.01.patch, 
> HIVE-12075.nogen.patch, HIVE-12075.patch
>
>
> ANALYZE TABLE (spec as usual) CACHE METADATA
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean

2015-11-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988213#comment-14988213
 ] 

Thejas M Nair commented on HIVE-12328:
--

select * from sample_07 a join sample_07 b on (code);
gives -

{code}
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
java.lang.Boolean
at 
org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:119)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)

{code}

> Join On clause needs  a semantic check to verify expression is boolean
> --
>
> Key: HIVE-12328
> URL: https://issues.apache.org/jira/browse/HIVE-12328
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Thejas M Nair
>Assignee: Pengcheng Xiong
>
> SQL join query fails at query runtime with a poor error message if the 
> expression in the on clause of join is not a boolean.
> Hive should give a proper error message at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12320:

Attachment: HIVE-12320.3.patch

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Types
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.2.patch, HIVE-12320.3.patch, HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12327) WebHCat e2e tests TestJob_1 and TestJob_2 fail

2015-11-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-12327:
--
Attachment: HIVE-12327.2.patch

Attach new patch.

> WebHCat e2e tests TestJob_1 and TestJob_2 fail
> --
>
> Key: HIVE-12327
> URL: https://issues.apache.org/jira/browse/HIVE-12327
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12327.1.patch, HIVE-12327.2.patch
>
>
> The tests are added in HIVE-7035. Both are negative tests and check if the 
> http status code is 400. The original patch capture the exception containing 
> specific message. However, in latter version of Hadoop, the message change so 
> the exception is not contained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11525) Bucket pruning

2015-11-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988049#comment-14988049
 ] 

Gopal V commented on HIVE-11525:


Posted https://reviews.apache.org/r/39916/

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.2.patch, 
> HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11749) To sometimes deadlock when run a query

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988028#comment-14988028
 ] 

Hive QA commented on HIVE-11749:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770290/HIVE-11749.00.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9760 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5904/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5904/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5904/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770290 - PreCommit-HIVE-TRUNK-Build

> To sometimes deadlock when run a query
> --
>
> Key: HIVE-11749
> URL: https://issues.apache.org/jira/browse/HIVE-11749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Ryu Kobayashi
>Assignee: Kai Sasaki
> Attachments: HIVE-11749.00.patch, HIVE-11749.stack-tarace.txt
>
>
> But not always, to deadlock when it run the query. Environment are as follows:
> * Hadoop 2.6.0
> * Hive 0.13
> * JDK 1.7.0_79
> It will attach the stack trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988071#comment-14988071
 ] 

Yongzhi Chen commented on HIVE-12182:
-

I think it is useful to handle the special charactors which need escape, for 
example:
{noformat}
create table tf (val int comment 'it\'s a dog');
describe tf;
+---++-+--+
| col_name  | data_type  |   comment   |
+---++-+--+
| val   | int| it's a dog  |
+---++-+--+
{noformat}


> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12327) WebHCat e2e tests TestJob_1 and TestJob_2 fail

2015-11-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-12327:
--
Attachment: (was: HIVE-12327-1.patch)

> WebHCat e2e tests TestJob_1 and TestJob_2 fail
> --
>
> Key: HIVE-12327
> URL: https://issues.apache.org/jira/browse/HIVE-12327
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
>
> The tests are added in HIVE-7035. Both are negative tests and check if the 
> http status code is 400. The original patch capture the exception containing 
> specific message. However, in latter version of Hadoop, the message change so 
> the exception is not contained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12327) WebHCat e2e tests TestJob_1 and TestJob_2 fail

2015-11-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-12327:
--
Attachment: HIVE-12327-1.patch

Here is the new message: java.io.IOException: History file for application 
application_139949638_0011 is not found. 

Attach patch to change the message captured.

> WebHCat e2e tests TestJob_1 and TestJob_2 fail
> --
>
> Key: HIVE-12327
> URL: https://issues.apache.org/jira/browse/HIVE-12327
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
>
> The tests are added in HIVE-7035. Both are negative tests and check if the 
> http status code is 400. The original patch capture the exception containing 
> specific message. However, in latter version of Hadoop, the message change so 
> the exception is not contained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11525) Bucket pruning

2015-11-03 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11525:
---
Attachment: HIVE-11525.2.patch

> Bucket pruning
> --
>
> Key: HIVE-11525
> URL: https://issues.apache.org/jira/browse/HIVE-11525
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Assignee: Gopal V
>  Labels: gsoc2015
> Attachments: HIVE-11525.1.patch, HIVE-11525.2.patch, 
> HIVE-11525.WIP.patch
>
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> BUCKET pruning
> Enable partition PRUNING equivalent optimisation for queries on BUCKETED 
> tables
> Simplest example is for queries like:
> "SELECT … FROM x WHERE colA=123123"
> to read only the relevant bucket file rather than all file-buckets that 
> belong to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12327) WebHCat e2e tests TestJob_1 and TestJob_2 fail

2015-11-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-12327:
--
Attachment: HIVE-12327.1.patch

> WebHCat e2e tests TestJob_1 and TestJob_2 fail
> --
>
> Key: HIVE-12327
> URL: https://issues.apache.org/jira/browse/HIVE-12327
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12327.1.patch
>
>
> The tests are added in HIVE-7035. Both are negative tests and check if the 
> http status code is 400. The original patch capture the exception containing 
> specific message. However, in latter version of Hadoop, the message change so 
> the exception is not contained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12266) When client exists abnormally, it doesn't release ACID locks

2015-11-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987975#comment-14987975
 ] 

Eugene Koifman commented on HIVE-12266:
---

committed to branch-1 
https://github.com/apache/hive/commit/87e5b4ef2f3a05f1c902b85588d1d96f8fe560b9

> When client exists abnormally, it doesn't release ACID locks
> 
>
> Key: HIVE-12266
> URL: https://issues.apache.org/jira/browse/HIVE-12266
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Fix For: 1.3.0
>
> Attachments: HIVE-12266.1.patch, HIVE-12266.2.patch, 
> HIVE-12266.3.patch, HIVE-12266.branch-1.patch
>
>
> if you start Hive CLI (locking enabled) and run some command that acquires 
> locks and ^C the shell before command completes the locks for the command 
> remain until they timeout.
> I believe Beeline has the same issue.
> Need to add proper hooks to release locks when command dies. (As much as 
> possible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12196) NPE when converting bad timestamp value

2015-11-03 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988136#comment-14988136
 ] 

Ryan Blue commented on HIVE-12196:
--

Thanks, it sounds like there are actually two problems. First, the one that you 
fixed which is causing the NPE. Second, whether this behavior should be caught 
and result in an exception or NULL.

I don't see why it shouldn't be the case that the timestamp in the first query 
isn't converted in the compiler but the second is. I guess it could be that the 
type information for that function isn't available, so the coercion actually 
takes place at runtime when you don't want to fail an entire query for a single 
value (even if it is a literal value in this case). I think that could be 
fixed, but I also have no idea how much work it would be so I'll defer to 
someone with more Hive knowledge.

> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
> Attachments: HIVE-12196.patch
>
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12300:

Attachment: HIVE-12300.01.patch

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.01.patch, HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987967#comment-14987967
 ] 

Sergey Shelukhin commented on HIVE-12300:
-

[~thejas] do you want to review?

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.01.patch, HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12300:

Attachment: (was: HIVE-12300.01.patch)

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.01.patch, HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12300:

Attachment: HIVE-12300.01.patch

Making the warning respect hiveserver2 logging config.

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.01.patch, HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: HIVE-11777.05.patch

Fixing the NPE that I have already fixed and lost in the rebase 0_o

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.05.patch, 
> HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12327) WebHCat e2e tests TestJob_1 and TestJob_2 fail

2015-11-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988100#comment-14988100
 ] 

Thejas M Nair commented on HIVE-12327:
--

[~daijy] I think it would be good to capture the 'is not found' part as well to 
return a 400 error. Ie check for 'History file .* is not found.'



> WebHCat e2e tests TestJob_1 and TestJob_2 fail
> --
>
> Key: HIVE-12327
> URL: https://issues.apache.org/jira/browse/HIVE-12327
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12327.1.patch
>
>
> The tests are added in HIVE-7035. Both are negative tests and check if the 
> http status code is 400. The original patch capture the exception containing 
> specific message. However, in latter version of Hadoop, the message change so 
> the exception is not contained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-03 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: (was: HIVE-10613.1.patch)

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-03 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988533#comment-14988533
 ] 

Jason Dere commented on HIVE-12320:
---

I think this looks ok. One thing I will note is that it is possible to change a 
table column from short to decimal, but if you then try to change it back from 
decimal these rules will kick in and complain because decimal is not implicitly 
convertible to short. So this would require the user to set 
hive.metastore.disallow.incompatible.col.type.changes=false before doing this.

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Types
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.2.patch, HIVE-12320.3.patch, HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-03 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988541#comment-14988541
 ] 

Aihua Xu commented on HIVE-12304:
-

I'm sorry. I haven't checked that yet. Let me take a look.

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12229) Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].

2015-11-03 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988679#comment-14988679
 ] 

Szehon Ho commented on HIVE-12229:
--

Continuing the investigation and findings in HIVE-12230 to not pollute this 
JIRA.  There I am giving another try with a config fix.

> Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].
> --
>
> Key: HIVE-12229
> URL: https://issues.apache.org/jira/browse/HIVE-12229
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-12229.1-spark.patch, HIVE-12229.2-spark.patch, 
> HIVE-12229.3-spark.patch
>
>
> Added one python script in the query and the python script cannot be found 
> during execution in yarn-cluster mode.
> {noformat}
> 15/10/21 21:10:55 INFO exec.ScriptOperator: Executing [/usr/bin/python, 
> q2-sessionize.py, 3600]
> 15/10/21 21:10:55 INFO exec.ScriptOperator: tablename=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: partname=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: alias=null
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 10 rows: used 
> memory = 324896224
> 15/10/21 21:10:55 INFO exec.ScriptOperator: ErrorStreamProcessor calling 
> reporter.progress()
> /usr/bin/python: can't open file 'q2-sessionize.py': [Errno 2] No such file 
> or directory
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread OutputProcessor done
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread ErrorProcessor done
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 100 rows: used 
> memory = 325619920
> 15/10/21 21:10:55 ERROR exec.ScriptOperator: Error in writing to script: 
> Stream closed
> 15/10/21 21:10:55 INFO exec.ScriptOperator: The script did not consume all 
> input data. This is considered as an error.
> 15/10/21 21:10:55 INFO exec.ScriptOperator: set 
> hive.exec.script.allow.partial.consumption=true; to ignore it.
> 15/10/21 21:10:55 ERROR spark.SparkReduceRecordHandler: Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:340)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
> at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: 
> An error occurred while reading or writing to your custom script. It may have 
> crashed with an error.
> at 
> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:453)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:331)
> ... 14 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12229) Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].

2015-11-03 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988682#comment-14988682
 ] 

Szehon Ho commented on HIVE-12229:
--

typo: HIVE-12330

> Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].
> --
>
> Key: HIVE-12229
> URL: https://issues.apache.org/jira/browse/HIVE-12229
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-12229.1-spark.patch, HIVE-12229.2-spark.patch, 
> HIVE-12229.3-spark.patch
>
>
> Added one python script in the query and the python script cannot be found 
> during execution in yarn-cluster mode.
> {noformat}
> 15/10/21 21:10:55 INFO exec.ScriptOperator: Executing [/usr/bin/python, 
> q2-sessionize.py, 3600]
> 15/10/21 21:10:55 INFO exec.ScriptOperator: tablename=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: partname=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: alias=null
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 10 rows: used 
> memory = 324896224
> 15/10/21 21:10:55 INFO exec.ScriptOperator: ErrorStreamProcessor calling 
> reporter.progress()
> /usr/bin/python: can't open file 'q2-sessionize.py': [Errno 2] No such file 
> or directory
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread OutputProcessor done
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread ErrorProcessor done
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 100 rows: used 
> memory = 325619920
> 15/10/21 21:10:55 ERROR exec.ScriptOperator: Error in writing to script: 
> Stream closed
> 15/10/21 21:10:55 INFO exec.ScriptOperator: The script did not consume all 
> input data. This is considered as an error.
> 15/10/21 21:10:55 INFO exec.ScriptOperator: set 
> hive.exec.script.allow.partial.consumption=true; to ignore it.
> 15/10/21 21:10:55 ERROR spark.SparkReduceRecordHandler: Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:340)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
> at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: 
> An error occurred while reading or writing to your custom script. It may have 
> crashed with an error.
> at 
> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:453)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:331)
> ... 14 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9198) Hive reported exception because that hive's derby version conflict with spark's derby version [Spark Branch]

2015-11-03 Thread Gayathri Murali (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988563#comment-14988563
 ] 

Gayathri Murali commented on HIVE-9198:
---

I am running into the same issue, as reported in the Original JIRA. Is the 
patch good to use? Any instructions on how to use it? 

> Hive reported exception because that hive's derby version conflict with 
> spark's derby version [Spark Branch]
> 
>
> Key: HIVE-9198
> URL: https://issues.apache.org/jira/browse/HIVE-9198
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Pierre Yin
>Assignee: Pierre Yin
> Attachments: HIVE-9198.1-spark.patch, HIVE-9198.1-spark.patch, 
> hive.patch
>
>
> Spark depends on derby-10.10.1.1 while hive-on-spark depneds on 
> derby-10.11.1.1. They will be conflict. Maybe we can adapt the classpath in 
> bin/hive.
> The detailed bug is described as bellows.
> 1. get spark-1.2.0-rc2 code and build spark-assembly-1.2.0-hadoop2.4.1.jar
> 2. get latest code from hive and make packages.
> 3. run hive --auxpath /path/to/spark-assembly-*.jar
> Hive report the following exception:
> Logging initialized using configuration in 
> jar:file:/home/realityload/hive-0.15.0-SNAPSHOT/lib/hive-common-0.15.0-SNAPSHOT.jar!/hive-log4j.properties
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:449)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:634)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:578)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1481)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2674)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2693)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:430)
> ... 7 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1479)
> ... 12 more
> Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional 
> connection factory
> NestedThrowables:
> java.lang.reflect.InvocationTargetException
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
> at 
> 

[jira] [Updated] (HIVE-12252) Streaming API HiveEndPoint can be created w/o partitionVals for partitioned table

2015-11-03 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12252:
-
Attachment: HIVE-12252.3.patch

[~ekoifman] Can you take another look?

> Streaming API HiveEndPoint can be created w/o partitionVals for partitioned 
> table
> -
>
> Key: HIVE-12252
> URL: https://issues.apache.org/jira/browse/HIVE-12252
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12252.1.patch, HIVE-12252.2.patch, 
> HIVE-12252.3.patch
>
>
> When this happens, the write from Streaming API to this end point will 
> succeed but it will place the data in the table directory which is not correct
> Need to make the API throw in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12330) Fix precommit Spark test part2

2015-11-03 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988677#comment-14988677
 ] 

Szehon Ho commented on HIVE-12330:
--

The issue is that as part of HIVE-11489, I am cleaning the generated 
'TestSparkCliDriver' before running the test.  Turns out multiple tests can be 
running on the same machine, so this is not so good.

I tuned the PTest server to run the drivers not in parallel, going to give a 
try.

> Fix precommit Spark test part2
> --
>
> Key: HIVE-12330
> URL: https://issues.apache.org/jira/browse/HIVE-12330
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> Regression because of HIVE-11489



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-03 Thread Thomas Friedrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-10613:

Attachment: HIVE-10613.patch

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-03 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988538#comment-14988538
 ] 

Jason Dere commented on HIVE-12304:
---

[~aihuaxu] do you know what broke the tests here?

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988599#comment-14988599
 ] 

Ashutosh Chauhan commented on HIVE-12320:
-

Thats right and that indeed is intentional, since once you go to wider type 
going back to narrower type may give you wrong/corrupt/truncated data. Those 
are the cases which we want to prevent by default. But, if user still wants 
narrower type he can chose to set config to false to do it.

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Types
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.2.patch, HIVE-12320.3.patch, HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-03 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12304:

Attachment: (was: HIVE-12304.2.patch)

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12252) Streaming API HiveEndPoint can be created w/o partitionVals for partitioned table

2015-11-03 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12252:
-
Attachment: HIVE-12252.2.patch

Made some changes based on [~ekoifman]'s comments

> Streaming API HiveEndPoint can be created w/o partitionVals for partitioned 
> table
> -
>
> Key: HIVE-12252
> URL: https://issues.apache.org/jira/browse/HIVE-12252
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12252.1.patch, HIVE-12252.2.patch
>
>
> When this happens, the write from Streaming API to this end point will 
> succeed but it will place the data in the table directory which is not correct
> Need to make the API throw in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12288) Extend HIVE-11306 changes to apply to Native vectorized map-joins

2015-11-03 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988546#comment-14988546
 ] 

Matt McCline commented on HIVE-12288:
-

+1 LGTM

> Extend HIVE-11306 changes to apply to Native vectorized map-joins
> -
>
> Key: HIVE-12288
> URL: https://issues.apache.org/jira/browse/HIVE-12288
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 2.0.0
>
> Attachments: HIVE-12288.1.patch
>
>
> HIVE-11306 applies to the old style VectorMapJoin operators, while the 
> specialized operators go via a different codepath into the 
> HybridHybridHashTable.
> Apply similar changes to the setDirect() codepath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12329) Turn on limit pushdown optimization by default

2015-11-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12329:

Attachment: HIVE-12329.patch

> Turn on limit pushdown optimization by default
> --
>
> Key: HIVE-12329
> URL: https://issues.apache.org/jira/browse/HIVE-12329
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
> Attachments: HIVE-12329.patch
>
>
> Whenever applicable, this will always help, so this should be on by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12063) Pad Decimal numbers with trailing zeros to the scale of the column

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988743#comment-14988743
 ] 

Hive QA commented on HIVE-12063:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770377/HIVE-12063.3.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9768 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_union_with_udf
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_with_udf
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5908/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5908/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5908/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770377 - PreCommit-HIVE-TRUNK-Build

> Pad Decimal numbers with trailing zeros to the scale of the column
> --
>
> Key: HIVE-12063
> URL: https://issues.apache.org/jira/browse/HIVE-12063
> Project: Hive
>  Issue Type: Improvement
>  Components: Types
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-12063.1.patch, HIVE-12063.2.patch, 
> HIVE-12063.3.patch, HIVE-12063.patch
>
>
> HIVE-7373 was to address the problems of trimming tailing zeros by Hive, 
> which caused many problems including treating 0.0, 0.00 and so on as 0, which 
> has different precision/scale. Please refer to HIVE-7373 description. 
> However, HIVE-7373 was reverted by HIVE-8745 while the underlying problems 
> remained. HIVE-11835 was resolved recently to address one of the problems, 
> where 0.0, 0.00, and so on cannot be read into decimal(1,1).
> However, HIVE-11835 didn't address the problem of showing as 0 in query 
> result for any decimal values such as 0.0, 0.00, etc. This causes confusion 
> as 0 and 0.0 have different precision/scale than 0.
> The proposal here is to pad zeros for query result to the type's scale. This 
> not only removes the confusion described above, but also aligns with many 
> other DBs. Internal decimal number representation doesn't change, however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-03 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988610#comment-14988610
 ] 

Aihua Xu commented on HIVE-12304:
-

hmm. Not sure what's happening there. Somehow it's said 2850 tests are added 
with the build. Local test of some tests passed. Reattached the same patch. 

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-03 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12304:

Attachment: HIVE-12304.2.patch

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12317) Emit current database in lineage info

2015-11-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988614#comment-14988614
 ] 

Hive QA commented on HIVE-12317:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770366/HIVE-12317.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9766 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5907/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5907/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5907/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770366 - PreCommit-HIVE-TRUNK-Build

> Emit current database in lineage info
> -
>
> Key: HIVE-12317
> URL: https://issues.apache.org/jira/browse/HIVE-12317
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12317.1.patch
>
>
> It will be easier to emit current database info explicitly instead of finding 
> out such info from normalized column names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12330) Fix precommit Spark test part2

2015-11-03 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-12330:
-
Attachment: HIVE-12229.3-spark.patch

> Fix precommit Spark test part2
> --
>
> Key: HIVE-12330
> URL: https://issues.apache.org/jira/browse/HIVE-12330
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-12229.3-spark.patch
>
>
> Regression because of HIVE-11489



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12327) WebHCat e2e tests TestJob_1 and TestJob_2 fail

2015-11-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988775#comment-14988775
 ] 

Thejas M Nair commented on HIVE-12327:
--

+1

> WebHCat e2e tests TestJob_1 and TestJob_2 fail
> --
>
> Key: HIVE-12327
> URL: https://issues.apache.org/jira/browse/HIVE-12327
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12327.1.patch, HIVE-12327.2.patch
>
>
> The tests are added in HIVE-7035. Both are negative tests and check if the 
> http status code is 400. The original patch capture the exception containing 
> specific message. However, in latter version of Hadoop, the message change so 
> the exception is not contained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs

2015-11-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988783#comment-14988783
 ] 

Gopal V commented on HIVE-12331:


Along with these, I nominate - {{hive.exec.infer.bucket.sort}} & 
{{hive.exec.infer.bucket.sort.num.buckets.power.two}} as common causes of data 
loss when enforce bucketing is false.

> Remove hive.enforce.bucketing & hive.enforce.sorting configs
> 
>
> Key: HIVE-12331
> URL: https://issues.apache.org/jira/browse/HIVE-12331
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>
> If table is created as bucketed and/or sorted and this config is set to 
> false, you will insert data in wrong buckets and/or sort order and then if 
> you use these tables subsequently in BMJ or SMBJ you will get wrong results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12316) Improved integration test for Hive

2015-11-03 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-12316:
--
Attachment: HIVE-12316.patch

An initial patch.  Apologies as I know this is large and a lot to absorb.  I 
did try to be exhaustive in the javadoc, which covers both the design and the 
usage.

> Improved integration test for Hive
> --
>
> Key: HIVE-12316
> URL: https://issues.apache.org/jira/browse/HIVE-12316
> Project: Hive
>  Issue Type: New Feature
>  Components: Testing Infrastructure
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-12316.patch
>
>
> In working with Hive testing I have found there are several issues that are 
> causing problems for developers, testers, and users:
> * Because Hive has many tunable knobs (file format, security, etc.) we end up 
> with tests that cover the same functionality with different permutations of 
> these features.
> * The Hive integration tests (ie qfiles) cannot be run on a cluster.  This 
> means we cannot run any of those tests at scale.  The HBase community by 
> contrast uses the same test suite locally and on a cluster, and has found 
> that this helps them greatly in testing.
> * Golden files are a grievous evil.  Test writers are forced to eyeball 
> results the first time they run a test and decide whether they look 
> reasonable, which is error prone and makes testing at scale impossible.  And 
> changes to one part of Hive often end up changing the plan (and the output of 
> explain) thus breaking many tests that are not related.  This is particularly 
> an issue for people working on the optimizer.  
> * The lack of ability to run on a cluster means that when people test Hive at 
> scale, they are forced to develop custom frameworks which can't then benefit 
> the community.
> * There is no easy mechanism to bring user queries into the test suite.
> I propose we build a new testing capability with the following requirements:
> * One test should be able to run all reasonable permutations (mr/tez/spark, 
> orc/parquet/text/rcfile, secure/non-secure etc.)  This doesn't mean it would 
> run every permutation every time, but that the tester could choose which 
> permutation to run.
> * The same tests should run locally and on a cluster.  The tests should 
> support scaling of input data from Ks to Ts.
> * Expected results should be auto-generated whenever possible, and this 
> should work with the scaling of inputs.  The dev should be able to provide 
> expected results or custom expected result generation in cases where 
> auto-generation doesn't make sense.
> * Access to the query plan should be available as an API in the tests so that 
> golden files of explain output are not required.
> * This should run in maven, junit, and java so that developers do not need to 
> manage yet another framework.
> * It should be possible to simulate user data (based on schema and 
> statistics) and quickly incorporate user queries so that tests from user 
> scenarios can be quickly incorporated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >