date:20150202

[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp

2015-02-02 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302905#comment-14302905
 ] 

Lefty Leverenz commented on HIVE-5472:
--

Does this need any documentation?

> support a simple scalar which returns the current timestamp
> ---
>
> Key: HIVE-5472
> URL: https://issues.apache.org/jira/browse/HIVE-5472
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.11.0
>Reporter: N Campbell
>Assignee: Jason Dere
> Fix For: 1.2.0
>
> Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, 
> HIVE-5472.4.patch
>
>
> ISO-SQL has two forms of functions
> local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE 
> and the latter with TIME ZONE
> select cast ( unix_timestamp() as timestamp ) from T
> implement a function which computes LOCAL TIMESTAMP which would be the 
> current timestamp for the users session time zone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9560) When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will result in value '0' after running 'analyze table TABLE_NAME compute statistics;'

2015-02-02 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302899#comment-14302899
 ] 

Prasanth Jayachandran commented on HIVE-9560:
-

Try using
{code}
analyze table TABLE_NAME compute statistics noscan
{code}
OR
{code}
analyze table TABLE_NAME compute statistics partialscan
{code}

This should get the raw data size properly. The reason why 'analyze table 
TABLE_NAME compute statistics' does not work in case of ORC is, ORC does not 
implement the SerDeStats which some formats implement. Implementing SerDeStats 
the traditional way requires ORC to pass serialized data size for each row. 
This is inefficient considering scanning of each row, getting stats and 
aggregating it. Since ORC already collects column stats we can utilize that 
information without scanning each row to compute the raw data size. Thats the 
reason we need noscan/partialscan at the end (both does the same).

> When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will 
> result in value '0' after running 'analyze table TABLE_NAME compute 
> statistics;'
> --
>
> Key: HIVE-9560
> URL: https://issues.apache.org/jira/browse/HIVE-9560
> Project: Hive
>  Issue Type: Bug
>Reporter: Xin Hao
>
> When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will 
> result in value '0' after running 'analyze table TABLE_NAME compute 
> statistics;'
> Reproduce step:
> (1) set hive.stats.collect.rawdatasize=true；
> (2) Generate an ORC table in hive, and the value of its 'rawDataSize' is NOT 
> zero.
> You can find the value of 'rawDataSize' (NOT zero) by executing  'describe 
> extended TABLE_NAME;' 
> (4) Execute 'analyze table TABLE_NAME compute statistics;'
> (5) Execute  'describe extended TABLE_NAME;' again, and you will find that  
> the value of 'rawDataSize' will be changed to '0'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9560) When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will result in value '0' after running 'analyze table TABLE_NAME compute statistics;'

2015-02-02 Thread Xin Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302891#comment-14302891
 ] 

Xin Hao commented on HIVE-9560:
---

For example, we have an ORC table named 'item'.

(a) Before running 'analyze table item compute statistics;',
the 'rawDataSize' was '884720592'.

The result of 'describe extended item':
Detailed Table Information  Table(tableName:item, dbName:bigbenchorc, 
owner:root, createTime:1421984899, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:i_item_sk, type:bigint, 
comment:null), FieldSchema(name:i_item_id, type:string, comment:null), 
FieldSchema(name:i_rec_start_date, type:string, comment:null), 
FieldSchema(name:i_rec_end_date, type:string, comment:null), 
FieldSchema(name:i_item_desc, type:string, comment:null), 
FieldSchema(name:i_current_price, type:double, comment:null), 
FieldSchema(name:i_wholesale_cost, type:double, comment:null), 
FieldSchema(name:i_brand_id, type:int, comment:null), FieldSchema(name:i_brand, 
type:string, comment:null), FieldSchema(name:i_class_id, type:int, 
comment:null), FieldSchema(name:i_class, type:string, comment:null), 
FieldSchema(name:i_category_id, type:int, comment:null), 
FieldSchema(name:i_category, type:string, comment:null), 
FieldSchema(name:i_manufact_id, type:int, comment:null), 
FieldSchema(name:i_manufact, type:string, comment:null), 
FieldSchema(name:i_size, type:string, comment:null), 
FieldSchema(name:i_formulation, type:string, comment:null), 
FieldSchema(name:i_color, type:string, comment:null), FieldSchema(name:i_units, 
type:string, comment:null), FieldSchema(name:i_container, type:string, 
comment:null), FieldSchema(name:i_manager_id, type:int, comment:null), 
FieldSchema(name:i_product_name, type:string, comment:null)], 
location:hdfs://bhx1:8020/user/hive/warehouse/bigbenchorc.db/item, 
inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[], parameters:{numFiles=4, transient_lastDdlTime=1421984899, 
COLUMN_STATS_ACCURATE=true, totalSize=83267548, numRows=563518, 
rawDataSize=884720592}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE)
Time taken: 0.527 seconds, Fetched: 24 row(s)

(b)After running 'analyze table TABLE_NAME compute statistics;'
the 'rawDataSize' will be changed to '0',

The result of 'describe extended item':
Detailed Table Information  Table(tableName:item, dbName:bigbenchorc, 
owner:root, createTime:1421984899, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:i_item_sk, type:bigint, 
comment:null), FieldSchema(name:i_item_id, type:string, comment:null), 
FieldSchema(name:i_rec_start_date, type:string, comment:null), 
FieldSchema(name:i_rec_end_date, type:string, comment:null), 
FieldSchema(name:i_item_desc, type:string, comment:null), 
FieldSchema(name:i_current_price, type:double, comment:null), 
FieldSchema(name:i_wholesale_cost, type:double, comment:null), 
FieldSchema(name:i_brand_id, type:int, comment:null), FieldSchema(name:i_brand, 
type:string, comment:null), FieldSchema(name:i_class_id, type:int, 
comment:null), FieldSchema(name:i_class, type:string, comment:null), 
FieldSchema(name:i_category_id, type:int, comment:null), 
FieldSchema(name:i_category, type:string, comment:null), 
FieldSchema(name:i_manufact_id, type:int, comment:null), 
FieldSchema(name:i_manufact, type:string, comment:null), 
FieldSchema(name:i_size, type:string, comment:null), 
FieldSchema(name:i_formulation, type:string, comment:null), 
FieldSchema(name:i_color, type:string, comment:null), FieldSchema(name:i_units, 
type:string, comment:null), FieldSchema(name:i_container, type:string, 
comment:null), FieldSchema(name:i_manager_id, type:int, comment:null), 
FieldSchema(name:i_product_name, type:string, comment:null)], 
location:hdfs://bhx1:8020/user/hive/warehouse/bigbenchorc.db/item, 
inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[], parameters:{numFiles=4, transient_lastDdlTime=1421984899, 
COLUMN_STATS_ACCURATE=true, totalSize=83267548, numRows=563518, 
rawDataSize=884720592}, viewOriginalText:null,

[jira] [Created] (HIVE-9560) When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will result in value '0' after running 'analyze table TABLE_NAME compute statistics;'

2015-02-02 Thread Xin Hao (JIRA)

Xin Hao created HIVE-9560:
-

 Summary: When hive.stats.collect.rawdatasize=true, 'rawDataSize' 
for an ORC table will result in value '0' after running 'analyze table 
TABLE_NAME compute statistics;'
 Key: HIVE-9560
 URL: https://issues.apache.org/jira/browse/HIVE-9560
 Project: Hive
  Issue Type: Bug
Reporter: Xin Hao


When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will 
result in value '0' after running 'analyze table TABLE_NAME compute statistics;'

Reproduce step:
(1) set hive.stats.collect.rawdatasize=true；
(2) Generate an ORC table in hive, and the value of its 'rawDataSize' is NOT 
zero.
You can find the value of 'rawDataSize' (NOT zero) by executing  'describe 
extended TABLE_NAME;' 
(4) Execute 'analyze table TABLE_NAME compute statistics;'
(5) Execute  'describe extended TABLE_NAME;' again, and you will find that  the 
value of 'rawDataSize' will be changed to '0'.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9496) Sl4j warning in hive command

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9496:
--
Attachment: HIVE-9496.1.patch

patch #1

> Sl4j warning in hive command
> 
>
> Key: HIVE-9496
> URL: https://issues.apache.org/jira/browse/HIVE-9496
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.14.0
> Environment: HDP 2.2.0 on CentOS.
> With Horton Sand Box and my own cluster.
>Reporter: Philippe Kernevez
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9496.1.patch
>
>
> Each time 'hive' command is ran, we have an Sl4J warning about multiple jars 
> containing SL4J classes.
> This bug is similar to Hive-6162, but doesn't seems to be solved.
> Logging initialized using configuration in 
> file:/etc/hive/conf/hive-log4j.properties
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-1084/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-1084/hive/lib/hive-jdbc-0.14.0.2.2.0.0-1084-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9496) Sl4j warning in hive command

2015-02-02 Thread Alexander Pivovarov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302877#comment-14302877
 ] 

Alexander Pivovarov commented on HIVE-9496:
---

decided to move hive-jdbc-standalone.jar to extlib folder


> Sl4j warning in hive command
> 
>
> Key: HIVE-9496
> URL: https://issues.apache.org/jira/browse/HIVE-9496
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.14.0
> Environment: HDP 2.2.0 on CentOS.
> With Horton Sand Box and my own cluster.
>Reporter: Philippe Kernevez
>Assignee: Alexander Pivovarov
>Priority: Minor
>
> Each time 'hive' command is ran, we have an Sl4J warning about multiple jars 
> containing SL4J classes.
> This bug is similar to Hive-6162, but doesn't seems to be solved.
> Logging initialized using configuration in 
> file:/etc/hive/conf/hive-log4j.properties
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-1084/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-1084/hive/lib/hive-jdbc-0.14.0.2.2.0.0-1084-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 30551: HIVE-9496 move hive-jdbc-standalone.jar from lib to extlib folder

2015-02-02 Thread Alexander Pivovarov


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30551/
---

Review request for hive, Jason Dere and Thejas Nair.


Bugs: HIVE-9496
https://issues.apache.org/jira/browse/HIVE-9496


Repository: hive-git


Description
---

-move hive-jdbc-standalone.jar from lib to extlib folder
-fix hive-jdbc-standalone.jar in beeline.sh


Diffs
-

  bin/ext/beeline.sh a957fe17349bbb2a63590c0939b4ab5dc6cf67d0 
  packaging/src/main/assembly/bin.xml 8e617d83b507041792f216336bfee66049d73165 

Diff: https://reviews.apache.org/r/30551/diff/


Testing
---


Thanks,

Alexander Pivovarov

[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: HIVE-9545.1.patch.txt

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Description: 

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2












  was:
With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2













> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Description: 
 NO PRECOMMIT TESTS 

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2












  was:

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2













> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
>  NO PRECOMMIT TESTS 
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: (was: HIVE-9495.1.patch.txt)

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Assignee: Navis
  Status: Patch Available  (was: Open)

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9495.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: HIVE-9495.1.patch.txt

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
> Attachments: HIVE-9495.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302853#comment-14302853
 ] 

Hive QA commented on HIVE-9350:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12693733/HIVE-9350.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2625/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2625/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2625/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-2625/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target 
shims/0.23/target shims/aggregator/target shims/common/target 
shims/scheduler/target packaging/target hbase-handler/target testutils/target 
jdbc/target metastore/target itests/target itests/thirdparty 
itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target 
itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-jmh/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target itests/qtest-spark/target hcatalog/target 
hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target 
accumulo-handler/target hwi/target common/target common/src/gen 
spark-client/target contrib/target service/target serde/target beeline/target 
odbc/target cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1656629.

At revision 1656629.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12693733 - PreCommit-HIVE-TRUNK-Build

> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-9350.1.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9517) UNION ALL query failed with ArrayIndexOutOfBoundsException [Spark Branch]

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302849#comment-14302849
 ] 

Hive QA commented on HIVE-9517:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696020/HIVE-9517.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7421 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2624/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2624/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2624/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696020 - PreCommit-HIVE-TRUNK-Build

> UNION ALL query failed with ArrayIndexOutOfBoundsException [Spark Branch]
> -
>
> Key: HIVE-9517
> URL: https://issues.apache.org/jira/browse/HIVE-9517
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9517.1.patch
>
>
> I was running a query from cbo_gby_empty.q:
> {code}
> select unionsrc.key, unionsrc.value FROM (select 'max' as key, max(c_int) as 
> value from cbo_t3 s1
>   UNION  ALL
>   select 'min' as key,  min(c_int) as value from cbo_t3 s2
> UNION ALL
> select 'avg' as key,  avg(c_int) as value from cbo_t3 s3) unionsrc 
> order by unionsrc.key;
> {code}
> and got the following exception:
> {noformat}
> 2015-01-29 15:57:55,948 ERROR [Executor task launch worker-1]: 
> spark.SparkReduceRecordHandler 
> (SparkReduceRecordHandler.java:processRow(299)) - Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) {"key":{"reducesinkkey0":"max"},"value":{"_col0":1.5}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) {"key":{"reducesinkkey0":"max"},"value":{"_col0":1.5}}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:339)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> VALUE._col0
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:330)
>   ... 17 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:84)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryDouble.init(LazyBinaryDouble.java:43)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apach

[jira] [Commented] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302848#comment-14302848
 ] 

Navis commented on HIVE-9397:
-

Now OIs are acquired directly from row schema of final GBY operator. And also 
I've fixed double to float type casting, making identical result between 
stat-optimized and not. 
It would be possible to extend StatsOptimizer to accept queries like "select 
min(x)+max(x) from tbl" but seemed better to be done in following issue. 

> SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
> 
>
> Key: HIVE-9397
> URL: https://issues.apache.org/jira/browse/HIVE-9397
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Damien Carol
>Assignee: Navis
> Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt
>
>
> These queries produce an error :
> {code:sql}
> DROP TABLE IF EXISTS foo;
> CREATE TABLE foo (id int) STORED AS ORC;
> INSERT INTO TABLE foo VALUES (1);
> INSERT INTO TABLE foo VALUES (2);
> INSERT INTO TABLE foo VALUES (3);
> INSERT INTO TABLE foo VALUES (4);
> INSERT INTO TABLE foo VALUES (5);
> SELECT max(id) FROM foo;
> ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
> SELECT max(id) FROM foo;
> {code}
> The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
> {noformat}
> 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo;
> +-+--+
> | _c0 |
> +-+--+
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
> 0: jdbc:hive2://nc-h04:1/casino>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 30549: SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis Ryu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30549/
---

Review request for hive.


Bugs: HIVE-9397
https://issues.apache.org/jira/browse/HIVE-9397


Repository: hive-git


Description
---

These queries produce an error :

{code:sql}
DROP TABLE IF EXISTS foo;

CREATE TABLE foo (id int) STORED AS ORC;

INSERT INTO TABLE foo VALUES (1);
INSERT INTO TABLE foo VALUES (2);
INSERT INTO TABLE foo VALUES (3);
INSERT INTO TABLE foo VALUES (4);
INSERT INTO TABLE foo VALUES (5);

SELECT max(id) FROM foo;

ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;

SELECT max(id) FROM foo;
{code}

The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
{noformat}
0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo;
+-+--+
| _c0 |
+-+--+
org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
0: jdbc:hive2://nc-h04:1/casino>
{noformat}


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 6961d7f 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java d1118f1 
  ql/src/test/results/clientpositive/metadata_only_queries.q.out 90c76ed 
  ql/src/test/results/clientpositive/metadata_only_queries_with_filters.q.out 
5be958f 

Diff: https://reviews.apache.org/r/30549/diff/


Testing
---


Thanks,

Navis Ryu

[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9397:

Attachment: HIVE-9397.2.patch.txt

Addressed comments & fixed double sub-type

> SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
> 
>
> Key: HIVE-9397
> URL: https://issues.apache.org/jira/browse/HIVE-9397
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Damien Carol
>Assignee: Navis
> Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt
>
>
> These queries produce an error :
> {code:sql}
> DROP TABLE IF EXISTS foo;
> CREATE TABLE foo (id int) STORED AS ORC;
> INSERT INTO TABLE foo VALUES (1);
> INSERT INTO TABLE foo VALUES (2);
> INSERT INTO TABLE foo VALUES (3);
> INSERT INTO TABLE foo VALUES (4);
> INSERT INTO TABLE foo VALUES (5);
> SELECT max(id) FROM foo;
> ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
> SELECT max(id) FROM foo;
> {code}
> The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
> {noformat}
> 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo;
> +-+--+
> | _c0 |
> +-+--+
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
> 0: jdbc:hive2://nc-h04:1/casino>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6394) Implement Timestmap in ParquetSerde

2015-02-02 Thread Yang Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302837#comment-14302837
 ] 

Yang Yang commented on HIVE-6394:
-

the parquet spec about logical types and Timestamp specifically, seems to say 
https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md
"TIMESTAMP_MILLIS is used for a combined logical date and time type. It must 
annotate an int64 that stores the number of milliseconds from the Unix epoch, 
00:00:00.000 on 1 January 1970, UTC.

"


i.e. here it says that the type is only precise to the point of miliseconds and 
it starts from 1970.


but if u look at the hive-parquet code in 
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L142
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTime.java#L54
it seems that hive's encoding of timestamp on parquet is of a different spec, 
precise to the point of nano seconds, and starting from "Monday, January 1, 
4713 " (defined in jodd.datetime.JDateTime) 


so Hive's parquet timestamp storage is completely different from the above spec 
?




> Implement Timestmap in ParquetSerde
> ---
>
> Key: HIVE-6394
> URL: https://issues.apache.org/jira/browse/HIVE-6394
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Jarek Jarcec Cecho
>Assignee: Szehon Ho
>  Labels: Parquet
> Fix For: 0.14.0
>
> Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, 
> HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch, 
> HIVE-6394.patch
>
>
> This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9559) Create UDF to measure strings similarity using q-gram distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9559:
--
Description: 
algo description 
http://stackoverflow.com/questions/1938678/q-gram-approximate-matching-optimisations

{code}
str_sim_qgrams('Test String1', 'Test String2') = 0.78571427f
{code}

another example
{code}
> qgrams('abcde','abdcde',q=2)
   ab bc cd de dc bd
V1  1  1  1  1  0  0
V2  1  0  1  1  1  1
 
> stringdist('abcde', 'abdcde', method='qgram', q=2)
[1] 3
{code}

take SimMetrics as a reference implementation 
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistance.java
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistanceTest.java

  was:
algo description 
http://stackoverflow.com/questions/1938678/q-gram-approximate-matching-optimisations

{code}
str_sim_qgrams("Test String1", "Test String2") = 0.78571427f
{code}

another example
{code}
> qgrams('abcde','abdcde',q=2)
   ab bc cd de dc bd
V1  1  1  1  1  0  0
V2  1  0  1  1  1  1
 
> stringdist('abcde', 'abdcde', method='qgram', q=2)
[1] 3
{code}

take SimMetrics as a reference implementation 
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistance.java
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistanceTest.java


> Create UDF to measure strings similarity using q-gram distance algo
> ---
>
> Key: HIVE-9559
> URL: https://issues.apache.org/jira/browse/HIVE-9559
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description 
> http://stackoverflow.com/questions/1938678/q-gram-approximate-matching-optimisations
> {code}
> str_sim_qgrams('Test String1', 'Test String2') = 0.78571427f
> {code}
> another example
> {code}
> > qgrams('abcde','abdcde',q=2)
>ab bc cd de dc bd
> V1  1  1  1  1  0  0
> V2  1  0  1  1  1  1
>  
> > stringdist('abcde', 'abdcde', method='qgram', q=2)
> [1] 3
> {code}
> take SimMetrics as a reference implementation 
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistance.java
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistanceTest.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Description: 
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein('Test String1', 'Test String2') = (12 -1) / 12 = 0.917f
{code}

  was:
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein("Test String1", "Test String2") = (12 -1) / 12 = 0.917f
{code}


> create UDF to measure strings similarity using Levenshtein Distance algo
> 
>
> Key: HIVE-9556
> URL: https://issues.apache.org/jira/browse/HIVE-9556
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
> {code}
> --one edit operation, greatest str len = 12
> str_sim_levenshtein('Test String1', 'Test String2') = (12 -1) / 12 = 
> 0.917f
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Description: 
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
{code}

reference implementation:
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java

  was:
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine("Test String1", "Test String2") = (2 - 1) / 2 = 0.5f
{code}

reference implementation:
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java


> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Description: 
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine("Test String1", "Test String2") = (2 - 1) / 2 = 0.5f
{code}

reference implementation:
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java

  was:
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine("Test String1", "Test String2") = (2 - 1) / 2 = 0.5f
{code}


> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine("Test String1", "Test String2") = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9559) Create UDF to measure strings similarity using q-gram distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9559:
--
Description: 
algo description 
http://stackoverflow.com/questions/1938678/q-gram-approximate-matching-optimisations

{code}
str_sim_qgrams("Test String1", "Test String2") = 0.78571427f
{code}

another example
{code}
> qgrams('abcde','abdcde',q=2)
   ab bc cd de dc bd
V1  1  1  1  1  0  0
V2  1  0  1  1  1  1
 
> stringdist('abcde', 'abdcde', method='qgram', q=2)
[1] 3
{code}

take SimMetrics as a reference implementation 
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistance.java
https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistanceTest.java

> Create UDF to measure strings similarity using q-gram distance algo
> ---
>
> Key: HIVE-9559
> URL: https://issues.apache.org/jira/browse/HIVE-9559
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description 
> http://stackoverflow.com/questions/1938678/q-gram-approximate-matching-optimisations
> {code}
> str_sim_qgrams("Test String1", "Test String2") = 0.78571427f
> {code}
> another example
> {code}
> > qgrams('abcde','abdcde',q=2)
>ab bc cd de dc bd
> V1  1  1  1  1  0  0
> V2  1  0  1  1  1  1
>  
> > stringdist('abcde', 'abdcde', method='qgram', q=2)
> [1] 3
> {code}
> take SimMetrics as a reference implementation 
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistance.java
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistanceTest.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9559) Create UDF to measure strings similarity using q-gram distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)

Alexander Pivovarov created HIVE-9559:
-

 Summary: Create UDF to measure strings similarity using q-gram 
distance algo
 Key: HIVE-9559
 URL: https://issues.apache.org/jira/browse/HIVE-9559
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Summary: create UDF to measure strings similarity using Cosine Similarity 
algo  (was: Create Cosine Similarity UDF)

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine("Test String1", "Test String2") = (2 -1) / 2 = 0.5f
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Description: 
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine("Test String1", "Test String2") = (2 - 1) / 2 = 0.5f
{code}

  was:
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine("Test String1", "Test String2") = (2 -1) / 2 = 0.5f
{code}


> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine("Test String1", "Test String2") = (2 - 1) / 2 = 0.5f
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9556) create UDF to measure strings similarity using Levenshtein Distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Summary: create UDF to measure strings similarity using Levenshtein 
Distance algo  (was: create UDF to measure string similarity using Levenshtein 
Distance algo)

> create UDF to measure strings similarity using Levenshtein Distance algo
> 
>
> Key: HIVE-9556
> URL: https://issues.apache.org/jira/browse/HIVE-9556
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
> {code}
> --one edit operation, greatest str len = 12
> str_sim_levenshtein("Test String1", "Test String2") = (12 -1) / 12 = 
> 0.917f
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) Create Cosine Similarity UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Description: 
algo description http://en.wikipedia.org/wiki/Cosine_similarity
{code}
--one word different, total 2 words
str_sim_cosine("Test String1", "Test String2") = (2 -1) / 2 = 0.5f
{code}

  was:
algo description http://en.wikipedia.org/wiki/Cosine_similarity



> Create Cosine Similarity UDF
> 
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine("Test String1", "Test String2") = (2 -1) / 2 = 0.5f
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9556) create UDF to measure string similarity using Levenshtein Distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Description: 
algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
{code}
--one edit operation, greatest str len = 12
str_sim_levenshtein("Test String1", "Test String2") = (12 -1) / 12 = 0.917f
{code}

  was:algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance


> create UDF to measure string similarity using Levenshtein Distance algo
> ---
>
> Key: HIVE-9556
> URL: https://issues.apache.org/jira/browse/HIVE-9556
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance
> {code}
> --one edit operation, greatest str len = 12
> str_sim_levenshtein("Test String1", "Test String2") = (12 -1) / 12 = 
> 0.917f
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9556) create UDF to measure string similarity using Levenshtein Distance algo

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Summary: create UDF to measure string similarity using Levenshtein Distance 
algo  (was: create Levenshtein Distance UDF)

> create UDF to measure string similarity using Levenshtein Distance algo
> ---
>
> Key: HIVE-9556
> URL: https://issues.apache.org/jira/browse/HIVE-9556
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9558) [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode

2015-02-02 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-9558:

Status: Patch Available  (was: Open)

> [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable 
> in vectorized mode
> ---
>
> Key: HIVE-9558
> URL: https://issues.apache.org/jira/browse/HIVE-9558
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dong Chen
>Assignee: Dong Chen
> Attachments: HIVE-9558.patch
>
>
> When using Parquet in  vectorized mode, 
> {{VectorColumnAssignFactory.buildAssigners(..)}} does not handle 
> HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable. 
> We need fix this and add test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9558) [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode

2015-02-02 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-9558:

Attachment: HIVE-9558.patch

Upload a patch to add a test case verifying parquet data types in vectorized 
mode, and fix failed decimal type.

> [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable 
> in vectorized mode
> ---
>
> Key: HIVE-9558
> URL: https://issues.apache.org/jira/browse/HIVE-9558
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dong Chen
>Assignee: Dong Chen
> Attachments: HIVE-9558.patch
>
>
> When using Parquet in  vectorized mode, 
> {{VectorColumnAssignFactory.buildAssigners(..)}} does not handle 
> HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable. 
> We need fix this and add test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6679) HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302785#comment-14302785
 ] 

Hive QA commented on HIVE-6679:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696010/HIVE-6679.6.patch

{color:green}SUCCESS:{color} +1 7421 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2623/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2623/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2623/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696010 - PreCommit-HIVE-TRUNK-Build

> HiveServer2 should support configurable the server side socket timeout and 
> keepalive for various transports types where applicable
> --
>
> Key: HIVE-6679
> URL: https://issues.apache.org/jira/browse/HIVE-6679
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0, 0.14.0
>Reporter: Prasad Mujumdar
>Assignee: Navis
>  Labels: TODOC14, TODOC15
> Fix For: 1.1.0
>
> Attachments: HIVE-6679.1.patch.txt, HIVE-6679.2.patch.txt, 
> HIVE-6679.3.patch, HIVE-6679.4.patch, HIVE-6679.5.patch, HIVE-6679.6.patch
>
>
>  HiveServer2 should support configurable the server side socket read timeout 
> and TCP keep-alive option. Metastore server already support this (and the so 
> is the old hive server). 
> We now have multiple client connectivity options like Kerberos, Delegation 
> Token (Digest-MD5), Plain SASL, Plain SASL with SSL and raw sockets. The 
> configuration should be applicable to all types (if possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9558) [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode

2015-02-02 Thread Dong Chen (JIRA)

Dong Chen created HIVE-9558:
---

 Summary: [Parquet] support HiveDecimalWritable, HiveCharWritable, 
HiveVarcharWritable in vectorized mode
 Key: HIVE-9558
 URL: https://issues.apache.org/jira/browse/HIVE-9558
 Project: Hive
  Issue Type: Sub-task
Reporter: Dong Chen
Assignee: Dong Chen


When using Parquet in  vectorized mode, 
{{VectorColumnAssignFactory.buildAssigners(..)}} does not handle 
HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable. 
We need fix this and add test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) Create Cosine Similarity UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Component/s: UDF

> Create Cosine Similarity UDF
> 
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9520) Create NEXT_DAY UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9520:
--
Component/s: UDF

> Create NEXT_DAY UDF
> ---
>
> Key: HIVE-9520
> URL: https://issues.apache.org/jira/browse/HIVE-9520
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9520.1.patch
>
>
> NEXT_DAY returns the date of the first weekday named by char that is later 
> than the date date
> Example:
> {code}
> select next_day('2001-02-02','TUESDAY') ...;
> OK
> 2001-02-06
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9556) create Levenshtein Distance UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9556:
--
Description: algorithm description 
http://en.wikipedia.org/wiki/Levenshtein_distance

> create Levenshtein Distance UDF
> ---
>
> Key: HIVE-9556
> URL: https://issues.apache.org/jira/browse/HIVE-9556
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algorithm description http://en.wikipedia.org/wiki/Levenshtein_distance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9557) Create Cosine Similarity UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9557:
--
Description: 
algo description http://en.wikipedia.org/wiki/Cosine_similarity


> Create Cosine Similarity UDF
> 
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9557) Create Cosine Similarity UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)

Alexander Pivovarov created HIVE-9557:
-

 Summary: Create Cosine Similarity UDF
 Key: HIVE-9557
 URL: https://issues.apache.org/jira/browse/HIVE-9557
 Project: Hive
  Issue Type: Improvement
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9556) create Levenshtein Distance UDF

2015-02-02 Thread Alexander Pivovarov (JIRA)

Alexander Pivovarov created HIVE-9556:
-

 Summary: create Levenshtein Distance UDF
 Key: HIVE-9556
 URL: https://issues.apache.org/jira/browse/HIVE-9556
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-02 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9528.
-
Resolution: Not a Problem

> SemanticException: Ambiguous column reference
> -
>
> Key: HIVE-9528
> URL: https://issues.apache.org/jira/browse/HIVE-9528
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Yongzhi Chen
>Assignee: Navis
>
> When running the following query:
> {code}
> SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
> sim a join sim2 b on a.simstr=b.simstr) app
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
> {code}
> This query works fine in hive 0.10
> In the apache trunk, following workaround will work:
> {code}
> SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
> a join sim2 b on a.simstr=b.simstr) app;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-02 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302768#comment-14302768
 ] 

Navis commented on HIVE-9528:
-

No, it's HIVE-7733. I've almost forgot the context of it but probably it was 
about enforcing unique column names in the final stage of subquery which was 
checked when generating select operator before of it.

> SemanticException: Ambiguous column reference
> -
>
> Key: HIVE-9528
> URL: https://issues.apache.org/jira/browse/HIVE-9528
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Yongzhi Chen
>Assignee: Navis
>
> When running the following query:
> {code}
> SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
> sim a join sim2 b on a.simstr=b.simstr) app
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
> {code}
> This query works fine in hive 0.10
> In the apache trunk, following workaround will work:
> {code}
> SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
> a join sim2 b on a.simstr=b.simstr) app;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9541) Update people page with new PMC members

2015-02-02 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9541:
--
Assignee: Prasanth Jayachandran  (was: Xuefu Zhang)

> Update people page with new PMC members
> ---
>
> Key: HIVE-9541
> URL: https://issues.apache.org/jira/browse/HIVE-9541
> Project: Hive
>  Issue Type: Improvement
>  Components: Website
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Trivial
> Attachments: HIVE-9541.1.patch, HIVE-9541.2.patch
>
>
> Move [~jdere], [~owen.omalley], [~prasanth_j], [~vikram.dixit] and [~szehon] 
> from committer list to PMC list.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-02 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302756#comment-14302756
 ] 

Navis commented on HIVE-9553:
-

+1

> Fix log-line in Partition Pruner
> 
>
> Key: HIVE-9553
> URL: https://issues.apache.org/jira/browse/HIVE-9553
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Trivial
> Attachments: HIVE-9553.1.patch
>
>
> Minor issue in logging the prune-expression in the PartitionPruner:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr);
> {code}
> Given the operator precedence order, this should read:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-9541) Update people page with new PMC members

2015-02-02 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-9541:
-

Assignee: Xuefu Zhang  (was: Prasanth Jayachandran)

> Update people page with new PMC members
> ---
>
> Key: HIVE-9541
> URL: https://issues.apache.org/jira/browse/HIVE-9541
> Project: Hive
>  Issue Type: Improvement
>  Components: Website
>Reporter: Prasanth Jayachandran
>Assignee: Xuefu Zhang
>Priority: Trivial
> Attachments: HIVE-9541.1.patch, HIVE-9541.2.patch
>
>
> Move [~jdere], [~owen.omalley], [~prasanth_j], [~vikram.dixit] and [~szehon] 
> from committer list to PMC list.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9552) Merge trunk to Spark branch 2/2/2015 [Spark Branch]

2015-02-02 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9552:
--
Attachment: HIVE-9552.patch

> Merge trunk to Spark branch 2/2/2015 [Spark Branch]
> ---
>
> Key: HIVE-9552
> URL: https://issues.apache.org/jira/browse/HIVE-9552
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9552.3-spark.patch, HIVE-9552.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302743#comment-14302743
 ] 

Hive QA commented on HIVE-9553:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696012/HIVE-9553.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7421 tests executed
*Failed tests:*
{noformat}
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2622/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2622/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2622/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696012 - PreCommit-HIVE-TRUNK-Build

> Fix log-line in Partition Pruner
> 
>
> Key: HIVE-9553
> URL: https://issues.apache.org/jira/browse/HIVE-9553
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Trivial
> Attachments: HIVE-9553.1.patch
>
>
> Minor issue in logging the prune-expression in the PartitionPruner:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr);
> {code}
> Given the operator precedence order, this should read:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9529) "alter table .. concatenate" under Tez mode should create TezTask

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302684#comment-14302684
 ] 

Hive QA commented on HIVE-9529:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696009/HIVE-9529.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7419 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2621/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2621/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2621/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696009 - PreCommit-HIVE-TRUNK-Build

> "alter table .. concatenate" under Tez mode should create TezTask
> -
>
> Key: HIVE-9529
> URL: https://issues.apache.org/jira/browse/HIVE-9529
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0, 1.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9529-branch-1.0.0.patch, 
> HIVE-9529-branch-1.1.0.patch, HIVE-9529.1.patch, HIVE-9529.2.patch
>
>
> "alter table .. concatenate" DDL command creates MR task by default. When 
> hive cli is launched with execution engine as tez, the scheduling of the MR 
> task for file merging could be delayed until tez session expiration. This 
> happens because YARN will not have capacity to launch another AppMaster for 
> MR task. We should create tez task to overcome this. When the execution 
> engine is tez TezTask will be created else MRTask will be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9273) Add option to fire metastore event on insert

2015-02-02 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-9273:
-
Attachment: HIVE-9273.2.patch

New version of the patch that addresses Sushanth's feedback, sort of. :)

I changed FireEventRequest.dbName to be optional, as suggested.

Rather than put in a null guard for FireEventRequest.getData I instead made 
data required.  I did this because I had removed the type indicator from the 
message in the last patch, so it didn't make any sense to allow a event to be 
requested that had no type information.  The type of the union is now 
effectively that type.

I agree we should file a separate bug for the HCat append work.

> Add option to fire metastore event on insert
> 
>
> Key: HIVE-9273
> URL: https://issues.apache.org/jira/browse/HIVE-9273
> Project: Hive
>  Issue Type: New Feature
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-9273.2.patch, HIVE-9273.patch
>
>
> HIVE-9271 adds the ability for the client to request firing metastore events. 
>  This can be used in the MoveTask to fire events when an insert is done that 
> does not add partitions to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9273) Add option to fire metastore event on insert

2015-02-02 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-9273:
-
Status: Patch Available  (was: Open)

> Add option to fire metastore event on insert
> 
>
> Key: HIVE-9273
> URL: https://issues.apache.org/jira/browse/HIVE-9273
> Project: Hive
>  Issue Type: New Feature
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-9273.2.patch, HIVE-9273.patch
>
>
> HIVE-9271 adds the ability for the client to request firing metastore events. 
>  This can be used in the MoveTask to fire events when an insert is done that 
> does not add partitions to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9273) Add option to fire metastore event on insert

2015-02-02 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-9273:
-
Status: Open  (was: Patch Available)

> Add option to fire metastore event on insert
> 
>
> Key: HIVE-9273
> URL: https://issues.apache.org/jira/browse/HIVE-9273
> Project: Hive
>  Issue Type: New Feature
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-9273.patch
>
>
> HIVE-9271 adds the ability for the client to request firing metastore events. 
>  This can be used in the MoveTask to fire events when an insert is done that 
> does not add partitions to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-02-02 Thread Brock Noland


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review70694
---



ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java


hey sorry for being dumb, but it looks like many tests are bding deleted as 
part of this change. Is that true or are these duplicate tests or being tested 
elsewhere?


- Brock Noland


On Jan. 29, 2015, 5:12 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30281/
> ---
> 
> (Updated Jan. 29, 2015, 5:12 p.m.)
> 
> 
> Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
> 
> 
> Bugs: HIVE-9333
> https://issues.apache.org/jira/browse/HIVE-9333
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch moves the ParquetHiveSerDe.serialize() implementation to 
> DataWritableWriter class in order to save time in materializing data on 
> serialize().
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
>  ea4109d358f7c48d1e2042e5da299475de4a0a29 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
> 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
>  060b1b722d32f3b2f88304a1a73eb249e150294b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
>  41b5f1c3b0ab43f734f8a211e3e03d5060c75434 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
>  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
> a693aff18516d133abf0aae4847d3fe00b9f1c96 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
>  667d3671547190d363107019cd9a2d105d26d336 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
> 007a665529857bcec612f638a157aa5043562a15 
>   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30281/diff/
> 
> 
> Testing
> ---
> 
> The tests run were the following:
> 
> 1. JMH (Java microbenchmark)
> 
> This benchmark called parquet serialize/write methods using text writable 
> objects. 
> 
> Class.method  Before Change (ops/s)  After Change (ops/s) 
>   
> ---
> ParquetHiveSerDe.serialize:  19,113   249,528   ->  
> 19x speed increase
> DataWritableWriter.write: 5,033 5,201   ->  
> 3.34% speed increase
> 
> 
> 2. Write 20 million rows (~1GB file) from Text to Parquet
> 
> I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
> using the following
> statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;
> 
> Time (s) it took to write the whole file BEFORE changes: 93.758 s
> Time (s) it took to write the whole file AFTER changes: 83.903 s
> 
> It got a 10% of speed inscrease.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

[jira] [Commented] (HIVE-9550) ObjectStore.getNextNotification() can return events inside NotificationEventResponse as null which conflicts with its thrift "required" tag

2015-02-02 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302654#comment-14302654
 ] 

Alan Gates commented on HIVE-9550:
--

+1

> ObjectStore.getNextNotification() can return events inside 
> NotificationEventResponse as null which conflicts with its thrift "required" 
> tag
> ---
>
> Key: HIVE-9550
> URL: https://issues.apache.org/jira/browse/HIVE-9550
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9550.patch
>
>
> Per hive_metastore.thrift, the "events" list inside NotificationEventResponse 
> is a required field that cannot be null.
> {code}
> struct NotificationEventResponse {
> 1: required list events,
> }
> {code}
> However, per ObjectStore.java, this events field can be uninitialized if the 
> events retrieved from the metastore is empty instead of null:
> {code}
>   NotificationEventResponse result = new NotificationEventResponse();
>   int maxEvents = rqst.getMaxEvents() > 0 ? rqst.getMaxEvents() : 
> Integer.MAX_VALUE;
>   int numEvents = 0; 
>   while (i.hasNext() && numEvents++ < maxEvents) {
> result.addToEvents(translateDbToThrift(i.next()));
>   }
>   return result;
> {code}
> The fix is simple enough - we need to call result.setEvents(new 
> ArrayList()) before we begin the iteration to do 
> result.addToEvents(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9470) Use a generic writable object to run ColumnaStorageBench write/read tests

2015-02-02 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9470:
---
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Thank you Sergio! I have committed this to trunk!

> Use a generic writable object to run ColumnaStorageBench write/read tests 
> --
>
> Key: HIVE-9470
> URL: https://issues.apache.org/jira/browse/HIVE-9470
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: 1.2.0
>
> Attachments: HIVE-9470.1.patch, HIVE-9470.2.patch
>
>
> The ColumnarStorageBench benchmark class is using a Parquet writable object 
> to run all write/read/serialize/deserialize tests. It would be better to use 
> a more generic writable object (like text writables) to get better benchmark 
> results between format storages.
> Using parquet writables may add advantage when writing parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9303) Parquet files are written with incorrect definition levels

2015-02-02 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9303:
---
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Thank you Sergio! I have committed this to trunk

> Parquet files are written with incorrect definition levels
> --
>
> Key: HIVE-9303
> URL: https://issues.apache.org/jira/browse/HIVE-9303
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Skye Wanderman-Milne
>Assignee: Sergio Peña
> Fix For: 1.2.0
>
> Attachments: HIVE-9303.1.patch, HIVE-9303.1.patch
>
>
> The definition level, which determines which level of nesting is NULL, 
> appears to always be n or n-1, where n is the maximum definition level. This 
> means that only the innermost level of nesting can be NULL. This is only 
> relevant for Parquet files. For example:
> {code:sql}
> CREATE TABLE text_tbl (a STRUCT>)
> STORED AS TEXTFILE;
> INSERT OVERWRITE TABLE text_tbl
> SELECT IF(false, named_struct("b", named_struct("c", 1)), NULL)
> FROM tbl LIMIT 1;
> CREATE TABLE parq_tbl
> STORED AS PARQUET
> AS SELECT * FROM text_tbl;
> SELECT * FROM text_tbl;
> => NULL # right
> SELECT * FROM parq_tbl;
> => {"b":{"c":null}} # wrong
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8379) NanoTimeUtils performs some work needlessly

2015-02-02 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302648#comment-14302648
 ] 

Brock Noland commented on HIVE-8379:


[~spena] - this patch needs a rebase

> NanoTimeUtils performs some work needlessly
> ---
>
> Key: HIVE-8379
> URL: https://issues.apache.org/jira/browse/HIVE-8379
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-8379.1.patch
>
>
> Portions of the math done with the constants can be pre-computed:
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java#L70



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9522) Improve the speed of select count(*) statement for a parquet table with big input(~1GB)

2015-02-02 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9522:
---
Description: Note that this comparison must be done after calculating stats 
and setting the properties specified here: 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark:+Getting+Started

> Improve the speed of select count(*) statement for a parquet table with big 
> input(~1GB)
> ---
>
> Key: HIVE-9522
> URL: https://issues.apache.org/jira/browse/HIVE-9522
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> Note that this comparison must be done after calculating stats and setting 
> the properties specified here: 
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark:+Getting+Started



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9554) Rename 0.15 upgrade scripts to 1.1

2015-02-02 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302646#comment-14302646
 ] 

Alan Gates commented on HIVE-9554:
--

+1

> Rename 0.15 upgrade scripts to 1.1
> --
>
> Key: HIVE-9554
> URL: https://issues.apache.org/jira/browse/HIVE-9554
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: HIVE-9554.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9529) "alter table .. concatenate" under Tez mode should create TezTask

2015-02-02 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9529:

Attachment: HIVE-9529-branch-1.1.0.patch
HIVE-9529-branch-1.0.0.patch

Patches for branch-1.0.0 and branch-1.1.0. In case if someone wants it in 
future.

> "alter table .. concatenate" under Tez mode should create TezTask
> -
>
> Key: HIVE-9529
> URL: https://issues.apache.org/jira/browse/HIVE-9529
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0, 1.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9529-branch-1.0.0.patch, 
> HIVE-9529-branch-1.1.0.patch, HIVE-9529.1.patch, HIVE-9529.2.patch
>
>
> "alter table .. concatenate" DDL command creates MR task by default. When 
> hive cli is launched with execution engine as tez, the scheduling of the MR 
> task for file merging could be delayed until tez session expiration. This 
> happens because YARN will not have capacity to launch another AppMaster for 
> MR task. We should create tez task to overcome this. When the execution 
> engine is tez TezTask will be created else MRTask will be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7175) Provide password file option to beeline

2015-02-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7175:
---
Attachment: HIVE-7175.branch-13.patch

Patch for 0.13.0 in case anyone needs it.

> Provide password file option to beeline
> ---
>
> Key: HIVE-7175
> URL: https://issues.apache.org/jira/browse/HIVE-7175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 0.13.0
>Reporter: Robert Justice
>Assignee: Dr. Wendell Urth
>  Labels: features, security
> Fix For: 1.2.0
>
> Attachments: HIVE-7175.1.patch, HIVE-7175.branch-13.patch, 
> HIVE-7175.patch
>
>
> For people connecting to Hive Server 2 with LDAP authentication enabled, in 
> order to batch run commands, we currently have to provide the password openly 
> in the command line.   They could use some expect scripting, but I think a 
> valid improvement would be to provide a password file option similar to other 
> CLI commands in hadoop (e.g. sqoop) to be more secure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9552) Merge trunk to Spark branch 2/2/2015 [Spark Branch]

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302624#comment-14302624
 ] 

Hive QA commented on HIVE-9552:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696047/HIVE-9552.3-spark.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 7468 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_explode2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_external_time
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_read_backward_compatible_files
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/706/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/706/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-706/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696047 - PreCommit-HIVE-SPARK-Build

> Merge trunk to Spark branch 2/2/2015 [Spark Branch]
> ---
>
> Key: HIVE-9552
> URL: https://issues.apache.org/jira/browse/HIVE-9552
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9552.3-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9549) Include missing directories in source tarball

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302620#comment-14302620
 ] 

Hive QA commented on HIVE-9549:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696004/HIVE-9549.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7420 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2620/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2620/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2620/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696004 - PreCommit-HIVE-TRUNK-Build

> Include missing directories in source tarball
> -
>
> Key: HIVE-9549
> URL: https://issues.apache.org/jira/browse/HIVE-9549
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: HIVE-9549.patch, HIVE-9549.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9521) Drop support for Java6

2015-02-02 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9521:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

JIRA already had two +1's. So committed it to trunk. Thanks [~ndimiduk]!

> Drop support for Java6
> --
>
> Key: HIVE-9521
> URL: https://issues.apache.org/jira/browse/HIVE-9521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 1.2.0
>
> Attachments: HIVE-9521.00.patch
>
>
> As logical continuation of HIVE-4583, let's start using java7 syntax as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-02 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302613#comment-14302613
 ] 

Navis commented on HIVE-6099:
-

[~ashutoshc] Could we remove this optimization? I'm sure this is not valid from 
the start.

> Multi insert does not work properly with distinct count
> ---
>
> Key: HIVE-6099
> URL: https://issues.apache.org/jira/browse/HIVE-6099
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0
>Reporter: Pavan Gadam Manohar
>Assignee: Navis
>  Labels: count, distinct, insert, multi-insert
> Attachments: explain_hive_0.10.0.txt, with_disabled.txt, 
> with_enabled.txt
>
>
> Need 2 rows to reproduce this Bug. Here are the steps.
> Step 1) Create a table Table_A
> CREATE EXTERNAL TABLE Table_A
> (
> user string
> , type int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//Table_A';
> Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
> 111 and 123. Insert 2 records into the table created above.
> select * from  Table_A;
> hive>  select * from table_a;
> OK
> tommy   123 2013-12-02
> tommy   111 2013-12-02
> Step 3) Create 2 destination tables to simulate multi-insert.
> CREATE EXTERNAL TABLE dest_Table_A
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_A';
>  
> CREATE EXTERNAL TABLE dest_Table_B
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_B';
> Step 4) Multi insert statement
> from Table_A a
> INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
>  
> INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
> ;
>  
> Step 5) Verify results.
> hive>  select * from dest_table_a;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.116 seconds
> hive>  select * from dest_table_b;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.13 seconds
> Conclusion: Hive gives a count of 2 for distinct users although there is 
> only one distinct user. After trying many datasets observed that Hive is 
> doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
> hive> select count(distinct a.user) from table_a a;
> Gives:
> Total MapReduce CPU Time Spent: 4 seconds 350 msec
> OK
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.

2015-02-02 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302595#comment-14302595
 ] 

Lefty Leverenz commented on HIVE-9500:
--

Doc note:  "hive.serialization.extend.nesting.levels" and 
"hive.serialization.extend.additional.nesting.levels" are SerDe properties, not 
HiveConf properties.  Their documentation should go in the Hive Types wikidoc, 
either in a new section for structs or in the existing "Complex Types" section, 
and perhaps in the DDL wikidoc.  (One of these days we should add a section on 
SerDe properties to the Hive SerDes wikidoc.)

* [Hive Data Types -- Complex Types | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-ComplexTypes]
* [DDL -- Create Table -- Row Format, Storage Format, and SerDe | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDe]
* [DDL -- Alter Table -- Add SerDe Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties]
* [Hive SerDes | https://cwiki.apache.org/confluence/display/Hive/SerDe]

Javadoc descriptions for the properties are in 
serde/src/java/org/apache/hadoop/hive/serde2/SerDeParameters.java:

{code}
+   /**
+* To be backward-compatible, initialize the first 3 separators to 
+   * be the given values. Default number of separators to be 8; If only
+   * hive.serialization.extend.nesting.levels is set, extend the number of
+   * separators to be 24; if 
hive.serialization.extend.additional.nesting.levels
+   * is set, extend the number of separators to 154.
+* @param tbl
+*/
{code}

Editorial review:  Please align the asterisks and change the first semicolon to 
a period ("to be 8; If only...").

> Support nested structs over 24 levels.
> --
>
> Key: HIVE-9500
> URL: https://issues.apache.org/jira/browse/HIVE-9500
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: SerDe
> Fix For: 1.2.0
>
> Attachments: HIVE-9500.1.patch
>
>
> Customer has deeply nested avro structure and is receiving the following 
> error when performing queries.
> 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
> supported for LazySimpleSerde is 23 Unable to work with level 24
> Currently we support up to 24 levels of nested structs when 
> hive.serialization.extend.nesting.levels is set to true, while the customers 
> have the requirement to support more than that. 
> It would be better to make the supported levels configurable or completely 
> removed (i.e., we can support any number of levels). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9521) Drop support for Java6

2015-02-02 Thread Nick Dimiduk (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302555#comment-14302555
 ] 

Nick Dimiduk commented on HIVE-9521:


Test failure looks unrelated and console output looks clean. Can I get a 
+1/commit? :)

> Drop support for Java6
> --
>
> Key: HIVE-9521
> URL: https://issues.apache.org/jira/browse/HIVE-9521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 1.2.0
>
> Attachments: HIVE-9521.00.patch
>
>
> As logical continuation of HIVE-4583, let's start using java7 syntax as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7175) Provide password file option to beeline

2015-02-02 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302548#comment-14302548
 ] 

Thejas M Nair commented on HIVE-7175:
-

+1
Thanks [~wendell.urth] and [~vgumashta]!


> Provide password file option to beeline
> ---
>
> Key: HIVE-7175
> URL: https://issues.apache.org/jira/browse/HIVE-7175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 0.13.0
>Reporter: Robert Justice
>Assignee: Dr. Wendell Urth
>  Labels: features, security
> Fix For: 1.2.0
>
> Attachments: HIVE-7175.1.patch, HIVE-7175.patch
>
>
> For people connecting to Hive Server 2 with LDAP authentication enabled, in 
> order to batch run commands, we currently have to provide the password openly 
> in the command line.   They could use some expect scripting, but I think a 
> valid improvement would be to provide a password file option similar to other 
> CLI commands in hadoop (e.g. sqoop) to be more secure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 30532: Provide password file option to beeline

2015-02-02 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30532/#review70661
---

Ship it!


Ship It!

- Thejas Nair


On Feb. 3, 2015, 12:11 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30532/
> ---
> 
> (Updated Feb. 3, 2015, 12:11 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-7175
> https://issues.apache.org/jira/browse/HIVE-7175
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-7175
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 630ead4 
>   beeline/src/main/resources/BeeLine.properties d038d46 
>   beeline/src/test/org/apache/hive/beeline/TestBeelineArgParsing.java a6ee93a 
> 
> Diff: https://reviews.apache.org/r/30532/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>

[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-02-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
-
Description: 
Given a table FOO(a int, b int, c int), ANSI SQL supports insert into FOO(c,b) 
select x,y from T.  The expectation is that 'x' is written to column 'c' and 
'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' is 
NULLABLE.

Hive does not support this.  In Hive one has to ensure that the data producing 
statement has a schema that matches target table schema.

Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
target schema is explicitly provided, missing columns will be set to NULL if 
they are NULLABLE, otherwise an error will be raised.

If/when DEFAULT clause is supported, this can be enhanced to set default value 
rather than NULL.

Thus, given {noformat}
create table source (a int, b int);
create table target (x int, y int, z int);
{noformat}
{noformat}insert into target(y,z) select * from source;{noformat}
will mean 
{noformat}insert into target select null as x, a, b from source;{noformat}
and 
{noformat}insert into target(z,y) select * from source;{noformat}
will meant 
{noformat}insert into target select null as x, b, a from source;{noformat}


  was:
Given a table FOO(a int, b int, c int), ANSI SQL supports insert into FOO(c,b) 
select x,y from T.  The expectation is that 'x' is written to column 'c' and 
'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' is 
NULLABLE.

Hive does not support this.  In Hive one has to ensure that the data producing 
statement has a schema that matches target table schema.

Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
target schema is explicitly provided, missing columns will be set to NULL if 
they are NULLABLE, otherwise an error will be raised.

If/when DEFAULT clause is supported, this can be enhanced to set default value 
rather than NULL.


> allow column list specification in INSERT statement
> ---
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor, SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7175) Provide password file option to beeline

2015-02-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7175:
---
Attachment: HIVE-7175.1.patch

Patch based on trunk.

cc [~thejas]

> Provide password file option to beeline
> ---
>
> Key: HIVE-7175
> URL: https://issues.apache.org/jira/browse/HIVE-7175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 0.13.0
>Reporter: Robert Justice
>Assignee: Dr. Wendell Urth
>  Labels: features, security
> Fix For: 1.2.0
>
> Attachments: HIVE-7175.1.patch, HIVE-7175.patch
>
>
> For people connecting to Hive Server 2 with LDAP authentication enabled, in 
> order to batch run commands, we currently have to provide the password openly 
> in the command line.   They could use some expect scripting, but I think a 
> valid improvement would be to provide a password file option similar to other 
> CLI commands in hadoop (e.g. sqoop) to be more secure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 30532: Provide password file option to beeline

2015-02-02 Thread Vaibhav Gumashta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30532/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-7175
https://issues.apache.org/jira/browse/HIVE-7175


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-7175


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 630ead4 
  beeline/src/main/resources/BeeLine.properties d038d46 
  beeline/src/test/org/apache/hive/beeline/TestBeelineArgParsing.java a6ee93a 

Diff: https://reviews.apache.org/r/30532/diff/


Testing
---


Thanks,

Vaibhav Gumashta

[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302505#comment-14302505
 ] 

Hive QA commented on HIVE-9500:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696001/HIVE-9500.1.patch

{color:red}ERROR:{color} -1 due to 154 failed/errored test(s), 7421 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ma

[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index

2015-02-02 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302507#comment-14302507
 ] 

Owen O'Malley commented on HIVE-9188:
-

Suggestions:
* Pick m to always be a multiple of 64 (since you are using longs are the 
representation)
* change the representation of BloomFilter in orc_proto to record the number of 
hash functions and not the size or fpp.
* use fixed64 for the bit field
* you'll also need to update the specification in the wiki with the change to 
the format 
(https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-orc-specORCFormatSpecification)
* revert the spurious change to CliDriver.java
* revert the spurious change to .gitignore
* it seems suboptimal to convert long values to bytes before hashing


> BloomFilter in ORC row group index
> --
>
> Key: HIVE-9188
> URL: https://issues.apache.org/jira/browse/HIVE-9188
> Project: Hive
>  Issue Type: New Feature
>  Components: File Formats
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: orcfile
> Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, 
> HIVE-9188.4.patch, HIVE-9188.5.patch, HIVE-9188.6.patch
>
>
> BloomFilters are well known probabilistic data structure for set membership 
> checking. We can use bloom filters in ORC index for better row group pruning. 
> Currently, ORC row group index uses min/max statistics to eliminate row 
> groups (stripes as well) that do not satisfy predicate condition specified in 
> the query. But in some cases, the efficiency of min/max based elimination is 
> not optimal (unsorted columns with wide range of entries). Bloom filters can 
> be an effective and efficient alternative for row group/split elimination for 
> point queries or queries with IN clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9552) Merge trunk to Spark branch 2/2/2015 [Spark Branch]

2015-02-02 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9552:
--
Attachment: HIVE-9552.3-spark.patch

> Merge trunk to Spark branch 2/2/2015 [Spark Branch]
> ---
>
> Key: HIVE-9552
> URL: https://issues.apache.org/jira/browse/HIVE-9552
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9552.3-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9552) Merge trunk to Spark branch 2/2/2015 [Spark Branch]

2015-02-02 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9552:
--
Attachment: (was: HIVE-9552.1-spark.patch)

> Merge trunk to Spark branch 2/2/2015 [Spark Branch]
> ---
>
> Key: HIVE-9552
> URL: https://issues.apache.org/jira/browse/HIVE-9552
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9552.3-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9555) assorted ORC refactorings for LLAP on trunk

2015-02-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302337#comment-14302337
 ] 

Sergey Shelukhin commented on HIVE-9555:


[~prasanth_j] can you take a look? RB is broken for me, cannot upload yet

> assorted ORC refactorings for LLAP on trunk
> ---
>
> Key: HIVE-9555
> URL: https://issues.apache.org/jira/browse/HIVE-9555
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9555.patch
>
>
> To minimize conflicts and given that ORC is being developed rapidly on trunk, 
> I would like to refactor some parts of ORC "in advance" based on the changes 
> in LLAP branch. Mostly it concerns making parts of ORC code (esp. SARG, but 
> also some internal methods) more modular and easier to use from alternative 
> codepaths. There's also significant change to how data reading is handled - 
> BufferChunk inherits from DiskRange; the reader receives a list of 
> DiskRange-s (as before), but instead of making a list of buffer chunks it 
> replaces ranges with buffer chunks in the original (linked) list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9456) Make Hive support unicode with MSSQL as Metastore backend

2015-02-02 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302332#comment-14302332
 ] 

Brock Noland commented on HIVE-9456:


bq. 1.1->1.2 with this

yes this option makes sense

> Make Hive support unicode with MSSQL as Metastore backend
> -
>
> Key: HIVE-9456
> URL: https://issues.apache.org/jira/browse/HIVE-9456
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9456.1.patch
>
>
> There are significant issues when Hive uses MSSQL as metastore backend to 
> support unicode, since MSSQL handles varchar and nvarchar datatypes 
> differently. Hive 0.14 metastore mssql script DDL was using varchar as 
> datatype, which can't handle multi-bytes/unicode characters, e.g., Chinese 
> chars. This JIRA is going to track implementation of unicode support in that 
> case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9554) Rename 0.15 upgrade scripts to 1.1

2015-02-02 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9554:
---
Status: Patch Available  (was: Open)

> Rename 0.15 upgrade scripts to 1.1
> --
>
> Key: HIVE-9554
> URL: https://issues.apache.org/jira/browse/HIVE-9554
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: HIVE-9554.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9555) assorted ORC refactorings for LLAP on trunk

2015-02-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9555:
---
Status: Patch Available  (was: Open)

> assorted ORC refactorings for LLAP on trunk
> ---
>
> Key: HIVE-9555
> URL: https://issues.apache.org/jira/browse/HIVE-9555
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9555.patch
>
>
> To minimize conflicts and given that ORC is being developed rapidly on trunk, 
> I would like to refactor some parts of ORC "in advance" based on the changes 
> in LLAP branch. Mostly it concerns making parts of ORC code (esp. SARG, but 
> also some internal methods) more modular and easier to use from alternative 
> codepaths. There's also significant change to how data reading is handled - 
> BufferChunk inherits from DiskRange; the reader receives a list of 
> DiskRange-s (as before), but instead of making a list of buffer chunks it 
> replaces ranges with buffer chunks in the original (linked) list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9554) Rename 0.15 upgrade scripts to 1.1

2015-02-02 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302329#comment-14302329
 ] 

Brock Noland commented on HIVE-9554:


Attached patch updates:

1) hive.shortname in pom.xml
2) Renames scripts from 0.15 to 1.1
3) Updates VERSION table to 1.1

> Rename 0.15 upgrade scripts to 1.1
> --
>
> Key: HIVE-9554
> URL: https://issues.apache.org/jira/browse/HIVE-9554
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: HIVE-9554.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9555) assorted ORC refactorings for LLAP on trunk

2015-02-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9555:
---
Attachment: HIVE-9555.patch

Patch. Will also help finding bugs with these changes...

> assorted ORC refactorings for LLAP on trunk
> ---
>
> Key: HIVE-9555
> URL: https://issues.apache.org/jira/browse/HIVE-9555
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9555.patch
>
>
> To minimize conflicts and given that ORC is being developed rapidly on trunk, 
> I would like to refactor some parts of ORC "in advance" based on the changes 
> in LLAP branch. Mostly it concerns making parts of ORC code (esp. SARG, but 
> also some internal methods) more modular and easier to use from alternative 
> codepaths. There's also significant change to how data reading is handled - 
> BufferChunk inherits from DiskRange; the reader receives a list of 
> DiskRange-s (as before), but instead of making a list of buffer chunks it 
> replaces ranges with buffer chunks in the original (linked) list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9456) Make Hive support unicode with MSSQL as Metastore backend

2015-02-02 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302328#comment-14302328
 ] 

Thejas M Nair commented on HIVE-9456:
-

fyi, 0.14 to 1.0 upgrade is a no-op,as the metastore schema did not change 
between those releases. I made this change in HIVE-9514 .

> Make Hive support unicode with MSSQL as Metastore backend
> -
>
> Key: HIVE-9456
> URL: https://issues.apache.org/jira/browse/HIVE-9456
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9456.1.patch
>
>
> There are significant issues when Hive uses MSSQL as metastore backend to 
> support unicode, since MSSQL handles varchar and nvarchar datatypes 
> differently. Hive 0.14 metastore mssql script DDL was using varchar as 
> datatype, which can't handle multi-bytes/unicode characters, e.g., Chinese 
> chars. This JIRA is going to track implementation of unicode support in that 
> case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9554) Rename 0.15 upgrade scripts to 1.1

2015-02-02 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9554:
---
Attachment: HIVE-9554.1.patch

> Rename 0.15 upgrade scripts to 1.1
> --
>
> Key: HIVE-9554
> URL: https://issues.apache.org/jira/browse/HIVE-9554
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: HIVE-9554.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9555) assorted ORC refactorings for LLAP on trunk

2015-02-02 Thread Sergey Shelukhin (JIRA)

Sergey Shelukhin created HIVE-9555:
--

 Summary: assorted ORC refactorings for LLAP on trunk
 Key: HIVE-9555
 URL: https://issues.apache.org/jira/browse/HIVE-9555
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


To minimize conflicts and given that ORC is being developed rapidly on trunk, I 
would like to refactor some parts of ORC "in advance" based on the changes in 
LLAP branch. Mostly it concerns making parts of ORC code (esp. SARG, but also 
some internal methods) more modular and easier to use from alternative 
codepaths. There's also significant change to how data reading is handled - 
BufferChunk inherits from DiskRange; the reader receives a list of DiskRange-s 
(as before), but instead of making a list of buffer chunks it replaces ranges 
with buffer chunks in the original (linked) list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9456) Make Hive support unicode with MSSQL as Metastore backend

2015-02-02 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302324#comment-14302324
 ] 

Brock Noland commented on HIVE-9456:


Thank you very much. HIVE-9554 should go into trunk and then this patch and be 
done on top of it. I will have a patch there shortly.

> Make Hive support unicode with MSSQL as Metastore backend
> -
>
> Key: HIVE-9456
> URL: https://issues.apache.org/jira/browse/HIVE-9456
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9456.1.patch
>
>
> There are significant issues when Hive uses MSSQL as metastore backend to 
> support unicode, since MSSQL handles varchar and nvarchar datatypes 
> differently. Hive 0.14 metastore mssql script DDL was using varchar as 
> datatype, which can't handle multi-bytes/unicode characters, e.g., Chinese 
> chars. This JIRA is going to track implementation of unicode support in that 
> case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9554) Rename 0.15 upgrade scripts to 1.1

2015-02-02 Thread Brock Noland (JIRA)

Brock Noland created HIVE-9554:
--

 Summary: Rename 0.15 upgrade scripts to 1.1
 Key: HIVE-9554
 URL: https://issues.apache.org/jira/browse/HIVE-9554
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9554) Rename 0.15 upgrade scripts to 1.1

2015-02-02 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9554:
---
Affects Version/s: 1.1.0
   1.2.0
Fix Version/s: 1.1.0

> Rename 0.15 upgrade scripts to 1.1
> --
>
> Key: HIVE-9554
> URL: https://issues.apache.org/jira/browse/HIVE-9554
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.0, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Blocker
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 30472: HIVE-9520 NEXT_DAY udf

2015-02-02 Thread Alexander Pivovarov


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30472/
---

(Updated Feb. 2, 2015, 11:33 p.m.)


Review request for hive, Jason Dere and Thejas Nair.


Bugs: HIVE-9520
https://issues.apache.org/jira/browse/HIVE-9520


Repository: hive-git


Description (updated)
---

Example
select next_day('2015-02-02','Fri') ...;
OK
2015-02-06


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
23d77ca4cc2e2a44b62f62ddbd4826df092bcfe8 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFNextDay.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFNextDay.java 
PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_next_day_error_1.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_next_day_error_2.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_next_day_error_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_next_day.q PRE-CREATION 
  ql/src/test/results/clientnegative/udf_next_day_error_1.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/udf_next_day_error_2.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/udf_next_day_error_3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 
36c8743a61c55a714352d358a5d9cc0deb4cef2c 
  ql/src/test/results/clientpositive/udf_next_day.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/30472/diff/


Testing
---


Thanks,

Alexander Pivovarov

Re: Review Request 30472: HIVE-9520 NEXT_DAY udf

2015-02-02 Thread Alexander Pivovarov


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30472/
---

(Updated Feb. 2, 2015, 11:31 p.m.)


Review request for hive, Jason Dere and Thejas Nair.


Bugs: HIVE-9520
https://issues.apache.org/jira/browse/HIVE-9520


Repository: hive-git


Description (updated)
---

Example
select next_day('2001-02-02','TUESDAY') ...;
OK
2001-02-06


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
23d77ca4cc2e2a44b62f62ddbd4826df092bcfe8 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFNextDay.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFNextDay.java 
PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_next_day_error_1.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_next_day_error_2.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_next_day_error_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_next_day.q PRE-CREATION 
  ql/src/test/results/clientnegative/udf_next_day_error_1.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/udf_next_day_error_2.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/udf_next_day_error_3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 
36c8743a61c55a714352d358a5d9cc0deb4cef2c 
  ql/src/test/results/clientpositive/udf_next_day.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/30472/diff/


Testing
---


Thanks,

Alexander Pivovarov

[jira] [Commented] (HIVE-9456) Make Hive support unicode with MSSQL as Metastore backend

2015-02-02 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302308#comment-14302308
 ] 

Sushanth Sowmyan commented on HIVE-9456:


+1 to content, that looks reasonable to me.

There is one other thing to take care of though, before committing this. 
[~vikram.dixit]/[~brocknoland]/[~alangates] : given that it's not 0.15 anymore 
but 1.2, what's the plan for db upgrade scripts? Should we have a 0.14->1.0 
without this, a 1.0->1.1 without this, and 1.1->1.2 with this?

> Make Hive support unicode with MSSQL as Metastore backend
> -
>
> Key: HIVE-9456
> URL: https://issues.apache.org/jira/browse/HIVE-9456
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HIVE-9456.1.patch
>
>
> There are significant issues when Hive uses MSSQL as metastore backend to 
> support unicode, since MSSQL handles varchar and nvarchar datatypes 
> differently. Hive 0.14 metastore mssql script DDL was using varchar as 
> datatype, which can't handle multi-bytes/unicode characters, e.g., Chinese 
> chars. This JIRA is going to track implementation of unicode support in that 
> case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-02 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9350:

Status: Patch Available  (was: Open)

> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-9350.1.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9143) select user(), current_user()

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9143:
--
Attachment: HIVE-9143.3.patch

patch #3

> select user(), current_user()
> -
>
> Key: HIVE-9143
> URL: https://issues.apache.org/jira/browse/HIVE-9143
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Hari Sekhon
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch, HIVE-9143.3.patch
>
>
> Feature request to add support for determining in HQL session which user I am 
> currently connected as - an old MySQL ability:
> {code}mysql> select user(), current_user();
> +++
> | user() | current_user() |
> +++
> | root@localhost | root@localhost |
> +++
> 1 row in set (0.00 sec)
> {code}
> which doesn't seem to have a counterpart in Hive at this time:
> {code}0: jdbc:hive2://:100> select user();
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Invalid function 'user' (state=42000,code=4)
> 0: jdbc:hive2://:100> select current_user();
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10011]: Line 1:7 Invalid function 'current_user' 
> (state=42000,code=10011){code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9529) "alter table .. concatenate" under Tez mode should create TezTask

2015-02-02 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302091#comment-14302091
 ] 

Gunther Hagleitner commented on HIVE-9529:
--

+1

> "alter table .. concatenate" under Tez mode should create TezTask
> -
>
> Key: HIVE-9529
> URL: https://issues.apache.org/jira/browse/HIVE-9529
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0, 1.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9529.1.patch, HIVE-9529.2.patch
>
>
> "alter table .. concatenate" DDL command creates MR task by default. When 
> hive cli is launched with execution engine as tez, the scheduling of the MR 
> task for file merging could be delayed until tez session expiration. This 
> happens because YARN will not have capacity to launch another AppMaster for 
> MR task. We should create tez task to overcome this. When the execution 
> engine is tez TezTask will be created else MRTask will be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9143) select user(), current_user()

2015-02-02 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9143:
--
Status: In Progress  (was: Patch Available)

> select user(), current_user()
> -
>
> Key: HIVE-9143
> URL: https://issues.apache.org/jira/browse/HIVE-9143
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Hari Sekhon
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch
>
>
> Feature request to add support for determining in HQL session which user I am 
> currently connected as - an old MySQL ability:
> {code}mysql> select user(), current_user();
> +++
> | user() | current_user() |
> +++
> | root@localhost | root@localhost |
> +++
> 1 row in set (0.00 sec)
> {code}
> which doesn't seem to have a counterpart in Hive at this time:
> {code}0: jdbc:hive2://:100> select user();
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Invalid function 'user' (state=42000,code=4)
> 0: jdbc:hive2://:100> select current_user();
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10011]: Line 1:7 Invalid function 'current_user' 
> (state=42000,code=10011){code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 30487: HIVE-9143 impl current_user() udf

2015-02-02 Thread Alexander Pivovarov


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30487/
---

(Updated Feb. 2, 2015, 10:42 p.m.)


Review request for hive.


Changes
---

fixed 4 issues found by Thejas


Bugs: HIVE-9143
https://issues.apache.org/jira/browse/HIVE-9143


Repository: hive-git


Description
---

HIVE-9143 impl current_user() udf


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
bfb4dc292b6b9f1ee342bb2e28b2afc722bb3167 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentUser.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_current_user.q PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 
e21b54bd6c3b5f51eb45c733bbe838fd78abe641 
  ql/src/test/results/clientpositive/udf_current_user.q.out PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/Utils.java 
c851dc2cb28876aef77811ead397429a2338cde4 

Diff: https://reviews.apache.org/r/30487/diff/


Testing
---


Thanks,

Alexander Pivovarov

[jira] [Commented] (HIVE-9550) ObjectStore.getNextNotification() can return events inside NotificationEventResponse as null which conflicts with its thrift "required" tag

2015-02-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302083#comment-14302083
 ] 

Hive QA commented on HIVE-9550:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12696000/HIVE-9550.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7420 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2618/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2618/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2618/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12696000 - PreCommit-HIVE-TRUNK-Build

> ObjectStore.getNextNotification() can return events inside 
> NotificationEventResponse as null which conflicts with its thrift "required" 
> tag
> ---
>
> Key: HIVE-9550
> URL: https://issues.apache.org/jira/browse/HIVE-9550
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9550.patch
>
>
> Per hive_metastore.thrift, the "events" list inside NotificationEventResponse 
> is a required field that cannot be null.
> {code}
> struct NotificationEventResponse {
> 1: required list events,
> }
> {code}
> However, per ObjectStore.java, this events field can be uninitialized if the 
> events retrieved from the metastore is empty instead of null:
> {code}
>   NotificationEventResponse result = new NotificationEventResponse();
>   int maxEvents = rqst.getMaxEvents() > 0 ? rqst.getMaxEvents() : 
> Integer.MAX_VALUE;
>   int numEvents = 0; 
>   while (i.hasNext() && numEvents++ < maxEvents) {
> result.addToEvents(translateDbToThrift(i.next()));
>   }
>   return result;
> {code}
> The fix is simple enough - we need to call result.setEvents(new 
> ArrayList()) before we begin the iteration to do 
> result.addToEvents(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9517) UNION ALL query failed with ArrayIndexOutOfBoundsException [Spark Branch]

2015-02-02 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9517:
---
Status: Patch Available  (was: Open)

> UNION ALL query failed with ArrayIndexOutOfBoundsException [Spark Branch]
> -
>
> Key: HIVE-9517
> URL: https://issues.apache.org/jira/browse/HIVE-9517
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9517.1.patch
>
>
> I was running a query from cbo_gby_empty.q:
> {code}
> select unionsrc.key, unionsrc.value FROM (select 'max' as key, max(c_int) as 
> value from cbo_t3 s1
>   UNION  ALL
>   select 'min' as key,  min(c_int) as value from cbo_t3 s2
> UNION ALL
> select 'avg' as key,  avg(c_int) as value from cbo_t3 s3) unionsrc 
> order by unionsrc.key;
> {code}
> and got the following exception:
> {noformat}
> 2015-01-29 15:57:55,948 ERROR [Executor task launch worker-1]: 
> spark.SparkReduceRecordHandler 
> (SparkReduceRecordHandler.java:processRow(299)) - Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) {"key":{"reducesinkkey0":"max"},"value":{"_col0":1.5}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) {"key":{"reducesinkkey0":"max"},"value":{"_col0":1.5}}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:339)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> VALUE._col0
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:330)
>   ... 17 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:84)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryDouble.init(LazyBinaryDouble.java:43)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9517) UNION ALL query failed with ArrayIndexOutOfBoundsException [Spark Branch]

2015-02-02 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9517:
---
Attachment: HIVE-9517.1.patch

> UNION ALL query failed with ArrayIndexOutOfBoundsException [Spark Branch]
> -
>
> Key: HIVE-9517
> URL: https://issues.apache.org/jira/browse/HIVE-9517
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9517.1.patch
>
>
> I was running a query from cbo_gby_empty.q:
> {code}
> select unionsrc.key, unionsrc.value FROM (select 'max' as key, max(c_int) as 
> value from cbo_t3 s1
>   UNION  ALL
>   select 'min' as key,  min(c_int) as value from cbo_t3 s2
> UNION ALL
> select 'avg' as key,  avg(c_int) as value from cbo_t3 s3) unionsrc 
> order by unionsrc.key;
> {code}
> and got the following exception:
> {noformat}
> 2015-01-29 15:57:55,948 ERROR [Executor task launch worker-1]: 
> spark.SparkReduceRecordHandler 
> (SparkReduceRecordHandler.java:processRow(299)) - Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) {"key":{"reducesinkkey0":"max"},"value":{"_col0":1.5}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) {"key":{"reducesinkkey0":"max"},"value":{"_col0":1.5}}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:339)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> VALUE._col0
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:330)
>   ... 17 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:84)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryDouble.init(LazyBinaryDouble.java:43)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-02 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9553:
---
Affects Version/s: 0.14.0

> Fix log-line in Partition Pruner
> 
>
> Key: HIVE-9553
> URL: https://issues.apache.org/jira/browse/HIVE-9553
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Trivial
> Attachments: HIVE-9553.1.patch
>
>
> Minor issue in logging the prune-expression in the PartitionPruner:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr);
> {code}
> Given the operator precedence order, this should read:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 3 >

1 - 100 of 223 matches

Mail list logo