[jira] [Commented] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151855#comment-15151855
 ] 

Prasanth Jayachandran commented on HIVE-13083:
--

Master patch is not exactly related. But the same testcase failed with NPE as 
HiveDecimalWriteable can be null. Null check is implicit for HiveDecimal but 
not for writeable variant. 

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Yi Zhang
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch, HIVE-13083.1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13015) Bundle Log4j2 jars with hive-exec

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-13015.
--
Resolution: Fixed

> Bundle Log4j2 jars with hive-exec
> -
>
> Key: HIVE-13015
> URL: https://issues.apache.org/jira/browse/HIVE-13015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.1.0
>
> Attachments: HIVE-13015.1.patch, HIVE-13015.1.patch, 
> HIVE-13015.2.patch, HIVE-13015.3.patch
>
>
> In some of the recent test runs, we are seeing multiple bindings for SLF4j 
> that causes issues with LOG4j2 logger. 
> {code}
> SLF4J: Found binding in 
> [jar:file:/grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1454694331819_0001/container_e06_1454694331819_0001_01_02/app/install/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> {code}
> We have added explicit exclusions for slf4j-log4j12 but some library is 
> pulling it transitively and it's getting packaged with hive libs. Also hive 
> currently uses version 1.7.5 for slf4j. We should add dependency convergence 
> for sl4fj and also remove packaging of slf4j-log4j12.*.jar 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151854#comment-15151854
 ] 

Gopal V commented on HIVE-13083:


I understand the branch-1 patch - basically an invalid/empty string being 
parsed as null will not write the isPresent.

The master patch needs more explanation for me.

{code}
+  } else {
+vector[elementNum].set(hiveDec);
+  }
{code}

definitely needs an isNull[elementNum] = false; (see HIVE-12827).

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Yi Zhang
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch, HIVE-13083.1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13015) Bundle Log4j2 jars with hive-exec

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13015:
-
Fix Version/s: 2.1.0
   Status: In Progress  (was: Patch Available)

Committed to master. [~sershe] Can we get this in for 2.0.1?

> Bundle Log4j2 jars with hive-exec
> -
>
> Key: HIVE-13015
> URL: https://issues.apache.org/jira/browse/HIVE-13015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.1.0
>
> Attachments: HIVE-13015.1.patch, HIVE-13015.1.patch, 
> HIVE-13015.2.patch, HIVE-13015.3.patch
>
>
> In some of the recent test runs, we are seeing multiple bindings for SLF4j 
> that causes issues with LOG4j2 logger. 
> {code}
> SLF4J: Found binding in 
> [jar:file:/grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1454694331819_0001/container_e06_1454694331819_0001_01_02/app/install/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> {code}
> We have added explicit exclusions for slf4j-log4j12 but some library is 
> pulling it transitively and it's getting packaged with hive libs. Also hive 
> currently uses version 1.7.5 for slf4j. We should add dependency convergence 
> for sl4fj and also remove packaging of slf4j-log4j12.*.jar 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13070) Precommit HMS tests should run in addition to precommit normal tests, not instead of

2016-02-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151837#comment-15151837
 ] 

Hive QA commented on HIVE-13070:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788230/HIVE-13070.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9775 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_stats_only_null
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_only_null
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7015/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7015/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7015/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788230 - PreCommit-HIVE-TRUNK-Build

> Precommit HMS tests should run in addition to precommit normal tests, not 
> instead of
> 
>
> Key: HIVE-13070
> URL: https://issues.apache.org/jira/browse/HIVE-13070
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13070.patch
>
>
> When a certain patch makes changes in the metastore upgrade scripts folder, 
> precommit HMS tests are triggered. The problem is that precommit HMS marks 
> the patch as tested, thus normal precommit tests are never triggered.
> I hit the issue while testing HIVE-12994.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13083:
-
Reporter: Yi Zhang  (was: Prasanth Jayachandran)

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Yi Zhang
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch, HIVE-13083.1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151827#comment-15151827
 ] 

Prasanth Jayachandran commented on HIVE-13083:
--

[~gopalv] Can you please review this patch? [~sershe] Can we get this for 2.0.1?

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch, HIVE-13083.1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13083:
-
Attachment: HIVE-13083.1.patch

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch, HIVE-13083.1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13083:
-
Status: Patch Available  (was: Open)

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.0.0, 0.14.0, 0.13.0, 1.3.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch, HIVE-13083.1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12064) prevent transactional=false

2016-02-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12064:
-
Attachment: HIVE-12064.5.patch

Thanks [~alangates] for the review. Made the two changes based on the comment.

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-12064.2.patch, HIVE-12064.3.patch, 
> HIVE-12064.4.patch, HIVE-12064.5.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12064) prevent transactional=false

2016-02-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12064:
-
Status: Patch Available  (was: Open)

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-12064.2.patch, HIVE-12064.3.patch, 
> HIVE-12064.4.patch, HIVE-12064.5.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12064) prevent transactional=false

2016-02-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12064:
-
Status: Open  (was: Patch Available)

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-12064.2.patch, HIVE-12064.3.patch, 
> HIVE-12064.4.patch, HIVE-12064.5.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13083:
-
Attachment: HIVE-13083-branch-1.patch

> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13083) Writing HiveDecimal to ORC can wrongly suppress present stream

2016-02-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151805#comment-15151805
 ] 

Prasanth Jayachandran commented on HIVE-13083:
--

[~owen.omalley] fyi..


> Writing HiveDecimal to ORC can wrongly suppress present stream
> --
>
> Key: HIVE-13083
> URL: https://issues.apache.org/jira/browse/HIVE-13083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13083-branch-1.patch
>
>
> HIVE-3976 can cause ORC file to be unreadable. The changes introduced in 
> HIVE-3976 for DecimalTreeWriter can create null values after updating the 
> isPresent stream. 
> https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#L1337
> As result of the above return statement, isPresent stream state can become 
> wrong. The isPresent stream thinks all values are non-null and hence 
> suppressed. But the data stream will be of 0 length. When reading such files 
> we will get the following exception
> {code}
> Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed 
> stream Stream for column 3 kind DATA position: 0 length: 0 range: 0 offset: 0 
> limit: 0
> at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
> at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13082) Enable constant propagation optimization in query with left semi join

2016-02-17 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-13082:
---
Status: Patch Available  (was: Open)

> Enable constant propagation optimization in query with left semi join
> -
>
> Key: HIVE-13082
> URL: https://issues.apache.org/jira/browse/HIVE-13082
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13082.patch
>
>
> Currently constant folding is only allowed for inner or unique join, I think 
> it is also applicable and allowed for left semi join. Otherwise the query 
> like following having multiple joins with left semi joins will fail:
> {code} 
> select table1.id, table1.val, table2.val2 from table1 inner join table2 on 
> table1.val = 't1val01' and table1.id = table2.id left semi join table3 on 
> table1.dimid = table3.id;
> {code}
> with errors:
> {code}
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.0.jar:?]
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45]
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45]
>   at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13015) Bundle Log4j2 jars with hive-exec

2016-02-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151757#comment-15151757
 ] 

Prasanth Jayachandran commented on HIVE-13015:
--

I see. Thanks for the explanation!

> Bundle Log4j2 jars with hive-exec
> -
>
> Key: HIVE-13015
> URL: https://issues.apache.org/jira/browse/HIVE-13015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Attachments: HIVE-13015.1.patch, HIVE-13015.1.patch, 
> HIVE-13015.2.patch, HIVE-13015.3.patch
>
>
> In some of the recent test runs, we are seeing multiple bindings for SLF4j 
> that causes issues with LOG4j2 logger. 
> {code}
> SLF4J: Found binding in 
> [jar:file:/grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1454694331819_0001/container_e06_1454694331819_0001_01_02/app/install/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> {code}
> We have added explicit exclusions for slf4j-log4j12 but some library is 
> pulling it transitively and it's getting packaged with hive libs. Also hive 
> currently uses version 1.7.5 for slf4j. We should add dependency convergence 
> for sl4fj and also remove packaging of slf4j-log4j12.*.jar 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13082) Enable constant propagation optimization in query with left semi join

2016-02-17 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-13082:
---
Attachment: HIVE-13082.patch

> Enable constant propagation optimization in query with left semi join
> -
>
> Key: HIVE-13082
> URL: https://issues.apache.org/jira/browse/HIVE-13082
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13082.patch
>
>
> Currently constant folding is only allowed for inner or unique join, I think 
> it is also applicable and allowed for left semi join. Otherwise the query 
> like following having multiple joins with left semi joins will fail:
> {code} 
> select table1.id, table1.val, table2.val2 from table1 inner join table2 on 
> table1.val = 't1val01' and table1.id = table2.id left semi join table3 on 
> table1.dimid = table3.id;
> {code}
> with errors:
> {code}
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.0.jar:?]
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45]
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45]
>   at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13015) Bundle Log4j2 jars with hive-exec

2016-02-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151754#comment-15151754
 ] 

Gopal V commented on HIVE-13015:


I thought so too, but transitive dependencies are pulled for compile, but not 
for shading - wasn't getting them in the hive-exec shader.

Was getting

{code}
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/logging/log4j/spi/LoggerAdapter
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.newInstance(Class.java:412)
at 
org.apache.commons.logging.LogFactory.createFactory(LogFactory.java:1090)
at org.apache.commons.logging.LogFactory$2.run(LogFactory.java:1003)
at java.security.AccessController.doPrivileged(Native Method)
at 
org.apache.commons.logging.LogFactory.newFactory(LogFactory.java:1000)
at org.apache.commons.logging.LogFactory.getFactory(LogFactory.java:554)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
at 
org.apache.hadoop.service.AbstractService.(AbstractService.java:43)
Caused by: java.lang.ClassNotFoundException: 
org.apache.logging.log4j.spi.LoggerAdapter
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 11 more
{code}

> Bundle Log4j2 jars with hive-exec
> -
>
> Key: HIVE-13015
> URL: https://issues.apache.org/jira/browse/HIVE-13015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Attachments: HIVE-13015.1.patch, HIVE-13015.1.patch, 
> HIVE-13015.2.patch, HIVE-13015.3.patch
>
>
> In some of the recent test runs, we are seeing multiple bindings for SLF4j 
> that causes issues with LOG4j2 logger. 
> {code}
> SLF4J: Found binding in 
> [jar:file:/grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1454694331819_0001/container_e06_1454694331819_0001_01_02/app/install/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> {code}
> We have added explicit exclusions for slf4j-log4j12 but some library is 
> pulling it transitively and it's getting packaged with hive libs. Also hive 
> currently uses version 1.7.5 for slf4j. We should add dependency convergence 
> for sl4fj and also remove packaging of slf4j-log4j12.*.jar 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13077) LLAP: Scrub daemon-site.xml from client configs

2016-02-17 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151751#comment-15151751
 ] 

Siddharth Seth commented on HIVE-13077:
---

Looks good. +1. llap-daemon-site.xml isn't supposed to have any configs, and is 
not used in the AM.

> LLAP: Scrub daemon-site.xml from client configs
> ---
>
> Key: HIVE-13077
> URL: https://issues.apache.org/jira/browse/HIVE-13077
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13077.1.patch
>
>
> {code}
>  if (llapMode) {
>   // add configs for llap-daemon-site.xml + localize llap jars
>   // they cannot be referred to directly as it would be a circular 
> depedency
>   conf.addResource("llap-daemon-site.xml");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13081) Support for DELETE/UPDATE... JOIN

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13081:
---
Component/s: Transactions

> Support for DELETE/UPDATE... JOIN
> -
>
> Key: HIVE-13081
> URL: https://issues.apache.org/jira/browse/HIVE-13081
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Fabian Tan
>Priority: Minor
>
> I noticed that  DELETE.. join does not appear to work with Hive 1.x.
> For an example:
> hive> delete from t_bic_pel_order_acid LEFT OUTER JOIN t_bic_pel_order_poc7 
> ON (t_bic_pel_order_acid.order_no=t_bic_pel_order_poc7.order_no);
> FAILED: ParseException line 1:33 missing EOF at 'LEFT' near 
> 't_bic_pel_order_acid'
> Looking at Hive's documentation, it does look like the DELETE syntax 
> supported is only limited to DELETE from .., where there are no 
> indicators that joins are supported.
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#Languag...
> I could not find any JIRA ticket or examples of other end users using this, 
> therefore, I'm creating this ticket as a feature request to support 
> DELETE/UPDATE .. JOIN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13079) LLAP: Allow reading log4j properties from default JAR resources

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151723#comment-15151723
 ] 

Sergey Shelukhin edited comment on HIVE-13079 at 2/18/16 4:15 AM:
--

I'm pretty sure I filed this JIRA before somewhere. +1 


was (Author: sershe):
I'm pretty sure I filed this JIRA before somewhere. +1 pending tests

> LLAP: Allow reading log4j properties from default JAR resources
> ---
>
> Key: HIVE-13079
> URL: https://issues.apache.org/jira/browse/HIVE-13079
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13079.1.patch
>
>
> If the log4j2 configuration is not overriden by the user, the Slider pkg 
> creation fails since the config is generated from a URL.
> Allow for the .properties file to be created from default JAR resources if 
> user provides no overrides.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13079) LLAP: Allow reading log4j properties from default JAR resources

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151723#comment-15151723
 ] 

Sergey Shelukhin commented on HIVE-13079:
-

I'm pretty sure I filed this JIRA before somewhere. +1 pending tests

> LLAP: Allow reading log4j properties from default JAR resources
> ---
>
> Key: HIVE-13079
> URL: https://issues.apache.org/jira/browse/HIVE-13079
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13079.1.patch
>
>
> If the log4j2 configuration is not overriden by the user, the Slider pkg 
> creation fails since the config is generated from a URL.
> Allow for the .properties file to be created from default JAR resources if 
> user provides no overrides.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13015) Bundle Log4j2 jars with hive-exec

2016-02-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13015:
-
Assignee: Gopal V  (was: Prasanth Jayachandran)

> Bundle Log4j2 jars with hive-exec
> -
>
> Key: HIVE-13015
> URL: https://issues.apache.org/jira/browse/HIVE-13015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Attachments: HIVE-13015.1.patch, HIVE-13015.1.patch, 
> HIVE-13015.2.patch, HIVE-13015.3.patch
>
>
> In some of the recent test runs, we are seeing multiple bindings for SLF4j 
> that causes issues with LOG4j2 logger. 
> {code}
> SLF4J: Found binding in 
> [jar:file:/grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1454694331819_0001/container_e06_1454694331819_0001_01_02/app/install/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> {code}
> We have added explicit exclusions for slf4j-log4j12 but some library is 
> pulling it transitively and it's getting packaged with hive libs. Also hive 
> currently uses version 1.7.5 for slf4j. We should add dependency convergence 
> for sl4fj and also remove packaging of slf4j-log4j12.*.jar 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13015) Bundle Log4j2 jars with hive-exec

2016-02-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151711#comment-15151711
 ] 

Hive QA commented on HIVE-13015:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788197/HIVE-13015.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9790 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-unionDistinct_1.q-insert_update_delete.q-selectDistinctStar.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7014/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7014/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7014/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788197 - PreCommit-HIVE-TRUNK-Build

> Bundle Log4j2 jars with hive-exec
> -
>
> Key: HIVE-13015
> URL: https://issues.apache.org/jira/browse/HIVE-13015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13015.1.patch, HIVE-13015.1.patch, 
> HIVE-13015.2.patch, HIVE-13015.3.patch
>
>
> In some of the recent test runs, we are seeing multiple bindings for SLF4j 
> that causes issues with LOG4j2 logger. 
> {code}
> SLF4J: Found binding in 
> [jar:file:/grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1454694331819_0001/container_e06_1454694331819_0001_01_02/app/install/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> {code}
> We have added explicit exclusions for slf4j-log4j12 but some library is 
> pulling it transitively and it's getting packaged with hive libs. Also hive 
> currently uses version 1.7.5 for slf4j. We should add dependency convergence 
> for sl4fj and also remove packaging of slf4j-log4j12.*.jar 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12981) ThriftCLIService uses incompatible getShortName() implementation

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12981:

Affects Version/s: (was: 2.1.0)
   2.0.0

> ThriftCLIService uses incompatible getShortName() implementation
> 
>
> Key: HIVE-12981
> URL: https://issues.apache.org/jira/browse/HIVE-12981
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Authorization, CLI, Security
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Critical
>  Labels: kerberos
> Fix For: 1.3.0, 1.2.2, 2.1.0
>
> Attachments: 0001-HIVE-12981-Use-KerberosName.patch, 
> HIVE-12981-branch-1.2.patch, HIVE-12981.01.patch, HIVE-12981.patch
>
>
> ThriftCLIService has a local implementation getShortName() that assumes a 
> short name is always the part before "@" and "/". This is not always the case 
> as Kerberos Rules (from Hadoop's KerberosName) might actually transform a 
> name to something else.
> Considering a pending change to getShortName() (#HADOOP-12751) and the normal 
> use of KerberosName in other parts of Hive it only seems logical to use the 
> standard implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13048) Rogue SQL statement in an upgrade SQL file for oracle.

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13048:

   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch!

> Rogue SQL statement in an upgrade SQL file for oracle.
> --
>
> Key: HIVE-13048
> URL: https://issues.apache.org/jira/browse/HIVE-13048
> Project: Hive
>  Issue Type: Bug
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.1.0
>
> Attachments: HIVE-13048.patch
>
>
> metastore/scripts/upgrade/oracle/033-HIVE-12892.oracle.sql has
>   VERSION_COMMENT VARCHAR(255) NOT NULL
> CREATE TABLE CHANGE_VERSION (
>   CHANGE_VERSION_ID NUMBER NOT NULL,
>   VERSION NUMBER NOT NULL,
>   TOPIC VARCHAR(255) NOT NULL
> );
> ALTER TABLE CHANGE_VERSION ADD CONSTRAINT CHANGE_VERSION_PK PRIMARY KEY 
> (CHANGE_VERSION_ID);
> CREATE UNIQUE INDEX UNIQUE_CHANGE_VERSION ON CHANGE_VERSION (TOPIC);
> The first line appears to be a typo and should not really be there. I noticed 
> that in the METASTORE-Test precommit builds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12981) ThriftCLIService uses incompatible getShortName() implementation

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12981:

   Resolution: Fixed
Fix Version/s: 2.1.0
   1.2.2
   1.3.0
   Status: Resolved  (was: Patch Available)

Committed everywhere. Thanks for the patch!

> ThriftCLIService uses incompatible getShortName() implementation
> 
>
> Key: HIVE-12981
> URL: https://issues.apache.org/jira/browse/HIVE-12981
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Authorization, CLI, Security
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Critical
>  Labels: kerberos
> Fix For: 1.3.0, 1.2.2, 2.1.0
>
> Attachments: 0001-HIVE-12981-Use-KerberosName.patch, 
> HIVE-12981-branch-1.2.patch, HIVE-12981.01.patch, HIVE-12981.patch
>
>
> ThriftCLIService has a local implementation getShortName() that assumes a 
> short name is always the part before "@" and "/". This is not always the case 
> as Kerberos Rules (from Hadoop's KerberosName) might actually transform a 
> name to something else.
> Considering a pending change to getShortName() (#HADOOP-12751) and the normal 
> use of KerberosName in other parts of Hive it only seems logical to use the 
> standard implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12981) ThriftCLIService uses incompatible getShortName() implementation

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12981:

Assignee: Bolke de Bruin  (was: Thejas M Nair)

> ThriftCLIService uses incompatible getShortName() implementation
> 
>
> Key: HIVE-12981
> URL: https://issues.apache.org/jira/browse/HIVE-12981
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Authorization, CLI, Security
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Critical
>  Labels: kerberos
> Attachments: 0001-HIVE-12981-Use-KerberosName.patch, 
> HIVE-12981-branch-1.2.patch, HIVE-12981.01.patch, HIVE-12981.patch
>
>
> ThriftCLIService has a local implementation getShortName() that assumes a 
> short name is always the part before "@" and "/". This is not always the case 
> as Kerberos Rules (from Hadoop's KerberosName) might actually transform a 
> name to something else.
> Considering a pending change to getShortName() (#HADOOP-12751) and the normal 
> use of KerberosName in other parts of Hive it only seems logical to use the 
> standard implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHAR and REPLACE

2016-02-17 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Attachment: (was: HIVE-13063.patch)

> Create UDFs for CHAR and REPLACE 
> -
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.master.patch, Screen Shot 2016-02-17 at 
> 7.20.57 PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHAR and REPLACE

2016-02-17 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Attachment: HIVE-13063.master.patch

> Create UDFs for CHAR and REPLACE 
> -
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.master.patch, Screen Shot 2016-02-17 at 
> 7.20.57 PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13063) Create UDFs for CHAR and REPLACE

2016-02-17 Thread Alejandro Fernandez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151678#comment-15151678
 ] 

Alejandro Fernandez commented on HIVE-13063:


Verified it worked with,

{code}
ADD JAR hdfs://c6401.ambari.apache.org:8020/tmp/hive-exec.jar;
CREATE TEMPORARY FUNCTION char_udf AS 'org.apache.hadoop.hive.ql.udf.UDFChar';
CREATE TEMPORARY FUNCTION replace_udf AS 
'org.apache.hadoop.hive.ql.udf.UDFReplace';

SHOW FUNCTIONS;

DESCRIBE FUNCTION char_udf;
DESCRIBE FUNCTION replace_udf;

select char_udf(-1), 
char_udf(0), 
char_udf(1), 
char_udf(48), 
char_udf(65), 
char_udf(68.12), 
char_udf(32457964);

select replace_udf('', '', ''), 
replace_udf(null, '', ''), 
replace_udf('', null, ''), 
replace_udf('', '', null), 
replace_udf('Hack and Hue', 'H', 'BL'), 
replace_udf('ABABrdvABrk', 'AB', 'a');
{code}

> Create UDFs for CHAR and REPLACE 
> -
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13063) Create UDFs for CHAR and REPLACE

2016-02-17 Thread Alejandro Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated HIVE-13063:
---
Attachment: Screen Shot 2016-02-17 at 7.20.57 PM.png
Screen Shot 2016-02-17 at 7.21.07 PM.png

> Create UDFs for CHAR and REPLACE 
> -
>
> Key: HIVE-13063
> URL: https://issues.apache.org/jira/browse/HIVE-13063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Alejandro Fernandez
>Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: HIVE-13063.patch, Screen Shot 2016-02-17 at 7.20.57 
> PM.png, Screen Shot 2016-02-17 at 7.21.07 PM.png
>
>
> Create UDFS for these functions.
> CHAR: convert n where n : [0, 256) into the ascii equivalent as a varchar. If 
> n is less than 0 or greater than 255, return the empty string. If n is 0, 
> return null.
> REPLACE: replace all substrings of 'str' that match 'search' with 'rep'.
> Example. SELECT REPLACE('Hack and Hue', 'H', 'BL');
> Equals 'BLack and BLue'"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13029) NVDIMM support for LLAP Cache

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13029:
---
Priority: Critical  (was: Major)

> NVDIMM support for LLAP Cache
> -
>
> Key: HIVE-13029
> URL: https://issues.apache.org/jira/browse/HIVE-13029
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
>
> LLAP cache has been designed so that the cache can be offloaded easily to a 
> pmem API without restart coherence.
> The tricky part about NVDIMMs are restart coherence, while most of the cache 
> gains can be obtained without keeping state across refreshes, since LLAP is 
> not the system of record, HDFS is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10355) LLAP: randomness in random machine scheduling is not random enough

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151594#comment-15151594
 ] 

Sergey Shelukhin commented on HIVE-10355:
-

Isn't anything less random than using pure randomness? ;) 

> LLAP: randomness in random machine scheduling is not random enough
> --
>
> Key: HIVE-10355
> URL: https://issues.apache.org/jira/browse/HIVE-10355
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Gopal V
>
> Based on discussion; there's some skew towards 0th machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10648) LLAP: registry; Tez attempted to schedule to daemon that didn't exist

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-10648.

Resolution: Not A Problem

> LLAP: registry; Tez attempted to schedule to daemon that didn't exist
> -
>
> Key: HIVE-10648
> URL: https://issues.apache.org/jira/browse/HIVE-10648
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Gopal V
>
> I can post logs externally; for now app IDs on test cluster are 
> application_1429683757595_0784 and application_1429683757595_0783, I also 
> have logs copied over.
> AM found the node (same logs for other nodes):
> {noformat}
> 2015-05-07 12:13:28,074 INFO 
> [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] 
> impl.LlapYarnRegistryImpl: Adding new worker 
> 342f4992-2608-43ab-a119-b50882e35f75 which mapped to DynamicServiceInstance 
> [alive=true, host=cn059-10.l42scl.hortonworks.com:15001 with 
> resources=]
> 
> 2015-05-07 12:13:28,082 INFO [Dispatcher thread: Central] node.AMNodeTracker: 
> Num cluster nodes = 19
> {noformat}
> Trouble is, this node never actually existed... The cluster only had 15 
> nodes. 
> As the job was progressing, AM repeatedly tried to schedule to this node and 
> failed. There was no other LLAP cluster running at the same time.
> In fact, given that I always start a 15-node cluster I am not sure where 
> 19-node data could conceivably come from...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10355) LLAP: randomness in random machine scheduling is not random enough

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-10355.

Resolution: Won't Fix

We have made it much less random with consistent splits and cache affinity.

At this point, the best method is less random than using pure randomness.

> LLAP: randomness in random machine scheduling is not random enough
> --
>
> Key: HIVE-10355
> URL: https://issues.apache.org/jira/browse/HIVE-10355
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Gopal V
>
> Based on discussion; there's some skew towards 0th machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11431) Vectorization: select * Left Semi Join projections NPE

2016-02-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151587#comment-15151587
 ] 

Matt McCline commented on HIVE-11431:
-

No progress.  There is a work-around: specify the columns you want instead of 
using select *

> Vectorization: select * Left Semi Join projections NPE
> --
>
> Key: HIVE-11431
> URL: https://issues.apache.org/jira/browse/HIVE-11431
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: left-semi-bug.sql
>
>
> The "select *" is meant to only apply to the left most table, not the right 
> most - the unprojected "d" from tmp1 triggers this NPE.
> {code}
> select * from tmp2 left semi join tmp1 where c1 = id and c0 = q;
> {code}
> {code}
> Caused by: java.lang.NullPointerException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.hadoop.io.Text.set(Text.java:225)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow$StringExtractorByValue.extract(VectorExtractRow.java:472)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:732)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:96)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:136)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:117)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13077) LLAP: Scrub daemon-site.xml from client configs

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13077:
---
Status: Patch Available  (was: Open)

> LLAP: Scrub daemon-site.xml from client configs
> ---
>
> Key: HIVE-13077
> URL: https://issues.apache.org/jira/browse/HIVE-13077
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13077.1.patch
>
>
> {code}
>  if (llapMode) {
>   // add configs for llap-daemon-site.xml + localize llap jars
>   // they cannot be referred to directly as it would be a circular 
> depedency
>   conf.addResource("llap-daemon-site.xml");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13077) LLAP: Scrub daemon-site.xml from client configs

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13077:
---
Attachment: HIVE-13077.1.patch

> LLAP: Scrub daemon-site.xml from client configs
> ---
>
> Key: HIVE-13077
> URL: https://issues.apache.org/jira/browse/HIVE-13077
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13077.1.patch
>
>
> {code}
>  if (llapMode) {
>   // add configs for llap-daemon-site.xml + localize llap jars
>   // they cannot be referred to directly as it would be a circular 
> depedency
>   conf.addResource("llap-daemon-site.xml");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13021) GenericUDAFEvaluator.isEstimable(agg) always returns false

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13021:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
   1.3.0
 Release Note: GenericUDAFEvaluator.isEstimable(agg) always returns false 
(Gopal V, reviewed by Prasanth Jayachandran)
   Status: Resolved  (was: Patch Available)

Good catch [~Spring].

Pushed to master & branch-1.

> GenericUDAFEvaluator.isEstimable(agg) always returns false
> --
>
> Key: HIVE-13021
> URL: https://issues.apache.org/jira/browse/HIVE-13021
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Sergey Zadoroshnyak
>Assignee: Gopal V
>Priority: Critical
>  Labels: Performance
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13021.1.patch
>
>
> GenericUDAFEvaluator.isEstimable(agg) always returns false, because 
> annotation AggregationType has default RetentionPolicy.CLASS and cannot be 
> retained by the VM at run time.
> As result estimate method will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11526) LLAP: implement LLAP UI as a separate service - part 1

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11526:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
 Release Note: LLAP: implement LLAP UI as a separate service - part 1 (Yuya 
OZAWA, reviewed by Gopal V)
   Status: Resolved  (was: Patch Available)

> LLAP: implement LLAP UI as a separate service - part 1
> --
>
> Key: HIVE-11526
> URL: https://issues.apache.org/jira/browse/HIVE-11526
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Yuya OZAWA
> Fix For: 2.1.0
>
> Attachments: HIVE-11526.2.patch, HIVE-11526.3.patch, 
> HIVE-11526.patch, llap_monitor_design.pdf
>
>
> The specifics are vague at this point. 
> Hadoop metrics can be output, as well as metrics we collect and output in 
> jmx, as well as those we collect per fragment and log right now. 
> This service can do LLAP-specific views, and per-query aggregation.
> [~gopalv] may have some information on how to reuse existing solutions for 
> part of the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11526) LLAP: implement LLAP UI as a separate service - part 1

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11526:
---
Summary: LLAP: implement LLAP UI as a separate service - part 1  (was: 
LLAP: implement LLAP UI as a separate service)

> LLAP: implement LLAP UI as a separate service - part 1
> --
>
> Key: HIVE-11526
> URL: https://issues.apache.org/jira/browse/HIVE-11526
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Yuya OZAWA
> Attachments: HIVE-11526.2.patch, HIVE-11526.3.patch, 
> HIVE-11526.patch, llap_monitor_design.pdf
>
>
> The specifics are vague at this point. 
> Hadoop metrics can be output, as well as metrics we collect and output in 
> jmx, as well as those we collect per fragment and log right now. 
> This service can do LLAP-specific views, and per-query aggregation.
> [~gopalv] may have some information on how to reuse existing solutions for 
> part of the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13071) Impossible cast from LongColumnVector to TimestampColumnVector in VectorColumnAssignFactory.buildObjectAssign method

2016-02-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-13071:
---

Assignee: Matt McCline

> Impossible cast from LongColumnVector to TimestampColumnVector in 
> VectorColumnAssignFactory.buildObjectAssign method
> 
>
> Key: HIVE-13071
> URL: https://issues.apache.org/jira/browse/HIVE-13071
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Hai Xu
>Assignee: Matt McCline
> Attachments: HIVE-13071.patch
>
>
> In the method of VectorColumnAssignFactory.buildObjectAssign, it is 
> impossible to cast from LongColumnVector to TimestampColumnVector. This cast 
> will always throw a ClassCastException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13079) LLAP: Allow reading log4j properties from default JAR resources

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13079:
---
Status: Patch Available  (was: Open)

> LLAP: Allow reading log4j properties from default JAR resources
> ---
>
> Key: HIVE-13079
> URL: https://issues.apache.org/jira/browse/HIVE-13079
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13079.1.patch
>
>
> If the log4j2 configuration is not overriden by the user, the Slider pkg 
> creation fails since the config is generated from a URL.
> Allow for the .properties file to be created from default JAR resources if 
> user provides no overrides.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13079) LLAP: Allow reading log4j properties from default JAR resources

2016-02-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13079:
---
Attachment: HIVE-13079.1.patch

> LLAP: Allow reading log4j properties from default JAR resources
> ---
>
> Key: HIVE-13079
> URL: https://issues.apache.org/jira/browse/HIVE-13079
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13079.1.patch
>
>
> If the log4j2 configuration is not overriden by the user, the Slider pkg 
> creation fails since the config is generated from a URL.
> Allow for the .properties file to be created from default JAR resources if 
> user provides no overrides.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13051:

Attachment: (was: HIVE-13501.patch)

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13501.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13051:

Attachment: HIVE-13501.patch

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13501.patch, HIVE-13501.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151549#comment-15151549
 ] 

Sergey Shelukhin commented on HIVE-13051:
-

needs a small update

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13501.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151542#comment-15151542
 ] 

Sergey Shelukhin commented on HIVE-13051:
-

The main issue is that it appears that failure can keep the old timer running, 
resulting in failures on some subsequent check from another operation. 

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13501.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13051:

Status: Patch Available  (was: Open)

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13501.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13051:

Attachment: HIVE-13501.patch

For now, not fixing the config change, only the more crucial issues. 
[~prasanth_j] [~ashutoshc] can you take a look?

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13501.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11526) LLAP: implement LLAP UI as a separate service

2016-02-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151529#comment-15151529
 ] 

Hive QA commented on HIVE-11526:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788183/HIVE-11526.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9790 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthHttp.org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7013/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7013/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7013/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788183 - PreCommit-HIVE-TRUNK-Build

> LLAP: implement LLAP UI as a separate service
> -
>
> Key: HIVE-11526
> URL: https://issues.apache.org/jira/browse/HIVE-11526
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Yuya OZAWA
> Attachments: HIVE-11526.2.patch, HIVE-11526.3.patch, 
> HIVE-11526.patch, llap_monitor_design.pdf
>
>
> The specifics are vague at this point. 
> Hadoop metrics can be output, as well as metrics we collect and output in 
> jmx, as well as those we collect per fragment and log right now. 
> This service can do LLAP-specific views, and per-query aggregation.
> [~gopalv] may have some information on how to reuse existing solutions for 
> part of the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13051:

Description: 
currentTimeMillis is not a correct way to measure intervals of time; it can 
easily be adjusted e.g. by ntpd. System.nanoTime should be used.
It's also unsafe for failure cases, and doesn't appear to update from config 
updates correctly.

  was:currentTimeMillis is not a correct way to measure intervals of time; it 
can easily be adjusted e.g. by ntpd. System.nanoTime should be used.


> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-13051:
---

Assignee: Sergey Shelukhin

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13051) Deadline class has numerous issues

2016-02-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13051:

Summary: Deadline class has numerous issues  (was: Deadline class should 
not use currentTimeMillis)

> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.

2016-02-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10632:
-
Status: Patch Available  (was: Open)

> Make sure TXN_COMPONENTS gets cleaned up if table is dropped before 
> compaction.
> ---
>
> Key: HIVE-10632
> URL: https://issues.apache.org/jira/browse/HIVE-10632
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-10632.1.patch, HIVE-10632.2.patch, 
> HIVE-10632.3.patch, HIVE-10632.4.patch, HIVE-10632.5.patch
>
>
> The compaction process will clean up entries in  TXNS, 
> COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS.  If the table/partition is dropped 
> before compaction is complete there will be data left in these tables.  Need 
> to investigate if there are other situations where this may happen and 
> address it.
> see HIVE-10595 for additional info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.

2016-02-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10632:
-
Status: Open  (was: Patch Available)

> Make sure TXN_COMPONENTS gets cleaned up if table is dropped before 
> compaction.
> ---
>
> Key: HIVE-10632
> URL: https://issues.apache.org/jira/browse/HIVE-10632
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-10632.1.patch, HIVE-10632.2.patch, 
> HIVE-10632.3.patch, HIVE-10632.4.patch, HIVE-10632.5.patch
>
>
> The compaction process will clean up entries in  TXNS, 
> COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS.  If the table/partition is dropped 
> before compaction is complete there will be data left in these tables.  Need 
> to investigate if there are other situations where this may happen and 
> address it.
> see HIVE-10595 for additional info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12808) Logical PPD: Push filter clauses through PTF(Windowing) into TS

2016-02-17 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-12808:
--
Status: Patch Available  (was: Open)

> Logical PPD: Push filter clauses through PTF(Windowing) into TS
> ---
>
> Key: HIVE-12808
> URL: https://issues.apache.org/jira/browse/HIVE-12808
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-12808.01.patch, HIVE-12808.02.patch, 
> HIVE-12808.03.patch, HIVE-12808.04.patch
>
>
> Simplified repro case of [HCC 
> #8880|https://community.hortonworks.com/questions/8880/hive-on-tez-pushdown-predicate-doesnt-work-in-part.html],
>  with the slow query showing the push-down miss. 
> And the manually rewritten query to indicate the expected one.
> Part of the problem could be the window range not being split apart for PPD, 
> but the FIL is not pushed down even if the rownum filter is removed.
> {code}
> create temporary table positions (regionid string, id bigint, deviceid 
> string, ts string);
> insert into positions values('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 
> 1422792010, '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-01'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-01'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-02'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-02');
> -- slow query
> explain
> WITH t1 AS 
> ( 
>  SELECT   *, 
>   Row_number() over ( PARTITION BY regionid, id, deviceid 
> ORDER BY ts DESC) AS rownos
>  FROM positions ), 
> latestposition as ( 
>SELECT * 
>FROM   t1 
>WHERE  rownos = 1) 
> SELECT * 
> FROM   latestposition 
> WHERE  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' 
> ANDid=1422792010 
> ANDdeviceid='6c5d1a30-2331-448b-a726-a380d6b3a432';
> -- fast query
> explain
> WITH t1 AS 
> ( 
>  SELECT   *, 
>   Row_number() over ( PARTITION BY regionid, id, deviceid 
> ORDER BY ts DESC) AS rownos
>  FROM positions 
>  WHERE  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' 
>  ANDid=1422792010 
>  ANDdeviceid='6c5d1a30-2331-448b-a726-a380d6b3a432'
> ),latestposition as ( 
>SELECT * 
>FROM   t1 
>WHERE  rownos = 1) 
> SELECT * 
> FROM   latestposition 
> ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12808) Logical PPD: Push filter clauses through PTF(Windowing) into TS

2016-02-17 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-12808:
--
Attachment: HIVE-12808.04.patch

> Logical PPD: Push filter clauses through PTF(Windowing) into TS
> ---
>
> Key: HIVE-12808
> URL: https://issues.apache.org/jira/browse/HIVE-12808
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-12808.01.patch, HIVE-12808.02.patch, 
> HIVE-12808.03.patch, HIVE-12808.04.patch
>
>
> Simplified repro case of [HCC 
> #8880|https://community.hortonworks.com/questions/8880/hive-on-tez-pushdown-predicate-doesnt-work-in-part.html],
>  with the slow query showing the push-down miss. 
> And the manually rewritten query to indicate the expected one.
> Part of the problem could be the window range not being split apart for PPD, 
> but the FIL is not pushed down even if the rownum filter is removed.
> {code}
> create temporary table positions (regionid string, id bigint, deviceid 
> string, ts string);
> insert into positions values('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 
> 1422792010, '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-01'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-01'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-02'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-02');
> -- slow query
> explain
> WITH t1 AS 
> ( 
>  SELECT   *, 
>   Row_number() over ( PARTITION BY regionid, id, deviceid 
> ORDER BY ts DESC) AS rownos
>  FROM positions ), 
> latestposition as ( 
>SELECT * 
>FROM   t1 
>WHERE  rownos = 1) 
> SELECT * 
> FROM   latestposition 
> WHERE  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' 
> ANDid=1422792010 
> ANDdeviceid='6c5d1a30-2331-448b-a726-a380d6b3a432';
> -- fast query
> explain
> WITH t1 AS 
> ( 
>  SELECT   *, 
>   Row_number() over ( PARTITION BY regionid, id, deviceid 
> ORDER BY ts DESC) AS rownos
>  FROM positions 
>  WHERE  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' 
>  ANDid=1422792010 
>  ANDdeviceid='6c5d1a30-2331-448b-a726-a380d6b3a432'
> ),latestposition as ( 
>SELECT * 
>FROM   t1 
>WHERE  rownos = 1) 
> SELECT * 
> FROM   latestposition 
> ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11526) LLAP: implement LLAP UI as a separate service

2016-02-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151437#comment-15151437
 ] 

Gopal V commented on HIVE-11526:


[~hagleitn]: BSD 3-clause (new) license is in the approved list of licenses - 
the old BSD license had an advertising clause, which is not.

http://www.apache.org/legal/resolved.html

> LLAP: implement LLAP UI as a separate service
> -
>
> Key: HIVE-11526
> URL: https://issues.apache.org/jira/browse/HIVE-11526
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Yuya OZAWA
> Attachments: HIVE-11526.2.patch, HIVE-11526.3.patch, 
> HIVE-11526.patch, llap_monitor_design.pdf
>
>
> The specifics are vague at this point. 
> Hadoop metrics can be output, as well as metrics we collect and output in 
> jmx, as well as those we collect per fragment and log right now. 
> This service can do LLAP-specific views, and per-query aggregation.
> [~gopalv] may have some information on how to reuse existing solutions for 
> part of the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-02-17 Thread Austin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151421#comment-15151421
 ] 

Austin Lee commented on HIVE-12679:
---

I meant my original thinking was to put the logic to pick up the actual 
implementation in Hive.java, not in SessionHiveMetaStoreClient.java.

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Fix For: 1.2.1
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-02-17 Thread Austin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151354#comment-15151354
 ] 

Austin Lee commented on HIVE-12679:
---

Thanks for your suggestion on the approach.  Instead of having 
SessionHiveMetaStoreClient directly extend HiveMetaStoreClient via inheritance, 
if I understand you correctly, we can accomplish what I am proposing via 
composition, i.e., by creating a new member of type IMetaStoreClient in 
SessionHiveMetaStoreClient and use HiveConf to determine its concrete 
implementation at runtime?  I was thinking of putting this logic in 
SessionHiveMetaStoreClient, but looking at the latest code in 2.1-snapshot, 
your approach might make more sense.

As for the use case that I have in mind, I am really after more flexibility, 
e.g. not having dependency on Thrift, not having to run in embedded mode to 
eliminate dependency on Thrift, etc.   

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Fix For: 1.2.1
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-02-17 Thread Austin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Lee updated HIVE-12679:
--
Fix Version/s: 1.2.1

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Fix For: 1.2.1
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-02-17 Thread Austin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Lee updated HIVE-12679:
--
Labels: metastore  (was: )

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Fix For: 1.2.1
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12165) wrong result when hive.optimize.sampling.orderby=true with some aggregate functions

2016-02-17 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12165:

Affects Version/s: 2.1.0
   Status: Patch Available  (was: Open)

> wrong result when hive.optimize.sampling.orderby=true with some aggregate 
> functions
> ---
>
> Key: HIVE-12165
> URL: https://issues.apache.org/jira/browse/HIVE-12165
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
> Environment: hortonworks  2.3
>Reporter: ErwanMAS
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-12165.patch
>
>
> This simple query give wrong result , when , i use the parallel order .
> {noformat}
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> {noformat}
> Current wrong result :
> {noformat}
> c0c1  c2  c3
> 32740 32740   0   163695
> 113172113172  163700  729555
> 54088 54088   729560  95
> {noformat}
> Right result :
> {noformat}
> c0c1  c2  c3
> 100   100 0   99
> {noformat}
> The sql script for my test 
> {noformat}
> drop table foobar_1 ;
> create table foobar_1 ( dummyint int  , dummystr string ) ;
> insert into table foobar_1 select count(*),'dummy 0'  from foobar_1 ;
> drop table foobar_1M ;
> create table foobar_1M ( dummyint bigint  , dummystr string ) ;
> insert overwrite table foobar_1M
>select val_int  , concat('dummy ',val_int) from
>  ( select ((d_1*10)+d_2)*10+d_3)*10+d_4)*10+d_5)*10+d_6) as 
> val_int from foobar_1
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_1 as d_1
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_2 as d_2
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_3 as d_3
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_4 as d_4
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_5 as d_5
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_6 as d_6  ) as f ;
> set hive.optimize.sampling.orderby.number=1;
> set hive.optimize.sampling.orderby.percent=0.1f;
> set mapreduce.job.reduces=3 ;
> set hive.optimize.sampling.orderby=false;
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> set hive.optimize.sampling.orderby=true;
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12165) wrong result when hive.optimize.sampling.orderby=true with some aggregate functions

2016-02-17 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12165:

Attachment: HIVE-12165.patch

> wrong result when hive.optimize.sampling.orderby=true with some aggregate 
> functions
> ---
>
> Key: HIVE-12165
> URL: https://issues.apache.org/jira/browse/HIVE-12165
> Project: Hive
>  Issue Type: Bug
> Environment: hortonworks  2.3
>Reporter: ErwanMAS
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-12165.patch
>
>
> This simple query give wrong result , when , i use the parallel order .
> {noformat}
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> {noformat}
> Current wrong result :
> {noformat}
> c0c1  c2  c3
> 32740 32740   0   163695
> 113172113172  163700  729555
> 54088 54088   729560  95
> {noformat}
> Right result :
> {noformat}
> c0c1  c2  c3
> 100   100 0   99
> {noformat}
> The sql script for my test 
> {noformat}
> drop table foobar_1 ;
> create table foobar_1 ( dummyint int  , dummystr string ) ;
> insert into table foobar_1 select count(*),'dummy 0'  from foobar_1 ;
> drop table foobar_1M ;
> create table foobar_1M ( dummyint bigint  , dummystr string ) ;
> insert overwrite table foobar_1M
>select val_int  , concat('dummy ',val_int) from
>  ( select ((d_1*10)+d_2)*10+d_3)*10+d_4)*10+d_5)*10+d_6) as 
> val_int from foobar_1
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_1 as d_1
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_2 as d_2
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_3 as d_3
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_4 as d_4
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_5 as d_5
>  lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
> tbl_6 as d_6  ) as f ;
> set hive.optimize.sampling.orderby.number=1;
> set hive.optimize.sampling.orderby.percent=0.1f;
> set mapreduce.job.reduces=3 ;
> set hive.optimize.sampling.orderby=false;
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> set hive.optimize.sampling.orderby=true;
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12927) HBase metastore: sequences should be one per row, not all in one row

2016-02-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151284#comment-15151284
 ] 

Hive QA commented on HIVE-12927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788163/HIVE-12927.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9805 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7012/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7012/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7012/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788163 - PreCommit-HIVE-TRUNK-Build

> HBase metastore: sequences should be one per row, not all in one row
> 
>
> Key: HIVE-12927
> URL: https://issues.apache.org/jira/browse/HIVE-12927
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>Priority: Critical
> Attachments: HIVE-12927.2.patch, HIVE-12927.patch
>
>
> {noformat}
>   long getNextSequence(byte[] sequence) throws IOException {
> {noformat}
> Is not safe in presence of any concurrency. It should use HBase increment API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13027) Async loggers for LLAP

2016-02-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151212#comment-15151212
 ] 

Prasanth Jayachandran commented on HIVE-13027:
--

The automatic log4j2 configurator will look for file specified via 
-Dlog4j.configurationFile.. if this file could not be located in classpath then 
it will default to ERROR,console. 

Programmatic initialization of logging happens in CliDriver and when running 
metastore as a service. Programmatic initialization looks for 
hive-log4j2.properties. mr/ExecDriver which initializes the logging for child 
JVM uses hive-exec-log4j2.properties. I suspect initialization using 
hive-exec-log4j2.properties as it is mr only code path. 

> Async loggers for LLAP
> --
>
> Key: HIVE-13027
> URL: https://issues.apache.org/jira/browse/HIVE-13027
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13027.1.patch
>
>
> LOG4j2's async logger claims to have 6-68 times better performance than 
> synchronous logger. https://logging.apache.org/log4j/2.x/manual/async.html
> We should use that for LLAP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13077) LLAP: Scrub daemon-site.xml from client configs

2016-02-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151172#comment-15151172
 ] 

Gopal V commented on HIVE-13077:


hive-site.xml is shared between them.

After the last set of ServiceDriver patches, llap-daemon-site.xml no longer 
exists on any client machine, but only inside the slider PKG.

So without this fix, queries will fail in LLAP mode.

> LLAP: Scrub daemon-site.xml from client configs
> ---
>
> Key: HIVE-13077
> URL: https://issues.apache.org/jira/browse/HIVE-13077
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>
> {code}
>  if (llapMode) {
>   // add configs for llap-daemon-site.xml + localize llap jars
>   // they cannot be referred to directly as it would be a circular 
> depedency
>   conf.addResource("llap-daemon-site.xml");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151171#comment-15151171
 ] 

Sergio Peña commented on HIVE-13039:


Nevermind. I will revert the patch, and re-committed it.
Is it only for branch-1?

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, 
> HIVE-13039.2.branch-1.txt, HIVE-13039.2.patch, HIVE-13039.3.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151170#comment-15151170
 ] 

Sergio Peña commented on HIVE-13039:


Thanks.
Could you create another JIRA for the new failure? i already committed the 
changes :P.

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, 
> HIVE-13039.2.branch-1.txt, HIVE-13039.2.patch, HIVE-13039.3.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13077) LLAP: Scrub daemon-site.xml from client configs

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151163#comment-15151163
 ] 

Sergey Shelukhin commented on HIVE-13077:
-

Hmm. Some configs, for example allowing UDFs (maybe others) are read both on 
client and on server. How will this work?

> LLAP: Scrub daemon-site.xml from client configs
> ---
>
> Key: HIVE-13077
> URL: https://issues.apache.org/jira/browse/HIVE-13077
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>
> {code}
>  if (llapMode) {
>   // add configs for llap-daemon-site.xml + localize llap jars
>   // they cannot be referred to directly as it would be a circular 
> depedency
>   conf.addResource("llap-daemon-site.xml");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-17 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13039:

Attachment: HIVE-13039.2.branch-1.txt

Thanks [~spena], I found a test failure which does not in master but in branch.
I commit the fix for the test to branch-1 too. 

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, 
> HIVE-13039.2.branch-1.txt, HIVE-13039.2.patch, HIVE-13039.3.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13040) Handle empty bucket creations more efficiently

2016-02-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151108#comment-15151108
 ] 

Hive QA commented on HIVE-13040:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788264/HIVE-13040.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 9745 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_const
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_neg_float
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_explode2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_partition_metadataonly
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataOnlyOptimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_min_structvalue
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partInit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_smb_cache
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union_fast_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_partition_metadataonly
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversionAndMajorCompaction
org.apache.hadoop.hive.ql.io.TestAcidUtils.testObsoleteOriginals
org.apache.hadoop.hive.ql.io.TestAcidUtils.testOriginal
org.apache.hadoop.hive.ql.io.TestAcidUtils.testOriginalDeltas
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testEtlCombinedStrategy
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testFileGenerator
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testSplitGenFailure
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7011/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7011/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7011/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 42 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788264 - PreCommit-HIVE-TRUNK-Build

> Handle empty bucket creations more efficiently 
> ---
>
> Key: HIVE-13040
> URL: https://issues.apache.org/jira/browse/HIVE-13040
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.0.0, 

[jira] [Commented] (HIVE-13065) Hive throws NPE when writing map type data to a HBase backed table

2016-02-17 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151100#comment-15151100
 ] 

Aihua Xu commented on HIVE-13065:
-

The patch looks good. +1.

> Hive throws NPE when writing map type data to a HBase backed table
> --
>
> Key: HIVE-13065
> URL: https://issues.apache.org/jira/browse/HIVE-13065
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13065.1.patch
>
>
> Hive throws NPE when writing data to a HBase backed table with below 
> conditions:
> # There is a map type column
> # The map type column has NULL in its values
> Below are the reproduce steps:
> *1) Create a HBase backed Hive table*
> {code:sql}
> create table hbase_test (id bigint, data map)
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> with serdeproperties ("hbase.columns.mapping" = ":key,cf:map_col")
> tblproperties ("hbase.table.name" = "hive_test");
> {code}
> *2) insert data into above table*
> {code:sql}
> insert overwrite table hbase_test select 1 as id, map('abcd', null) as data 
> from src limit 1;
> {code}
> The mapreduce job for insert query fails. Error messages are as below:
> {noformat}
> 2016-02-15 02:26:33,225 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) {"key":{},"value":{"_col0":1,"_col1":{"abcd":null}}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:265)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{},"value":{"_col0":1,"_col1":{"abcd":null}}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:731)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:666)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:221)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:236)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:275)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:222)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
>   at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13040) Handle empty bucket creations more efficiently

2016-02-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13040:

Attachment: HIVE-13040.4.patch

> Handle empty bucket creations more efficiently 
> ---
>
> Key: HIVE-13040
> URL: https://issues.apache.org/jira/browse/HIVE-13040
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13040.2.patch, HIVE-13040.3.patch, 
> HIVE-13040.4.patch, HIVE-13040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13040) Handle empty bucket creations more efficiently

2016-02-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13040:

Status: Open  (was: Patch Available)

> Handle empty bucket creations more efficiently 
> ---
>
> Key: HIVE-13040
> URL: https://issues.apache.org/jira/browse/HIVE-13040
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.1.0, 1.2.0, 1.0.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13040.2.patch, HIVE-13040.3.patch, 
> HIVE-13040.4.patch, HIVE-13040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13076) Implement FK/PK "rely novalidate" constraints for better CBO

2016-02-17 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan reassigned HIVE-13076:


Assignee: Hari Sankar Sivarama Subramaniyan

> Implement FK/PK "rely novalidate" constraints for better CBO
> 
>
> Key: HIVE-13076
> URL: https://issues.apache.org/jira/browse/HIVE-13076
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Reporter: Ruslan Dautkhanov
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for 
> Hive to start with something like that for PK/FK constraints. So CBO has more 
> information for optimizations. It does not have to actually check if that 
> constraint is relationship is true; it can just "rely" on that constraint.. 
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like 
> this - https://issues.apache.org/jira/browse/HIVE-13019



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12927) HBase metastore: sequences should be one per row, not all in one row

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151050#comment-15151050
 ] 

Sergey Shelukhin commented on HIVE-12927:
-

+1

> HBase metastore: sequences should be one per row, not all in one row
> 
>
> Key: HIVE-12927
> URL: https://issues.apache.org/jira/browse/HIVE-12927
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>Priority: Critical
> Attachments: HIVE-12927.2.patch, HIVE-12927.patch
>
>
> {noformat}
>   long getNextSequence(byte[] sequence) throws IOException {
> {noformat}
> Is not safe in presence of any concurrency. It should use HBase increment API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler

2016-02-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151017#comment-15151017
 ] 

Alan Gates commented on HIVE-13013:
---

+1, looks good.

> Further Improve concurrency in TxnHandler
> -
>
> Key: HIVE-13013
> URL: https://issues.apache.org/jira/browse/HIVE-13013
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13013.patch
>
>
> There are still a few operations in TxnHandler that run at Serializable 
> isolation.
> Most or all of them can be dropped to READ_COMMITTED now that we have SELECT 
> ... FOR UPDATE support.  This will reduce number of deadlocks in the DBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12981) ThriftCLIService uses incompatible getShortName() implementation

2016-02-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151011#comment-15151011
 ] 

Sergey Shelukhin commented on HIVE-12981:
-

Sorry, will commit shortly, I've been a little bit busy.

> ThriftCLIService uses incompatible getShortName() implementation
> 
>
> Key: HIVE-12981
> URL: https://issues.apache.org/jira/browse/HIVE-12981
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Authorization, CLI, Security
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Thejas M Nair
>Priority: Critical
>  Labels: kerberos
> Attachments: 0001-HIVE-12981-Use-KerberosName.patch, 
> HIVE-12981-branch-1.2.patch, HIVE-12981.01.patch, HIVE-12981.patch
>
>
> ThriftCLIService has a local implementation getShortName() that assumes a 
> short name is always the part before "@" and "/". This is not always the case 
> as Kerberos Rules (from Hadoop's KerberosName) might actually transform a 
> name to something else.
> Considering a pending change to getShortName() (#HADOOP-12751) and the normal 
> use of KerberosName in other parts of Hive it only seems logical to use the 
> standard implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11607) Export tables broken for data > 32 MB

2016-02-17 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150958#comment-15150958
 ] 

Ravi Prakash commented on HIVE-11607:
-

We ran into this as well. Thanks a lot for the fix folks!
One thing I did notice though is 
https://github.com/apache/hive/blob/master/shims/0.23/pom.xml#L186 {code} 
provided {code} . 
Unfortunately, hadoop distcp is in the tools directory, which has been dropped 
from being loaded on the default classpath. So we had to play some games for 
the Hive CLI to pick up this jar

> Export tables broken for data > 32 MB
> -
>
> Key: HIVE-11607
> URL: https://issues.apache.org/jira/browse/HIVE-11607
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11607.2.patch, HIVE-11607.3.patch, HIVE-11607.patch
>
>
> Broken for both hadoop-1 as well as hadoop-2 line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-17 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150904#comment-15150904
 ] 

Naveen Gangam commented on HIVE-12994:
--


aah, makes sense. Thanks for the explaination.

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13056) delegation tokens do not work with HS2 when used with http transport and kerberos

2016-02-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13056:

Fix Version/s: 2.0.1

> delegation tokens do not work with HS2 when used with http transport and 
> kerberos
> -
>
> Key: HIVE-13056
> URL: https://issues.apache.org/jira/browse/HIVE-13056
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Affects Versions: 1.2.1
>Reporter: Cheng Xu
>Assignee: Sushanth Sowmyan
>Priority: Critical
> Fix For: 2.1.0, 2.0.1
>
> Attachments: HIVE-13056.patch
>
>
> We're getting a HiveSQLException on secure windows clusters.
> {code}
> 2016-02-08 
> 13:48:09,535|beaver.machine|INFO|6114|140264674350912|MainThread|Job ID : 
> 000-160208134528402-oozie-oozi-W
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|Workflow 
> Name : hive2-wf
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|App Path 
>  : 
> wasb://oozie1-hb...@humbtestings5jp.blob.core.windows.net/user/hrt_qa/test_hiveserver2
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|Status   
>  : KILLED
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|Run  
>  : 0
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|User 
>  : hrt_qa
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|Group
>  : -
> 2016-02-08 
> 13:48:09,547|beaver.machine|INFO|6114|140264674350912|MainThread|Created  
>  : 2016-02-08 13:47 GMT
> 2016-02-08 
> 13:48:09,548|beaver.machine|INFO|6114|140264674350912|MainThread|Started  
>  : 2016-02-08 13:47 GMT
> 2016-02-08 
> 13:48:09,552|beaver.machine|INFO|6114|140264674350912|MainThread|Last 
> Modified : 2016-02-08 13:48 GMT
> 2016-02-08 
> 13:48:09,553|beaver.machine|INFO|6114|140264674350912|MainThread|Ended
>  : 2016-02-08 13:48 GMT
> 2016-02-08 
> 13:48:09,553|beaver.machine|INFO|6114|140264674350912|MainThread|CoordAction 
> ID: -
> 2016-02-08 13:48:09,566|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,566|beaver.machine|INFO|6114|140264674350912|MainThread|Actions
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|ID   
>  Status
> Ext ID Ext Status Err Code
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,571|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@:start:
>   OK-  OK 
> -
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@hive-node
> ERROR -  ERROR  
> HiveSQLException
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@fail
>  OK-  OK  
>E0729
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13056) delegation tokens do not work with HS2 when used with http transport and kerberos

2016-02-17 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150886#comment-15150886
 ] 

Sushanth Sowmyan commented on HIVE-13056:
-

Committed to branch-2.0 as well, now that it is open again, to be part of an 
eventual 2.0.1 if so.

> delegation tokens do not work with HS2 when used with http transport and 
> kerberos
> -
>
> Key: HIVE-13056
> URL: https://issues.apache.org/jira/browse/HIVE-13056
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Affects Versions: 1.2.1
>Reporter: Cheng Xu
>Assignee: Sushanth Sowmyan
>Priority: Critical
> Fix For: 2.1.0, 2.0.1
>
> Attachments: HIVE-13056.patch
>
>
> We're getting a HiveSQLException on secure windows clusters.
> {code}
> 2016-02-08 
> 13:48:09,535|beaver.machine|INFO|6114|140264674350912|MainThread|Job ID : 
> 000-160208134528402-oozie-oozi-W
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|Workflow 
> Name : hive2-wf
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|App Path 
>  : 
> wasb://oozie1-hb...@humbtestings5jp.blob.core.windows.net/user/hrt_qa/test_hiveserver2
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|Status   
>  : KILLED
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|Run  
>  : 0
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|User 
>  : hrt_qa
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|Group
>  : -
> 2016-02-08 
> 13:48:09,547|beaver.machine|INFO|6114|140264674350912|MainThread|Created  
>  : 2016-02-08 13:47 GMT
> 2016-02-08 
> 13:48:09,548|beaver.machine|INFO|6114|140264674350912|MainThread|Started  
>  : 2016-02-08 13:47 GMT
> 2016-02-08 
> 13:48:09,552|beaver.machine|INFO|6114|140264674350912|MainThread|Last 
> Modified : 2016-02-08 13:48 GMT
> 2016-02-08 
> 13:48:09,553|beaver.machine|INFO|6114|140264674350912|MainThread|Ended
>  : 2016-02-08 13:48 GMT
> 2016-02-08 
> 13:48:09,553|beaver.machine|INFO|6114|140264674350912|MainThread|CoordAction 
> ID: -
> 2016-02-08 13:48:09,566|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,566|beaver.machine|INFO|6114|140264674350912|MainThread|Actions
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|ID   
>  Status
> Ext ID Ext Status Err Code
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,571|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@:start:
>   OK-  OK 
> -
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@hive-node
> ERROR -  ERROR  
> HiveSQLException
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@fail
>  OK-  OK  
>E0729
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10115) HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled

2016-02-17 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150884#comment-15150884
 ] 

Sushanth Sowmyan commented on HIVE-10115:
-

Committed to branch-2.0 as well, now that it is open again, to be part of an 
eventual 2.0.1 if so.

> HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and 
> Delegation token(DIGEST) when alternate authentication is enabled
> ---
>
> Key: HIVE-10115
> URL: https://issues.apache.org/jira/browse/HIVE-10115
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 1.1.0
>Reporter: Mubashir Kazia
>Assignee: Mubashir Kazia
>  Labels: patch
> Fix For: 1.3.0, 2.1.0, 2.0.1
>
> Attachments: HIVE-10115.0.patch, HIVE-10115.2.patch
>
>
> In a Kerberized cluster when alternate authentication is enabled on HS2, it 
> should also accept Kerberos Authentication. The reason this is important is 
> because when we enable LDAP authentication HS2 stops accepting delegation 
> token authentication. So we are forced to enter username passwords in the 
> oozie configuration.
> The whole idea of SASL is that multiple authentication mechanism can be 
> offered. If we disable Kerberos(GSSAPI) and delegation token (DIGEST) 
> authentication when we enable LDAP authentication, this defeats SASL purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10115) HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled

2016-02-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10115:

Fix Version/s: 2.0.1

> HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and 
> Delegation token(DIGEST) when alternate authentication is enabled
> ---
>
> Key: HIVE-10115
> URL: https://issues.apache.org/jira/browse/HIVE-10115
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 1.1.0
>Reporter: Mubashir Kazia
>Assignee: Mubashir Kazia
>  Labels: patch
> Fix For: 1.3.0, 2.1.0, 2.0.1
>
> Attachments: HIVE-10115.0.patch, HIVE-10115.2.patch
>
>
> In a Kerberized cluster when alternate authentication is enabled on HS2, it 
> should also accept Kerberos Authentication. The reason this is important is 
> because when we enable LDAP authentication HS2 stops accepting delegation 
> token authentication. So we are forced to enter username passwords in the 
> oozie configuration.
> The whole idea of SASL is that multiple authentication mechanism can be 
> offered. If we disable Kerberos(GSSAPI) and delegation token (DIGEST) 
> authentication when we enable LDAP authentication, this defeats SASL purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-17 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150882#comment-15150882
 ] 

Jesus Camacho Rodriguez commented on HIVE-12994:


[~ngangam], thanks for checking.

Yes, indeed the other scripts need to be added to the patch; in fact, I 
uploaded the patch because I wanted to trigger a QA to check whether it was 
complete or I needed more work. However, I hit a couple of issues with 
metastore testing (cf HIVE-13062, HIVE-13070), thus no return from QA yet.

No worries, the final patch will contain those scripts.

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-17 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150871#comment-15150871
 ] 

Naveen Gangam commented on HIVE-12994:
--

[~jcamachorodriguez] The patch only contains HMS schema changes for Derby DB 
only. Shouldnt the schema changes also be applied to Oracle, MySQL and Postgres 
SQL scripts? Am I missing something? Thanks

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.05.patch, 
> HIVE-12994.06.patch, HIVE-12994.06.patch, HIVE-12994.07.patch, 
> HIVE-12994.08.patch, HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13056) delegation tokens do not work with HS2 when used with http transport and kerberos

2016-02-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13056:

   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks, Thejas!

> delegation tokens do not work with HS2 when used with http transport and 
> kerberos
> -
>
> Key: HIVE-13056
> URL: https://issues.apache.org/jira/browse/HIVE-13056
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Affects Versions: 1.2.1
>Reporter: Cheng Xu
>Assignee: Sushanth Sowmyan
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13056.patch
>
>
> We're getting a HiveSQLException on secure windows clusters.
> {code}
> 2016-02-08 
> 13:48:09,535|beaver.machine|INFO|6114|140264674350912|MainThread|Job ID : 
> 000-160208134528402-oozie-oozi-W
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|Workflow 
> Name : hive2-wf
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|App Path 
>  : 
> wasb://oozie1-hb...@humbtestings5jp.blob.core.windows.net/user/hrt_qa/test_hiveserver2
> 2016-02-08 
> 13:48:09,536|beaver.machine|INFO|6114|140264674350912|MainThread|Status   
>  : KILLED
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|Run  
>  : 0
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|User 
>  : hrt_qa
> 2016-02-08 
> 13:48:09,537|beaver.machine|INFO|6114|140264674350912|MainThread|Group
>  : -
> 2016-02-08 
> 13:48:09,547|beaver.machine|INFO|6114|140264674350912|MainThread|Created  
>  : 2016-02-08 13:47 GMT
> 2016-02-08 
> 13:48:09,548|beaver.machine|INFO|6114|140264674350912|MainThread|Started  
>  : 2016-02-08 13:47 GMT
> 2016-02-08 
> 13:48:09,552|beaver.machine|INFO|6114|140264674350912|MainThread|Last 
> Modified : 2016-02-08 13:48 GMT
> 2016-02-08 
> 13:48:09,553|beaver.machine|INFO|6114|140264674350912|MainThread|Ended
>  : 2016-02-08 13:48 GMT
> 2016-02-08 
> 13:48:09,553|beaver.machine|INFO|6114|140264674350912|MainThread|CoordAction 
> ID: -
> 2016-02-08 13:48:09,566|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,566|beaver.machine|INFO|6114|140264674350912|MainThread|Actions
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|ID   
>  Status
> Ext ID Ext Status Err Code
> 2016-02-08 
> 13:48:09,567|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,571|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@:start:
>   OK-  OK 
> -
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@hive-node
> ERROR -  ERROR  
> HiveSQLException
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|000-160208134528402-oozie-oozi-W@fail
>  OK-  OK  
>E0729
> 2016-02-08 
> 13:48:09,572|beaver.machine|INFO|6114|140264674350912|MainThread|
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13075) Metastore shuts down when no delegation token is found in ZooKeeper

2016-02-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Wikieł updated HIVE-13075:

Affects Version/s: 1.3.0
   1.2.1
  Description: 
{{ZooKeeperTokenStore}} looks [as 
follows|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/ZooKeeperTokenStore.java#L397]:

{code:java}
@Override
public DelegationTokenInformation getToken(DelegationTokenIdentifier 
tokenIdentifier) {
  byte[] tokenBytes = zkGetData(getTokenPath(tokenIdentifier));
  try {
return 
HiveDelegationTokenSupport.decodeDelegationTokenInformation(tokenBytes);
  } catch (Exception ex) {
throw new TokenStoreException("Failed to decode token", ex);
  }
}
{code}

which is slightly different from [DBTokenStore 
implementation|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/DBTokenStore.java#L85]
 that is protected against {{tokenBytes==null}} because nullable {{tokenBytes}} 
causes NPE to be thrown in 
[HiveDelegationTokenSupport#decodeDelegationTokenInformation|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/security/token/delegation/HiveDelegationTokenSupport.java#L51]

Furthermore, NPE thrown here causes 
[TokenStoreDelegationTokenSecretManager.ExpiredTokenRemover|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/TokenStoreDelegationTokenSecretManager.java#L333]
 to catch it and exits MetaStore.

{{null}} from 
{{[zkGetData()|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/ZooKeeperTokenStore.java#L284]}}
 is possible during ZooKeeper failure or (and that was our case) when another 
metastore instance removes tokens during {{ExpiredTokenRemover}} run. There 
were two solutions of this problem:

 * distributed lock in ZooKeeper acquired during one metastore instance's 
ExpiredTokenRemover run,
 * simple null check

I think null check is sufficient if it is in {{DBTokenStore}}.

Patch will be attached.

Sorry for an edit but I think worth mentioning is a fact that possible 
workaround for this issue is setting 
{{hive.cluster.delegation.key.update-interval}}, 
{{hive.cluster.delegation.token.renew-interval}} and 
{{hive.cluster.delegation.token.max-lifetime}} to one year as described 
[here|https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Potential-misconfiguration-detected-Hue-Hive-Editor-HiveServer2/m-p/26117/highlight/true#M763].
 But in my opinion it is not an engineer-way of doing things ;)

  was:
{{ZooKeeperTokenStore}} looks [as 
follows|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/ZooKeeperTokenStore.java#L397]:

{code:java}
@Override
public DelegationTokenInformation getToken(DelegationTokenIdentifier 
tokenIdentifier) {
  byte[] tokenBytes = zkGetData(getTokenPath(tokenIdentifier));
  try {
return 
HiveDelegationTokenSupport.decodeDelegationTokenInformation(tokenBytes);
  } catch (Exception ex) {
throw new TokenStoreException("Failed to decode token", ex);
  }
}
{code}

which is slightly different from [DBTokenStore 
implementation|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/DBTokenStore.java#L85]
 that is protected against {{tokenBytes==null}} because nullable {{tokenBytes}} 
causes NPE to be thrown in 
[HiveDelegationTokenSupport#decodeDelegationTokenInformation|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/security/token/delegation/HiveDelegationTokenSupport.java#L51]

Furthermore, NPE thrown here causes 
[TokenStoreDelegationTokenSecretManager.ExpiredTokenRemover|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/TokenStoreDelegationTokenSecretManager.java#L333]
 to catch it and exits MetaStore.

{{null}} from 
{{[zkGetData()|https://github.com/apache/hive/blob/branch-1.2/shims/common/src/main/java/org/apache/hadoop/hive/thrift/ZooKeeperTokenStore.java#L284]}}
 is possible during ZooKeeper failure or (and that was our case) when another 
metastore instance removes tokens during {{ExpiredTokenRemover}} run. There 
were two solutions of this problem:

 * distributed lock in ZooKeeper acquired during one metastore instance's 
ExpiredTokenRemover run,
 * simple null check

I think null check is sufficient if it is in {{DBTokenStore}}.

Patch will be attached.


> Metastore shuts down when no delegation token is found in ZooKeeper
> ---
>
> Key: HIVE-13075
> URL: https://issues.apache.org/jira/browse/HIVE-13075
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.1
>

[jira] [Commented] (HIVE-12856) LLAP: update (add/remove) the UDFs available in LLAP when they are changed (refresh periodically)

2016-02-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150821#comment-15150821
 ] 

Hive QA commented on HIVE-12856:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12788154/HIVE-12856.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9790 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7010/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7010/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7010/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12788154 - PreCommit-HIVE-TRUNK-Build

> LLAP: update (add/remove) the UDFs available in LLAP when they are changed 
> (refresh periodically)
> -
>
> Key: HIVE-12856
> URL: https://issues.apache.org/jira/browse/HIVE-12856
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12856.01.nogen.patch, HIVE-12856.01.patch, 
> HIVE-12856.02.nogen.patch, HIVE-12856.02.patch, HIVE-12856.nogen.patch, 
> HIVE-12856.patch
>
>
> I don't think re-querying the functions is going to scale, and the sessions 
> obviously cannot notify all LLAP clusters of every change. We should add 
> global versioning to metastore functions to track changes, and then possibly 
> add a notification mechanism, potentially thru ZK to avoid overloading the 
> metastore itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150798#comment-15150798
 ] 

Sergio Peña commented on HIVE-13039:


Thanks. I committed to the branch-1 as well.


> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, 
> HIVE-13039.2.patch, HIVE-13039.3.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-17 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13039:
---
Fix Version/s: 1.3.0

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, 
> HIVE-13039.2.patch, HIVE-13039.3.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13040) Handle empty bucket creations more efficiently

2016-02-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13040:

Status: Patch Available  (was: Open)

> Handle empty bucket creations more efficiently 
> ---
>
> Key: HIVE-13040
> URL: https://issues.apache.org/jira/browse/HIVE-13040
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.1.0, 1.2.0, 1.0.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13040.2.patch, HIVE-13040.3.patch, HIVE-13040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13040) Handle empty bucket creations more efficiently

2016-02-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13040:

Status: Open  (was: Patch Available)

> Handle empty bucket creations more efficiently 
> ---
>
> Key: HIVE-13040
> URL: https://issues.apache.org/jira/browse/HIVE-13040
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.1.0, 1.2.0, 1.0.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13040.2.patch, HIVE-13040.3.patch, HIVE-13040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13040) Handle empty bucket creations more efficiently

2016-02-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13040:

Attachment: HIVE-13040.3.patch

> Handle empty bucket creations more efficiently 
> ---
>
> Key: HIVE-13040
> URL: https://issues.apache.org/jira/browse/HIVE-13040
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13040.2.patch, HIVE-13040.3.patch, HIVE-13040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9576) ALTER TABLE STORED AS - change storage format and/or compression

2016-02-17 Thread Stephen Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150740#comment-15150740
 ] 

Stephen Miller commented on HIVE-9576:
--

If it helps, the following command does work:

CREATE TABLE MyNewTable
LIKE MyOldTable
STORED AS (e.g.) PARQUET;

So at least you can copy over the schema easily.

I suspect changing storage type under the hood is trickier than it sounds if 
you want the table to remain writeable during the process; there would be delta 
files involved.

> ALTER TABLE STORED AS - change storage format and/or compression
> 
>
> Key: HIVE-9576
> URL: https://issues.apache.org/jira/browse/HIVE-9576
> Project: Hive
>  Issue Type: New Feature
>  Components: Compression, File Formats, Parser, Query Planning, Query 
> Processor, Serializers/Deserializers, SQL, StorageHandler
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request to support rewriting a table into a different format or 
> compression level via a single alter table statement:
> {code}ALTER TABLE ... STORED AS... TBLPROPERTIES(...){code}
> Currently we create a new table of a different format and do
> {code}INSERT OVERWRITE newtable SELECT * FROM oldtable{code}
> but a colleague has just asked me why this can't be handled by a single alter 
> statement via a tempory table replacement under the hood, which seems like a 
> fair question.
> Best Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-17 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150704#comment-15150704
 ] 

Yongzhi Chen commented on HIVE-13039:
-

[~spena], I tried to add the tests, but each one has many build errors. I think 
the files are added with other fixes which are not in branch-1.
And for this jira, it has its own tests, so it is safe to only have the newly 
added tests. 

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 2.1.0
>
> Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, 
> HIVE-13039.2.patch, HIVE-13039.3.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13065) Hive throws NPE when writing map type data to a HBase backed table

2016-02-17 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150670#comment-15150670
 ] 

Yongzhi Chen commented on HIVE-13065:
-

The failures are not related.

> Hive throws NPE when writing map type data to a HBase backed table
> --
>
> Key: HIVE-13065
> URL: https://issues.apache.org/jira/browse/HIVE-13065
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13065.1.patch
>
>
> Hive throws NPE when writing data to a HBase backed table with below 
> conditions:
> # There is a map type column
> # The map type column has NULL in its values
> Below are the reproduce steps:
> *1) Create a HBase backed Hive table*
> {code:sql}
> create table hbase_test (id bigint, data map)
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> with serdeproperties ("hbase.columns.mapping" = ":key,cf:map_col")
> tblproperties ("hbase.table.name" = "hive_test");
> {code}
> *2) insert data into above table*
> {code:sql}
> insert overwrite table hbase_test select 1 as id, map('abcd', null) as data 
> from src limit 1;
> {code}
> The mapreduce job for insert query fails. Error messages are as below:
> {noformat}
> 2016-02-15 02:26:33,225 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) {"key":{},"value":{"_col0":1,"_col1":{"abcd":null}}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:265)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{},"value":{"_col0":1,"_col1":{"abcd":null}}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:731)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:666)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:221)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:236)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:275)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:222)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
>   at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >