[jira] [Commented] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115307#comment-16115307
 ] 

Hive QA commented on HIVE-17181:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880482/HIVE-17181.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10990 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6269/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6269/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6269/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880482 - PreCommit-HIVE-Build

> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17255) OrcInputFormat.Context relying on wrong property

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115291#comment-16115291
 ] 

Hive QA commented on HIVE-17255:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880490/HIVE-17255.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 56 failed/errored test(s), 10989 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat6]
 (batchId=7)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.ql.TestAcidOnTezWithSplitUpdate.testMapJoinOnMR 
(batchId=218)
org.apache.hadoop.hive.ql.TestAcidOnTezWithSplitUpdate.testMapJoinOnTez 
(batchId=218)
org.apache.hadoop.hive.ql.TestAcidOnTezWithSplitUpdate.testMergeJoinOnMR 
(batchId=218)
org.apache.hadoop.hive.ql.TestAcidOnTezWithSplitUpdate.testMergeJoinOnTez 
(batchId=218)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testACIDwithSchemaEvolutionAndCompaction
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testBucketCodec 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testBucketizedInputFormat
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testDeleteIn 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testDynamicPartitionsMerge
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testDynamicPartitionsMerge2
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge2 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMerge3 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMergeWithPredicate
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMultiInsert 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMultiInsertStatement
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testNonAcidToAcidConversion1
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testNonAcidToAcidConversion2
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testNonAcidToAcidConversion3
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testOrcNoPPD 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testOrcPPD 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testUpdateMixedCase 
(batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.updateDeletePartitioned
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.writeBetweenWorkerAndCleaner
 (batchId=282)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testACIDwithSchemaEvolutionAndCompaction
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAlterTable
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketCodec
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketizedInputFormat
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDeleteIn
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMergeWithPredicate
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsertStatement
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion1
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion2
 (batchId=279)
org.apache.hadoop.hive.ql.TestTxnCommands

[jira] [Commented] (HIVE-15686) Partitions on Remote HDFS break encryption-zone checks

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115270#comment-16115270
 ] 

Hive QA commented on HIVE-15686:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880098/HIVE-15686.branch-2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6267/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6267/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6267/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-05 04:48:49.470
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6267/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-05 04:48:49.473
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 5e06155 HIVE-17234 Remove HBase metastore from master (Alan 
Gates, reviewed by Daniel Dai and Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 5e06155 HIVE-17234 Remove HBase metastore from master (Alan 
Gates, reviewed by Daniel Dai and Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-05 04:48:55.200
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: No 
such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880098 - PreCommit-HIVE-Build

> Partitions on Remote HDFS break encryption-zone checks
> --
>
> Key: HIVE-15686
> URL: https://issues.apache.org/jira/browse/HIVE-15686
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-15686.branch-2.patch
>
>
> This is in relation to HIVE-13243, which fixes encryption-zone checks for 
> external tables.
> Unfortunately, this is still borked for partitions with remote HDFS paths. 
> The code fails as follows:
> {noformat}
> 2015-12-09 19:26:14,997 ERROR [pool-4-thread-1476] server.TThreadPoolServer 
> (TThreadPoolServer.java:run_aroundBody0(305)) - Error occurred during 
> processing of message.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://remote-cluster-nn1.myth.net:8020/dbs/mythdb/myth_table/dt=20170120, 
> expected: hdfs://local-cluster-n1.myth.net:8020
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:1985)
> at 
> org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:262)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1290)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.checkTrashPurgeCombination(HiveMetaStore.java:1746)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_partitions_req(HiveMetaStore.java:2974)
> at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke

[jira] [Commented] (HIVE-17247) HoS DPP: UDFs on the partition column side does not evaluate correctly

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115269#comment-16115269
 ] 

Hive QA commented on HIVE-17247:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880480/HIVE-17247.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10989 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=239)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert1] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
 (batchId=221)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6266/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6266/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6266/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880480 - PreCommit-HIVE-Build

> HoS DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-17247
> URL: https://issues.apache.org/jira/browse/HIVE-17247
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17247.1.patch
>
>
> Same problem as HIVE-12473 and HIVE-12667.
> The query below (uses tables from {{spark_dynamic_partition_pruning.q}}) 
> returns incorrect results:
> {code}
> select count(*) from srcpart join srcpart_date on (day(srcpart.ds) = 
> day(srcpart_date.ds)) where srcpart_date.`date` = '2008-04-08';
> {code}
> It returns a value of 0 when DPP is on, when it is disabled it returns 1000



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115253#comment-16115253
 ] 

Hive QA commented on HIVE-17254:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880484/HIVE-17254.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10989 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6265/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6265/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6265/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880484 - PreCommit-HIVE-Build

> Skip updating AccessTime of recycled files in ReplChangeManager
> ---
>
> Key: HIVE-17254
> URL: https://issues.apache.org/jira/browse/HIVE-17254
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17254.1.patch
>
>
> For recycled file, we update both ModifyTime and AccessTime:
> fs.setTimes(path, now, now);
> On some version of hdfs, this is now allowed when 
> "dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved 
> in HDFS-9208, we don't use AccessTime in CM and this could be skipped so we 
> don't have to fail on this scenario.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17216) Additional qtests for HoS DPP

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115221#comment-16115221
 ] 

Hive QA commented on HIVE-17216:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880447/HIVE-17216.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10989 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6264/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6264/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6264/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880447 - PreCommit-HIVE-Build

> Additional qtests for HoS DPP
> -
>
> Key: HIVE-17216
> URL: https://issues.apache.org/jira/browse/HIVE-17216
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17216.1.patch
>
>
> There are a few queries that we can add to the HoS DPP tests to increase 
> coverage. There are a few query patterns that the current tests don't cover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17256) add a notion of a guaranteed task to LLAP

2017-08-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115211#comment-16115211
 ] 

Sergey Shelukhin commented on HIVE-17256:
-

[~sseth] can you review this? thanks. Most of the patch is actually protobuf 
changes I think

> add a notion of a guaranteed task to LLAP
> -
>
> Key: HIVE-17256
> URL: https://issues.apache.org/jira/browse/HIVE-17256
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17256.patch
>
>
> Tasks are basically on two levels, guaranteed and speculative, with 
> speculative being the default. As long as noone uses the new flag, the tasks 
> behave the same.
> All the tasks that do have the flag also behave the same with regard to each 
> other.
> The difference is that a guaranteed task is always higher priority, and 
> preempts, a speculative task. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17256) add a notion of a guaranteed task to LLAP

2017-08-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17256:

Attachment: HIVE-17256.patch

Updating the protocol, the internal scheduling, adding the tests.


> add a notion of a guaranteed task to LLAP
> -
>
> Key: HIVE-17256
> URL: https://issues.apache.org/jira/browse/HIVE-17256
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17256.patch
>
>
> Tasks are basically on two levels, guaranteed and speculative, with 
> speculative being the default. As long as noone uses the new flag, the tasks 
> behave the same.
> All the tasks that do have the flag also behave the same with regard to each 
> other.
> The difference is that a guaranteed task is always higher priority, and 
> preempts, a speculative task. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17256) add a notion of a guaranteed task to LLAP

2017-08-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17256:
---


> add a notion of a guaranteed task to LLAP
> -
>
> Key: HIVE-17256
> URL: https://issues.apache.org/jira/browse/HIVE-17256
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Tasks are basically on two levels, guaranteed and speculative, with 
> speculative being the default. As long as noone uses the new flag, the tasks 
> behave the same.
> All the tasks that do have the flag also behave the same with regard to each 
> other.
> The difference is that a guaranteed task is always higher priority, and 
> preempts, a speculative task. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115189#comment-16115189
 ] 

Hive QA commented on HIVE-17246:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880483/HIVE-17246.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10990 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=239)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6263/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6263/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6263/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880483 - PreCommit-HIVE-Build

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17089) make acid 2.0 the default

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115158#comment-16115158
 ] 

Hive QA commented on HIVE-17089:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880486/HIVE-17089.05.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 10952 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=239)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.io.TestAcidUtils.testAcidOperationalPropertiesSettersAndGetters
 (batchId=260)
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testSplitGenReadOps 
(batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testSplitGenReadOpsLocalCache
 (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testSplitGenReadOpsLocalCacheChangeFileLen
 (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testSplitGenReadOpsLocalCacheChangeModificationTime
 (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testEmpty (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBaseAndDelta 
(batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderDelta 
(batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta
 (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta
 (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta
 (batchId=263)
org.apache.hadoop.hive.ql.io.orc.TestOrcRecordUpdater.testUpdates (batchId=263)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6262/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6262/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6262/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880486 - PreCommit-HIVE-Build

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch, 
> HIVE-17089.05.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-04 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115151#comment-16115151
 ] 

Thejas M Nair commented on HIVE-17181:
--

Thanks for the test Mithun.
Won't junit fail method stop further query execution ?

Will line 214 below get executed ?
{code}
212 catch (Exception unexpected) {
213   fail("Unexpected failure! " + unexpected.getMessage());
214   unexpected.printStackTrace();
{code}

One option seems to be just letting that exception happen - 
https://stackoverflow.com/questions/16596418/how-to-handle-exceptions-in-junit


> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-8472) Add ALTER DATABASE SET LOCATION

2017-08-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115084#comment-16115084
 ] 

Alan Gates commented on HIVE-8472:
--

One other point, to save Lefty the time of saying it, the docs need changed to 
reflect this new functionality.  In particular, it needs to call out that 
changing the location does not move any existing data.  It does not even affect 
where existing tables will write new data (including new partitions).  It just 
changes where new tables will be located by default.

> Add ALTER DATABASE SET LOCATION
> ---
>
> Key: HIVE-8472
> URL: https://issues.apache.org/jira/browse/HIVE-8472
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Jeremy Beard
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8472.1.patch
>
>
> Similarly to ALTER TABLE tablename SET LOCATION, it would be helpful if there 
> was an equivalent for databases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-8472) Add ALTER DATABASE SET LOCATION

2017-08-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115082#comment-16115082
 ] 

Alan Gates commented on HIVE-8472:
--

In HiveMetaStore.alter_database, you want to call get_database_core to get the 
old db, not get_database.  get_database will fire listener events and the 
metric tracking functions that determine how much time are being spent in what 
methods.

+1 to adding an AlterDatabase event.  You should add a test in the event firing 
tests to make sure it is properly triggered whenever a database is altered.

> Add ALTER DATABASE SET LOCATION
> ---
>
> Key: HIVE-8472
> URL: https://issues.apache.org/jira/browse/HIVE-8472
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Jeremy Beard
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8472.1.patch
>
>
> Similarly to ALTER TABLE tablename SET LOCATION, it would be helpful if there 
> was an equivalent for databases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager

2017-08-04 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115073#comment-16115073
 ] 

Thejas M Nair commented on HIVE-17254:
--

+1 pending tests


> Skip updating AccessTime of recycled files in ReplChangeManager
> ---
>
> Key: HIVE-17254
> URL: https://issues.apache.org/jira/browse/HIVE-17254
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17254.1.patch
>
>
> For recycled file, we update both ModifyTime and AccessTime:
> fs.setTimes(path, now, now);
> On some version of hdfs, this is now allowed when 
> "dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved 
> in HDFS-9208, we don't use AccessTime in CM and this could be skipped so we 
> don't have to fail on this scenario.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17255) OrcInputFormat.Context relying on wrong property

2017-08-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17255:
--
Status: Patch Available  (was: Open)

> OrcInputFormat.Context relying on wrong property
> 
>
> Key: HIVE-17255
> URL: https://issues.apache.org/jira/browse/HIVE-17255
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17255.01.patch
>
>
> constructor of Context() has
> boolean isTableTransactional = 
> conf.getBoolean(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, false).
> This looks wrong.  Everywhere else we use 
> ConfVars.HIVE_TRANSACTIONAL_TABLE_SCAN.
> (yet someone does set it - can't find where)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17255) OrcInputFormat.Context relying on wrong property

2017-08-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17255:
--
Attachment: HIVE-17255.01.patch

> OrcInputFormat.Context relying on wrong property
> 
>
> Key: HIVE-17255
> URL: https://issues.apache.org/jira/browse/HIVE-17255
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17255.01.patch
>
>
> constructor of Context() has
> boolean isTableTransactional = 
> conf.getBoolean(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, false).
> This looks wrong.  Everywhere else we use 
> ConfVars.HIVE_TRANSACTIONAL_TABLE_SCAN.
> (yet someone does set it - can't find where)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17255) OrcInputFormat.Context relying on wrong property

2017-08-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17255:
-


> OrcInputFormat.Context relying on wrong property
> 
>
> Key: HIVE-17255
> URL: https://issues.apache.org/jira/browse/HIVE-17255
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> constructor of Context() has
> boolean isTableTransactional = 
> conf.getBoolean(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, false).
> This looks wrong.  Everywhere else we use 
> ConfVars.HIVE_TRANSACTIONAL_TABLE_SCAN.
> (yet someone does set it - can't find where)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Attachment: HIVE-17089.05.patch

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch, 
> HIVE-17089.05.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Status: Patch Available  (was: Open)

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Status: Open  (was: Patch Available)

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization

2017-08-04 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-17235:
-
Attachment: HIVE-17235.patch

This patch clones LongColumnVector and adds some testing.

> Add ORC Decimal64 Serialization/Deserialization
> ---
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Attachment: (was: HIVE-17246-2.patch)

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager

2017-08-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-17254:
--
Status: Patch Available  (was: Open)

> Skip updating AccessTime of recycled files in ReplChangeManager
> ---
>
> Key: HIVE-17254
> URL: https://issues.apache.org/jira/browse/HIVE-17254
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17254.1.patch
>
>
> For recycled file, we update both ModifyTime and AccessTime:
> fs.setTimes(path, now, now);
> On some version of hdfs, this is now allowed when 
> "dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved 
> in HDFS-9208, we don't use AccessTime in CM and this could be skipped so we 
> don't have to fail on this scenario.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager

2017-08-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-17254:
--
Attachment: HIVE-17254.1.patch

The HDFS issue is fixed in bundled hadoop (2.8.0), so it would be hard to write 
a test case. Manually tested and it works.

> Skip updating AccessTime of recycled files in ReplChangeManager
> ---
>
> Key: HIVE-17254
> URL: https://issues.apache.org/jira/browse/HIVE-17254
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17254.1.patch
>
>
> For recycled file, we update both ModifyTime and AccessTime:
> fs.setTimes(path, now, now);
> On some version of hdfs, this is now allowed when 
> "dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved 
> in HDFS-9208, we don't use AccessTime in CM and this could be skipped so we 
> don't have to fail on this scenario.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Attachment: (was: HIVE-17246.patch)

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246-2.patch, HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Attachment: HIVE-17246.patch

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246-2.patch, HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager

2017-08-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned HIVE-17254:
-


> Skip updating AccessTime of recycled files in ReplChangeManager
> ---
>
> Key: HIVE-17254
> URL: https://issues.apache.org/jira/browse/HIVE-17254
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>
> For recycled file, we update both ModifyTime and AccessTime:
> fs.setTimes(path, now, now);
> On some version of hdfs, this is now allowed when 
> "dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved 
> in HDFS-9208, we don't use AccessTime in CM and this could be skipped so we 
> don't have to fail on this scenario.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-04 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17181:

Attachment: HIVE-17181.2.patch

Thanks for the review, [~thejas]. :] I've added a test.

> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-04 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17181:

Status: Open  (was: Patch Available)

> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-04 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17181:

Status: Patch Available  (was: Open)

> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.2.patch, 
> HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115035#comment-16115035
 ] 

Taklon Stephen Wu commented on HIVE-17246:
--

thanks [~spena] for quickly looking at it, will wait till the auto tests pass. 

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246-2.patch, HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17247) HoS DPP: UDFs on the partition column side does not evaluate correctly

2017-08-04 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17247:

Attachment: HIVE-17247.1.patch

> HoS DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-17247
> URL: https://issues.apache.org/jira/browse/HIVE-17247
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17247.1.patch
>
>
> Same problem as HIVE-12473 and HIVE-12667.
> The query below (uses tables from {{spark_dynamic_partition_pruning.q}}) 
> returns incorrect results:
> {code}
> select count(*) from srcpart join srcpart_date on (day(srcpart.ds) = 
> day(srcpart_date.ds)) where srcpart_date.`date` = '2008-04-08';
> {code}
> It returns a value of 0 when DPP is on, when it is disabled it returns 1000



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17247) HoS DPP: UDFs on the partition column side does not evaluate correctly

2017-08-04 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17247:

Status: Patch Available  (was: Open)

> HoS DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-17247
> URL: https://issues.apache.org/jira/browse/HIVE-17247
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17247.1.patch
>
>
> Same problem as HIVE-12473 and HIVE-12667.
> The query below (uses tables from {{spark_dynamic_partition_pruning.q}}) 
> returns incorrect results:
> {code}
> select count(*) from srcpart join srcpart_date on (day(srcpart.ds) = 
> day(srcpart_date.ds)) where srcpart_date.`date` = '2008-04-08';
> {code}
> It returns a value of 0 when DPP is on, when it is disabled it returns 1000



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17234) Remove HBase metastore from master

2017-08-04 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17234:
--
   Resolution: Fixed
 Hadoop Flags: Incompatible change
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Patch committed.  Thanks Daniel and Sergey for taking a look.

> Remove HBase metastore from master
> --
>
> Key: HIVE-17234
> URL: https://issues.apache.org/jira/browse/HIVE-17234
> Project: Hive
>  Issue Type: Task
>  Components: HBase Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 3.0.0
>
> Attachments: HIVE-17234.patch
>
>
> No new development has been done on the HBase metastore in at least a year, 
> and to my knowledge no one is using it (nor is it even in a state to be fully 
> usable).  Given the lack of interest in continuing to develop it, we should 
> remove it rather than leave dead code hanging around and extra tests taking 
> up time in test runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17253) Adding SUMMARY statement to HPL/SQL

2017-08-04 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-17253:
--
Description: 
Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
result) similar to Python and R.

For each column output the data type, number of distinct values, non-NULL rows, 
mean, std, percentiles, min, max. Output additional stats for categorical 
columns. This helps perform quick and easy exploratory data analysis for SQL 
devs and business users.  http://hplsql.org/summary

  was:
Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
result) similar to Python and R.

For each column output the data type, number of distinct values, non-NULL rows, 
mean, std, percentiles, min, max. Output additional stats for categorical 
columns. This helps perform quick and easy explanatory data analysis for SQL 
devs and business users.  http://hplsql.org/summary


> Adding SUMMARY statement to HPL/SQL
> ---
>
> Key: HIVE-17253
> URL: https://issues.apache.org/jira/browse/HIVE-17253
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>
> Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
> result) similar to Python and R.
> For each column output the data type, number of distinct values, non-NULL 
> rows, mean, std, percentiles, min, max. Output additional stats for 
> categorical columns. This helps perform quick and easy exploratory data 
> analysis for SQL devs and business users.  http://hplsql.org/summary



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17252) Insecure YARN Fair Scheduler when using HiveServer2 non-impersonation mode

2017-08-04 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114905#comment-16114905
 ] 

Xuefu Zhang commented on HIVE-17252:


I don't think Hive set any value to mapreduce.job.queuename by default. In 
fact, it's expected that a user to set the queue name correctly. Hive doesn't 
manage user-queue mapping either. Please refer to YARN queue access control for 
queue permissions.

> Insecure YARN Fair Scheduler when using HiveServer2 non-impersonation mode
> --
>
> Key: HIVE-17252
> URL: https://issues.apache.org/jira/browse/HIVE-17252
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Vugar Karimli
>
> Hi,
> I am using Hive version 1.1.0 with Hadoop 2.6.0. As you know when Kerberos 
> and Sentry is enabled in hadoop cluster HiveServer2 user impersonation should 
> be turned of (hive.server2.enable.doAs=false) to force all queries in 
> background to be executed by hive user instead of logged in user. 
> In this case by default HiveServer2 takes into account Fair Scheduler and 
> sets mapreduce.job.queuename parameter according to logged in Hive username 
> and correctly executes query in user's YARN queue. For example, in 
> root.users.user_name queue.
> But problem here is any user can modify mapreduce.job.queuename parameter 
> setting other user's queue name (set 
> mapreduce.job.queuename=root.users.other_user_name) and execute query in 
> another user's YARN queue. Here YARN queue's ACL also doesn't help, because 
> all queries are executed by hive user in YARN not by logged in user.
> Is it possible to prevent HiveServer2 users changing mapreduce.job.queuename 
> parameter?
> Best Regards,
> Vugar.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15686) Partitions on Remote HDFS break encryption-zone checks

2017-08-04 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114890#comment-16114890
 ] 

Owen O'Malley commented on HIVE-15686:
--

+1

> Partitions on Remote HDFS break encryption-zone checks
> --
>
> Key: HIVE-15686
> URL: https://issues.apache.org/jira/browse/HIVE-15686
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-15686.branch-2.patch
>
>
> This is in relation to HIVE-13243, which fixes encryption-zone checks for 
> external tables.
> Unfortunately, this is still borked for partitions with remote HDFS paths. 
> The code fails as follows:
> {noformat}
> 2015-12-09 19:26:14,997 ERROR [pool-4-thread-1476] server.TThreadPoolServer 
> (TThreadPoolServer.java:run_aroundBody0(305)) - Error occurred during 
> processing of message.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://remote-cluster-nn1.myth.net:8020/dbs/mythdb/myth_table/dt=20170120, 
> expected: hdfs://local-cluster-n1.myth.net:8020
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:1985)
> at 
> org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:262)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1290)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.checkTrashPurgeCombination(HiveMetaStore.java:1746)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_partitions_req(HiveMetaStore.java:2974)
> at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy5.drop_partitions_req(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_partitions_req.getResult(ThriftHiveMetastore.java:10005)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_partitions_req.getResult(ThriftHiveMetastore.java:9989)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$2.run(HadoopThriftAuthBridge.java:767)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$2.run(HadoopThriftAuthBridge.java:763)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:763)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody0(TThreadPoolServer.java:285)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody1$advice(TThreadPoolServer.java:101)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:1)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I have a really simple fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Status: Patch Available  (was: In Progress)

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246-2.patch, HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread Taklon Stephen Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taklon Stephen Wu updated HIVE-17246:
-
Status: In Progress  (was: Patch Available)

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246-2.patch, HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-04 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114857#comment-16114857
 ] 

Sahil Takiar edited comment on HIVE-17213 at 8/4/17 7:19 PM:
-

Actually, [~csun] it may be best to commit this first. If I make the changes to 
master-mr2.properties now, it might break ptest complaining that the group 
mainProperties.$\{miniSparkOnYarn.only.query.files} doesn't exist.


was (Author: stakiar):
Actually, [~csun] it may be best to commit this first. If I make the changes to 
master-mr2.properties now, it might break ptest complaining that the group 
mainProperties.${miniSparkOnYarn.only.query.files} doesn't exist.

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch, HIVE-17213.3.patch, HIVE-17213.4.patch, HIVE-17213.5.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-04 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114857#comment-16114857
 ] 

Sahil Takiar commented on HIVE-17213:
-

Actually, [~csun] it may be best to commit this first. If I make the changes to 
master-mr2.properties now, it might break ptest complaining that the group 
mainProperties.${miniSparkOnYarn.only.query.files} doesn't exist.

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch, HIVE-17213.3.patch, HIVE-17213.4.patch, HIVE-17213.5.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17216) Additional qtests for HoS DPP

2017-08-04 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17216:

Attachment: HIVE-17216.1.patch

> Additional qtests for HoS DPP
> -
>
> Key: HIVE-17216
> URL: https://issues.apache.org/jira/browse/HIVE-17216
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17216.1.patch
>
>
> There are a few queries that we can add to the HoS DPP tests to increase 
> coverage. There are a few query patterns that the current tests don't cover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17216) Additional qtests for HoS DPP

2017-08-04 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17216:

Status: Patch Available  (was: Open)

> Additional qtests for HoS DPP
> -
>
> Key: HIVE-17216
> URL: https://issues.apache.org/jira/browse/HIVE-17216
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17216.1.patch
>
>
> There are a few queries that we can add to the HoS DPP tests to increase 
> coverage. There are a few query patterns that the current tests don't cover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-04 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17220:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Test failures are unrelated to this change and are happening in master already.

Committed patch to master. Thanks Gopal for the review!

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-17220.1.patch, HIVE-17220.2.patch, 
> HIVE-17220.3.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17181) HCatOutputFormat should expose complete output-schema (including partition-keys) for dynamic-partitioning MR jobs

2017-08-04 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114791#comment-16114791
 ] 

Thejas M Nair commented on HIVE-17181:
--

The change looks good to me. Can you also please add a unit test ?


> HCatOutputFormat should expose complete output-schema (including 
> partition-keys) for dynamic-partitioning MR jobs
> -
>
> Key: HIVE-17181
> URL: https://issues.apache.org/jira/browse/HIVE-17181
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17181.1.patch, HIVE-17181.branch-2.patch
>
>
> Map/Reduce jobs that use HCatalog APIs to write to Hive tables using Dynamic 
> partitioning are expected to call the following API methods:
> # {{HCatOutputFormat.setOutput()}} to indicate which table/partitions to 
> write to. This call populates the {{OutputJobInfo}} with details fetched from 
> the Metastore.
> # {{HCatOutputFormat.setSchema()}} to indicate the output-schema for the data 
> being written.
> It is a common mistake to invoke {{HCatOUtputFormat.setSchema()}} as follows:
> {code:java}
> HCatOutputFormat.setSchema(conf, HCatOutputFormat.getTableSchema(conf));
> {code}
> Unfortunately, {{getTableSchema()}} returns only the record-schema, not the 
> entire table's schema. We'll need a better API for use in M/R jobs to get the 
> complete table-schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17169) Avoid extra call to KeyProvider::getMetadata()

2017-08-04 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114767#comment-16114767
 ] 

Owen O'Malley commented on HIVE-17169:
--

+1

Although I note that in general encryption block size is not the same as the 
key length. I believe HDFS only currently supports AES128 and not AES256, so I 
don't think this is a big issue currently. Clearly Hadoop's CipherSuite should 
also include a method for key length. 

Block size: AES128 & AES256 = 128
Key size: AES128 = 128, AES256 = 256


> Avoid extra call to KeyProvider::getMetadata()
> --
>
> Key: HIVE-17169
> URL: https://issues.apache.org/jira/browse/HIVE-17169
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17169.1.patch
>
>
> Here's the code from {{Hadoop23Shims}}:
> {code:title=Hadoop23Shims.java|borderStyle=solid}
> @Override
> public int comparePathKeyStrength(Path path1, Path path2) throws 
> IOException {
>   EncryptionZone zone1, zone2;
>   zone1 = hdfsAdmin.getEncryptionZoneForPath(path1);
>   zone2 = hdfsAdmin.getEncryptionZoneForPath(path2);
>   if (zone1 == null && zone2 == null) {
> return 0;
>   } else if (zone1 == null) {
> return -1;
>   } else if (zone2 == null) {
> return 1;
>   }
>   return compareKeyStrength(zone1.getKeyName(), zone2.getKeyName());
> }
> private int compareKeyStrength(String keyname1, String keyname2) throws 
> IOException {
>   KeyProvider.Metadata meta1, meta2;
>   if (keyProvider == null) {
> throw new IOException("HDFS security key provider is not configured 
> on your server.");
>   }
>   meta1 = keyProvider.getMetadata(keyname1);
>   meta2 = keyProvider.getMetadata(keyname2);
>   if (meta1.getBitLength() < meta2.getBitLength()) {
> return -1;
>   } else if (meta1.getBitLength() == meta2.getBitLength()) {
> return 0;
>   } else {
> return 1;
>   }
> }
>   }
> {code}
> It turns out that {{EncryptionZone}} already has the cipher's bit-length 
> stored in a member variable. One shouldn't need an additional name-node call 
> ({{KeyProvider::getMetadata()}}) only to fetch it again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17253) Adding SUMMARY statement to HPL/SQL

2017-08-04 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-17253:
--
Description: 
Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
result) similar to Python and R.

For each column output the data type, number of distinct values, non-NULL rows, 
mean, std, percentiles, min, max. Output additional stats for categorical 
columns. This helps perform quick and easy explanatory data analysis for SQL 
devs and business users.  http://hplsql.org/summary

  was:
Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
result) similar to Python and R.

For each column output the data type, number of distinct values, non-NULL rows, 
mean, std, percentiles, min, max. Output additional stats for categorical 
columns. This helps perform quick and easy explanatory data analysis for SQL 
devs and business users.   


> Adding SUMMARY statement to HPL/SQL
> ---
>
> Key: HIVE-17253
> URL: https://issues.apache.org/jira/browse/HIVE-17253
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>
> Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
> result) similar to Python and R.
> For each column output the data type, number of distinct values, non-NULL 
> rows, mean, std, percentiles, min, max. Output additional stats for 
> categorical columns. This helps perform quick and easy explanatory data 
> analysis for SQL devs and business users.  http://hplsql.org/summary



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17253) Adding SUMMARY statement to HPL/SQL

2017-08-04 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko reassigned HIVE-17253:
-


> Adding SUMMARY statement to HPL/SQL
> ---
>
> Key: HIVE-17253
> URL: https://issues.apache.org/jira/browse/HIVE-17253
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>
> Adding SUMMARY statement to HPL/SQL to describe a data set (table, query 
> result) similar to Python and R.
> For each column output the data type, number of distinct values, non-NULL 
> rows, mean, std, percentiles, min, max. Output additional stats for 
> categorical columns. This helps perform quick and easy explanatory data 
> analysis for SQL devs and business users.   



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16294) Support snapshot for truncate table

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114706#comment-16114706
 ] 

Hive QA commented on HIVE-16294:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880372/HIVE-16294.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11144 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=236)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6261/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6261/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6261/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880372 - PreCommit-HIVE-Build

> Support snapshot for truncate table
> ---
>
> Key: HIVE-16294
> URL: https://issues.apache.org/jira/browse/HIVE-16294
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Vihang Karajgaonkar
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16294.01.patch, HIVE-16294.02.patch, 
> HIVE-16294.03.patch, HIVE-16294.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (HIVE-17229) HiveMetastore HMSHandler locks during initialization, even though its static variable threadPool is not null

2017-08-04 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reopened HIVE-17229:
-

Sorry, [~yuan_zac], there has been a misunderstanding. This issue is not 
resolved yet. I modified your patch to apply to master, properly. I'm reopening 
it to run tests.

> HiveMetastore HMSHandler locks during initialization, even though its static 
> variable threadPool is not null
> 
>
> Key: HIVE-17229
> URL: https://issues.apache.org/jira/browse/HIVE-17229
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
> Attachments: HIVE-17229.2.patch, HIVE-17229.patch
>
>
> A thread pool is used to accelerate adding partitions operation, since 
> [HIVE-13901|https://issues.apache.org/jira/browse/HIVE-13901]. 
> However, HMSHandler needs a lock during initialization every time, even 
> though its static variable threadPool is not null



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17229) HiveMetastore HMSHandler locks during initialization, even though its static variable threadPool is not null

2017-08-04 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17229:

Status: Patch Available  (was: Reopened)

> HiveMetastore HMSHandler locks during initialization, even though its static 
> variable threadPool is not null
> 
>
> Key: HIVE-17229
> URL: https://issues.apache.org/jira/browse/HIVE-17229
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
> Attachments: HIVE-17229.2.patch, HIVE-17229.patch
>
>
> A thread pool is used to accelerate adding partitions operation, since 
> [HIVE-13901|https://issues.apache.org/jira/browse/HIVE-13901]. 
> However, HMSHandler needs a lock during initialization every time, even 
> though its static variable threadPool is not null



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-04 Thread sarun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114609#comment-16114609
 ] 

sarun commented on HIVE-17115:
--

[~erik.fang] Can you please upload the test case?

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17240) function ACOS(2) and ASIN(2) should be null

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114603#comment-16114603
 ] 

Hive QA commented on HIVE-17240:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880352/HIVE-17240.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11144 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf2] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_udf2] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCompareCliDriver.testCliDriver[vectorized_math_funcs]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf2]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6260/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6260/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6260/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880352 - PreCommit-HIVE-Build

> function ACOS(2) and ASIN(2) should be null
> ---
>
> Key: HIVE-17240
> URL: https://issues.apache.org/jira/browse/HIVE-17240
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.1.1, 1.2.2, 2.2.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
> Attachments: HIVE-17240.1.patch, HIVE-17240.2.patch
>
>
> {{acos(2)}} should be null, same as MySQL:
> {code:sql}
> hive> desc function extended acos;
> OK
> acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise
> Example:
>   > SELECT acos(1) FROM src LIMIT 1;
>   0
>   > SELECT acos(2) FROM src LIMIT 1;
>   NULL
> Time taken: 0.009 seconds, Fetched: 6 row(s)
> hive> select acos(2);
> OK
> NaN
> Time taken: 0.437 seconds, Fetched: 1 row(s)
> {code}
> {code:sql}
> mysql>  select acos(2);
> +-+
> | acos(2) |
> +-+
> |NULL |
> +-+
> 1 row in set (0.00 sec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization

2017-08-04 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114568#comment-16114568
 ] 

Owen O'Malley commented on HIVE-17235:
--

I think we should make a new type that looks like:

{code}
class Decimal64ColumnVector extends ColumnVector {
  long[] vector;
  int precision;
  int scale;
}
{code}

It will be extremely fast and provide a fast conduit to ORC. 

> Add ORC Decimal64 Serialization/Deserialization
> ---
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114533#comment-16114533
 ] 

Hive QA commented on HIVE-17148:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880173/HIVE-17148.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11145 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[innerjoin1] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[semijoin4] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[semijoin5] (batchId=15)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_2]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query24] 
(batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query8] (batchId=236)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_in] 
(batchId=128)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6259/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6259/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6259/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880173 - PreCommit-HIVE-Build

> Incorrect result for Hive join query with COALESCE in WHERE condition
> -
>
> Key: HIVE-17148
> URL: https://issues.apache.org/jira/browse/HIVE-17148
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.1
>Reporter: Vlad Gudikov
>Assignee: Vlad Gudikov
> Attachments: HIVE-17148.1.patch, HIVE-17148.patch
>
>
> The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo 
> enabled:
> STEPS TO REPRODUCE:
> {code}
> Step 1: Create a table ct1
> create table ct1 (a1 string,b1 string);
> Step 2: Create a table ct2
> create table ct2 (a2 string);
> Step 3 : Insert following data into table ct1
> insert into table ct1 (a1) values ('1');
> Step 4 : Insert following data into table ct2
> insert into table ct2 (a2) values ('1');
> Step 5 : Execute the following query 
> select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2;
> {code}
> ACTUAL RESULT:
> {code}
> The query returns nothing;
> {code}
> EXPECTED RESULT:
> {code}
> 1   NULL1
> {code}
> The issue seems to be because of the incorrect query plan. In the plan we can 
> see:
> predicate:(a1 is not null and b1 is not null)
> which does not look correct. As a result, it is filtering out all the rows is 
> any column mentioned in the COALESCE has null value.
> Please find the query plan below:
> {code}
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1
>   File Output Operator [FS_10]
> Map Join Operator [MAPJOIN_15] (rows=1 width=4)
>   
> Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE]
>   BROADCAST [RS_7]
> PartitionCols:_col0
> Select Operator [SEL_5] (rows=1 width=1)
>   Ou

[jira] [Commented] (HIVE-17246) Add having related blobstore query test

2017-08-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114454#comment-16114454
 ] 

Sergio Peña commented on HIVE-17246:


The patch looks good [~wutak...@amazon.com]
+1

> Add having related blobstore query test
> ---
>
> Key: HIVE-17246
> URL: https://issues.apache.org/jira/browse/HIVE-17246
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.1.1
>Reporter: Taklon Stephen Wu
>Assignee: Taklon Stephen Wu
> Attachments: HIVE-17246-2.patch, HIVE-17246.patch
>
>
> This patch introduces the following regression test into the hive-blobstore 
> qtest module:
> * having.q -> Test Having clause for aggregation functions such as COUNT(), 
> MAX(), and min()
> ** test HAVING with aggregation as alias
> ** test HAVING with aggregation as field
> ** test HAVING with aggregation as a function does not exist in field



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114447#comment-16114447
 ] 

Hive QA commented on HIVE-16895:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880359/HIVE-16895.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11144 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentStatements (batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6258/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6258/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6258/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880359 - PreCommit-HIVE-Build

>  Multi-threaded execution of bootstrap dump of partitions
> -
>
> Key: HIVE-16895
> URL: https://issues.apache.org/jira/browse/HIVE-16895
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch
>
>
> to allow faster execution of bootstrap dump phase we dump multiple partitions 
> from same table simultaneously. 
> even though dumping  functions is  not going to be a blocker, moving to 
> similar execution modes for all metastore objects will make code more 
> coherent. 
> Bootstrap dump at db level does :
> * boostrap of all tables
> ** boostrap of all partitions in a table.  (scope of current jira) 
> * boostrap of all functions 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16758) Better Select Number of Replications

2017-08-04 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114388#comment-16114388
 ] 

BELUGA BEHR commented on HIVE-16758:


[~csun] ? :)

> Better Select Number of Replications
> 
>
> Key: HIVE-16758
> URL: https://issues.apache.org/jira/browse/HIVE-16758
> Project: Hive
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16758.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.java}}
> We should be smarter about how we pick a replication number.  We should add a 
> new configuration equivalent to {{mapreduce.client.submit.file.replication}}. 
>  This value should be around the square root of the number of nodes and not 
> hard-coded in the code.
> {code}
> public static final String DFS_REPLICATION_MAX = "dfs.replication.max";
> private int minReplication = 10;
>   @Override
>   protected void initializeOp(Configuration hconf) throws HiveException {
> ...
> int dfsMaxReplication = hconf.getInt(DFS_REPLICATION_MAX, minReplication);
> // minReplication value should not cross the value of dfs.replication.max
> minReplication = Math.min(minReplication, dfsMaxReplication);
>   }
> {code}
> https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-08-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114359#comment-16114359
 ] 

Hive QA commented on HIVE-16896:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880360/HIVE-16896.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6257/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6257/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6257/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-04 13:33:05.249
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6257/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-04 13:33:05.251
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   0bf8314..ceec583  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 0bf8314 Add License Header for HIVE-17144
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/spark_union_merge.q
Removing ql/src/test/results/clientpositive/spark/spark_union_merge.q.out
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at ceec583 HIVE-17208: Repl dump should pass in db/table 
information to authorization API (Daniel Dai, reviewed by Thejas Nair)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-04 13:33:11.444
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: No such 
file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java:
 No such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/WarehouseInstance.java:
 No such file or directory
error: a/ql/if/queryplan.thrift: No such file or directory
error: a/ql/src/gen/thrift/gen-cpp/queryplan_types.cpp: No such file or 
directory
error: a/ql/src/gen/thrift/gen-cpp/queryplan_types.h: No such file or directory
error: 
a/ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java:
 No such file or directory
error: a/ql/src/gen/thrift/gen-php/Types.php: No such file or directory
error: a/ql/src/gen/thrift/gen-py/queryplan/ttypes.py: No such file or directory
error: a/ql/src/gen/thrift/gen-rb/queryplan_types.rb: No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/Context.java: No such file or 
directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java: No such 
file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java: 
No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/ImportTableDesc.java: No 
such file or directory
error: a/ql/src/test/results/clientnegative/repl_load_requires_admin.q.out: No 
such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880360 - PreCommit-HIVE-Build

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> ---

[jira] [Commented] (HIVE-14352) Beeline can't run sub-second queries in HTTP mode

2017-08-04 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114338#comment-16114338
 ] 

Barna Zsombor Klara commented on HIVE-14352:


Hi [~gopalv],
I had a look at this Jira (better late than never I guess...) but I can't seem 
to be able to reproduce it. For reference this query was run from BeeLine 
connected to HS2 over http and has n execution time less than 1s:
{code}
0: jdbc:hive2://localhost:10001/default (default)> select * from btest;
select * from btest;
DEBUG : Acquired the compile lock.
INFO  : Compiling 
command(queryId=zsomborklara_20170804210130_6a35b6b4-a27d-4fc8-bf1b-d85d144358cc):
 select * from btest
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:btest.col1, type:string, comment:null), 
FieldSchema(name:btest.col2, type:int, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=zsomborklara_20170804210130_6a35b6b4-a27d-4fc8-bf1b-d85d144358cc);
 Time taken: 0.068 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing 
command(queryId=zsomborklara_20170804210130_6a35b6b4-a27d-4fc8-bf1b-d85d144358cc):
 select * from btest
INFO  : PREHOOK: query: select * from btest
INFO  : PREHOOK: type: QUERY
INFO  : PREHOOK: Input: default@btest
INFO  : PREHOOK: Output: 
file:/var/folders/mf/zwgh3vt55q7b7bz5bl147_s0gp/T/zsomborklara/7f168485-3645-4349-9be4-9b7cd791e573/hive_2017-08-04_21-01-30_116_341068566559606460-2/-mr-10001
INFO  : POSTHOOK: query: select * from btest
INFO  : POSTHOOK: type: QUERY
INFO  : POSTHOOK: Input: default@btest
INFO  : POSTHOOK: Output: 
file:/var/folders/mf/zwgh3vt55q7b7bz5bl147_s0gp/T/zsomborklara/7f168485-3645-4349-9be4-9b7cd791e573/hive_2017-08-04_21-01-30_116_341068566559606460-2/-mr-10001
INFO  : Completed executing 
command(queryId=zsomborklara_20170804210130_6a35b6b4-a27d-4fc8-bf1b-d85d144358cc);
 Time taken: 0.003 seconds
INFO  : OK
DEBUG : Shutting down query select * from btest
+-+-+
| btest.col1  | btest.col2  |
+-+-+
| aaa | 1   |
+-+-+
1 row selected (0.157 seconds)
{code}

Looking at the code while it's true that we have a Thread.sleep(1000L), but the 
log thread is interrupted if the query takes less than 1 second.
{code}
InPlaceUpdateStream.EventNotifier eventNotifier =
new InPlaceUpdateStream.EventNotifier();
logThread = new Thread(createLogRunnable(stmnt, eventNotifier));
logThread.setDaemon(true);
logThread.start();
if (stmnt instanceof HiveStatement) {
  HiveStatement hiveStatement = (HiveStatement) stmnt;
  hiveStatement.setInPlaceUpdateStream(
  new BeelineInPlaceUpdateStream(
  beeLine.getErrorStream(),
  eventNotifier
  ));
}
hasResults = stmnt.execute(sql);
logThread.interrupt();
 Beeline can't run sub-second queries in HTTP mode
> -
>
> Key: HIVE-14352
> URL: https://issues.apache.org/jira/browse/HIVE-14352
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> Even a 12ms query execution takes 1000+ ms in Beeline.
> {code}
>   private static final int DEFAULT_QUERY_PROGRESS_INTERVAL = 1000;
> ...
>   while (hiveStatement.hasMoreLogs()) {
>   Thread.sleep(DEFAULT_QUERY_PROGRESS_INTERVAL);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17251) Remove usage of org.apache.pig.ResourceStatistics#setmBytes method in HCatLoader

2017-08-04 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned HIVE-17251:
-

Assignee: Adam Szita

> Remove usage of org.apache.pig.ResourceStatistics#setmBytes method in 
> HCatLoader
> 
>
> Key: HIVE-17251
> URL: https://issues.apache.org/jira/browse/HIVE-17251
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Reporter: Nandor Kollar
>Assignee: Adam Szita
>Priority: Minor
>
> org.apache.pig.ResourceStatistics#setmBytes is marked as deprecated, and is 
> going to be removed from Pig. Is it possible to use use the the proper 
> replacement method (ResourceStatistics#setSizeInBytes) instead?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16294) Support snapshot for truncate table

2017-08-04 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114161#comment-16114161
 ] 

Peter Vary commented on HIVE-16294:
---

+1 pending tests

> Support snapshot for truncate table
> ---
>
> Key: HIVE-16294
> URL: https://issues.apache.org/jira/browse/HIVE-16294
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Vihang Karajgaonkar
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16294.01.patch, HIVE-16294.02.patch, 
> HIVE-16294.03.patch, HIVE-16294.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16294) Support snapshot for truncate table

2017-08-04 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16294:
---
Attachment: HIVE-16294.04.patch

Added Hadoop jira to the comment.

> Support snapshot for truncate table
> ---
>
> Key: HIVE-16294
> URL: https://issues.apache.org/jira/browse/HIVE-16294
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Vihang Karajgaonkar
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16294.01.patch, HIVE-16294.02.patch, 
> HIVE-16294.03.patch, HIVE-16294.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14786) Beeline displays binary column data as string instead of byte array

2017-08-04 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114098#comment-16114098
 ] 

Barna Zsombor Klara commented on HIVE-14786:


Failures should not be related.

> Beeline displays binary column data as string instead of byte array
> ---
>
> Key: HIVE-14786
> URL: https://issues.apache.org/jira/browse/HIVE-14786
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: Ram Mettu
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-14786.01.patch, HIVE-14786.02.patch, 
> HIVE-14786.03.patch
>
>
> In Beeline, doing a SELECT binaryColName FROM tableName; results in displays 
> data as string type (which looks corrupted due to unprintable chars). Instead 
> Beeline should display binary columns as byte array. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17167) Create metastore specific configuration tool

2017-08-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114076#comment-16114076
 ] 

Lefty Leverenz commented on HIVE-17167:
---

bq. Do we have a "wait and see" label for docs?

Agreed, this shouldn't be documented right now.  But a TODOC label is a 
searchable reminder, not necessarily a call to action.  

Is the metastore split likely to happen before or after 3.0.0 is released?  I'd 
use a TODOC3.0 label for now but you can create a new one if you prefer.

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 3.0.0
>
> Attachments: HIVE-17167.2.patch, HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17229) HiveMetastore HMSHandler locks during initialization, even though its static variable threadPool is not null

2017-08-04 Thread Zac Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated HIVE-17229:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> HiveMetastore HMSHandler locks during initialization, even though its static 
> variable threadPool is not null
> 
>
> Key: HIVE-17229
> URL: https://issues.apache.org/jira/browse/HIVE-17229
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
> Attachments: HIVE-17229.2.patch, HIVE-17229.patch
>
>
> A thread pool is used to accelerate adding partitions operation, since 
> [HIVE-13901|https://issues.apache.org/jira/browse/HIVE-13901]. 
> However, HMSHandler needs a lock during initialization every time, even 
> though its static variable threadPool is not null



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17229) HiveMetastore HMSHandler locks during initialization, even though its static variable threadPool is not null

2017-08-04 Thread Zac Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114055#comment-16114055
 ] 

Zac Zhou commented on HIVE-17229:
-

Thanks a lot, Mithun

> HiveMetastore HMSHandler locks during initialization, even though its static 
> variable threadPool is not null
> 
>
> Key: HIVE-17229
> URL: https://issues.apache.org/jira/browse/HIVE-17229
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
> Attachments: HIVE-17229.2.patch, HIVE-17229.patch
>
>
> A thread pool is used to accelerate adding partitions operation, since 
> [HIVE-13901|https://issues.apache.org/jira/browse/HIVE-13901]. 
> However, HMSHandler needs a lock during initialization every time, even 
> though its static variable threadPool is not null



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-08-04 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114031#comment-16114031
 ] 

anishek commented on HIVE-17144:


Thanks everyone for helping on this one !.

> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17144.1.patch
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)