date:20171016

[jira] [Comment Edited] (HIVE-17771) Implement create and show resource plan.

2017-10-16 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207051#comment-16207051
 ] 

Thejas M Nair edited comment on HIVE-17771 at 10/17/17 5:52 AM:


bq. I see that admin commands like grant role, etc. don't have any privileges 
associated with them in HiveOperation.java. How does one control access to that 
stuff?
HiveAuthorizer interface has methods for those operations that plugins need to 
implement for them. Since plugin is being used for those methods, there is no 
additional checkPrivilege call that is made for them to check with the 
authorizer plugin.



was (Author: thejas):
bq. I see that admin commands like grant role, etc. don't have any privileges 
associated with them in HiveOperation.java. How does one control access to that 
stuff?
HiveAuthorizer interface has methods for those operations. Since authorizer 
plugin implements these operations, there is no additional checkPrivilege call 
that is made for them to check with the authorizer plugin.


> Implement create and show resource plan.
> 
>
> Key: HIVE-17771
> URL: https://issues.apache.org/jira/browse/HIVE-17771
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17771.01.patch
>
>
> Please see parent jira about llap workload management.
> This jira is to implement create and show resource plan commands in hive to 
> configure resource plans for llap workload.
> The following are the proposed commands implemented as part of the jira:
> CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism;
> SHOW RESOURCE PLAN;
> It will be followed up with more jiras to add pools, triggers and copy 
> resource plans. And also with drop commands for each of them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17771) Implement create and show resource plan.

2017-10-16 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207051#comment-16207051
 ] 

Thejas M Nair commented on HIVE-17771:
--

bq. I see that admin commands like grant role, etc. don't have any privileges 
associated with them in HiveOperation.java. How does one control access to that 
stuff?
HiveAuthorizer interface has methods for those operations. Since authorizer 
plugin implements these operations, there is no additional checkPrivilege call 
that is made for them to check with the authorizer plugin.


> Implement create and show resource plan.
> 
>
> Key: HIVE-17771
> URL: https://issues.apache.org/jira/browse/HIVE-17771
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17771.01.patch
>
>
> Please see parent jira about llap workload management.
> This jira is to implement create and show resource plan commands in hive to 
> configure resource plans for llap workload.
> The following are the proposed commands implemented as part of the jira:
> CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism;
> SHOW RESOURCE PLAN;
> It will be followed up with more jiras to add pools, triggers and copy 
> resource plans. And also with drop commands for each of them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207048#comment-16207048
 ] 

Thejas M Nair commented on HIVE-17825:
--

+1

> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207040#comment-16207040
 ] 

anishek commented on HIVE-17825:


[~thejas]/[~daijy] can you please review.

> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17825:
---
Status: Patch Available  (was: Open)

> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17825:
--
Labels: pull-request-available  (was: )

> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207039#comment-16207039
 ] 

ASF GitHub Bot commented on HIVE-17825:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/262

HIVE-17825: Socket not closed when trying to read files to copy over in 
replication from metadata



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-17825

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit ade76e64ec6180eccf51b12173b73be4a9ca8370
Author: Anishek Agarwal 
Date:   2017-10-17T05:23:35Z

HIVE-17825: Socket not closed when trying to read files to copy over in 
replication from metadata




> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17825:
---
Attachment: HIVE-17825.0.patch

> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207028#comment-16207028
 ] 

Hive QA commented on HIVE-15104:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892511/HIVE-15104.7.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7341/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7341/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7341/

Messages:
{noformat}
 This message was trimmed, see log for full details 
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7341/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-17 05:22:03.427
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 8c3f0e4 HIVE-17815: prevent OOM with Atlas Hive hook (Anishek 
Agarwal reviewed by Thejas Nair)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 8c3f0e4 HIVE-17815: prevent OOM with Atlas Hive hook (Anishek 
Agarwal reviewed by Thejas Nair)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-17 05:22:03.911
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
patching file hive-kryo-registrator/pom.xml
patching file 
hive-kryo-registrator/src/main/java/org/apache/hive/spark/HiveKryoRegistrator.java
patching file itests/src/test/resources/testconfiguration.properties
patching file packaging/pom.xml
patching file pom.xml
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
patching file ql/src/test/queries/clientpositive/spark_opt_shuffle_serde.q
patching file 
ql/src/test/results/clientpositive/spark/spark_opt_shuffle_serde.q.out
patching file 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java
patching file 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: protoc version: 250, detected platform: linux/amd64
protoc-jar: executing: [/tmp/protoc9130095883787762036.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
DataNucleus Enhancer (version 4.1.17) for API "JDO"
DataNucleus Enhancer : Classpath
>>  /usr/share/maven/boot/plexus-classworlds-2.x.jar
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTable
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MConstraint
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MSerDeInfo
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MOrder
ENHANCED (Persistable) :

[jira] [Updated] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2017-10-16 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17825:
---
Summary: Socket not closed when trying to read files to copy over in 
replication from metadata  (was: Connection Leak when trying to read files to 
copy over in replication from metadata)

> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
> Fix For: 3.0.0
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207027#comment-16207027
 ] 

Hive QA commented on HIVE-17433:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892510/HIVE-17433.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7340/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7340/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7340/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-17 05:21:16.710
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7340/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-17 05:21:16.712
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   8fea117..8c3f0e4  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 8fea117 HIVE-17371 : Move tokenstores to metastore module 
(Vihang Karajgaonkar, reviewed by Alan Gates, Thejas M Nair)
+ git clean -f -d
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 8c3f0e4 HIVE-17815: prevent OOM with Atlas Hive hook (Anishek 
Agarwal reviewed by Thejas Nair)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-17 05:21:20.549
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/test/results/clientpositive/llap/vector_groupby_grouping_id2.q.out:612
error: 
ql/src/test/results/clientpositive/llap/vector_groupby_grouping_id2.q.out: 
patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892510 - PreCommit-HIVE-Build

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
>

[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207024#comment-16207024
 ] 

Hive QA commented on HIVE-8937:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892509/HIVE-8937.001.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11242 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7339/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7339/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7339/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892509 - PreCommit-HIVE-Build

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17825) Connection Leak when trying to read files to copy over in replication from metadata

2017-10-16 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-17825:
--


> Connection Leak when trying to read files to copy over in replication from 
> metadata
> ---
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
> Fix For: 3.0.0
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17815) prevent OOM with Atlas Hive hook

2017-10-16 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17815:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> prevent OOM with Atlas Hive hook 
> -
>
> Key: HIVE-17815
> URL: https://issues.apache.org/jira/browse/HIVE-17815
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17815.0.patch, HIVE-17815.1.patch
>
>
> as part of HIVE-17814 we are going to handle the issue w.r.t to hive as well 
> as the post execution hook api's . However for atlas which is a commonly used 
> hive post execution hook, we want to prevent additional memory usage. Also 
> Atlas currently does not handle /  work on replication queries hence 
> overloading the hookContext with TaskRunner objects is just using a lot of 
> memory.  The same should be true for other execution hooks as well since 
> replication is a new functionality,
> This task is to reduce that for replication related queries. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17815) prevent OOM with Atlas Hive hook

2017-10-16 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207016#comment-16207016
 ] 

anishek commented on HIVE-17815:


committed to master [~thejas] thanks for review !

> prevent OOM with Atlas Hive hook 
> -
>
> Key: HIVE-17815
> URL: https://issues.apache.org/jira/browse/HIVE-17815
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17815.0.patch, HIVE-17815.1.patch
>
>
> as part of HIVE-17814 we are going to handle the issue w.r.t to hive as well 
> as the post execution hook api's . However for atlas which is a commonly used 
> hive post execution hook, we want to prevent additional memory usage. Also 
> Atlas currently does not handle /  work on replication queries hence 
> overloading the hookContext with TaskRunner objects is just using a lot of 
> memory.  The same should be true for other execution hooks as well since 
> replication is a new functionality,
> This task is to reduce that for replication related queries. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207015#comment-16207015
 ] 

Thejas M Nair commented on HIVE-8937:
-

[~ajisakaa]
I guess it would be more accurate to say 
hive.security.authorization.sqlstd.confwhitelist.append is 2nd java regex that 
it would match in addition to 
"hive.security.authorization.sqlstd.confwhitelist".
I hope that would convey the message that they don't need to include a starting 
"|" in the value.

"A Java regex, to be used in addition to regex set in 
hive.security.authorization.sqlstd.confwhitelist. Using this regex instead of 
updating the original regex means that you can append to the default set by SQL 
standard authorization instead of replacing it entirely."


> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17815) prevent OOM with Atlas Hive hook

2017-10-16 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17815:
---
Attachment: HIVE-17815.1.patch

with comment 

> prevent OOM with Atlas Hive hook 
> -
>
> Key: HIVE-17815
> URL: https://issues.apache.org/jira/browse/HIVE-17815
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17815.0.patch, HIVE-17815.1.patch
>
>
> as part of HIVE-17814 we are going to handle the issue w.r.t to hive as well 
> as the post execution hook api's . However for atlas which is a commonly used 
> hive post execution hook, we want to prevent additional memory usage. Also 
> Atlas currently does not handle /  work on replication queries hence 
> overloading the hookContext with TaskRunner objects is just using a lot of 
> memory.  The same should be true for other execution hooks as well since 
> replication is a new functionality,
> This task is to reduce that for replication related queries. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table definit

2017-10-16 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HIVE-12408:
-
Status: Patch Available  (was: Open)

> SQLStdAuthorizer expects external table creator to be owner of directory, 
> does not respect rwx group permission. Only one user could ever create an 
> external table definition to dir!
> -
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HIVE-12408.001.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table definit

2017-10-16 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HIVE-12408:
-
Attachment: HIVE-12408.001.patch

Attaching a patch.

> SQLStdAuthorizer expects external table creator to be owner of directory, 
> does not respect rwx group permission. Only one user could ever create an 
> external table definition to dir!
> -
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HIVE-12408.001.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17821) TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if partName=null and table is partitioned

2017-10-16 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206985#comment-16206985
 ] 

Lefty Leverenz commented on HIVE-17821:
---

Eugene, in the Summary is "now" a typo for "not"?  (Just a guess, since this is 
Greek to me.)

> TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if 
> partName=null and table is partitioned
> ---
>
> Key: HIVE-17821
> URL: https://issues.apache.org/jira/browse/HIVE-17821
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> LM may acquire read locks on the table when writing a partition.
> There is no need to make an entry for the table if we know it's partitioned 
> since any I/U/D must affect a partition (or set of).
> Pass isPartitioned() in LockComponent/LockRequest or look up in TxnHandler



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17806) Create directory for metrics file if it doesn't exist

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206977#comment-16206977
 ] 

Hive QA commented on HIVE-17806:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892497/HIVE-17806.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11242 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7338/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7338/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892497 - PreCommit-HIVE-Build

> Create directory for metrics file if it doesn't exist
> -
>
> Key: HIVE-17806
> URL: https://issues.apache.org/jira/browse/HIVE-17806
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17806.01.patch, HIVE-17806.02.patch
>
>
> HIVE-17563 changed metrics code to use local file system operations instead 
> of Hadoop local file system operations. There is an unintended side effect - 
> hadoop file systems create the directory if it doesn't exist and java nio 
> interfaces don't. The purpose of this fix is to revert the behavior to the 
> original one to avoid surprises.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17574) Avoid multiple copies of HDFS-based jars when localizing job-jars

2017-10-16 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206967#comment-16206967
 ] 

Lefty Leverenz commented on HIVE-17574:
---

Thanks for the documentation, [~mithun].  I added a link to this jira.

Will the patch be committed to branch-2.3 also?  If not, perhaps the version 
given in the doc should be "2.2.1 and 2.4+" or some such.

> Avoid multiple copies of HDFS-based jars when localizing job-jars
> -
>
> Key: HIVE-17574
> URL: https://issues.apache.org/jira/browse/HIVE-17574
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0, 2.4.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17574.1-branch-2.2.patch, 
> HIVE-17574.1-branch-2.patch, HIVE-17574.1.patch, HIVE-17574.2.patch
>
>
> Raising this on behalf of [~selinazh]. (For my own reference: YHIVE-1035.)
> This has to do with the classpaths of Hive actions run from Oozie, and 
> affects scripts that adds jars/resources from HDFS locations.
> As part of Oozie's "sharelib" deploys, foundation jars (such as Hive jars) 
> tend to be stored in HDFS paths, as are any custom user-libraries used in 
> workflows. An {{ADD JAR|FILE|ARCHIVE}} statement in a Hive script causes the 
> following steps to occur:
> # Files are downloaded from HDFS to local temp dir.
> # UDFs are resolved/validated.
> # All jars/files, including those just downloaded from HDFS, are shipped 
> right back to HDFS-based scratch-directories, for job submission.
> For HDFS-based files, this is wasteful and time-consuming. #3 above should 
> skip shipping HDFS-based resources, and add those directly to the Tez session.
> We have a patch that's being used internally at Yahoo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206948#comment-16206948
 ] 

Hive QA commented on HIVE-17822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892483/HIVE-17822.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11242 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7337/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7337/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7337/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892483 - PreCommit-HIVE-Build

> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subminute dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17824) msck repair table should drop the missing partitions from metastore

2017-10-16 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-17824:
--


> msck repair table should drop the missing partitions from metastore
> ---
>
> Key: HIVE-17824
> URL: https://issues.apache.org/jira/browse/HIVE-17824
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> {{msck repair table }} is often used in environments where the new 
> partitions are loaded as directories on HDFS or S3 and users want to create 
> the missing partitions in bulk. However, currently it only supports addition 
> of missing partitions. If there are any partitions which are present in 
> metastore but not on the FileSystem, it should also delete them so that it 
> truly repairs the table metadata.
> We should be careful not to break backwards compatibility so we should either 
> introduce a new config or keyword to add support to delete unnecessary 
> partitions from the metastore. This way users who want the old behavior can 
> easily turn it off. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17111) Add TestLocalSparkCliDriver

2017-10-16 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206945#comment-16206945
 ] 

Rui Li commented on HIVE-17111:
---

I think the new test needs order by to give deterministic output. Besides, 
should it be included in {{master-mr2.properties}}?

> Add TestLocalSparkCliDriver
> ---
>
> Key: HIVE-17111
> URL: https://issues.apache.org/jira/browse/HIVE-17111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-17111.1.patch
>
>
> The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but 
> the HoS still uses decides to use the RemoteHiveSparkClient rather than the 
> LocalHiveSparkClient.
> The issue is with the following check in HiveSparkClientFactory:
> {code}
> if (master.equals("local") || master.startsWith("local[")) {
>   // With local spark context, all user sessions share the same spark 
> context.
>   return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf));
> } else {
>   return new RemoteHiveSparkClient(hiveconf, sparkConf);
> }
> {code}
> When {{master.startsWith("local[")}} it checks the value of spark.master and 
> sees that it doesn't start with {{local[}} and then decides to use the 
> RemoteHiveSparkClient.
> We should fix this so that the LocalHiveSparkClient is used. It should speed 
> up some of the tests, and also makes qtests easier to debug since everything 
> will now be run in the same process.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table defini

2017-10-16 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-12408:


Assignee: Akira Ajisaka  (was: Thejas M Nair)

> SQLStdAuthorizer expects external table creator to be owner of directory, 
> does not respect rwx group permission. Only one user could ever create an 
> external table definition to dir!
> -
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table defin

2017-10-16 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206939#comment-16206939
 ] 

Thejas M Nair commented on HIVE-12408:
--

Sure, assigning the bug to you

> SQLStdAuthorizer expects external table creator to be owner of directory, 
> does not respect rwx group permission. Only one user could ever create an 
> external table definition to dir!
> -
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private

2017-10-16 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206938#comment-16206938
 ] 

Vihang Karajgaonkar commented on HIVE-17425:


I like the idea to discourage devs to use conf.get() and conf.set() methods 
directly. Can we change the usage of the conf.get() and conf.set() usage in the 
patch? Eg. in Metrics.java I see the following line {{String reportersToStart = 
conf.get(MetastoreConf.ConfVars.METRICS_REPORTERS.getVarname());}} Is there a 
reason why we can't use {{MetastoreConf.get(conf, 
MetastoreConf.ConfVars.METRICS_REPORTERS)}}. Same for the other places in the 
patch as well.



> Change MetastoreConf.ConfVars internal members to be private
> 
>
> Key: HIVE-17425
> URL: https://issues.apache.org/jira/browse/HIVE-17425
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17425.patch
>
>
> MetastoreConf's dual use of metastore keys and Hive keys is causing confusion 
> for developers.  We should make the relevant members private and provide 
> getter methods with comments on when it is appropriate to use them.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17371) Move tokenstores to metastore module

2017-10-16 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17371:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 3.0.0
>
> Attachments: HIVE-17371.01.patch, HIVE-17371.02.patch, 
> HIVE-17371.03.patch, HIVE-17371.04.patch, HIVE-17371.05.patch, 
> HIVE-17371.06.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module

2017-10-16 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206934#comment-16206934
 ] 

Vihang Karajgaonkar commented on HIVE-17371:


Patch merged to master. Thanks for the review [~alangates] and [~thejas]

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17371.01.patch, HIVE-17371.02.patch, 
> HIVE-17371.03.patch, HIVE-17371.04.patch, HIVE-17371.05.patch, 
> HIVE-17371.06.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table defin

2017-10-16 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206931#comment-16206931
 ] 

Akira Ajisaka commented on HIVE-12408:
--

Thank you [~thejas] for the comment. Do you mind if I create a patch for this 
issue?

> SQLStdAuthorizer expects external table creator to be owner of directory, 
> does not respect rwx group permission. Only one user could ever create an 
> external table definition to dir!
> -
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Thejas M Nair
>Priority: Critical
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-17823:
-

Assignee: Dapeng Sun

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Description: The JIRA is targeted to fix the Qtest files failures of HoS 
due to HIVE-17726 introduced subquery exist fix.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module

2017-10-16 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206925#comment-16206925
 ] 

Vihang Karajgaonkar commented on HIVE-17371:


All the tests are failing since many past builds and are unrelated

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17371.01.patch, HIVE-17371.02.patch, 
> HIVE-17371.03.patch, HIVE-17371.04.patch, HIVE-17371.05.patch, 
> HIVE-17371.06.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-16 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.7.patch

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
> Attachments: HIVE-15104.1.patch, HIVE-15104.2.patch, 
> HIVE-15104.3.patch, HIVE-15104.4.patch, HIVE-15104.5.patch, 
> HIVE-15104.6.patch, HIVE-15104.7.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-16 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Status: Patch Available  (was: Open)

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-16 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Attachment: HIVE-17433.03.patch

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206919#comment-16206919
 ] 

Akira Ajisaka commented on HIVE-8937:
-

Thanks! Attached a patch.

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HIVE-8937:

Status: Patch Available  (was: Open)

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HIVE-8937:

Component/s: Documentation

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HIVE-8937:

Attachment: HIVE-8937.001.patch

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-16 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HIVE-8937:
---

Assignee: Akira Ajisaka  (was: Thejas M Nair)

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Attachments: HIVE-8937.001.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206907#comment-16206907
 ] 

Hive QA commented on HIVE-17371:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892484/HIVE-17371.06.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11242 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7336/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7336/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7336/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892484 - PreCommit-HIVE-Build

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17371.01.patch, HIVE-17371.02.patch, 
> HIVE-17371.03.patch, HIVE-17371.04.patch, HIVE-17371.05.patch, 
> HIVE-17371.06.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Affects Version/s: 3.0.0

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17756) Enable subquery related Qtests for Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun resolved HIVE-17756.
---
Resolution: Fixed

Thank all for the comments. I have opened a new JIRA HIVE-17823 to fix it.

> Enable subquery related Qtests for Hive on Spark
> 
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 3.0.0
>
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce subquery test and verify the subqueries plan for 
> Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17802) Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206848#comment-16206848
 ] 

Hive QA commented on HIVE-17802:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892473/HIVE-17802.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11240 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat.testOutputFormat 
(batchId=192)
org.apache.hive.hcatalog.mapreduce.TestHCatOutputFormat.testSetOutput 
(batchId=192)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7335/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7335/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7335/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892473 - PreCommit-HIVE-Build

> Remove unnecessary calls to FileSystem.setOwner() from 
> FileOutputCommitterContainer
> ---
>
> Key: HIVE-17802
> URL: https://issues.apache.org/jira/browse/HIVE-17802
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17802.1.patch
>
>
> For large Pig/HCat queries that produce a large number of 
> partitions/directories/files, we have seen cases where the HDFS NameNode 
> groaned under the weight of {{FileSystem.setOwner()}} calls, originating from 
> the commit-step. This was the result of the following code in 
> FileOutputCommitterContainer:
> {code:java}
> private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission 
> permission,
>   List acls, String group, boolean recursive)
> throws IOException {
> ...
> if (recursive) {
>   for (FileStatus fileStatus : fs.listStatus(dir)) {
> if (fileStatus.isDir()) {
>   applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, 
> group, true);
> } else {
>   fs.setPermission(fileStatus.getPath(), permission);
>   chown(fs, fileStatus.getPath(), group);
> }
>   }
> }
>   }
>   private void chown(FileSystem fs, Path file, String group) throws 
> IOException {
> try {
>   fs.setOwner(file, null, group);
> } catch (AccessControlException ignore) {
>   // Some users have wrong table group, ignore it.
>   LOG.warn("Failed to change group of partition directories/files: " + 
> file, ignore);
> }
>   }
> {code}
> One call per file/directory is far too many. We have a patch that reduces the 
> namenode pressure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Issue Comment Deleted] (HIVE-16669) Fine tune Compaction to take advantage of Acid 2.0

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16669:
--
Comment: was deleted

(was: OK, this is more than just fine tuning.  Suppose we have 
base_8
delta_9
delete_delta_10 - this affects rows in base_8

Minor compaction (as currently implemented (inherited from Acid 1)) will 
produce delta_9_10 which means all deletes by txn 10 affecting rows in base_8 
are lost.

so HIVE-17089 is effectively incomplete w/o this)

> Fine tune Compaction to take advantage of Acid 2.0
> --
>
> Key: HIVE-16669
> URL: https://issues.apache.org/jira/browse/HIVE-16669
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> * There is little point using 2.0 vectorized reader since there is no 
> operator pipeline in compaction
> * If minor compaction just concats delete_delta files together, then the 2 
> stage compaction should always ensure that we have a limited number of Orc 
> readers to do the merging and current OrcRawRecordMerger should be fine
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16669) Fine tune Compaction to take advantage of Acid 2.0

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16669:
--
Priority: Critical  (was: Blocker)

> Fine tune Compaction to take advantage of Acid 2.0
> --
>
> Key: HIVE-16669
> URL: https://issues.apache.org/jira/browse/HIVE-16669
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> * There is little point using 2.0 vectorized reader since there is no 
> operator pipeline in compaction
> * If minor compaction just concats delete_delta files together, then the 2 
> stage compaction should always ensure that we have a limited number of Orc 
> readers to do the merging and current OrcRawRecordMerger should be fine
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17792) Enable Bucket Map Join when there are extra keys other than bucketed columns

2017-10-16 Thread Deepak Jaiswal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206838#comment-16206838
 ] 

Deepak Jaiswal commented on HIVE-17792:
---

Thanks for the review [~jdere].

> Enable Bucket Map Join when there are extra keys other than bucketed columns
> 
>
> Key: HIVE-17792
> URL: https://issues.apache.org/jira/browse/HIVE-17792
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17792.1.patch, HIVE-17792.2.patch, 
> HIVE-17792.3.patch, HIVE-17792.4.patch, HIVE-17792.5.patch
>
>
> Currently this wont go through Bucket Map Join(BMJ)
> CREATE TABLE tab_part (key int, value string) PARTITIONED BY(ds STRING) 
> CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE;
> CREATE TABLE tab(key int, value string) PARTITIONED BY(ds STRING) STORED AS 
> TEXTFILE;
> select a.key, a.value, b.value
> from tab a join tab_part b on a.key = b.key and a.value = b.value;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17806) Create directory for metrics file if it doesn't exist

2017-10-16 Thread Alexander Kolbasov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17806:
--
Attachment: HIVE-17806.02.patch

Addressed code review comments by Andrew Sherman

> Create directory for metrics file if it doesn't exist
> -
>
> Key: HIVE-17806
> URL: https://issues.apache.org/jira/browse/HIVE-17806
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17806.01.patch, HIVE-17806.02.patch
>
>
> HIVE-17563 changed metrics code to use local file system operations instead 
> of Hadoop local file system operations. There is an unintended side effect - 
> hadoop file systems create the directory if it doesn't exist and java nio 
> interfaces don't. The purpose of this fix is to revert the behavior to the 
> original one to avoid surprises.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17792) Enable Bucket Map Join when there are extra keys other than bucketed columns

2017-10-16 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206828#comment-16206828
 ] 

Jason Dere commented on HIVE-17792:
---

+1

> Enable Bucket Map Join when there are extra keys other than bucketed columns
> 
>
> Key: HIVE-17792
> URL: https://issues.apache.org/jira/browse/HIVE-17792
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17792.1.patch, HIVE-17792.2.patch, 
> HIVE-17792.3.patch, HIVE-17792.4.patch, HIVE-17792.5.patch
>
>
> Currently this wont go through Bucket Map Join(BMJ)
> CREATE TABLE tab_part (key int, value string) PARTITIONED BY(ds STRING) 
> CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE;
> CREATE TABLE tab(key int, value string) PARTITIONED BY(ds STRING) STORED AS 
> TEXTFILE;
> select a.key, a.value, b.value
> from tab a join tab_part b on a.key = b.key and a.value = b.value;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17054) Expose SQL database constraints to Calcite

2017-10-16 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17054:
---
Fix Version/s: 3.0.0

> Expose SQL database constraints to Calcite
> --
>
> Key: HIVE-17054
> URL: https://issues.apache.org/jira/browse/HIVE-17054
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
>
> Hive already has support to declare multiple SQL constraints (PRIMARY KEY, 
> FOREIGN KEY, UNIQUE, and NOT NULL). Although these constraints cannot be 
> currently enforced on the data, they can be made available to the optimizer 
> by using the 'RELY' keyword.
> Currently, even when they are declared with the RELY keyword, they are not 
> exposed to Calcite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17053) Enable improved Calcite MV-based rewriting rules

2017-10-16 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-17053.

Resolution: Duplicate

> Enable improved Calcite MV-based rewriting rules
> 
>
> Key: HIVE-17053
> URL: https://issues.apache.org/jira/browse/HIVE-17053
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer, Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Integrate the rules introduced in CALCITE-1731.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17672) Upgrade Calcite version to 1.14

2017-10-16 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17672:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]!

> Upgrade Calcite version to 1.14
> ---
>
> Key: HIVE-17672
> URL: https://issues.apache.org/jira/browse/HIVE-17672
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17672.01.patch, HIVE-17672.02.patch, 
> HIVE-17672.03.patch, HIVE-17672.04.patch, HIVE-17672.05.patch
>
>
> Calcite 1.14.0 has been recently released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206789#comment-16206789
 ] 

Hive QA commented on HIVE-17812:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892468/HIVE-17812.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7334/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7334/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7334/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-16 23:46:05.347
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7334/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-16 23:46:05.349
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   599a74f..45b9b8d  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 599a74f HIVE-17391 Compaction fails if there is an empty value 
in tblproperties (Steve Yeom via Eugene Koifman)
+ git clean -f -d
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/functions/HiveSqlSumEmptyIsZeroAggFunction.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEpochMilli.java
Removing ql/src/test/queries/clientpositive/timestamptz_3.q
Removing ql/src/test/results/clientpositive/timestamptz_3.q.out
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 45b9b8d HIVE-16688 Make sure Alter Table to set transaction=true 
acquires X lock (Eugene Koifman, reviewed by Alan Gates)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-16 23:46:06.919
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java:19
error: ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java: 
patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892468 - PreCommit-HIVE-Build

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17672) Upgrade Calcite version to 1.14

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206788#comment-16206788
 ] 

Hive QA commented on HIVE-17672:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892464/HIVE-17672.05.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11241 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7333/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7333/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7333/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892464 - PreCommit-HIVE-Build

> Upgrade Calcite version to 1.14
> ---
>
> Key: HIVE-17672
> URL: https://issues.apache.org/jira/browse/HIVE-17672
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17672.01.patch, HIVE-17672.02.patch, 
> HIVE-17672.03.patch, HIVE-17672.04.patch, HIVE-17672.05.patch
>
>
> Calcite 1.14.0 has been recently released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17822:
-
Description: 
Maven shade plugin does not have option to skip. Adding it under a profile can 
help with skip shade reducing build times.

Maven build profile shows druid and jdbc shade plugin to be slowest (also 
hive-exec). For devs not working on druid or jdbc, it will be good to have an 
option to skip shading via a profile. With this it will be possible to get a 
subminute dev build.

  was:
Maven shade plugin does not have option to skip. Adding it under a profile can 
help with skip shade reducing build times.

Maven build profile shows druid and jdbc shade plugin to be slowest (also 
hive-exec). For devs not working on druid or jdbc, it will be good to have an 
option to skip shading via a profile. With this it will be possible to get a 
subsecond dev build.


> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subminute dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206768#comment-16206768
 ] 

Prasanth Jayachandran commented on HIVE-17822:
--

[~ashutoshc] Can you please review this change?

> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subsecond dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16688) Make sure Alter Table to set transaction=true acquires X lock

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16688:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

committed to master
thanks Alan for the review

> Make sure Alter Table to set transaction=true acquires X lock
> -
>
> Key: HIVE-16688
> URL: https://issues.apache.org/jira/browse/HIVE-16688
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16688.01.patch, HIVE-16688.02.patch
>
>
> suppose we have non-acid table with some data
> An insert op starts (long running)  (with hive.txn.strict.locking.mode=false 
> this takes shared lock)
> An alter table runs to add (transactional=true)
> An update is run which will read the list of "original" files and assign IDs 
> on the fly which are written to a delta file.
> The long running insert completes.
> Another update is run which now sees a different set of "original" files and 
> will (most likely) assign different IDs.
> Need to make sure to mutex this
> To clarify: The X lock is acquired for "An alter table runs to add 
> (transactional=true)"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-16 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206744#comment-16206744
 ] 

Alan Gates commented on HIVE-17812:
---

To clarify my comments above, the biggest compatibility issue is in 
ListenerEvent, where I changed the constructor to take IHMSHandler rather than 
HMSHandler.  Since every event inherits from ListenerEvent, this trickles down 
to all of the events.  Also, in a future patch HMSHandler.getHiveConf will have 
to be removed since HMSHandler will no longer have access to HiveConf once it 
moves to standalone-metastore.

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206727#comment-16206727
 ] 

Prasanth Jayachandran commented on HIVE-17822:
--

Some numbers
{code}
# Base: Clean offline quite build
$ time mvn clean install -DskipTests -o -q
real3m9.005s
user7m14.864s
sys 0m40.295s

# Parallel (using 1C gave best build times) build
$ time mvn clean install -DskipTests -T 1C -o -q
real2m24.415s
user8m12.243s
sys 0m54.905s

# With MAVEN_OPTS
$ time MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" mvn clean 
install -DskipTests -T 1C -o -q
real2m12.872s
user7m46.879s
sys 0m49.696s

# Skip clean
$ MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1"
$ time MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" mvn install 
-DskipTests -T 1C -o -q
real1m31.403s
user5m13.439s
sys 0m37.885s

# Skip shade for jdbc and druid-handler (requires HIVE-17822)
# NOTE: if you are changing/testing jdbc or druid you may want to skip this step
$ time MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" mvn install 
-DskipShade -DskipTests -T 1C -o -q
real1m20.130s
user4m37.645s
sys 0m39.897s

# Skip remote resource plugin
$ time MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" mvn install 
-DskipShade -DskipTests -Dremoteresources.skip=true -T 1C -o -q
real0m37.485s
user0m52.652s
sys 0m14.118s

# Build ql and downstream modules
$ time MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" mvn install 
-DskipShade -DskipTests -Dremoteresources.skip=true -T 1C -o -q -pl ql -amd
real0m31.827s
user1m50.349s
sys 0m9.494s

# Build llap-server and downstream modules
$ time MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" mvn install 
-DskipShade -DskipTests -Dremoteresources.skip=true -T 1C -o -q -pl llap-server 
-amd
real0m9.147s
user0m20.189s
sys 0m3.056s
{code}

> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subsecond dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17371) Move tokenstores to metastore module

2017-10-16 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17371:
---
Attachment: HIVE-17371.06.patch

I thought I had fixed this conflict while rebasing. Turns out I didn't add the 
fix to the patch. Attaching it with the fix for the compile issue.

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17371.01.patch, HIVE-17371.02.patch, 
> HIVE-17371.03.patch, HIVE-17371.04.patch, HIVE-17371.05.patch, 
> HIVE-17371.06.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17822:
-
Status: Patch Available  (was: Open)

> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subsecond dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206701#comment-16206701
 ] 

Prasanth Jayachandran commented on HIVE-17822:
--

Users have to pass -DskipShade to skip druid and jdbc shading. hive-exec 
shading is not used as submodules depend on it in most cases. 

> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subsecond dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17822:
-
Attachment: HIVE-17822.1.patch

> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17822.1.patch
>
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subsecond dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17822) Provide an option to skip shading of jars

2017-10-16 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17822:



> Provide an option to skip shading of jars
> -
>
> Key: HIVE-17822
> URL: https://issues.apache.org/jira/browse/HIVE-17822
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Maven shade plugin does not have option to skip. Adding it under a profile 
> can help with skip shade reducing build times.
> Maven build profile shows druid and jdbc shade plugin to be slowest (also 
> hive-exec). For devs not working on druid or jdbc, it will be good to have an 
> option to skip shading via a profile. With this it will be possible to get a 
> subsecond dev build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private

2017-10-16 Thread Alexander Kolbasov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206695#comment-16206695
 ] 

Alexander Kolbasov commented on HIVE-17425:
---

Can't apply the patch to the latest Hive code because things moved to 
standalone-metastore. Can you merge with your changes that moved things to 
standalone-metastore?

> Change MetastoreConf.ConfVars internal members to be private
> 
>
> Key: HIVE-17425
> URL: https://issues.apache.org/jira/browse/HIVE-17425
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17425.patch
>
>
> MetastoreConf's dual use of metastore keys and Hive keys is causing confusion 
> for developers.  We should make the relevant members private and provide 
> getter methods with comments on when it is appropriate to use them.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17821) TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if partName=null and table is partitioned

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17821:
--
Priority: Minor  (was: Critical)

> TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if 
> partName=null and table is partitioned
> ---
>
> Key: HIVE-17821
> URL: https://issues.apache.org/jira/browse/HIVE-17821
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> LM may acquire read locks on the table when writing a partition.
> There is no need to make an entry for the table if we know it's partitioned 
> since any I/U/D must affect a partition (or set of).
> Pass isPartitioned() in LockComponent/LockRequest or look up in TxnHandler



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17821) TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if partName=null and table is partitioned

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17821:
-


> TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if 
> partName=null and table is partitioned
> ---
>
> Key: HIVE-17821
> URL: https://issues.apache.org/jira/browse/HIVE-17821
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> LM may acquire read locks on the table when writing a partition.
> There is no need to make an entry for the table if we know it's partitioned 
> since any I/U/D must affect a partition (or set of).
> Pass isPartitioned() in LockComponent/LockRequest or look up in TxnHandler



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17813) hive.exec.move.files.from.source.dir does not work with partitioned tables

2017-10-16 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206667#comment-16206667
 ] 

Ashutosh Chauhan commented on HIVE-17813:
-

+1

> hive.exec.move.files.from.source.dir does not work with partitioned tables
> --
>
> Key: HIVE-17813
> URL: https://issues.apache.org/jira/browse/HIVE-17813
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17813.1.patch
>
>
> Setting hive.exec.move.files.from.source.dir=true causes data to not be moved 
> properly during inserts to partitioned tables.
> Looks like the file path checking in Utilties.moveSpecifiedFiles() needs to 
> recursively check into directories.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17820) Add buckets.q test for blobstores

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206663#comment-16206663
 ] 

Hive QA commented on HIVE-17820:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892438/HIVE-17820.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11240 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7328/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7328/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7328/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892438 - PreCommit-HIVE-Build

> Add buckets.q test for blobstores
> -
>
> Key: HIVE-17820
> URL: https://issues.apache.org/jira/browse/HIVE-17820
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
> Attachments: HIVE-17820.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17672) Upgrade Calcite version to 1.14

2017-10-16 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206661#comment-16206661
 ] 

Ashutosh Chauhan commented on HIVE-17672:
-

+1

> Upgrade Calcite version to 1.14
> ---
>
> Key: HIVE-17672
> URL: https://issues.apache.org/jira/browse/HIVE-17672
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17672.01.patch, HIVE-17672.02.patch, 
> HIVE-17672.03.patch, HIVE-17672.04.patch, HIVE-17672.05.patch
>
>
> Calcite 1.14.0 has been recently released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-13795) TxnHandler should know if operation is using dynamic partitions

2017-10-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634542#comment-15634542
 ] 

Eugene Koifman edited comment on HIVE-13795 at 10/16/17 10:01 PM:
--

HIVE-14943 adds logic lock acquisition code path to know if it's part of DP 
write and the delete SQL stmt is removed from addDynamicParitions()


was (Author: ekoifman):
HIVE-14943 add logic lock acquisition code path to know if it's part of DB 
write and the delete SQL stmt is removed from addDynamicParitions()

> TxnHandler should know if operation is using dynamic partitions
> ---
>
> Key: HIVE-13795
> URL: https://issues.apache.org/jira/browse/HIVE-13795
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert". If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient. In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition. Conversely, if not running with DynPart then, we can 
> kill current txn on lock grant rather than wait until commit time.
> if addDynamicPartitions() also knew about DynPart it could eliminate the 
> Delete from Txn_components... statement
> This is an important perf optimization when it allows us to detect that 
> concurrent txns will have a WW conflict early



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17706) Add a possibility to run the BeeLine tests on the default database

2017-10-16 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206640#comment-16206640
 ] 

Lefty Leverenz commented on HIVE-17706:
---

[~pvary], you committed this to master so please update the status and fix 
version.

Also, in the future please make sure JIRA has a +1 before you commit.  In this 
case [~zsombor.klara] okayed the patch on the review board, but we need a 
record of that in JIRA too.

Here are the policy details:

* [How to Commit -- Review (3rd paragraph) | 
https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-Review]
* [How to Commit -- Commit (step #1) | 
https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-Commit]

> Add a possibility to run the BeeLine tests on the default database
> --
>
> Key: HIVE-17706
> URL: https://issues.apache.org/jira/browse/HIVE-17706
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-17706.2.patch, HIVE-17706.3.patch, 
> HIVE-17706.4.patch, HIVE-17706.patch
>
>
> Currently it is possible to run the BeeLine tests sequentially but it still 
> relies on cleaning up after the tests by cleaning up the database. Some of 
> the tests could be run only against the default database. We need a cleanup 
> mechanism between the tests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16688) Make sure Alter Table to set transaction=true acquires X lock

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16688:
--
Description: 
suppose we have non-acid table with some data
An insert op starts (long running)  (with hive.txn.strict.locking.mode=false 
this takes shared lock)
An alter table runs to add (transactional=true)
An update is run which will read the list of "original" files and assign IDs on 
the fly which are written to a delta file.
The long running insert completes.
Another update is run which now sees a different set of "original" files and 
will (most likely) assign different IDs.

Need to make sure to mutex this

To clarify: The X lock is acquired for "An alter table runs to add 
(transactional=true)"

  was:
suppose we have non-acid table with some data
An insert op starts (long running)  (with hive.txn.strict.locking.mode=false 
this takes shared lock)
An alter table runs to add (transactional=true)
An update is run which will read the list of "original" files and assign IDs on 
the fly which are written to a delta file.
The long running insert completes.
Another update is run which now sees a different set of "original" files and 
will (most likely) assign different IDs.

Need to make sure to mutex this


> Make sure Alter Table to set transaction=true acquires X lock
> -
>
> Key: HIVE-16688
> URL: https://issues.apache.org/jira/browse/HIVE-16688
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16688.01.patch, HIVE-16688.02.patch
>
>
> suppose we have non-acid table with some data
> An insert op starts (long running)  (with hive.txn.strict.locking.mode=false 
> this takes shared lock)
> An alter table runs to add (transactional=true)
> An update is run which will read the list of "original" files and assign IDs 
> on the fly which are written to a delta file.
> The long running insert completes.
> Another update is run which now sees a different set of "original" files and 
> will (most likely) assign different IDs.
> Need to make sure to mutex this
> To clarify: The X lock is acquired for "An alter table runs to add 
> (transactional=true)"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Priority: Critical  (was: Major)

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16688) Make sure Alter Table to set transaction=true acquires X lock

2017-10-16 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206629#comment-16206629
 ] 

Alan Gates commented on HIVE-16688:
---

+1

> Make sure Alter Table to set transaction=true acquires X lock
> -
>
> Key: HIVE-16688
> URL: https://issues.apache.org/jira/browse/HIVE-16688
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16688.01.patch, HIVE-16688.02.patch
>
>
> suppose we have non-acid table with some data
> An insert op starts (long running)  (with hive.txn.strict.locking.mode=false 
> this takes shared lock)
> An alter table runs to add (transactional=true)
> An update is run which will read the list of "original" files and assign IDs 
> on the fly which are written to a delta file.
> The long running insert completes.
> Another update is run which now sees a different set of "original" files and 
> will (most likely) assign different IDs.
> Need to make sure to mutex this



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17802) Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer

2017-10-16 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17802:

Attachment: HIVE-17802.1.patch

Cumulative patch with HIVE-17802, HIVE-17803, and part of HIVE-13989.

> Remove unnecessary calls to FileSystem.setOwner() from 
> FileOutputCommitterContainer
> ---
>
> Key: HIVE-17802
> URL: https://issues.apache.org/jira/browse/HIVE-17802
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17802.1.patch
>
>
> For large Pig/HCat queries that produce a large number of 
> partitions/directories/files, we have seen cases where the HDFS NameNode 
> groaned under the weight of {{FileSystem.setOwner()}} calls, originating from 
> the commit-step. This was the result of the following code in 
> FileOutputCommitterContainer:
> {code:java}
> private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission 
> permission,
>   List acls, String group, boolean recursive)
> throws IOException {
> ...
> if (recursive) {
>   for (FileStatus fileStatus : fs.listStatus(dir)) {
> if (fileStatus.isDir()) {
>   applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, 
> group, true);
> } else {
>   fs.setPermission(fileStatus.getPath(), permission);
>   chown(fs, fileStatus.getPath(), group);
> }
>   }
> }
>   }
>   private void chown(FileSystem fs, Path file, String group) throws 
> IOException {
> try {
>   fs.setOwner(file, null, group);
> } catch (AccessControlException ignore) {
>   // Some users have wrong table group, ignore it.
>   LOG.warn("Failed to change group of partition directories/files: " + 
> file, ignore);
> }
>   }
> {code}
> One call per file/directory is far too many. We have a patch that reduces the 
> namenode pressure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17802) Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer

2017-10-16 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17802:

Status: Patch Available  (was: Open)

> Remove unnecessary calls to FileSystem.setOwner() from 
> FileOutputCommitterContainer
> ---
>
> Key: HIVE-17802
> URL: https://issues.apache.org/jira/browse/HIVE-17802
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17802.1.patch
>
>
> For large Pig/HCat queries that produce a large number of 
> partitions/directories/files, we have seen cases where the HDFS NameNode 
> groaned under the weight of {{FileSystem.setOwner()}} calls, originating from 
> the commit-step. This was the result of the following code in 
> FileOutputCommitterContainer:
> {code:java}
> private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission 
> permission,
>   List acls, String group, boolean recursive)
> throws IOException {
> ...
> if (recursive) {
>   for (FileStatus fileStatus : fs.listStatus(dir)) {
> if (fileStatus.isDir()) {
>   applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, 
> group, true);
> } else {
>   fs.setPermission(fileStatus.getPath(), permission);
>   chown(fs, fileStatus.getPath(), group);
> }
>   }
> }
>   }
>   private void chown(FileSystem fs, Path file, String group) throws 
> IOException {
> try {
>   fs.setOwner(file, null, group);
> } catch (AccessControlException ignore) {
>   // Some users have wrong table group, ignore it.
>   LOG.warn("Failed to change group of partition directories/files: " + 
> file, ignore);
> }
>   }
> {code}
> One call per file/directory is far too many. We have a patch that reduces the 
> namenode pressure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-16 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17812:
--
Status: Patch Available  (was: Open)

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-16 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17812:
--
Attachment: HIVE-17812.2.patch

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-16 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17812:
--
Status: Open  (was: Patch Available)

Failure in TestTxnCommands is real.  Will post a new patch.

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17609) Tool to manipulate delegation tokens

2017-10-16 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206592#comment-16206592
 ] 

Lefty Leverenz commented on HIVE-17609:
---

Can this be documented on an existing wiki page, or does it need a page of its 
own with links from other pages?

Added TODOC2.4 and TODOC2.2 labels (note that it's 2.2.1, not 2.2.0).

> Tool to manipulate delegation tokens
> 
>
> Key: HIVE-17609
> URL: https://issues.apache.org/jira/browse/HIVE-17609
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-17609.1-branch-2.2.patch, 
> HIVE-17609.1-branch-2.patch, HIVE-17609.1.patch, HIVE-17609.2.patch
>
>
> This was precipitated by OOZIE-2797. We had a case in production where the 
> number of active metastore delegation tokens outstripped the ZooKeeper 
> {{jute.maxBuffer}} size. Delegation tokens could neither be fetched, nor be 
> cancelled. 
> The root-cause turned out to be a miscommunication, causing delegation tokens 
> fetched by Oozie *not* to be cancelled automatically from HCat. This was 
> sorted out as part of OOZIE-2797.
> The issue exposed how poor the log-messages were, in the code pertaining to 
> token fetch/cancellation. We also found need for a tool to query/list/purge 
> delegation tokens that might have expired already. This patch introduces such 
> a tool, and improves the log-messages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17609) Tool to manipulate delegation tokens

2017-10-16 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17609:
--
Labels: TODOC2.2 TODOC2.4  (was: )

> Tool to manipulate delegation tokens
> 
>
> Key: HIVE-17609
> URL: https://issues.apache.org/jira/browse/HIVE-17609
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Security
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-17609.1-branch-2.2.patch, 
> HIVE-17609.1-branch-2.patch, HIVE-17609.1.patch, HIVE-17609.2.patch
>
>
> This was precipitated by OOZIE-2797. We had a case in production where the 
> number of active metastore delegation tokens outstripped the ZooKeeper 
> {{jute.maxBuffer}} size. Delegation tokens could neither be fetched, nor be 
> cancelled. 
> The root-cause turned out to be a miscommunication, causing delegation tokens 
> fetched by Oozie *not* to be cancelled automatically from HCat. This was 
> sorted out as part of OOZIE-2797.
> The issue exposed how poor the log-messages were, in the code pertaining to 
> token fetch/cancellation. We also found need for a tool to query/list/purge 
> delegation tokens that might have expired already. This patch introduces such 
> a tool, and improves the log-messages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17672) Upgrade Calcite version to 1.14

2017-10-16 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17672:
---
Attachment: HIVE-17672.05.patch

> Upgrade Calcite version to 1.14
> ---
>
> Key: HIVE-17672
> URL: https://issues.apache.org/jira/browse/HIVE-17672
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17672.01.patch, HIVE-17672.02.patch, 
> HIVE-17672.03.patch, HIVE-17672.04.patch, HIVE-17672.05.patch
>
>
> Calcite 1.14.0 has been recently released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206569#comment-16206569
 ] 

Hive QA commented on HIVE-17371:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892436/HIVE-17371.05.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7327/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7327/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7327/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-16 20:39:12.652
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7327/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-16 20:39:12.655
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   031cfa2..599a74f  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 031cfa2 HIVE-17214 heck/fix conversion of unbucketed non-acid to 
acid (Eugene Koifman, reviewed by Sergey Shelukhin)
+ git clean -f -d
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 599a74f HIVE-17391 Compaction fails if there is an empty value 
in tblproperties (Steve Yeom via Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-16 20:39:17.498
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p1
patching file beeline/src/test/org/apache/hive/beeline/ProxyAuthTest.java
patching file 
hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java
patching file 
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/Security.java
patching file 
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java
patching file 
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestHiveAuthFactory.java
patching file 
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithDBTokenStore.java
patching file 
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/security/TestDBTokenStore.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/security/TestZooKeeperTokenStore.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestDBTokenStore.java
patching file 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java
patching file jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
patching file jdbc/src/java/org/apache/hive/jdbc/Utils.java
patching file 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
patching file 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
patching file 
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
patching file service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java
patching file service/src/java/org/apache/hive/service/auth/HttpAuthUtils.java
patching file 
service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java
patching file 
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
patching file 
service/src/java/org/apache/hive/service/cli/session/SessionUtils.java
patching file

[jira] [Commented] (HIVE-17731) add a backward compat option for external users to HIVE-11985

2017-10-16 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206550#comment-16206550
 ] 

Lefty Leverenz commented on HIVE-17731:
---

Doc note:  This adds *hive.legacy.schema.for.all.serdes* to HiveConf.java, so 
it needs to be documented in the wiki.

* [Configuration Properties -- SerDes | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-SerDes]

Added a TODOC2.4 label.

> add a backward compat option for external users to HIVE-11985
> -
>
> Key: HIVE-17731
> URL: https://issues.apache.org/jira/browse/HIVE-17731
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.4
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17731.patch
>
>
> See HIVE-11985.
> Some external callers (e.g. Presto) do not appear to process types from 
> deserializer correctly, relying on DB types. Ideally, it should be resolved 
> via HIVE-17714, hiding the custom SerDe logic from users.
> For now we can add a backward compatibility config for such cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17710) LockManager and External tables

2017-10-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206547#comment-16206547
 ] 

Eugene Koifman commented on HIVE-17710:
---

Also consider closing an implicit txn immediately after compiling if we know it 
only uses External tables.

Somewhat orthogonal, maybe useful to have ReadOnly txn mode - this could 
simplify TxnHandler.commitTxn()

> LockManager and External tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17731) add a backward compat option for external users to HIVE-11985

2017-10-16 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17731:
--
Labels: TODOC2.4  (was: )

> add a backward compat option for external users to HIVE-11985
> -
>
> Key: HIVE-17731
> URL: https://issues.apache.org/jira/browse/HIVE-17731
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.4
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17731.patch
>
>
> See HIVE-11985.
> Some external callers (e.g. Presto) do not appear to process types from 
> deserializer correctly, relying on DB types. Ideally, it should be resolved 
> via HIVE-17714, hiding the custom SerDe logic from users.
> For now we can add a backward compatibility config for such cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-16139) Clarify Acid concurrency model

2017-10-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206541#comment-16206541
 ] 

Eugene Koifman edited comment on HIVE-16139 at 10/16/17 8:22 PM:
-

https://community.hortonworks.com/questions/140685/locking-granularity-on-a-table.html

http://mail-archives.apache.org/mod_mbox/hive-user/201710.mbox/browser





was (Author: ekoifman):
https://community.hortonworks.com/questions/140685/locking-granularity-on-a-table.html


> Clarify Acid concurrency model
> --
>
> Key: HIVE-16139
> URL: https://issues.apache.org/jira/browse/HIVE-16139
> Project: Hive
>  Issue Type: Task
>  Components: Documentation, Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Need to clarify the rules in 1 place - it's spread out across multiple 
> locations.
> FYI [~cartershanklin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17760) Create a unit test which validates HIVE-9423 does not regress

2017-10-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206543#comment-16206543
 ] 

Hive QA commented on HIVE-17760:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892431/HIVE-17760.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11234 tests 
executed
*Failed tests:*
{noformat}
TestThriftCLIServiceWithHttp - did not produce a TEST-*.xml file (likely timed 
out) (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hive.beeline.TestBeelinePasswordOption.testMultiConnect (batchId=224)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7326/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7326/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7326/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892431 - PreCommit-HIVE-Build

> Create a unit test which validates HIVE-9423 does not regress 
> --
>
> Key: HIVE-17760
> URL: https://issues.apache.org/jira/browse/HIVE-17760
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17760.1.patch, HIVE-17760.2.patch, 
> HIVE-17760.3.patch, HIVE-17760.4.patch
>
>
> During [HIVE-9423] we verified that when the Thrift server pool is exhausted, 
> then Beeline connection times out, and provide a meaningful error message.
> Create a unit test which verifies this, and helps to keep this feature working



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16139) Clarify Acid concurrency model

2017-10-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206541#comment-16206541
 ] 

Eugene Koifman commented on HIVE-16139:
---

https://community.hortonworks.com/questions/140685/locking-granularity-on-a-table.html


> Clarify Acid concurrency model
> --
>
> Key: HIVE-16139
> URL: https://issues.apache.org/jira/browse/HIVE-16139
> Project: Hive
>  Issue Type: Task
>  Components: Documentation, Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Need to clarify the rules in 1 place - it's spread out across multiple 
> locations.
> FYI [~cartershanklin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17391) Compaction fails if there is an empty value in tblproperties

2017-10-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-17391.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

committed to master (Hive 3.0)
thanks Steve for the contribution

> Compaction fails if there is an empty value in tblproperties
> 
>
> Key: HIVE-17391
> URL: https://issues.apache.org/jira/browse/HIVE-17391
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Steve Yeom
> Fix For: 3.0.0
>
> Attachments: HIVE-17391.01.patch, HIVE-17391.02.patch, 
> HIVE-17391.03.patch
>
>
> create table t1 (a int) tblproperties ('serialization.null.format'='');
> alter table t1 compact 'major';
> fails



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17756) Enable subquery related Qtests for Hive on Spark

2017-10-16 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206531#comment-16206531
 ] 

Vineet Garg commented on HIVE-17756:


[~dapengsun] Can you open a jira and regenerate failing tests? I can take a 
look then.

> Enable subquery related Qtests for Hive on Spark
> 
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 3.0.0
>
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce subquery test and verify the subqueries plan for 
> Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17726) Using exists may lead to incorrect results

2017-10-16 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206525#comment-16206525
 ] 

Peter Vary commented on HIVE-17726:
---

[~vgarg]: Check with [~dapengsun], and [~xuefuz], they might be already working 
on a fix started from HIVE-17756.

Thanks,
Peter

> Using exists may lead to incorrect results
> --
>
> Key: HIVE-17726
> URL: https://issues.apache.org/jira/browse/HIVE-17726
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zoltan Haindrich
>Assignee: Vineet Garg
> Attachments: HIVE-17726.1.patch, HIVE-17726.2.patch, 
> HIVE-17726.3.patch
>
>
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer);
> insert into tx1   values  (1, 1),
> (1, 2),
> (1, 3);
> select count(*) as result,3 as expected from tx1 u
> where exists (select * from tx1 v where u.a=v.a and u.b <> v.b);
> select count(*) as result,3 as expected from tx1 u
> where exists (select * from tx1 v where u.a=v.a and u.b <> v.b limit 1);
> {code}
> current results are 6 and 2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17391) Compaction fails if there is an empty value in tblproperties

2017-10-16 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206511#comment-16206511
 ] 

Steve Yeom edited comment on HIVE-17391 at 10/16/17 8:02 PM:
-

[~eugene.koifman]
Reflected on Eugene's comments. 
Thanks, Eugene. 


was (Author: steveyeom2017):
Reflected on Eugene's comments. 
Thanks, Eugene. 

> Compaction fails if there is an empty value in tblproperties
> 
>
> Key: HIVE-17391
> URL: https://issues.apache.org/jira/browse/HIVE-17391
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Steve Yeom
> Attachments: HIVE-17391.01.patch, HIVE-17391.02.patch, 
> HIVE-17391.03.patch
>
>
> create table t1 (a int) tblproperties ('serialization.null.format'='');
> alter table t1 compact 'major';
> fails



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17391) Compaction fails if there is an empty value in tblproperties

2017-10-16 Thread Steve Yeom (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-17391:
--
Attachment: HIVE-17391.03.patch

Reflected on Eugene's comments. 
Thanks, Eugene. 

> Compaction fails if there is an empty value in tblproperties
> 
>
> Key: HIVE-17391
> URL: https://issues.apache.org/jira/browse/HIVE-17391
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Steve Yeom
> Attachments: HIVE-17391.01.patch, HIVE-17391.02.patch, 
> HIVE-17391.03.patch
>
>
> create table t1 (a int) tblproperties ('serialization.null.format'='');
> alter table t1 compact 'major';
> fails



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17391) Compaction fails if there is an empty value in tblproperties

2017-10-16 Thread Steve Yeom (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-17391:
--
Status: Open  (was: Patch Available)

> Compaction fails if there is an empty value in tblproperties
> 
>
> Key: HIVE-17391
> URL: https://issues.apache.org/jira/browse/HIVE-17391
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.3.0, 2.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Steve Yeom
> Attachments: HIVE-17391.01.patch, HIVE-17391.02.patch
>
>
> create table t1 (a int) tblproperties ('serialization.null.format'='');
> alter table t1 compact 'major';
> fails



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17534) Add a config to turn off parquet vectorization

2017-10-16 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206502#comment-16206502
 ] 

Lefty Leverenz commented on HIVE-17534:
---

Doc note:  This adds *hive.vectorized.input.format.excludes* to HiveConf.java.  
Thanks for the doc, Vihang.

* [hive.vectorized.input.format.excludes | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.input.format.excludes]

> Add a config to turn off parquet vectorization
> --
>
> Key: HIVE-17534
> URL: https://issues.apache.org/jira/browse/HIVE-17534
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17534.01.patch, HIVE-17534.02.patch, 
> HIVE-17534.03.patch, HIVE-17534.04-branch-2.patch
>
>
> It should be a good addition to give an option for users to turn off parquet 
> vectorization without affecting vectorization on other file formats. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17815) prevent OOM with Atlas Hive hook

2017-10-16 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206444#comment-16206444
 ] 

Thejas M Nair commented on HIVE-17815:
--

[~anishek] +1 . Can you also add a comment about why thats done in the code ?


> prevent OOM with Atlas Hive hook 
> -
>
> Key: HIVE-17815
> URL: https://issues.apache.org/jira/browse/HIVE-17815
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17815.0.patch
>
>
> as part of HIVE-17814 we are going to handle the issue w.r.t to hive as well 
> as the post execution hook api's . However for atlas which is a commonly used 
> hive post execution hook, we want to prevent additional memory usage. Also 
> Atlas currently does not handle /  work on replication queries hence 
> overloading the hookContext with TaskRunner objects is just using a lot of 
> memory.  The same should be true for other execution hooks as well since 
> replication is a new functionality,
> This task is to reduce that for replication related queries. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 183 matches

Mail list logo