[jira] [Commented] (HIVE-21922) Allow keytabs to be reused in LLAP yarn applications through Yarn localization

2019-07-30 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895934#comment-16895934
 ] 

Adam Szita commented on HIVE-21922:
---

After consulting with other folks, it looks like this change is not desirable. 
In Hadoop world we're abusing Kerberos entities i.e. hive/host1@realm and 
hive/host2@realm are interpreted by UGI as the same Hive user. Still we need 
different principals per host so that LDAP doesn't revoke permissions due to 
frequent renewals seen across the cluster if one principal is used for Hive 
only.

Thus marking this change as resolved.

> Allow keytabs to be reused in LLAP yarn applications through Yarn localization
> --
>
> Key: HIVE-21922
> URL: https://issues.apache.org/jira/browse/HIVE-21922
> Project: Hive
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21922.0.patch, HIVE-21922.1.patch, 
> HIVE-21922.2.patch
>
>
> In secure clusters LLAP has to be able to reach keytab files for kerberos 
> login.
> Currently _hive.llap.task.scheduler.am.registry.keytab.file_ and 
> _hive.llap.daemon.keytab.file_ configs are used to define the path of such 
> keytabs on the Tez AM and LLAP daemon side respectively. Both presume local 
> file system paths only - hence all nodes in the LLAP cluster (even those that 
> eventually don't end up executing a daemon...) have to have Hive's keytab 
> preinstalled on them.
> The above is described by this strategy: 
> [Pre-installed_Keytabs_for_AM_and_containers|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Pre-installed_Keytabs_for_AM_and_containers]
> Another approach can be 
> [Keytabs_for_AM_and_containers_distributed_via_YARN|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Keytabs_for_AM_and_containers_distributed_via_YARN]
>  where we rely on HDFS and Yarn resource localization, and no prior keytab 
> distribution is required. I intend to make this strategy an option for 
> Hive-LLAP in this jira.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21922) Allow keytabs to be reused in LLAP yarn applications through Yarn localization

2019-07-30 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-21922:
--
Resolution: Invalid
Status: Resolved  (was: Patch Available)

> Allow keytabs to be reused in LLAP yarn applications through Yarn localization
> --
>
> Key: HIVE-21922
> URL: https://issues.apache.org/jira/browse/HIVE-21922
> Project: Hive
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21922.0.patch, HIVE-21922.1.patch, 
> HIVE-21922.2.patch
>
>
> In secure clusters LLAP has to be able to reach keytab files for kerberos 
> login.
> Currently _hive.llap.task.scheduler.am.registry.keytab.file_ and 
> _hive.llap.daemon.keytab.file_ configs are used to define the path of such 
> keytabs on the Tez AM and LLAP daemon side respectively. Both presume local 
> file system paths only - hence all nodes in the LLAP cluster (even those that 
> eventually don't end up executing a daemon...) have to have Hive's keytab 
> preinstalled on them.
> The above is described by this strategy: 
> [Pre-installed_Keytabs_for_AM_and_containers|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Pre-installed_Keytabs_for_AM_and_containers]
> Another approach can be 
> [Keytabs_for_AM_and_containers_distributed_via_YARN|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Keytabs_for_AM_and_containers_distributed_via_YARN]
>  where we rely on HDFS and Yarn resource localization, and no prior keytab 
> distribution is required. I intend to make this strategy an option for 
> Hive-LLAP in this jira.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22059) hive-exec jar doesn't contain (fasterxml) jackson library

2019-07-30 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22059:

Attachment: HIVE-22059.02.patch

> hive-exec jar doesn't contain (fasterxml) jackson library
> -
>
> Key: HIVE-22059
> URL: https://issues.apache.org/jira/browse/HIVE-22059
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22059.01.patch, HIVE-22059.02.patch
>
>
> While deploying master branch into a container I've noticed that the jackson 
> libraries are not 100% sure that are available at runtime - this is probably 
> due to the fact that we are still using the "old" codehaus jackson and also 
> the "new" fasterxml one.
> {code:java}
> ]Vertex killed, vertexName=Reducer 2, 
> vertexId=vertex_1564408646590_0005_1_01, diagnostics=[Vertex received Kill in 
> INITED state., Vertex vertex_1564408646590_0005_1_01 [Reducer 2] 
> killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:1
> INFO : Completed executing 
> command(queryId=vagrant_20190729141949_8d8c7f0d-0ac4-4d76-ba12-6ec01561b040); 
> Time taken: 5.127 seconds
> INFO : Concurrency mode is disabled, not creating a lock manager
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1564408646590_0005_1_00, 
> diagnostics=[Vertex vertex_1564408646590_0005_1_00 [Map 1] killed/failed due 
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: _dummy_table initializer failed, 
> vertex=vertex_1564408646590_0005_1_00 [Map 1], 
> java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/ObjectMapper
> at org.apache.hadoop.hive.ql.exec.Utilities.(Utilities.java:226)
> at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:428)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:508)
> at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:488)
> at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:337)
> at 
> org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:122)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> com.fasterxml.jackson.databind.ObjectMapper
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 19 more
> ]Vertex killed, vertexName=Reducer 2, 
> vertexId=vertex_1564408646590_0005_1_01, diagnostics=[Vertex received Kill in 
> INITED state., Vertex vertex_1564408646590_0005_1_01 [Reducer 2] 
> killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:1 (state=08S01,code=2)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22054) Avoid recursive listing to check if a directory is empty

2019-07-30 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896009#comment-16896009
 ] 

Steve Loughran commented on HIVE-22054:
---

you are correct, the getContentSummary call will be horribly bad on S3; didn't 
know anyone used it. Filed HADOOP-16468 to speed it up, but it'll still be 
issuing {{descendants/1000}} LIST calls, which costs $ as well as time.

For directories where the parent is deleted, things are low cost today; this 
patch will deliver significant speedups in the state where the parent directory 
is not empty and 1+ subdirectory has a deep tree -its the depth which is 
potentially more expensive than the number of entries in a directory.



> Avoid recursive listing to check if a directory is empty
> 
>
> Key: HIVE-22054
> URL: https://issues.apache.org/jira/browse/HIVE-22054
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0, 1.2.0, 2.1.0, 3.1.1, 2.3.5
>Reporter: Prabhas Kumar Samanta
>Assignee: Prabhas Kumar Samanta
>Priority: Major
> Attachments: HIVE-22054.2.patch, HIVE-22054.patch
>
>
> During drop partition on a managed table, first we delete the directory 
> corresponding to the partition. After that we recursively delete the parent 
> directory as well if parent directory becomes empty. To do this emptiness 
> check, we call Warehouse::getContentSummary(), which in turn recursively 
> check all files and subdirectories. This is a costly operation when a 
> directory has a lot of files or subdirectories. This overhead is even more 
> prominent for cloud based file systems like s3. And for emptiness check, this 
> is unnecessary too.
> This is recursive listing was introduced as part of HIVE-5220. Code snippet 
> for reference :
> {code:java}
> // Warehouse.java
> public boolean isEmpty(Path path) throws IOException, MetaException {
>   ContentSummary contents = getFs(path).getContentSummary(path);
>   if (contents != null && contents.getFileCount() == 0 && 
> contents.getDirectoryCount() == 1) {
> return true;
>   }
>   return false;
> }
> // HiveMetaStore.java
> private void deleteParentRecursive(Path parent, int depth, boolean mustPurge, 
> boolean needRecycle)
>   throws IOException, MetaException {
>   if (depth > 0 && parent != null && wh.isWritable(parent)) {
> if (wh.isDir(parent) && wh.isEmpty(parent)) {
>   wh.deleteDir(parent, true, mustPurge, needRecycle);
> }
> deleteParentRecursive(parent.getParent(), depth - 1, mustPurge, 
> needRecycle);
>   }
> }
> // Note: FileSystem::getContentSummary() performs a recursive listing.{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (HIVE-22060) Replacing "catch Throwable" with a more restricted exception class

2019-07-30 Thread Ivan Suller (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Suller reassigned HIVE-22060:
--


> Replacing "catch Throwable" with a more restricted exception class
> --
>
> Key: HIVE-22060
> URL: https://issues.apache.org/jira/browse/HIVE-22060
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Major
>
> Catching Throwable considered unsafe in Java. A Throwable can be any Error 
> and those are JVM errors after the state of the JVM is not guaranteed thus 
> the cleanest way to "handle" the error is to let it kill the current thread.
> I ran a quick scan and found almost 400 "catch Throwable" in the current 
> codebase. I opened this ticket as a conversation starter to:
> - discuss if we want to eliminate this issue
> - if we want to do it what's the best way to do it



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20683:

Attachment: HIVE-20683.2.patch

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, HIVE-20683.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20683:

Attachment: (was: HIVE-20683.2.patch)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, HIVE-20683.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20683:

Attachment: HIVE-20683.2.patch

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, HIVE-20683.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22059) hive-exec jar doesn't contain (fasterxml) jackson library

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896153#comment-16896153
 ] 

Hive QA commented on HIVE-22059:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12976196/HIVE-22059.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18201/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18201/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18201/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-07-30 13:51:08.433
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-18201/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-07-30 13:51:08.437
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at ba26fcf HIVE-22036 : HMS should identify events corresponding to 
replicated database for Atlas HMS hook. (Ashutosh Bapat reviewed by Mahesh 
Kumar Behera)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at ba26fcf HIVE-22036 : HMS should identify events corresponding to 
replicated database for Atlas HMS hook. (Ashutosh Bapat reviewed by Mahesh 
Kumar Behera)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-07-30 13:51:11.403
+ rm -rf ../yetus_PreCommit-HIVE-Build-18201
+ mkdir ../yetus_PreCommit-HIVE-Build-18201
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-18201
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-18201/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
fatal: unrecognized input
fatal: unrecognized input
fatal: unrecognized input
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-18201
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12976196 - PreCommit-HIVE-Build

> hive-exec jar doesn't contain (fasterxml) jackson library
> -
>
> Key: HIVE-22059
> URL: https://issues.apache.org/jira/browse/HIVE-22059
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22059.01.patch, HIVE-22059.02.patch
>
>
> While deploying master branch into a container I've noticed that the jackson 
> libraries are not 100% sure that are available at runtime - this is probably 
> due to the fact that we are still using the "old" codehaus jackson and also 
> the "new" fasterxml one.
> {code:java}
> ]Vertex killed, vertexName=Reducer 2, 
> vertexId=vertex_1564408646590_0005_1_01, diagnostics=[Vertex received Kill in 
> INITED state., Vertex vertex_1564408646590_0005_1_01 [Reducer 2] 
> killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:1
> INFO : Completed executing 
> command(queryId=vagrant_20190729141949_8d8c7f0d-0ac4-4d76-ba12-6ec01561b040); 
> Time taken: 5.127 seconds
> INFO : Concurrency mode is disabled, not creating a lock manager
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1564408646590_0005_1_00, 
> diagnostics=[Vertex vertex_1564408646590_0005

[jira] [Assigned] (HIVE-22062) WriteId is not updated for a partitioned ACID table when schema changes

2019-07-30 Thread Gabor Kaszab (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab reassigned HIVE-22062:
---


> WriteId is not updated for a partitioned ACID table when schema changes
> ---
>
> Key: HIVE-22062
> URL: https://issues.apache.org/jira/browse/HIVE-22062
> Project: Hive
>  Issue Type: Bug
>Reporter: Gabor Kaszab
>Assignee: Laszlo Kovari
>Priority: Major
>  Labels: ACID
>
> Changing the schema (e.g. adding a new column) of a non-partitioned ACID 
> table results in the table-level writeId being incremented.
> However, if you do the same on a partitioned ACID table then neither the 
> table-level nor the partition-level writeIds are updated. I would expect in 
> this case to increment the table-level writeId to reflect that the table has 
> been changed.
> Note, that get_valid_write_ids() shows that the high watermark is incremented 
> even though the writeId isn't.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22062) WriteId is not updated for a partitioned ACID table when schema changes

2019-07-30 Thread Gabor Kaszab (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated HIVE-22062:

Description: 
Changing the schema (e.g. adding a new column) of a non-partitioned ACID table 
results in the table-level writeId being incremented. This is as expected.

However, if you do the same on a partitioned ACID table then neither the 
table-level nor the partition-level writeIds are updated. I would expect in 
this case to increment the table-level writeId to reflect that the table has 
been changed.
Note, that get_valid_write_ids() shows that the high watermark is incremented 
even though the writeId isn't.

  was:
Changing the schema (e.g. adding a new column) of a non-partitioned ACID table 
results in the table-level writeId being incremented.

However, if you do the same on a partitioned ACID table then neither the 
table-level nor the partition-level writeIds are updated. I would expect in 
this case to increment the table-level writeId to reflect that the table has 
been changed.
Note, that get_valid_write_ids() shows that the high watermark is incremented 
even though the writeId isn't.


> WriteId is not updated for a partitioned ACID table when schema changes
> ---
>
> Key: HIVE-22062
> URL: https://issues.apache.org/jira/browse/HIVE-22062
> Project: Hive
>  Issue Type: Bug
>Reporter: Gabor Kaszab
>Assignee: Laszlo Kovari
>Priority: Major
>  Labels: ACID
>
> Changing the schema (e.g. adding a new column) of a non-partitioned ACID 
> table results in the table-level writeId being incremented. This is as 
> expected.
> However, if you do the same on a partitioned ACID table then neither the 
> table-level nor the partition-level writeIds are updated. I would expect in 
> this case to increment the table-level writeId to reflect that the table has 
> been changed.
> Note, that get_valid_write_ids() shows that the high watermark is incremented 
> even though the writeId isn't.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (HIVE-21960) HMS tasks on replica

2019-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21960?focusedWorklogId=285083&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-285083
 ]

ASF GitHub Bot logged work on HIVE-21960:
-

Author: ASF GitHub Bot
Created on: 30/Jul/19 16:27
Start Date: 30/Jul/19 16:27
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #735: 
HIVE-21960 : Avoid running stats updater and partition management task on a 
replicated table.
URL: https://github.com/apache/hive/pull/735#discussion_r308820482
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
 ##
 @@ -220,6 +221,16 @@ private void stopWorkers() {
 String skipParam = 
table.getParameters().get(SKIP_STATS_AUTOUPDATE_PROPERTY);
 if ("true".equalsIgnoreCase(skipParam)) return null;
 
+// If the table is being replicated into,
+// 1. the stats are also replicated from the source, so we don't need 
those to be calculated
+//on the target again
+// 2. updating stats requires a writeId to be created. Hence writeIds on 
source and target
+//can get out of sync when stats are updated. That can cause 
consistency issues.
+String replTrgtParam = 
table.getParameters().get(ReplConst.REPL_TARGET_PROPERTY);
 
 Review comment:
   Done. Please check.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 285083)
Time Spent: 1h 20m  (was: 1h 10m)

> HMS tasks on replica
> 
>
> Key: HIVE-21960
> URL: https://issues.apache.org/jira/browse/HIVE-21960
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21960.01.patch, HIVE-21960.02.patch, 
> HIVE-21960.03.patch, Replication and House keeping tasks.pdf
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> An HMS performs a number of housekeeping tasks. Assess whether
>  # They are required to be performed in the replicated data
>  # Performing those on replicated data causes any issues and how to fix those.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21960) HMS tasks on replica

2019-07-30 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21960:
--
Status: In Progress  (was: Patch Available)

> HMS tasks on replica
> 
>
> Key: HIVE-21960
> URL: https://issues.apache.org/jira/browse/HIVE-21960
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21960.01.patch, HIVE-21960.02.patch, 
> HIVE-21960.03.patch, Replication and House keeping tasks.pdf
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> An HMS performs a number of housekeeping tasks. Assess whether
>  # They are required to be performed in the replicated data
>  # Performing those on replicated data causes any issues and how to fix those.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21960) HMS tasks on replica

2019-07-30 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21960:
--
Attachment: HIVE-21960.04.patch
Status: Patch Available  (was: In Progress)

Addressed [~maheshk114]'s comments and rebased the patches.

> HMS tasks on replica
> 
>
> Key: HIVE-21960
> URL: https://issues.apache.org/jira/browse/HIVE-21960
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21960.01.patch, HIVE-21960.02.patch, 
> HIVE-21960.03.patch, HIVE-21960.04.patch, Replication and House keeping 
> tasks.pdf
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> An HMS performs a number of housekeeping tasks. Assess whether
>  # They are required to be performed in the replicated data
>  # Performing those on replicated data causes any issues and how to fix those.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22046?focusedWorklogId=285096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-285096
 ]

ASF GitHub Bot logged work on HIVE-22046:
-

Author: ASF GitHub Bot
Created on: 30/Jul/19 16:50
Start Date: 30/Jul/19 16:50
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #741: HIVE-22046
URL: https://github.com/apache/hive/pull/741
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 285096)
Time Spent: 10m
Remaining Estimate: 0h

> Differentiate among column stats computed by different engines
> --
>
> Key: HIVE-22046
> URL: https://issues.apache.org/jira/browse/HIVE-22046
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22046.01.patch, HIVE-22046.02.patch, 
> HIVE-22046.03.patch, HIVE-22046.04.patch, HIVE-22046.05.patch, 
> HIVE-22046.06.patch, HIVE-22046.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is to avoid computation of column stats by engines to step on each 
> other, e.g., Hive and Impala. In longer term, we may introduce a common 
> representation for the column statistics stored by different engines.
> For this issue, we will add a new column 'engine' to TAB_COL_STATS HMS table 
> (unpartitioned tables) and to PART_COL_STATS HMS table (partitioned tables). 
> This will prevent conflicts at the column level stats.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22046:
--
Labels: pull-request-available  (was: )

> Differentiate among column stats computed by different engines
> --
>
> Key: HIVE-22046
> URL: https://issues.apache.org/jira/browse/HIVE-22046
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22046.01.patch, HIVE-22046.02.patch, 
> HIVE-22046.03.patch, HIVE-22046.04.patch, HIVE-22046.05.patch, 
> HIVE-22046.06.patch, HIVE-22046.patch
>
>
> The goal is to avoid computation of column stats by engines to step on each 
> other, e.g., Hive and Impala. In longer term, we may introduce a common 
> representation for the column statistics stored by different engines.
> For this issue, we will add a new column 'engine' to TAB_COL_STATS HMS table 
> (unpartitioned tables) and to PART_COL_STATS HMS table (partitioned tables). 
> This will prevent conflicts at the column level stats.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22046:
---
Attachment: HIVE-22046.07.patch

> Differentiate among column stats computed by different engines
> --
>
> Key: HIVE-22046
> URL: https://issues.apache.org/jira/browse/HIVE-22046
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22046.01.patch, HIVE-22046.02.patch, 
> HIVE-22046.03.patch, HIVE-22046.04.patch, HIVE-22046.05.patch, 
> HIVE-22046.06.patch, HIVE-22046.07.patch, HIVE-22046.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is to avoid computation of column stats by engines to step on each 
> other, e.g., Hive and Impala. In longer term, we may introduce a common 
> representation for the column statistics stored by different engines.
> For this issue, we will add a new column 'engine' to TAB_COL_STATS HMS table 
> (unpartitioned tables) and to PART_COL_STATS HMS table (partitioned tables). 
> This will prevent conflicts at the column level stats.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896367#comment-16896367
 ] 

Vineet Garg commented on HIVE-21991:


[~jcamachorodriguez] Can you take a look please?

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (HIVE-22063) Ranger Authorization in Hive based on object ownership - HMS code path

2019-07-30 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An reassigned HIVE-22063:
-


> Ranger Authorization in Hive based on object ownership - HMS code path
> --
>
> Key: HIVE-22063
> URL: https://issues.apache.org/jira/browse/HIVE-22063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>
> This takes care of adding the owner and ownertype in the HMS code path



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work started] (HIVE-22063) Ranger Authorization in Hive based on object ownership - HMS code path

2019-07-30 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22063 started by Sam An.
-
> Ranger Authorization in Hive based on object ownership - HMS code path
> --
>
> Key: HIVE-22063
> URL: https://issues.apache.org/jira/browse/HIVE-22063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>
> This takes care of adding the owner and ownertype in the HMS code path



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22063) Ranger Authorization in Hive based on object ownership - HMS code path

2019-07-30 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-22063:
--
Attachment: HIVE-22063.1.patch
Status: Patch Available  (was: In Progress)

> Ranger Authorization in Hive based on object ownership - HMS code path
> --
>
> Key: HIVE-22063
> URL: https://issues.apache.org/jira/browse/HIVE-22063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-22063.1.patch
>
>
> This takes care of adding the owner and ownertype in the HMS code path



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21917) COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-07-30 Thread Rajkumar Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896377#comment-16896377
 ] 

Rajkumar Singh commented on HIVE-21917:
---

look like these two methods form initiator thread will invoke getListing, 
getfileInfo and ugi close on hdfs 
https://github.com/apache/hive/blob/c5624f62b42a038864d3c79e44a778d64d05a1f7/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorThread.java#L165
https://github.com/apache/hive/blob/a72b9ac6dd7a2e027b30a01effbce4668324f055/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L265


> COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs
> 
>
> Key: HIVE-21917
> URL: https://issues.apache.org/jira/browse/HIVE-21917
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Craig Condit
>Priority: Major
>
> The Initiator thread in the metastore repeatedly loops over entries in the 
> COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
> need to be compacted. However, entries are never removed from this table 
> except by a completed Compactor run.
> In a cluster where most tables / partitions are write-once read-many, this 
> results in stale entries in this table never being cleaned up. In a small 
> test cluster, we have observed approximately 45k entries in this table 
> (virtually equal to the number of partitions in the cluster) while < 100 of 
> these tables have delta files at all. Since most of the tables will never get 
> enough writes to trigger a compaction (and in fact have only ever been 
> written to once), the initiator thread keeps trying to evaluate them on every 
> loop.
> On this test cluster, it takes approximately 10 minutes to loop through all 
> the entries and results in severe performance degradation on metastore 
> operations. With the default run timing of 5 minutes, the initiator basically 
> never stops running.
> On a production cluster with 2M partitions, this would be a non-starter.
> The initiator thread should proactively remove entries from 
> COMPLETED_TXN_COMPONENTS when it determines that a compaction is not needed, 
> so that they are not evaluated again on the next loop.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896385#comment-16896385
 ] 

Jesus Camacho Rodriguez commented on HIVE-22046:


[~vihang], can you review this patch?
https://github.com/apache/hive/pull/741
Thanks

> Differentiate among column stats computed by different engines
> --
>
> Key: HIVE-22046
> URL: https://issues.apache.org/jira/browse/HIVE-22046
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22046.01.patch, HIVE-22046.02.patch, 
> HIVE-22046.03.patch, HIVE-22046.04.patch, HIVE-22046.05.patch, 
> HIVE-22046.06.patch, HIVE-22046.07.patch, HIVE-22046.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is to avoid computation of column stats by engines to step on each 
> other, e.g., Hive and Impala. In longer term, we may introduce a common 
> representation for the column statistics stored by different engines.
> For this issue, we will add a new column 'engine' to TAB_COL_STATS HMS table 
> (unpartitioned tables) and to PART_COL_STATS HMS table (partitioned tables). 
> This will prevent conflicts at the column level stats.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896385#comment-16896385
 ] 

Jesus Camacho Rodriguez edited comment on HIVE-22046 at 7/30/19 6:28 PM:
-

[~vihangk1], can you review this patch?
https://github.com/apache/hive/pull/741
Thanks


was (Author: jcamachorodriguez):
[~vihang], can you review this patch?
https://github.com/apache/hive/pull/741
Thanks

> Differentiate among column stats computed by different engines
> --
>
> Key: HIVE-22046
> URL: https://issues.apache.org/jira/browse/HIVE-22046
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22046.01.patch, HIVE-22046.02.patch, 
> HIVE-22046.03.patch, HIVE-22046.04.patch, HIVE-22046.05.patch, 
> HIVE-22046.06.patch, HIVE-22046.07.patch, HIVE-22046.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is to avoid computation of column stats by engines to step on each 
> other, e.g., Hive and Impala. In longer term, we may introduce a common 
> representation for the column statistics stored by different engines.
> For this issue, we will add a new column 'engine' to TAB_COL_STATS HMS table 
> (unpartitioned tables) and to PART_COL_STATS HMS table (partitioned tables). 
> This will prevent conflicts at the column level stats.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22045) HIVE-21711 introduced regression in data load

2019-07-30 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896393#comment-16896393
 ] 

Vineet Garg commented on HIVE-22045:


bq. Is that going to be another perf regression?
I suppose not. Due to {{isBlobStorage}} LOAD data regressed significantly but 
it helped CTAS/CM statements, now with this patch behavior for CTAS/CM should 
be same as with {{isBlobStorage}}.

bq. The FileSinkDesc:isCTas* method needs more docs in the declaration.
Upload patch which should address this.

> HIVE-21711 introduced regression in data load
> -
>
> Key: HIVE-22045
> URL: https://issues.apache.org/jira/browse/HIVE-22045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22045.1.patch, HIVE-22045.2.patch, 
> HIVE-22045.3.patch, HIVE-22045.4.patch
>
>
> Better fix for HIVE-21711 is to specialize the handling for CTAS/Create MV 
> statements to avoid intermittent rename operation but keep INSERT etc 
> statements do intermittent rename since otherwise final move by file 
> operation is significantly slow for such statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22045) HIVE-21711 introduced regression in data load

2019-07-30 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22045:
---
Attachment: HIVE-22045.4.patch

> HIVE-21711 introduced regression in data load
> -
>
> Key: HIVE-22045
> URL: https://issues.apache.org/jira/browse/HIVE-22045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22045.1.patch, HIVE-22045.2.patch, 
> HIVE-22045.3.patch, HIVE-22045.4.patch
>
>
> Better fix for HIVE-21711 is to specialize the handling for CTAS/Create MV 
> statements to avoid intermittent rename operation but keep INSERT etc 
> statements do intermittent rename since otherwise final move by file 
> operation is significantly slow for such statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22045) HIVE-21711 introduced regression in data load

2019-07-30 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22045:
---
Status: Open  (was: Patch Available)

> HIVE-21711 introduced regression in data load
> -
>
> Key: HIVE-22045
> URL: https://issues.apache.org/jira/browse/HIVE-22045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22045.1.patch, HIVE-22045.2.patch, 
> HIVE-22045.3.patch, HIVE-22045.4.patch
>
>
> Better fix for HIVE-21711 is to specialize the handling for CTAS/Create MV 
> statements to avoid intermittent rename operation but keep INSERT etc 
> statements do intermittent rename since otherwise final move by file 
> operation is significantly slow for such statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22045) HIVE-21711 introduced regression in data load

2019-07-30 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22045:
---
Status: Patch Available  (was: Open)

> HIVE-21711 introduced regression in data load
> -
>
> Key: HIVE-22045
> URL: https://issues.apache.org/jira/browse/HIVE-22045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22045.1.patch, HIVE-22045.2.patch, 
> HIVE-22045.3.patch, HIVE-22045.4.patch
>
>
> Better fix for HIVE-21711 is to specialize the handling for CTAS/Create MV 
> statements to avoid intermittent rename operation but keep INSERT etc 
> statements do intermittent rename since otherwise final move by file 
> operation is significantly slow for such statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896398#comment-16896398
 ] 

Jesus Camacho Rodriguez commented on HIVE-21991:


A couple of questions:
- Merge of files in {{orc_merge9.q}} is not happening anymore, is that expected?
- The _min_ value for some columns in stripe statistics in {{orc_file_dump.q}} 
has changed, is this expected?

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896455#comment-16896455
 ] 

Vineet Garg commented on HIVE-21991:


bq Merge of files in orc_merge9.q is not happening anymore, is that expected?
This is expected and is due to {{Writer version mismatch}}. Looks like writer 
version is updated in ORC 1.5.6

bq. The min value for some columns in stripe statistics in orc_file_dump.q has 
changed, is this expected?
This is also expected. There is a bug fix in ORC 1.5.6 which fixes incorrect 
min/max

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896455#comment-16896455
 ] 

Vineet Garg edited comment on HIVE-21991 at 7/30/19 7:50 PM:
-

bq. Merge of files in orc_merge9.q is not happening anymore, is that expected?
This is expected and is due to {{Writer version mismatch}}. Looks like writer 
version is updated in ORC 1.5.6

bq. The min value for some columns in stripe statistics in orc_file_dump.q has 
changed, is this expected?
This is also expected. There is a bug fix in ORC 1.5.6 which fixes incorrect 
min/max


was (Author: vgarg):
bq Merge of files in orc_merge9.q is not happening anymore, is that expected?
This is expected and is due to {{Writer version mismatch}}. Looks like writer 
version is updated in ORC 1.5.6

bq. The min value for some columns in stripe statistics in orc_file_dump.q has 
changed, is this expected?
This is also expected. There is a bug fix in ORC 1.5.6 which fixes incorrect 
min/max

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896457#comment-16896457
 ] 

Jesus Camacho Rodriguez commented on HIVE-21991:


+1

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896460#comment-16896460
 ] 

Hive QA commented on HIVE-20683:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
57s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
11s{color} | {color:blue} ql in master has 2250 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} druid-handler in master has 3 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
20s{color} | {color:blue} itests/qtest-druid in master has 7 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 6 new + 35 unchanged - 0 fixed 
= 41 total (was 35) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} druid-handler: The patch generated 13 new + 0 
unchanged - 0 fixed = 13 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
56s{color} | {color:red} root: The patch generated 19 new + 35 unchanged - 0 
fixed = 54 total (was 35) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
39s{color} | {color:red} druid-handler generated 2 new + 3 unchanged - 0 fixed 
= 5 total (was 3) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:druid-handler |
|  |  Null passed for non-null parameter of new 
org.apache.druid.segment.virtual.ExpressionVirtualColumn(String, String, 
ValueType, ExprMacroTable) in 
org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.extractColName(ExprNodeDesc,
 List)  At DruidStorageHandlerUtils.java:of new 
org.apache.druid.segment.virtual.ExpressionVirtualColumn(String, String, 
ValueType, ExprMacroTable) in 
org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.extractColName(ExprNodeDesc,
 List)  At DruidStorageHandlerUtils.java:[line 1165] |
|  |  Switch statement found in 
org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.addDynamicFilters(Query, 
ExprNodeGenericFuncDesc, Configuration, boolean) where default case is missing  
At DruidStorageHandlerUtils.java:ExprNodeGenericFuncDesc, Configuration, 
boolean) where default case is missing  At DruidStorageHandlerUtils.java:[lines 
974-1003] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3

[jira] [Commented] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896466#comment-16896466
 ] 

Hive QA commented on HIVE-20683:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12976218/HIVE-20683.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 16716 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[semijoin_hint]
 (batchId=167)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query1b]
 (batchId=296)
org.apache.hadoop.hive.llap.cache.TestBuddyAllocator.testMTT[2] (batchId=360)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18202/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18202/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18202/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12976218 - PreCommit-HIVE-Build

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, HIVE-20683.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21637) Synchronized metastore cache

2019-07-30 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21637:
--
Attachment: HIVE-21637.52.patch

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, 
> HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, 
> HIVE-21637.14.patch, HIVE-21637.15.patch, HIVE-21637.16.patch, 
> HIVE-21637.17.patch, HIVE-21637.18.patch, HIVE-21637.19.patch, 
> HIVE-21637.19.patch, HIVE-21637.2.patch, HIVE-21637.20.patch, 
> HIVE-21637.21.patch, HIVE-21637.22.patch, HIVE-21637.23.patch, 
> HIVE-21637.24.patch, HIVE-21637.25.patch, HIVE-21637.26.patch, 
> HIVE-21637.27.patch, HIVE-21637.28.patch, HIVE-21637.29.patch, 
> HIVE-21637.3.patch, HIVE-21637.30.patch, HIVE-21637.31.patch, 
> HIVE-21637.32.patch, HIVE-21637.33.patch, HIVE-21637.34.patch, 
> HIVE-21637.35.patch, HIVE-21637.36.patch, HIVE-21637.37.patch, 
> HIVE-21637.38.patch, HIVE-21637.39.patch, HIVE-21637.4.patch, 
> HIVE-21637.40.patch, HIVE-21637.41.patch, HIVE-21637.42.patch, 
> HIVE-21637.43.patch, HIVE-21637.44.patch, HIVE-21637.45.patch, 
> HIVE-21637.46.patch, HIVE-21637.47.patch, HIVE-21637.48.patch, 
> HIVE-21637.49.patch, HIVE-21637.5.patch, HIVE-21637.50.patch, 
> HIVE-21637.51.patch, HIVE-21637.52.patch, HIVE-21637.6.patch, 
> HIVE-21637.7.patch, HIVE-21637.8.patch, HIVE-21637.9.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896486#comment-16896486
 ] 

Vineet Garg commented on HIVE-21991:


Pushed to master. Thanks [~jcamachorodriguez]

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21991) Upgrade ORC version to 1.5.6

2019-07-30 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21991:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Upgrade ORC version to 1.5.6
> 
>
> Key: HIVE-21991
> URL: https://issues.apache.org/jira/browse/HIVE-21991
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21991.1.patch, HIVE-21991.2.patch, 
> HIVE-21991.3.patch, HIVE-21991.4.patch, HIVE-21991.5.patch, 
> HIVE-21991.6.patch, HIVE-21991.7.patch, HIVE-21991.8.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-19994) Impala "drop table" fails with Hive Metastore exception

2019-07-30 Thread Karthik Manamcheri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896492#comment-16896492
 ] 

Karthik Manamcheri commented on HIVE-19994:
---

Adding the foreign-key line to package.jdo should not affect functionality. It 
would affect performance because data nucleus cannot optimize effectively.

> Impala "drop table" fails with Hive Metastore exception
> ---
>
> Key: HIVE-19994
> URL: https://issues.apache.org/jira/browse/HIVE-19994
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: Hadoop distribution: CHD 5.14.2
> Hive version:  1.1.0-cdh5.14.2
> Impala version: 2.11.0
> Kudu version: 1.6.0
>  
>Reporter: Rodion Myronov
>Assignee: Karthik Manamcheri
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19994.1.patch, metastore_exception.txt
>
>
> "drop table" statement in Impala shell fails with the following exception:
> {{ImpalaRuntimeException: Error making 'dropTable' RPC to Hive Metastore: 
> CAUSED BY: MetaException: One or more instances could not be deleted}}
>  
> Metastore log file shows that "DELETE FROM `PARTITION_KEYS` WHERE `TBL_ID`=?" 
> statement fails because of foreign key violation (full stacktrace will be 
> added):
> {{Caused by: java.sql.BatchUpdateException: Cannot delete or update a parent 
> row: a foreign key constraint fails 
> ("hivemetastore_emtig3vtq7qp1tiooo07sb70ud"."COLUMNS_V2", CONSTRAINT 
> "COLUMNS_V2_FK1" FOREIGN KEY ("CD_ID") REFERENCES "CDS" ("CD_ID"))}}
>  
> The table is created and then dropped as a part of ETL process executed every 
> hour. Most of the time it works fine, the issue is not reproducible at will.
> Table creation script is:
> {{CREATE TABLE IF NOT EXISTS price_advisor_ouput.t_switching_coef_source}}
> {{( }}
> {{...fields here...}}
> {{PRIMARY KEY (...PK field here...)}}
> {{)}}
> {{PARTITION BY HASH(matrix_pcd) PARTITIONS 3}}
> {{STORED AS KUDU;}}
>  
> Not sure how to approach diagnostics and fix, so any input will be really 
> appreciated. 
> Thanks in advance, 
> Rodion Myronov



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20683:

Attachment: HIVE-20683.3.patch

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-07-30 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896539#comment-16896539
 ] 

Nishant Bangarwa commented on HIVE-20683:
-

fixed checkstyle and updated qfile for dynamic_semijoin_reduction_3 and 
semijoin_hint

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21838) Hive Metastore Translation: Add API call to tell client why table has limited access

2019-07-30 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896550#comment-16896550
 ] 

Jason Dere commented on HIVE-21838:
---

Left some minor nits on RB, but otherwise looks good.

> Hive Metastore Translation: Add API call to tell client why table has limited 
> access
> 
>
> Key: HIVE-21838
> URL: https://issues.apache.org/jira/browse/HIVE-21838
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Yongzhi Chen
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21838.10.patch, HIVE-21838.11.patch, 
> HIVE-21838.12.patch, HIVE-21838.13.patch, HIVE-21838.13.patch, 
> HIVE-21838.2.patch, HIVE-21838.3.patch, HIVE-21838.4.patch, 
> HIVE-21838.5.patch, HIVE-21838.6.patch, HIVE-21838.7.patch, 
> HIVE-21838.8.patch, HIVE-21838.9.patch, HIVE-21838.patch
>
>
> When a table access type is Read-only or None, we need a way to tell clients 
> why. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21962) Replacing ArrayList params with List in and around PlanUtils and MapWork

2019-07-30 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896558#comment-16896558
 ] 

Laszlo Bodor commented on HIVE-21962:
-

+1

> Replacing ArrayList params with List in and around PlanUtils and MapWork
> 
>
> Key: HIVE-21962
> URL: https://issues.apache.org/jira/browse/HIVE-21962
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Minor
> Attachments: HIVE-21962.1.patch, HIVE-21962.1.patch, 
> HIVE-21962.2.patch, HIVE-21962.2.patch
>
>
> Using the implementing class is usually a bad practice. OO suggest to use the 
> least restrictive interface instead. ArrayList is used in many-many methods 
> as a parameter - this is just a tiny part of this work.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21944) Remove unused methods, fields and variables from Vectorizer

2019-07-30 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896565#comment-16896565
 ] 

Laszlo Bodor commented on HIVE-21944:
-

[~isuller]: I've just reviewed your another patch and found 1 line which is 
contradictory to that.
In 
[HIVE-21962.2.patch|https://issues.apache.org/jira/secure/attachment/12975370/HIVE-21962.2.patch#file-6]
 you refactored something in vectorizer which can avoid a needless shallow copy 
(as far as I can understand):
from
{code}
  mapWork.setVectorizationEnabledConditionsMet(new 
ArrayList(enabledConditionsMetSet));
{code}
to
{code}
mapWork.setVectorizationEnabledConditionsMet(enabledConditionsMetSet);
{code}

however here the last patch seems to touch the same code part, but doesn't care 
about that
https://issues.apache.org/jira/secure/attachment/12975371/HIVE-21944.1.patch

Apart from that, this looks good to me.

> Remove unused methods, fields and variables from Vectorizer
> ---
>
> Key: HIVE-21944
> URL: https://issues.apache.org/jira/browse/HIVE-21944
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21944.1.patch, HIVE-21944.1.patch, 
> HIVE-21944.1.patch, HIVE-21944.1.patch
>
>
> It seems there are many unused fields, variables and methods in 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer class. Removing them 
> would make the code easier to understand.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22064) CBO: Rewrite year() predicate to a constant condition

2019-07-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-22064:
---
Component/s: CBO

> CBO: Rewrite year() predicate to a constant condition
> -
>
> Key: HIVE-22064
> URL: https://issues.apache.org/jira/browse/HIVE-22064
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Gopal V
>Priority: Major
>
> {code}
> CREATE TEMPORARY TABLE users (signup_date date, user_id bigint) stored as ORC;
> INSERT INTO users values("1999-01-01", 1);
> EXPLAIN ANALYZE
> SELECT year(signup_date), count(distinct user_id) from users where 
> year(signup_date) BETWEEN 2017 and 2019
> GROUP BY year(signup_date);
> {code}
> The YEAR() or EXTRACT( YEAR) is not rewritten into a constant for 
> push-down.
> {code}
> EXPLAIN ANALYZE
> SELECT year(signup_date), count(distinct user_id) from users where 
> signup_date BETWEEN DATE'2017-01-01' and DATE'2019-12-31'
> GROUP BY year(signup_date);
> {code}
> Does push-down into the storage layers.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21960) HMS tasks on replica

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896641#comment-16896641
 ] 

Hive QA commented on HIVE-21960:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
34s{color} | {color:blue} standalone-metastore/metastore-common in master has 
31 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
10s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2250 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} standalone-metastore/metastore-server: The patch 
generated 0 new + 50 unchanged - 1 fixed = 50 total (was 51) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} ql: The patch generated 0 new + 78 unchanged - 1 
fixed = 78 total (was 79) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
20s{color} | {color:red} itests/hive-unit: The patch generated 12 new + 237 
unchanged - 12 fixed = 249 total (was 249) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18203/dev-support/hive-personality.sh
 |
| git revision | master / b8afcc3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18203/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18203/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generat

[jira] [Commented] (HIVE-21960) HMS tasks on replica

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896651#comment-16896651
 ] 

Hive QA commented on HIVE-21960:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12976238/HIVE-21960.04.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 16689 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=160)

[external_jdbc_table_typeconversion.q,vector_udf_octet_length.q,schema_evol_orc_acidvec_table_update.q,materialized_view_rewrite_part_2.q,vector_decimal_5.q,vector_case_when_conversion.q,escape1.q,schema_evol_orc_acid_table_update_llap_io.q,cte_mat_5.q,acid_meta_columns_decode.q,vector_string_decimal.q,results_cache_lifetime.q,cross_prod_3.q,join46.q,dynpart_sort_optimization2.q,tez_bmj_schema_evolution.q,insert_into_default_keyword.q,bucketmapjoin4.q,vector_orc_null_check.q,semijoin7.q,uber_reduce.q,schema_evol_orc_nonvec_part_all_complex.q,is_distinct_from.q,schema_evol_text_vec_part_all_complex_llap_io.q,auto_sortmerge_join_3.q,vectorization_9.q,materialized_view_create_rewrite.q,merge2.q,join_nulls.q,bucketmapjoin2.q]
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesBootstrapWithJsonMessage.testRetryAcidTablesBootstrapFromDifferentDump
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testRetryAcidTablesBootstrapFromDifferentDump
 (batchId=248)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18203/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18203/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18203/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12976238 - PreCommit-HIVE-Build

> HMS tasks on replica
> 
>
> Key: HIVE-21960
> URL: https://issues.apache.org/jira/browse/HIVE-21960
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21960.01.patch, HIVE-21960.02.patch, 
> HIVE-21960.03.patch, HIVE-21960.04.patch, Replication and House keeping 
> tasks.pdf
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> An HMS performs a number of housekeeping tasks. Assess whether
>  # They are required to be performed in the replicated data
>  # Performing those on replicated data causes any issues and how to fix those.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21838) Hive Metastore Translation: Add API call to tell client why table has limited access

2019-07-30 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21838:
-
Attachment: HIVE-21838.14.patch

> Hive Metastore Translation: Add API call to tell client why table has limited 
> access
> 
>
> Key: HIVE-21838
> URL: https://issues.apache.org/jira/browse/HIVE-21838
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Yongzhi Chen
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21838.10.patch, HIVE-21838.11.patch, 
> HIVE-21838.12.patch, HIVE-21838.13.patch, HIVE-21838.13.patch, 
> HIVE-21838.14.patch, HIVE-21838.2.patch, HIVE-21838.3.patch, 
> HIVE-21838.4.patch, HIVE-21838.5.patch, HIVE-21838.6.patch, 
> HIVE-21838.7.patch, HIVE-21838.8.patch, HIVE-21838.9.patch, HIVE-21838.patch
>
>
> When a table access type is Read-only or None, we need a way to tell clients 
> why. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21838) Hive Metastore Translation: Add API call to tell client why table has limited access

2019-07-30 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21838:
-
Status: Open  (was: Patch Available)

> Hive Metastore Translation: Add API call to tell client why table has limited 
> access
> 
>
> Key: HIVE-21838
> URL: https://issues.apache.org/jira/browse/HIVE-21838
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Yongzhi Chen
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21838.10.patch, HIVE-21838.11.patch, 
> HIVE-21838.12.patch, HIVE-21838.13.patch, HIVE-21838.13.patch, 
> HIVE-21838.14.patch, HIVE-21838.2.patch, HIVE-21838.3.patch, 
> HIVE-21838.4.patch, HIVE-21838.5.patch, HIVE-21838.6.patch, 
> HIVE-21838.7.patch, HIVE-21838.8.patch, HIVE-21838.9.patch, HIVE-21838.patch
>
>
> When a table access type is Read-only or None, we need a way to tell clients 
> why. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21838) Hive Metastore Translation: Add API call to tell client why table has limited access

2019-07-30 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-21838:
-
Status: Patch Available  (was: Open)

Thanks [~jdere]. I have uploaded a new patch with the feedback from RB. 

> Hive Metastore Translation: Add API call to tell client why table has limited 
> access
> 
>
> Key: HIVE-21838
> URL: https://issues.apache.org/jira/browse/HIVE-21838
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Yongzhi Chen
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21838.10.patch, HIVE-21838.11.patch, 
> HIVE-21838.12.patch, HIVE-21838.13.patch, HIVE-21838.13.patch, 
> HIVE-21838.14.patch, HIVE-21838.2.patch, HIVE-21838.3.patch, 
> HIVE-21838.4.patch, HIVE-21838.5.patch, HIVE-21838.6.patch, 
> HIVE-21838.7.patch, HIVE-21838.8.patch, HIVE-21838.9.patch, HIVE-21838.patch
>
>
> When a table access type is Read-only or None, we need a way to tell clients 
> why. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-13781) Tez Job failed with FileNotFoundException when partition dir doesnt exists

2019-07-30 Thread zhangbutao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-13781:
--
Fix Version/s: 4.0.0

> Tez Job failed with FileNotFoundException when partition dir doesnt exists 
> ---
>
> Key: HIVE-13781
> URL: https://issues.apache.org/jira/browse/HIVE-13781
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Query Planning
>Affects Versions: 0.14.0, 2.0.0, 3.1.1
>Reporter: Feng Yuan
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-13781.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> when i have a partitioned table a with partition "day",in metadata a have 
> partition day: 20160501,20160502,but partition 20160501's dir didnt exits.
> so when i use tez engine to run hive -e "select day,count(*) from a where 
> xx=xx group by day"
> hive throws FileNotFoundException.
> but mr work.
> repo eg:
> CREATE EXTERNAL TABLE `a`(
>   `a` string)
> PARTITIONED BY ( 
>   `l_date` string);
> insert overwrite table a partition(l_date='2016-04-08') values (1),(2);
> insert overwrite table a partition(l_date='2016-04-09') values (1),(2);
> hadoop dfs -rm -r -f /warehouse/a/l_date=2016-04-09
> select l_date,count(*) from a where a='1' group by l_date;
> error:
> ut: a initializer failed, vertex=vertex_1463493135662_10445_1_00 [Map 1], 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://bfdhadoopcool/warehouse/test.db/a/l_date=2015-04-09
>   at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:129)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data

2019-07-30 Thread Yuanbo Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896731#comment-16896731
 ] 

Yuanbo Liu commented on HIVE-10685:
---

[~wangbaoyun]
Sorry to interrupt, Have you found any solution to recovery those merged files?

> Alter table concatenate oparetor will cause duplicate data
> --
>
> Key: HIVE-10685
> URL: https://issues.apache.org/jira/browse/HIVE-10685
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 1.2.1
>Reporter: guoliming
>Assignee: guoliming
>Priority: Critical
> Fix For: 1.2.1
>
> Attachments: HIVE-10685.patch, HIVE-10685.patch
>
>
> "Orders" table has 15 rows and stored as ORC. 
> {noformat}
> hive> select count(*) from orders;
> OK
> 15
> Time taken: 37.692 seconds, Fetched: 1 row(s)
> {noformat}
> The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB.
> After executing command : ALTER TABLE orders CONCATENATE;
> The table is already 1530115000 rows.
> My hive version is 1.1.0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896748#comment-16896748
 ] 

Hive QA commented on HIVE-22046:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
23s{color} | {color:blue} standalone-metastore/metastore-common in master has 
31 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} common in master has 62 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
12s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2250 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
48s{color} | {color:blue} itests/util in master has 44 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} standalone-metastore/metastore-common: The patch 
generated 9 new + 409 unchanged - 8 fixed = 418 total (was 417) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 77 new + 1935 unchanged - 38 fixed = 2012 total (was 1973) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 5 new + 1127 unchanged - 3 
fixed = 1132 total (was 1130) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} itests/hcatalog-unit: The patch generated 3 new + 26 
unchanged - 1 fixed = 29 total (was 27) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
20s{color} | {color:red} itests/hive-unit: The patch generated 13 new + 300 
unchanged - 1 fixed = 313 total (was 301) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
23s{color} | {color:red} standalone-metastore/metastore-server generated 2 new 
+ 177 unchanged - 2 fixed = 179 total (was 179) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m 
19s{color} | {color:red} standalone-metastore_metastore-common generated 2 new 
+ 47 unchanged - 0 fixed = 49 total (was 47) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  instanceof will always return false in 
org.

[jira] [Commented] (HIVE-22046) Differentiate among column stats computed by different engines

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896758#comment-16896758
 ] 

Hive QA commented on HIVE-22046:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12976243/HIVE-22046.07.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16715 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18204/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18204/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18204/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12976243 - PreCommit-HIVE-Build

> Differentiate among column stats computed by different engines
> --
>
> Key: HIVE-22046
> URL: https://issues.apache.org/jira/browse/HIVE-22046
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22046.01.patch, HIVE-22046.02.patch, 
> HIVE-22046.03.patch, HIVE-22046.04.patch, HIVE-22046.05.patch, 
> HIVE-22046.06.patch, HIVE-22046.07.patch, HIVE-22046.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is to avoid computation of column stats by engines to step on each 
> other, e.g., Hive and Impala. In longer term, we may introduce a common 
> representation for the column statistics stored by different engines.
> For this issue, we will add a new column 'engine' to TAB_COL_STATS HMS table 
> (unpartitioned tables) and to PART_COL_STATS HMS table (partitioned tables). 
> This will prevent conflicts at the column level stats.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21944) Remove unused methods, fields and variables from Vectorizer

2019-07-30 Thread Ivan Suller (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896790#comment-16896790
 ] 

Ivan Suller commented on HIVE-21944:


[~abstractdog] I deliberately didn't make the same change twice. This way the 
two patch could be tested separately.

> Remove unused methods, fields and variables from Vectorizer
> ---
>
> Key: HIVE-21944
> URL: https://issues.apache.org/jira/browse/HIVE-21944
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Attachments: HIVE-21944.1.patch, HIVE-21944.1.patch, 
> HIVE-21944.1.patch, HIVE-21944.1.patch
>
>
> It seems there are many unused fields, variables and methods in 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer class. Removing them 
> would make the code easier to understand.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22045) HIVE-21711 introduced regression in data load

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896800#comment-16896800
 ] 

Hive QA commented on HIVE-22045:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2250 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 2 new + 665 unchanged - 0 
fixed = 667 total (was 665) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18205/dev-support/hive-personality.sh
 |
| git revision | master / b8afcc3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18205/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18205/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HIVE-21711 introduced regression in data load
> -
>
> Key: HIVE-22045
> URL: https://issues.apache.org/jira/browse/HIVE-22045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22045.1.patch, HIVE-22045.2.patch, 
> HIVE-22045.3.patch, HIVE-22045.4.patch
>
>
> Better fix for HIVE-21711 is to specialize the handling for CTAS/Create MV 
> statements to avoid intermittent rename operation but keep INSERT etc 
> statements do intermittent rename since otherwise final move by file 
> operation is significantly slow for such statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22008) Apache Hive 2.3.4 - Issue with combination of Like operator & newline (\n) character in data

2019-07-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-22008:
---
Affects Version/s: 4.0.0

> Apache Hive 2.3.4 -  Issue with combination of Like operator & newline (\n) 
> character in data
> -
>
> Key: HIVE-22008
> URL: https://issues.apache.org/jira/browse/HIVE-22008
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0, 2.3.4
>Reporter: Shankar
>Assignee: Gopal V
>Priority: Major
>
> I am facing some issues while using *Like* operator & *newline* (\n) 
> character. Below is the in details description :
>  
>  
> {color:#263238}*-- Hive 
> Queries *{color}
> {color:#263238} – consider these are the reproduce steps.{color}
>  
> {color:#263238}create table default.withdraw({color}
> {color:#263238}id string{color}
> {color:#263238}) stored as parquet;{color}
> {color:#263238} {color}
>  
> {color:#263238}
> *insert into default.withdraw select 'withdraw\ncash';*{color}
> -- note here, added '\n' character
>  
> *{color:#263238}--1)  result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.078 seconds, Fetched: *1 row(s)*{color}
>  
> *--2)* *{color:#263238}result = {color}{color:#cc}wrong{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%cash';{color}
> {color:#263238}OK{color}
> {color:#263238}Time taken: 0.066 seconds{color}
>  
> *--3)* *{color:#263238}result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%cash%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.086 seconds, Fetched: 1 row(s){color}
>  
>  
>  
>  
> {color:#263238}*-- Presto 
> Queries -*{color}
> {color:#263238}FYI - Presto (v0.221) is using above table meta store. We 
> tested above queries on presto too. {color}
> {color:#263238} {color}
>  
> *--1)*  *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--2)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%cash';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--3)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%cash%';
>    id    
> --
> withdraw 
> cash     
> (1 row){color:#263238}
> {color}
>  
>  
> *--* 
> *--* 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (HIVE-22008) Apache Hive 2.3.4 - Issue with combination of Like operator & newline (\n) character in data

2019-07-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-22008:
--

Assignee: Gopal V

> Apache Hive 2.3.4 -  Issue with combination of Like operator & newline (\n) 
> character in data
> -
>
> Key: HIVE-22008
> URL: https://issues.apache.org/jira/browse/HIVE-22008
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: Shankar
>Assignee: Gopal V
>Priority: Major
>
> I am facing some issues while using *Like* operator & *newline* (\n) 
> character. Below is the in details description :
>  
>  
> {color:#263238}*-- Hive 
> Queries *{color}
> {color:#263238} – consider these are the reproduce steps.{color}
>  
> {color:#263238}create table default.withdraw({color}
> {color:#263238}id string{color}
> {color:#263238}) stored as parquet;{color}
> {color:#263238} {color}
>  
> {color:#263238}
> *insert into default.withdraw select 'withdraw\ncash';*{color}
> -- note here, added '\n' character
>  
> *{color:#263238}--1)  result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.078 seconds, Fetched: *1 row(s)*{color}
>  
> *--2)* *{color:#263238}result = {color}{color:#cc}wrong{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%cash';{color}
> {color:#263238}OK{color}
> {color:#263238}Time taken: 0.066 seconds{color}
>  
> *--3)* *{color:#263238}result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%cash%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.086 seconds, Fetched: 1 row(s){color}
>  
>  
>  
>  
> {color:#263238}*-- Presto 
> Queries -*{color}
> {color:#263238}FYI - Presto (v0.221) is using above table meta store. We 
> tested above queries on presto too. {color}
> {color:#263238} {color}
>  
> *--1)*  *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--2)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%cash';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--3)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%cash%';
>    id    
> --
> withdraw 
> cash     
> (1 row){color:#263238}
> {color}
>  
>  
> *--* 
> *--* 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22008) Apache Hive 2.3.4 - Issue with combination of Like operator & newline (\n) character in data

2019-07-30 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896804#comment-16896804
 ] 

Gopal V commented on HIVE-22008:


Interestingly, this is working when it hits the ORC fast-path.

{code}
CREATE TEMPORARY TABLE SplitLines(`id` string) STORED AS ORC;
INSERT INTO SplitLines SELECT 'withdraw\ncash';
SELECT * FROM SplitLines WHERE `id` LIKE '%withdraw%cash' ORDER BY id;
SELECT count(*) FROM SplitLines WHERE `id` LIKE '%withdraw%cash';
{code}

that scenario works fine, which means this is a simple enough fix for Parquet 
with a simple SEL.

> Apache Hive 2.3.4 -  Issue with combination of Like operator & newline (\n) 
> character in data
> -
>
> Key: HIVE-22008
> URL: https://issues.apache.org/jira/browse/HIVE-22008
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0, 2.3.4
>Reporter: Shankar
>Assignee: Gopal V
>Priority: Major
>
> I am facing some issues while using *Like* operator & *newline* (\n) 
> character. Below is the in details description :
>  
>  
> {color:#263238}*-- Hive 
> Queries *{color}
> {color:#263238} – consider these are the reproduce steps.{color}
>  
> {color:#263238}create table default.withdraw({color}
> {color:#263238}id string{color}
> {color:#263238}) stored as parquet;{color}
> {color:#263238} {color}
>  
> {color:#263238}
> *insert into default.withdraw select 'withdraw\ncash';*{color}
> -- note here, added '\n' character
>  
> *{color:#263238}--1)  result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.078 seconds, Fetched: *1 row(s)*{color}
>  
> *--2)* *{color:#263238}result = {color}{color:#cc}wrong{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%cash';{color}
> {color:#263238}OK{color}
> {color:#263238}Time taken: 0.066 seconds{color}
>  
> *--3)* *{color:#263238}result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%cash%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.086 seconds, Fetched: 1 row(s){color}
>  
>  
>  
>  
> {color:#263238}*-- Presto 
> Queries -*{color}
> {color:#263238}FYI - Presto (v0.221) is using above table meta store. We 
> tested above queries on presto too. {color}
> {color:#263238} {color}
>  
> *--1)*  *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--2)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%cash';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--3)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%cash%';
>    id    
> --
> withdraw 
> cash     
> (1 row){color:#263238}
> {color}
>  
>  
> *--* 
> *--* 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22008) Apache Hive 2.3.4 - Issue with combination of Like operator & newline (\n) character in data

2019-07-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-22008:
---
Attachment: HIVE-22008.1.patch

> Apache Hive 2.3.4 -  Issue with combination of Like operator & newline (\n) 
> character in data
> -
>
> Key: HIVE-22008
> URL: https://issues.apache.org/jira/browse/HIVE-22008
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0, 2.3.4
>Reporter: Shankar
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-22008.1.patch
>
>
> I am facing some issues while using *Like* operator & *newline* (\n) 
> character. Below is the in details description :
>  
>  
> {color:#263238}*-- Hive 
> Queries *{color}
> {color:#263238} – consider these are the reproduce steps.{color}
>  
> {color:#263238}create table default.withdraw({color}
> {color:#263238}id string{color}
> {color:#263238}) stored as parquet;{color}
> {color:#263238} {color}
>  
> {color:#263238}
> *insert into default.withdraw select 'withdraw\ncash';*{color}
> -- note here, added '\n' character
>  
> *{color:#263238}--1)  result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.078 seconds, Fetched: *1 row(s)*{color}
>  
> *--2)* *{color:#263238}result = {color}{color:#cc}wrong{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%withdraw%cash';{color}
> {color:#263238}OK{color}
> {color:#263238}Time taken: 0.066 seconds{color}
>  
> *--3)* *{color:#263238}result = {color}{color:#6aa84f}success{color}*
> {color:#263238}hive> select * from default.withdraw where id like 
> '%cash%';{color}
> {color:#263238}OK{color}
> {color:#263238}withdraw{color}
> {color:#263238}cash{color}
> {color:#263238}Time taken: 0.086 seconds, Fetched: 1 row(s){color}
>  
>  
>  
>  
> {color:#263238}*-- Presto 
> Queries -*{color}
> {color:#263238}FYI - Presto (v0.221) is using above table meta store. We 
> tested above queries on presto too. {color}
> {color:#263238} {color}
>  
> *--1)*  *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--2)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%withdraw%cash';
>    id    
> --
> withdraw 
> cash     
> (1 row)
>  
> *--3)* *{color:#263238}result ={color}* *{color:#6aa84f}success{color}*
> presto> select * from default.withdraw where id like '%cash%';
>    id    
> --
> withdraw 
> cash     
> (1 row){color:#263238}
> {color}
>  
>  
> *--* 
> *--* 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22045) HIVE-21711 introduced regression in data load

2019-07-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896816#comment-16896816
 ] 

Hive QA commented on HIVE-22045:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12976254/HIVE-22045.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16715 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18205/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18205/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18205/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12976254 - PreCommit-HIVE-Build

> HIVE-21711 introduced regression in data load
> -
>
> Key: HIVE-22045
> URL: https://issues.apache.org/jira/browse/HIVE-22045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22045.1.patch, HIVE-22045.2.patch, 
> HIVE-22045.3.patch, HIVE-22045.4.patch
>
>
> Better fix for HIVE-21711 is to specialize the handling for CTAS/Create MV 
> statements to avoid intermittent rename operation but keep INSERT etc 
> statements do intermittent rename since otherwise final move by file 
> operation is significantly slow for such statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)