[jira] [Comment Edited] (HIVE-16318) LLAP cache: address some issues in 2.2/2.3

2017-09-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948726#comment-15948726
 ] 

Lefty Leverenz edited comment on HIVE-16318 at 9/16/17 4:09 AM:


Doc note:  This adds *hive.llap.io.metadata.fraction* and changes the default 
value of *hive.llap.io.allocator.alloc.min*, so they need to be documented in 
the wiki for release 2.2.0.

* [Configuration Properties -- LLAP I/O | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAPI/O]
* [Configuration Properties -- LLAP I/O -- hive.llap.io.allocator.alloc.min | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.llap.io.allocator.alloc.min]

By the way, a prior change of default value for 
*hive.llap.io.allocator.alloc.min* hasn't been documented yet -- see HIVE-13346.

Added a TODOC2.2.0 label.

Update 15/Sep/17:  HIVE-15665 removes the deprecated config 
*hive.llap.io.metadata.fraction* in release 3.0.0.


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.llap.io.metadata.fraction* and changes the default 
value of *hive.llap.io.allocator.alloc.min*, so they need to be documented in 
the wiki for release 2.2.0.

* [Configuration Properties -- LLAP I/O | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAPI/O]
* [Configuration Properties -- LLAP I/O -- hive.llap.io.allocator.alloc.min | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.llap.io.allocator.alloc.min]

By the way, a prior change of default value for 
*hive.llap.io.allocator.alloc.min* hasn't been documented yet -- see HIVE-13346.

Added a TODOC2.2.0 label.

> LLAP cache: address some issues in 2.2/2.3
> --
>
> Key: HIVE-16318
> URL: https://issues.apache.org/jira/browse/HIVE-16318
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.2.0
> Fix For: 2.2.0, 2.3.0, 3.0.0
>
> Attachments: HIVE-16318.01.patch, HIVE-16318.02.patch, 
> HIVE-16318.03.patch, HIVE-16318.04.patch, HIVE-16318.patch
>
>
> We've run into HIVE-16233 and HIVE-15665 and given that 2.2 and 2.3 releases 
> are approaching we are going to add workarounds for them, and then commit the 
> above patches and revert the workarounds as soon as we can.
> Unfortunately this will result in cache wasting some memory on some datasets, 
> but the alternatives, when they are encountered (usually only on large 
> datasets), are worse.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-09-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168804#comment-16168804
 ] 

Lefty Leverenz commented on HIVE-15665:
---

Doc note:  This removes the deprecated configuration parameter 
*hive.llap.io.metadata.fraction*, which was introduced by HIVE-16318 in release 
2.2.0 and is not documented in the wiki yet.  It still needs to be documented, 
with version information.

* [Configuration Properties -- LLAP I/O | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAPI/O]

Added a TODOC3.0 label.

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, 
> HIVE-15665.03.patch, HIVE-15665.04.patch, HIVE-15665.05.patch, 
> HIVE-15665.06.patch, HIVE-15665.07.patch, HIVE-15665.08.patch, 
> HIVE-15665.09.patch, HIVE-15665.10.patch, HIVE-15665.11.patch, 
> HIVE-15665.12.patch, HIVE-15665.patch
>
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-09-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15665:
--
Labels: TODOC3.0  (was: )

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, 
> HIVE-15665.03.patch, HIVE-15665.04.patch, HIVE-15665.05.patch, 
> HIVE-15665.06.patch, HIVE-15665.07.patch, HIVE-15665.08.patch, 
> HIVE-15665.09.patch, HIVE-15665.10.patch, HIVE-15665.11.patch, 
> HIVE-15665.12.patch, HIVE-15665.patch
>
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17529) Bucket Map Join : Sets incorrect edge type causing execution failure

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168803#comment-16168803
 ] 

Hive QA commented on HIVE-17529:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887283/HIVE-17529.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_handler_snapshot]
 (batchId=96)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=137)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteVarchar 
(batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6836/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6836/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6836/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887283 - PreCommit-HIVE-Build

> Bucket Map Join : Sets incorrect edge type causing execution failure
> 
>
> Key: HIVE-17529
> URL: https://issues.apache.org/jira/browse/HIVE-17529
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17529.1.patch, HIVE-17529.2.patch
>
>
> If while traversing the tree to generate tasks, a bucket mapjoin may set its 
> edge as CUSTOM_SIMPLE_EDGE against CUSTOM_EDGE if the bigtable is already not 
> traversed causing Tez to assert and fail the vertex.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16500) Remove parser references from PrivilegeType

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168779#comment-16168779
 ] 

Hive QA commented on HIVE-16500:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887281/HIVE-16500.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6835/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6835/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6835/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887281 - PreCommit-HIVE-Build

> Remove parser references from PrivilegeType
> ---
>
> Key: HIVE-16500
> URL: https://issues.apache.org/jira/browse/HIVE-16500
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Server Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16500.1.patch, HIVE-16500.1.patch, 
> HIVE-16500.2.patch, HIVE-16500.3.patch, HIVE-16500.4.patch
>
>
> the authorization uses {{PrivilegeType}}, but that shouldn't depend on parser 
> tokens
> https://github.com/apache/hive/blob/ff67cdda1c538dc65087878eeba3e165cf3230f4/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/PrivilegeType.java#L31



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168751#comment-16168751
 ] 

Vineet Garg commented on HIVE-17536:


This is for queries without from e.g. {{select 1}}. getBasicStats return -1 and 
then {{estimateRowSizeFromSchema}} isn't able to estimate since there is no 
data so we end up returning -1. Previously on countering 0 number of rows we 
were instead returning 1.
One way of handling this is to change getNumRows to return 1 instead of -1.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168746#comment-16168746
 ] 

Hive QA commented on HIVE-15053:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887280/HIVE-15053.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11038 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hadoop.hive.ql.parse.TestParseNegativeDriver.testCliDriver[wrong_distinct2]
 (batchId=238)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6834/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6834/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6834/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887280 - PreCommit-HIVE-Build

> Beeline#addlocaldriver - reduce classpath scanning
> --
>
> Key: HIVE-15053
> URL: https://issues.apache.org/jira/browse/HIVE-15053
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, 
> HIVE-15053.1.patch, HIVE-15053.2.patch, HIVE-15053.3.patch
>
>
> There is a classpath scanning machinery inside {{ClassNameCompleter}}.
> I think the sole purpose of these things is to scan for jdbc drivers...(but 
> not entirely sure)
> if it is indeed looking for jdbc drivers..then possibly this can be removed 
> without any issues because modern jdbc drivers usually advertise their driver 
> as a service-loadable class for {{java.sql.Driver}}
> http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html
> Auto-Loading of JDBC Driver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15899) check CTAS over acid table

2017-09-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15899:
--
Attachment: HIVE-15899.07.patch

> check CTAS over acid table 
> ---
>
> Key: HIVE-15899
> URL: https://issues.apache.org/jira/browse/HIVE-15899
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15899.01.patch, HIVE-15899.02.patch, 
> HIVE-15899.03.patch, HIVE-15899.04.patch, HIVE-15899.05.patch, 
> HIVE-15899.07.patch
>
>
> need to add a test to check if create table as works correctly with acid 
> tables



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16827) Merge stats task and column stats task into a single task

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168700#comment-16168700
 ] 

Hive QA commented on HIVE-16827:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887276/HIVE-16827.03.patch

{color:green}SUCCESS:{color} +1 due to 39 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11045 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6833/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6833/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6833/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887276 - PreCommit-HIVE-Build

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17547) MoveTask for Acid tables race condition

2017-09-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17547:
-


> MoveTask for Acid tables race condition
> ---
>
> Key: HIVE-17547
> URL: https://issues.apache.org/jira/browse/HIVE-17547
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Consider Hive.moveAcidFiles()
> it starts out with something like
> {noformat}
>   └── -ext-1
> │   └── 00_0
> │   ├── _orc_acid_version
> │   └── delta_019_019
> │   └── bucket_0
> │   └── 00_1
> │   ├── _orc_acid_version
> │   └── delta_019_019
> │   └── bucket_1
> {noformat}
> for a write to a bucketed table.
> The "move" handles each 00_N separately.  The first on creates 
> delta_019_019 under the table/partition dir, the others just add 
> bucket_N there.
> That means there is a small window where someone may "ls 
> table/part/delta_019_019" and not see all the buckets.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17515) Use SHA-256 for GenericUDFMaskHash to improve security

2017-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168653#comment-16168653
 ] 

Thejas M Nair commented on HIVE-17515:
--

Patch committed to master. Thanks for the patch [~taoli-hwx]!


> Use SHA-256 for GenericUDFMaskHash to improve security
> --
>
> Key: HIVE-17515
> URL: https://issues.apache.org/jira/browse/HIVE-17515
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17515.1.patch
>
>
> See HIVE-17226 for detailed description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17226) Use strong hashing as security improvement

2017-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17226:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Use strong hashing as security improvement
> --
>
> Key: HIVE-17226
> URL: https://issues.apache.org/jira/browse/HIVE-17226
> Project: Hive
>  Issue Type: Improvement
>  Components: Security
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
>
> There have been 2 places identified where weak hashing needs to be replaced 
> by SHA256.
> 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is 
> mapped to SHA-1, which is not secure enough according to today's standards. 
> We should use SHA-256 instead.
> 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak 
> and should be replaced by DigestUtils.sha256Hex.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17514) Use SHA-256 for cookie signer to improve security

2017-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168651#comment-16168651
 ] 

Thejas M Nair commented on HIVE-17514:
--

Patch committed to master. Thanks for the patch [~taoli-hwx]!


> Use SHA-256 for cookie signer to improve security
> -
>
> Key: HIVE-17514
> URL: https://issues.apache.org/jira/browse/HIVE-17514
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17514.1.patch
>
>
> See HIVE-17226 for detailed description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17226) Use strong hashing as security improvement

2017-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17226:
-
Attachment: (was: HIVE-17226.1.patch)

> Use strong hashing as security improvement
> --
>
> Key: HIVE-17226
> URL: https://issues.apache.org/jira/browse/HIVE-17226
> Project: Hive
>  Issue Type: Improvement
>  Components: Security
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
>
> There have been 2 places identified where weak hashing needs to be replaced 
> by SHA256.
> 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is 
> mapped to SHA-1, which is not secure enough according to today's standards. 
> We should use SHA-256 instead.
> 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak 
> and should be replaced by DigestUtils.sha256Hex.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-15 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: HIVE-17483.1.patch

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17515) Use SHA-256 for GenericUDFMaskHash to improve security

2017-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17515:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Use SHA-256 for GenericUDFMaskHash to improve security
> --
>
> Key: HIVE-17515
> URL: https://issues.apache.org/jira/browse/HIVE-17515
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17515.1.patch
>
>
> See HIVE-17226 for detailed description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-15 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Status: Patch Available  (was: Open)

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17514) Use SHA-256 for cookie signer to improve security

2017-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17514:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Use SHA-256 for cookie signer to improve security
> -
>
> Key: HIVE-17514
> URL: https://issues.apache.org/jira/browse/HIVE-17514
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17514.1.patch
>
>
> See HIVE-17226 for detailed description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17138) FileSinkOperator doesn't create empty files for acid path

2017-09-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168645#comment-16168645
 ] 

Eugene Koifman commented on HIVE-17138:
---

in acid1 arguably only the base/ has to have a full compliment of bucket files
in acid2 all insert deltas should as well
In particular Compactor should make sure to produce empty buckets which it 
doesn't currently

> FileSinkOperator doesn't create empty files for acid path
> -
>
> Key: HIVE-17138
> URL: https://issues.apache.org/jira/browse/HIVE-17138
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> For bucketed tables, FileSinkOperator is expected (in some cases)  to produce 
> a specific number of files even if they are empty.
> FileSinkOperator.closeOp(boolean abort) has logic to create files even if 
> empty.
> This doesn't property work for Acid path.  For Insert, the 
> OrcRecordUpdater(s) is set up in createBucketForFileIdx() which creates the 
> actual bucketN file (as of HIVE-14007, it does it regardless of whether 
> RecordUpdater sees any rows).  This causes empty (i.e.ORC metadata only) 
> bucket files to be created for multiFileSpray=true if a particular 
> FileSinkOperator.process() sees at least 1 row.  For example,
> {noformat}
> create table fourbuckets (a int, b int) clustered by (a) into 4 buckets 
> stored as orc TBLPROPERTIES ('transactional'='true');
> insert into fourbuckets values(0,1),(1,1);
> with mapreduce.job.reduces = 1 or 2 
> {noformat}
> For Update/Delete path, OrcRecordWriter is created lazily when the 1st row 
> that needs to land there is seen.  Thus it never creates empty buckets no 
> mater what the value of _skipFiles_ in closeOp(boolean).
> Once Split Update does the split early (in operator pipeline) only the Insert 
> path will matter since base and delta are the only files split computation, 
> etc looks at.  delete_delta is only for Acid internals so there is never any 
> reason for create empty files there.
> Also make sure to close RecordUpdaters in FileSinkOperator.abortWriters()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Status: Patch Available  (was: Open)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Attachment: HIVE-17422.3.patch

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17422:
--
Status: Open  (was: Patch Available)

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch, 
> HIVE-17422.3.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17196) CM: ReplCopyTask should retain the original file names even if copied from CM path.

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168627#comment-16168627
 ] 

Hive QA commented on HIVE-17196:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887273/HIVE-17196.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6832/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6832/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6832/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887273 - PreCommit-HIVE-Build

> CM: ReplCopyTask should retain the original file names even if copied from CM 
> path.
> ---
>
> Key: HIVE-17196
> URL: https://issues.apache.org/jira/browse/HIVE-17196
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Daniel Dai
> Fix For: 3.0.0
>
> Attachments: HIVE-17196.1.patch, HIVE-17196.2.patch
>
>
> Consider the below scenario,
> 1. Insert into table T1 with value(X).
> 2. Insert into table T1 with value(X).
> 3. Truncate the table T1. 
> – This step backs up 2 files with same content to cmroot which ends up with 
> one file in cmroot as checksum matches.
> 4. Incremental repl with above 3 operations.
> – In this step, both the insert event files will be read from cmroot where 
> copy of one leads to overwrite the other one as the file name is same in cm 
> path (checksum as file name).
> So, this leads to data loss and hence it is necessary to retain the original 
> file names even if we copy from cm path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17482) External LLAP client: acquire locks for tables queried directly by LLAP

2017-09-15 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17482:
--
Attachment: HIVE-17482.3.patch

Precommit build never ran .. re-attaching patch.

> External LLAP client: acquire locks for tables queried directly by LLAP
> ---
>
> Key: HIVE-17482
> URL: https://issues.apache.org/jira/browse/HIVE-17482
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17482.1.patch, HIVE-17482.2.patch, 
> HIVE-17482.3.patch
>
>
> When using the LLAP external client with simple queries (filter/project of 
> single table), the appropriate locks should be taken on the table being read 
> like they are for normal Hive queries. This is important in the case of 
> transactional tables being queried, since the compactor relies on the 
> presence of table locks to determine whether it can safely delete old 
> versions of compacted files without affecting currently running queries.
> This does not have to happen in the complex query case, since a query is used 
> (with the appropriate locking mechanisms) to create/populate the temp table 
> holding the results to the complex query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17479) Staging directories do not get cleaned up for update/delete queries

2017-09-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168609#comment-16168609
 ] 

Eugene Koifman commented on HIVE-17479:
---

[~jcamachorodriguez] it looks like you've changed the patch after I +1'd it.  
Can you explain the changes?

> Staging directories do not get cleaned up for update/delete queries
> ---
>
> Key: HIVE-17479
> URL: https://issues.apache.org/jira/browse/HIVE-17479
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-17479.01.patch, HIVE-17479.02.patch, 
> HIVE-17479.patch
>
>
> When these queries are internally rewritten, a new context is created with a 
> new execution id. This id is used to create the scratch directories. However, 
> only the original context is cleared, and thus the directories created with 
> the original execution id.
> The solution is to pass the execution id to the new context when the queries 
> are internally rewritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17515) Use SHA-256 for GenericUDFMaskHash to improve security

2017-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168587#comment-16168587
 ] 

Thejas M Nair commented on HIVE-17515:
--

+1


> Use SHA-256 for GenericUDFMaskHash to improve security
> --
>
> Key: HIVE-17515
> URL: https://issues.apache.org/jira/browse/HIVE-17515
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17515.1.patch
>
>
> See HIVE-17226 for detailed description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17514) Use SHA-256 for cookie signer to improve security

2017-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168586#comment-16168586
 ] 

Thejas M Nair commented on HIVE-17514:
--

+1


> Use SHA-256 for cookie signer to improve security
> -
>
> Key: HIVE-17514
> URL: https://issues.apache.org/jira/browse/HIVE-17514
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17514.1.patch
>
>
> See HIVE-17226 for detailed description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17479) Staging directories do not get cleaned up for update/delete queries

2017-09-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17479:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Fails are unrelated, pushed to master. Thanks for reviewing [~ashutoshc], 
[~ekoifman]

> Staging directories do not get cleaned up for update/delete queries
> ---
>
> Key: HIVE-17479
> URL: https://issues.apache.org/jira/browse/HIVE-17479
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-17479.01.patch, HIVE-17479.02.patch, 
> HIVE-17479.patch
>
>
> When these queries are internally rewritten, a new context is created with a 
> new execution id. This id is used to create the scratch directories. However, 
> only the original context is cleared, and thus the directories created with 
> the original execution id.
> The solution is to pass the execution id to the new context when the queries 
> are internally rewritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168555#comment-16168555
 ] 

Ashutosh Chauhan commented on HIVE-17536:
-

some explain plans has -1 which is not what user will expect.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez

2017-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168535#comment-16168535
 ] 

Thejas M Nair edited comment on HIVE-17502 at 9/15/17 9:15 PM:
---

The SessionState has state for the session/jdbc connection. So its meant to be 
used for everything in that session.
However, it is not a class that we have worked on to ensure thread safety.

I had created a patch to disabled parallel use of single session because of 
safety issues, but forgot to commit it - HIVE-14247 .

[~thai.bui]
What is the purpose of sharing the same connection for different queries at 
same time ? 
Its not something that api's like JDBC easily allow and not something usually 
recommended in the RDBMS world. Sharing the session won't give you any 
significant advantages in terms of resource utilization in LLAP mode AFAIK.




was (Author: thejas):
The SessionState has state for the session/jdbc connection. So its meant to be 
used for everything in that session.
However, it is not a class that we have worked on to ensure thread safety.

I had created a patch to disabled parallel use of single session because of 
safety issues, but forgot to commit it - HIVE-11402 .

[~thai.bui]
What is the purpose of sharing the same connection for different queries at 
same time ? 
Its not something that api's like JDBC easily allow and not something usually 
recommended in the RDBMS world. Sharing the session won't give you any 
significant advantages in terms of resource utilization in LLAP mode AFAIK.



> Reuse of default session should not throw an exception in LLAP w/ Tez
> -
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
>Reporter: Thai Bui
>Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be 
> skipped mostly because of this line 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per 
> user. Under this configuration, a Thrift client will send a request to either 
> reuse or open a new session. The reuse request could include the session id 
> of a currently used snippet being executed in Hue, this causes HS2 to throw 
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: 
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task 
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session 
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, 
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have 
> been returned to the pool
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP 
> daemon pool, a set of pre-determined number of AMs is initialized at setup 
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out 
> of the pool, or an existing session to be skipped and an unused session from 
> the pool to be returned. The logic to throw an exception in the  
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at 
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
>  When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default 
> session 

[jira] [Issue Comment Deleted] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez

2017-09-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17502:

Comment: was deleted

(was: HIVE-14247 is the correct link for the patch)

> Reuse of default session should not throw an exception in LLAP w/ Tez
> -
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
>Reporter: Thai Bui
>Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be 
> skipped mostly because of this line 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per 
> user. Under this configuration, a Thrift client will send a request to either 
> reuse or open a new session. The reuse request could include the session id 
> of a currently used snippet being executed in Hue, this causes HS2 to throw 
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: 
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task 
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session 
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, 
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have 
> been returned to the pool
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP 
> daemon pool, a set of pre-determined number of AMs is initialized at setup 
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out 
> of the pool, or an existing session to be skipped and an unused session from 
> the pool to be returned. The logic to throw an exception in the  
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at 
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
>  When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default 
> session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap, 
> user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms 
> since it is being used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible 
> I would like this patch to be applied to version 2.1, 2.2 and master. Since 
> we are using 2.1 LLAP in production with Hue 4, this patch is critical to our 
> success.
> Alternatively, if this patch is too broad in scope, I propose adding an 
> option to allow "skipping of currently used default sessions". With this new 
> option default to "false", existing behavior won't change unless the option 
> is turned on.
> I will prepare an official path if this change to master &/ the other 
> branches is acceptable. I'm not an contributor &/ committer, this will be my 
> first time contributing to Hive and the Apache foundation. Any early review 
> is greatly appreciated, thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez

2017-09-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168537#comment-16168537
 ] 

Sergey Shelukhin commented on HIVE-17502:
-

HIVE-14247 is the correct link for the patch

> Reuse of default session should not throw an exception in LLAP w/ Tez
> -
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
>Reporter: Thai Bui
>Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be 
> skipped mostly because of this line 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per 
> user. Under this configuration, a Thrift client will send a request to either 
> reuse or open a new session. The reuse request could include the session id 
> of a currently used snippet being executed in Hue, this causes HS2 to throw 
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: 
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task 
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session 
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, 
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have 
> been returned to the pool
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP 
> daemon pool, a set of pre-determined number of AMs is initialized at setup 
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out 
> of the pool, or an existing session to be skipped and an unused session from 
> the pool to be returned. The logic to throw an exception in the  
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at 
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
>  When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default 
> session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap, 
> user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms 
> since it is being used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible 
> I would like this patch to be applied to version 2.1, 2.2 and master. Since 
> we are using 2.1 LLAP in production with Hue 4, this patch is critical to our 
> success.
> Alternatively, if this patch is too broad in scope, I propose adding an 
> option to allow "skipping of currently used default sessions". With this new 
> option default to "false", existing behavior won't change unless the option 
> is turned on.
> I will prepare an official path if this change to master &/ the other 
> branches is acceptable. I'm not an contributor &/ committer, this will be my 
> first time contributing to Hive and the Apache foundation. Any early review 
> is greatly appreciated, thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez

2017-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168535#comment-16168535
 ] 

Thejas M Nair commented on HIVE-17502:
--

The SessionState has state for the session/jdbc connection. So its meant to be 
used for everything in that session.
However, it is not a class that we have worked on to ensure thread safety.

I had created a patch to disabled parallel use of single session because of 
safety issues, but forgot to commit it - HIVE-11402 .

[~thai.bui]
What is the purpose of sharing the same connection for different queries at 
same time ? 
Its not something that api's like JDBC easily allow and not something usually 
recommended in the RDBMS world. Sharing the session won't give you any 
significant advantages in terms of resource utilization in LLAP mode AFAIK.



> Reuse of default session should not throw an exception in LLAP w/ Tez
> -
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
>Reporter: Thai Bui
>Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be 
> skipped mostly because of this line 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per 
> user. Under this configuration, a Thrift client will send a request to either 
> reuse or open a new session. The reuse request could include the session id 
> of a currently used snippet being executed in Hue, this causes HS2 to throw 
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: 
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task 
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session 
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, 
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have 
> been returned to the pool
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP 
> daemon pool, a set of pre-determined number of AMs is initialized at setup 
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out 
> of the pool, or an existing session to be skipped and an unused session from 
> the pool to be returned. The logic to throw an exception in the  
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at 
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
>  When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default 
> session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap, 
> user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms 
> since it is being used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible 
> I would like this patch to be applied to version 2.1, 2.2 and master. Since 
> we are using 2.1 LLAP in production with Hue 4, this patch is critical to our 
> success.
> Alternatively, if this patch is too broad in scope, I propose adding an 
> option to allow "skipping of currently used default sessions". With this new 
> option default to "false", existing behavior won't change unless the option 
> is turned on.
> I will prepare an official path if this change to master &/ the other 
> branches 

[jira] [Commented] (HIVE-15899) check CTAS over acid table

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168525#comment-16168525
 ] 

Hive QA commented on HIVE-15899:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887251/HIVE-15899.05.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 202 failed/errored test(s), 6763 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=49)

[jira] [Updated] (HIVE-14247) Disable parallel query execution within a session

2017-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-14247:
-
Status: Open  (was: Patch Available)

> Disable parallel query execution within a session
> -
>
> Key: HIVE-14247
> URL: https://issues.apache.org/jira/browse/HIVE-14247
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-14247.1.patch
>
>
> HIVE-11402 leaves the parallel compilation enabled within a session. 
> This is patch for those who want it to be disabled by default. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14247) Disable parallel query execution within a session

2017-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-14247:
-
Status: Patch Available  (was: Open)

> Disable parallel query execution within a session
> -
>
> Key: HIVE-14247
> URL: https://issues.apache.org/jira/browse/HIVE-14247
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-14247.1.patch
>
>
> HIVE-11402 leaves the parallel compilation enabled within a session. 
> This is patch for those who want it to be disabled by default. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Patch Available  (was: Open)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Attachment: HIVE-17536.2.patch

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Open  (was: Patch Available)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17530) ClassCastException when converting uniontype

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168444#comment-16168444
 ] 

Hive QA commented on HIVE-17530:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887239/HIVE-17530.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testStoreFuncAllSimpleTypes 
(batchId=183)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTimestamp 
(batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6830/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6830/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6830/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887239 - PreCommit-HIVE-Build

> ClassCastException when converting uniontype
> 
>
> Key: HIVE-17530
> URL: https://issues.apache.org/jira/browse/HIVE-17530
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 3.0.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-17530.1.patch, HIVE-17530.2.patch
>
>
> To repro:
> {noformat}
> SET hive.exec.schema.evolution = false;
> CREATE TABLE avro_orc_partitioned_uniontype (a uniontype) 
> PARTITIONED BY (b int) STORED AS ORC;
> INSERT INTO avro_orc_partitioned_uniontype PARTITION (b=1) SELECT 
> create_union(1, true, value) FROM src LIMIT 5;
> ALTER TABLE avro_orc_partitioned_uniontype SET FILEFORMAT AVRO;
> SELECT * FROM avro_orc_partitioned_uniontype;
> {noformat}
> The exception you get is:
> {code}
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.UnionObject
> {code}
> The issue is that StandardUnionObjectInspector was creating and returning an 
> ArrayList rather than a UnionObject.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15212) merge branch into master

2017-09-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15212:

Attachment: HIVE-15212.13.patch

A small update

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17545:

Attachment: HIVE-17545.1.patch

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17545:

Status: Patch Available  (was: Open)

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168424#comment-16168424
 ] 

Sahil Takiar commented on HIVE-17545:
-

Adding tests for this will be difficult until HIVE-17546 has been done.

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17545:
---


> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15212) merge branch into master

2017-09-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15212:

Attachment: HIVE-15212.13.patch

Rinse, repeat

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez

2017-09-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168391#comment-16168391
 ] 

Sergey Shelukhin commented on HIVE-17502:
-

The problem are methods in HiveSessionImpl that call  
SessionState.setCurrentSessionState(sessionState), setting the same 
sessionState object to different threads when this pattern is used (client 
requesting the same Hive session from different threads).
So, the SessionState object is getting reused even though it's accessed thru a 
threadlocal.
My understanding was that this was not supported, so the client should open 
separate sessions for parallel requests. However, I'm not sure about that.
[~thejas] [~vgumashta] can you clarify on Hive session usage w/HS2?

If the same SessionState can be reused otherwise, it needs to be changed to 
support multiple Tez sessions.

> Reuse of default session should not throw an exception in LLAP w/ Tez
> -
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
>Reporter: Thai Bui
>Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be 
> skipped mostly because of this line 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per 
> user. Under this configuration, a Thrift client will send a request to either 
> reuse or open a new session. The reuse request could include the session id 
> of a currently used snippet being executed in Hue, this causes HS2 to throw 
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: 
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task 
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session 
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, 
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have 
> been returned to the pool
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP 
> daemon pool, a set of pre-determined number of AMs is initialized at setup 
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out 
> of the pool, or an existing session to be skipped and an unused session from 
> the pool to be returned. The logic to throw an exception in the  
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at 
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
>  When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default 
> session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap, 
> user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms 
> since it is being used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible 
> I would like this patch to be applied to version 2.1, 2.2 and master. Since 
> we are using 2.1 LLAP in production with Hue 4, this patch is critical to our 
> success.
> Alternatively, if this patch is too broad in scope, I propose adding an 
> option to allow "skipping of currently used default sessions". With this new 
> option default to "false", existing behavior won't change unless the option 
> is turned on.
> I will prepare an official path if this change to 

[jira] [Updated] (HIVE-17542) Make HoS CombineEquivalentWorkResolver Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17542:

Status: Patch Available  (was: Open)

> Make HoS CombineEquivalentWorkResolver Configurable
> ---
>
> Key: HIVE-17542
> URL: https://issues.apache.org/jira/browse/HIVE-17542
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17542.1.patch
>
>
> The {{CombineEquivalentWorkResolver}} is run by default. We should make it 
> configurable so that users can disable it in case there are any issues. We 
> can enable it by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17542) Make HoS CombineEquivalentWorkResolver Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17542:

Attachment: HIVE-17542.1.patch

> Make HoS CombineEquivalentWorkResolver Configurable
> ---
>
> Key: HIVE-17542
> URL: https://issues.apache.org/jira/browse/HIVE-17542
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17542.1.patch
>
>
> The {{CombineEquivalentWorkResolver}} is run by default. We should make it 
> configurable so that users can disable it in case there are any issues. We 
> can enable it by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-09-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15665:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Test failures are old or unrelated. Thanks for the review! Committed to master

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, 
> HIVE-15665.03.patch, HIVE-15665.04.patch, HIVE-15665.05.patch, 
> HIVE-15665.06.patch, HIVE-15665.07.patch, HIVE-15665.08.patch, 
> HIVE-15665.09.patch, HIVE-15665.10.patch, HIVE-15665.11.patch, 
> HIVE-15665.12.patch, HIVE-15665.patch
>
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17544) Add Parsed Tree as input for Authorization

2017-09-15 Thread Na Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Li reassigned HIVE-17544:



> Add Parsed Tree as input for Authorization
> --
>
> Key: HIVE-17544
> URL: https://issues.apache.org/jira/browse/HIVE-17544
> Project: Hive
>  Issue Type: Task
>  Components: Authorization
>Affects Versions: 2.1.1
>Reporter: Na Li
>Assignee: Aihua Xu
>Priority: Critical
>
> Right now, for authorization 2, the 
> HiveAuthorizationValidator.checkPrivileges(HiveOperationType var1, 
> List var2, List var3, 
> HiveAuthzContext var4) does not contain the parsed sql command string as 
> input. Therefore, Sentry has to parse the command again.
> The API should be changed to include the parsed result as input, so Sentry 
> does not need to parse the sql command string again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17157) Add InterfaceAudience and InterfaceStability annotations for ObjectInspector APIs

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168331#comment-16168331
 ] 

Hive QA commented on HIVE-17157:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887224/HIVE-17157.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[fp_literal_arithmetic] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=225)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6829/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6829/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6829/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887224 - PreCommit-HIVE-Build

> Add InterfaceAudience and InterfaceStability annotations for ObjectInspector 
> APIs
> -
>
> Key: HIVE-17157
> URL: https://issues.apache.org/jira/browse/HIVE-17157
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17157.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17541) Move testing related methods from MetaStoreUtils to some testing related utility

2017-09-15 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168313#comment-16168313
 ] 

Alan Gates commented on HIVE-17541:
---

+1

> Move testing related methods from MetaStoreUtils to some testing related 
> utility
> 
>
> Key: HIVE-17541
> URL: https://issues.apache.org/jira/browse/HIVE-17541
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17541.01.patch
>
>
> MetaStoreUtils has a very wide range of methods...when the last time tried to 
> do some modularization related with it - it always came back problematic :)
> The most usefull observation I made that it doesn't neccessarily needs the 
> {{HMSHandler}} import.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17422) Skip non-native/temporary tables for all major table/partition related scenarios

2017-09-15 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168298#comment-16168298
 ] 

Tao Li commented on HIVE-17422:
---

Fixing the test failure "testNoopReplEximCommands".

> Skip non-native/temporary tables for all major table/partition related 
> scenarios
> 
>
> Key: HIVE-17422
> URL: https://issues.apache.org/jira/browse/HIVE-17422
> Project: Hive
>  Issue Type: Improvement
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17422.1.patch, HIVE-17422.2.patch
>
>
> Currently during incremental dump, the non-native/temporary table info is 
> partially dumped in metadata file and will be ignored later by the repl load. 
> We can optimize it by moving the check (whether the table should be exported 
> or not) earlier so that we don't save any info to dump file for such types of 
> tables. CreateTableHandler already has this optimization, so we just need to 
> apply similar logic to other scenarios.
> The change is to apply the EximUtil.shouldExportTable check to all scenarios 
> (e.g. alter table) that calls into the common dump method. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Patch Available  (was: Open)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: HIVE-17496.4.patch

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: (was: HIVE-17496.4.patch)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Open  (was: Patch Available)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Patch Available  (was: Open)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: HIVE-17496.4.patch

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch, HIVE-17496.4.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Open  (was: Patch Available)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: HIVE-17496.4.patch

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Attachment: (was: HIVE-17496.4.patch)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16898) Validation of source file after distcp in repl load

2017-09-15 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168283#comment-16168283
 ] 

Daniel Dai commented on HIVE-16898:
---

Created a review board for it.

> Validation of source file after distcp in repl load 
> 
>
> Key: HIVE-16898
> URL: https://issues.apache.org/jira/browse/HIVE-16898
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: Daniel Dai
> Fix For: 3.0.0
>
> Attachments: HIVE-16898.1.patch
>
>
> time between deciding the source and destination path for distcp to invoking 
> of distcp can have a change of the source file, hence distcp might copy the 
> wrong file to destination, hence we should an additional check on the 
> checksum of the source file path after distcp finishes to make sure the path 
> didnot change during the copy process. if it has take additional steps to 
> delete the previous file on destination and copy the new source and repeat 
> the same process as above till we copy the correct file. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17542) Make HoS CombineEquivalentWorkResolver Configurable

2017-09-15 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17542:
---


> Make HoS CombineEquivalentWorkResolver Configurable
> ---
>
> Key: HIVE-17542
> URL: https://issues.apache.org/jira/browse/HIVE-17542
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> The {{CombineEquivalentWorkResolver}} is run by default. We should make it 
> configurable so that users can disable it in case there are any issues. We 
> can enable it by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168225#comment-16168225
 ] 

Ashutosh Chauhan commented on HIVE-17536:
-

thats correct.. missed it.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17479) Staging directories do not get cleaned up for update/delete queries

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168223#comment-16168223
 ] 

Hive QA commented on HIVE-17479:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12886921/HIVE-17479.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6828/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6828/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6828/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12886921 - PreCommit-HIVE-Build

> Staging directories do not get cleaned up for update/delete queries
> ---
>
> Key: HIVE-17479
> URL: https://issues.apache.org/jira/browse/HIVE-17479
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17479.01.patch, HIVE-17479.02.patch, 
> HIVE-17479.patch
>
>
> When these queries are internally rewritten, a new context is created with a 
> new execution id. This id is used to create the scratch directories. However, 
> only the original context is cleared, and thus the directories created with 
> the original execution id.
> The solution is to pass the execution id to the new context when the queries 
> are internally rewritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168219#comment-16168219
 ] 

Vineet Garg commented on HIVE-17536:


[~ashutoshc] if stat key is missing following code will throw 
NumberFormatException {code}Long.parseLong(params.get(statType)){code} since 
{code} params.get(statType){code} will return null on missing stat key. So this 
is already accounted for.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-15 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168216#comment-16168216
 ] 

Chao Sun commented on HIVE-17261:
-

[~Ferd]: does [HIVE-14836|https://issues.apache.org/jira/browse/HIVE-14836] 
depend on this JIRA?

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-17261.10.patch, HIVE-17261.11.patch, 
> HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, 
> HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.7.patch, 
> HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17521) Improve defaults for few runtime configs

2017-09-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17521:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. 

> Improve defaults for few runtime configs
> 
>
> Key: HIVE-17521
> URL: https://issues.apache.org/jira/browse/HIVE-17521
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17521.2.patch, HIVE-17521.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14836) Test the predicate pushing down support for Parquet vectorization read path

2017-09-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168207#comment-16168207
 ] 

Xuefu Zhang commented on HIVE-14836:


Hi [~Ferd], thanks for your patch. I'm a little confused. The JIRA title and 
the patch itself seem about adding tests, but the JIRA description suggests 
some feature. Could you clarify a little bit? Thanks.

> Test the predicate pushing down support for Parquet vectorization read path
> ---
>
> Key: HIVE-14836
> URL: https://issues.apache.org/jira/browse/HIVE-14836
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: pull-request-available
> Attachments: HIVE-14836.patch
>
>
> Currently we filter blocks using Predict pushing down. We should support it 
> in page reader as well to improve its efficiency. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Open  (was: Patch Available)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Open  (was: Patch Available)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17496) Bootstrap repl is not cleaning up staging dirs

2017-09-15 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17496:
--
Status: Patch Available  (was: Open)

> Bootstrap repl is not cleaning up staging dirs
> --
>
> Key: HIVE-17496
> URL: https://issues.apache.org/jira/browse/HIVE-17496
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17496.1.patch, HIVE-17496.2.patch, 
> HIVE-17496.3.patch
>
>
> This will put more pressure on the HDFS file limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17537) Move Warehouse class to standalone metastore

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168078#comment-16168078
 ] 

Hive QA commented on HIVE-17537:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887207/HIVE-17537.patch

{color:green}SUCCESS:{color} +1 due to 16 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate (batchId=183)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalX 
(batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6827/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6827/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6827/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887207 - PreCommit-HIVE-Build

> Move Warehouse class to standalone metastore
> 
>
> Key: HIVE-17537
> URL: https://issues.apache.org/jira/browse/HIVE-17537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17537.patch
>
>
> Move the Warehouse class.  This is done in its own JIRA as it was somewhat 
> more involved than some of the other classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168061#comment-16168061
 ] 

Ashutosh Chauhan commented on HIVE-17536:
-

stat key may also be missing from map, resulting in NPE. That should be 
accounted for as well.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167949#comment-16167949
 ] 

Hive QA commented on HIVE-17536:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887195/HIVE-17536.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 56 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[select_dummy_source] 
(batchId=239)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_table] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_precision2] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_into1] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[select_dummy_source] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats9] (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamp_literal] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_add_months] 
(batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_aes_decrypt] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_aes_encrypt] 
(batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_bitwise_shiftleft] 
(batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_bitwise_shiftright] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_bitwise_shiftrightunsigned]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_cbrt] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_crc32] (batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_current_database] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_date_format] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_decode] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_factorial] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_from_utc_timestamp] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_last_day] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_levenshtein] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_first_n] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_hash] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_last_n] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_show_first_n] 
(batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_mask_show_last_n] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_md5] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_months_between] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_nullif] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_quarter] (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sha1] (batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sha2] (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_soundex] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_substring_index] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_to_utc_timestamp] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_trunc] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_width_bucket] 
(batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_stack] (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_tablesample_rows] 
(batchId=49)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[insert_into1] 
(batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)

[jira] [Commented] (HIVE-17338) Utilities.get*Tasks multiple methods duplicate code

2017-09-15 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167943#comment-16167943
 ] 

Zoltan Haindrich commented on HIVE-17338:
-

+1

> Utilities.get*Tasks multiple methods duplicate code
> ---
>
> Key: HIVE-17338
> URL: https://issues.apache.org/jira/browse/HIVE-17338
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Gergely Hajós
> Attachments: HIVE-17338.1.patch, HIVE-17338.2.patch, 
> HIVE-17338.2.patch, HIVE-17338.3.patch
>
>
> As discussed in https://github.com/apache/hive/pull/212/files, the 3 
> functions can share a more general function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning

2017-09-15 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167925#comment-16167925
 ] 

Zoltan Haindrich commented on HIVE-15053:
-

[~Ferd] the classpath scanning thing did cause problems earlier - I think it is 
safe to assume that all jdbc drivers nowadays has this functionality - this was 
part of a doc for java6.

I did looked around; and concluded that all relevant jdbc drivers are supported 
- [see 
here|https://issues.apache.org/jira/browse/HIVE-15053?focusedCommentId=15609340=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15609340]
 

> Beeline#addlocaldriver - reduce classpath scanning
> --
>
> Key: HIVE-15053
> URL: https://issues.apache.org/jira/browse/HIVE-15053
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, 
> HIVE-15053.1.patch, HIVE-15053.2.patch, HIVE-15053.3.patch
>
>
> There is a classpath scanning machinery inside {{ClassNameCompleter}}.
> I think the sole purpose of these things is to scan for jdbc drivers...(but 
> not entirely sure)
> if it is indeed looking for jdbc drivers..then possibly this can be removed 
> without any issues because modern jdbc drivers usually advertise their driver 
> as a service-loadable class for {{java.sql.Driver}}
> http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html
> Auto-Loading of JDBC Driver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17541) Move testing related methods from MetaStoreUtils to some testing related utility

2017-09-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17541:

Status: Patch Available  (was: Open)

> Move testing related methods from MetaStoreUtils to some testing related 
> utility
> 
>
> Key: HIVE-17541
> URL: https://issues.apache.org/jira/browse/HIVE-17541
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17541.01.patch
>
>
> MetaStoreUtils has a very wide range of methods...when the last time tried to 
> do some modularization related with it - it always came back problematic :)
> The most usefull observation I made that it doesn't neccessarily needs the 
> {{HMSHandler}} import.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17541) Move testing related methods from MetaStoreUtils to some testing related utility

2017-09-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17541:

Attachment: HIVE-17541.01.patch

#1)

* move methods out from {{MetaStoreUtils}} into {{MetaStoreTestUtils}}
* {{HMSHandler}} free
* create metastore test-jar

> Move testing related methods from MetaStoreUtils to some testing related 
> utility
> 
>
> Key: HIVE-17541
> URL: https://issues.apache.org/jira/browse/HIVE-17541
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17541.01.patch
>
>
> MetaStoreUtils has a very wide range of methods...when the last time tried to 
> do some modularization related with it - it always came back problematic :)
> The most usefull observation I made that it doesn't neccessarily needs the 
> {{HMSHandler}} import.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17541) Move testing related methods from MetaStoreUtils to some testing related utility

2017-09-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-17541:
---

Assignee: Zoltan Haindrich

> Move testing related methods from MetaStoreUtils to some testing related 
> utility
> 
>
> Key: HIVE-17541
> URL: https://issues.apache.org/jira/browse/HIVE-17541
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> MetaStoreUtils has a very wide range of methods...when the last time tried to 
> do some modularization related with it - it always came back problematic :)
> The most usefull observation I made that it doesn't neccessarily needs the 
> {{HMSHandler}} import.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17530) ClassCastException when converting uniontype

2017-09-15 Thread Ratandeep Ratti (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167897#comment-16167897
 ] 

Ratandeep Ratti commented on HIVE-17530:


LGTM [~erwaman]

> ClassCastException when converting uniontype
> 
>
> Key: HIVE-17530
> URL: https://issues.apache.org/jira/browse/HIVE-17530
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 3.0.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-17530.1.patch, HIVE-17530.2.patch
>
>
> To repro:
> {noformat}
> SET hive.exec.schema.evolution = false;
> CREATE TABLE avro_orc_partitioned_uniontype (a uniontype) 
> PARTITIONED BY (b int) STORED AS ORC;
> INSERT INTO avro_orc_partitioned_uniontype PARTITION (b=1) SELECT 
> create_union(1, true, value) FROM src LIMIT 5;
> ALTER TABLE avro_orc_partitioned_uniontype SET FILEFORMAT AVRO;
> SELECT * FROM avro_orc_partitioned_uniontype;
> {noformat}
> The exception you get is:
> {code}
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.UnionObject
> {code}
> The issue is that StandardUnionObjectInspector was creating and returning an 
> ArrayList rather than a UnionObject.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning

2017-09-15 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167888#comment-16167888
 ] 

Ferdinand Xu commented on HIVE-15053:
-

Thanks [~kgyrtkirk]. The patch did avoid the classpath scan for driver. Just as 
[~pvary] mentioned, I am not quite sure about whether it's compatible for other 
drivers who is not service loadable (if exists). Regards to a cleaner code, any 
other benefits we can have like performance? 

> Beeline#addlocaldriver - reduce classpath scanning
> --
>
> Key: HIVE-15053
> URL: https://issues.apache.org/jira/browse/HIVE-15053
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, 
> HIVE-15053.1.patch, HIVE-15053.2.patch, HIVE-15053.3.patch
>
>
> There is a classpath scanning machinery inside {{ClassNameCompleter}}.
> I think the sole purpose of these things is to scan for jdbc drivers...(but 
> not entirely sure)
> if it is indeed looking for jdbc drivers..then possibly this can be removed 
> without any issues because modern jdbc drivers usually advertise their driver 
> as a service-loadable class for {{java.sql.Driver}}
> http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html
> Auto-Loading of JDBC Driver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17495) CachedStore: prewarm improvement (avoid multiple sql calls to read partition column stats), refactoring and caching some aggregate stats

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167852#comment-16167852
 ] 

Hive QA commented on HIVE-17495:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887189/HIVE-17495.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extrapolate_part_stats_partial]
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[tunable_ndv] (batchId=43)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testDatabaseOps 
(batchId=200)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testPartitionOps 
(batchId=200)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testTableOps 
(batchId=200)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6825/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6825/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6825/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887189 - PreCommit-HIVE-Build

> CachedStore: prewarm improvement (avoid multiple sql calls to read partition 
> column stats), refactoring and caching some aggregate stats
> 
>
> Key: HIVE-17495
> URL: https://issues.apache.org/jira/browse/HIVE-17495
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-17495.1.patch, HIVE-17495.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17519) Transpose column stats display

2017-09-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17519:

Status: Patch Available  (was: Open)

> Transpose column stats display
> --
>
> Key: HIVE-17519
> URL: https://issues.apache.org/jira/browse/HIVE-17519
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17519.01.patch
>
>
> currently {{describe formatted table1 insert_num}} shows the column 
> informations in a table like format...which is very hard to read - because 
> there are to many columns
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment bitVector   
>   
>  
> insert_numint 
>   
>   
> from deserializer   
> {code}
> I think it would be better to show the same information like this:
> {code}
> col_name  insert_num  
> data_type int 
> min   
> max   
> num_nulls 
> distinct_count
> avg_col_len   
> max_col_len   
> num_trues 
> num_falses
> comment   from deserializer   
> bitVector 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17451) Cannot read decimal from avro file created with HIVE

2017-09-15 Thread Ganesh Tripathi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesh Tripathi reassigned HIVE-17451:
--

Assignee: Ganesh Tripathi

> Cannot read decimal from avro file created with HIVE
> 
>
> Key: HIVE-17451
> URL: https://issues.apache.org/jira/browse/HIVE-17451
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: liviu
>Assignee: Ganesh Tripathi
>Priority: Blocker
>
> Hi,
> When we export decimal data from a hive managed table to a hive avro external 
> table (as bytes with decimal logicalType) the value from avro file cannot be 
> read with any other tools (ex: avro-tools, spark, datastage..)
> _+Scenario:+_
> *create hive managed table an insert a decimal record:*
> {code:java}
> create table test_decimal (col1 decimal(20,2));
> insert into table test_decimal values (3.12);
> {code}
> *create avro schema /tmp/test_decimal.avsc with below content:*
> {code:java}
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   }],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *create an hive external table stored as avro:*
> {code:java}
> create external table test_decimal_avro
> STORED AS AVRO
> LOCATION '/tmp/test_decimal'
> TBLPROPERTIES (
>   'avro.schema.url'='/tmp/test_decimal.avsc',
>   'orc.compress'='SNAPPY');
> {code}
> *insert data in avro external table from hive managed table:*
> {code:java}
> set hive.exec.compress.output=true;
> set hive.exec.compress.intermediate=true;
> set avro.output.codec=snappy; 
> insert overwrite table test_decimal_avro select * from test_decimal;
> {code}
> *successfully reading data from hive avro table through hive cli:*
> {code:java}
> select * from test_decimal_avro;
> OK
> 3.12
> {code}
> *avro schema from avro created file is ok:*
> {code:java}
> hadoop jar /avro-tools.jar getschema /tmp/test_decimal/00_0
> {
>   "type" : "record",
>   "name" : "decimal_test_avro",
>   "fields" : [ {
> "name" : "col1",
> "type" : [ "null", {
>   "type" : "bytes",
>   "logicalType" : "decimal",
>   "precision" : 20,
>   "scale" : 2
> } ],
> "default" : null,
> "columnName" : "col1",
> "sqlType" : "2"
>   } ],
>   "tableName" : "decimal_test_avro"
> }
> {code}
> *read data from avro file with avro-tools {color:#d04437}error{color}, got 
> {color:#d04437}"\u00018"{color} value instead of the correct one:*
> {code:java}
> hadoop jar avro-tools.jar tojson /tmp/test_decimal/00_0
> {"col1":{"bytes":"\u00018"}}
> {code}
> *Read data in a spark dataframe error, got {color:#d04437}[01 38]{color} 
> and{color:#d04437} 8{color} when converted to string instead of correct 
> "3.12" value :*
> {code:java}
> val df = sql.read.avro("/tmp/test_decimal")
> df: org.apache.spark.sql.DataFrame = [col1: binary]
> scala> df.show()
> +---+
> |   col1|
> +---+
> |[01 38]|
> +---+
> scala> df.withColumn("col2", 'col1.cast("String")).select("col2").show()
> ++
> |col2|
> ++
> |  8|
> ++
> {code}
> Is this a Hive bug or there is anything else I can do in order to get correct 
> values in the avro file created by Hive?
> Thanks,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17519) Transpose column stats display

2017-09-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17519:

Attachment: HIVE-17519.01.patch

#1)

* transpose stats in case {{describe formatted table1 col1}}
* some cleanup
* fix column order bitVector/comment were in reverse order
* removed an extra linefeed after "Partition Information"...seemed out of place 

> Transpose column stats display
> --
>
> Key: HIVE-17519
> URL: https://issues.apache.org/jira/browse/HIVE-17519
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17519.01.patch
>
>
> currently {{describe formatted table1 insert_num}} shows the column 
> informations in a table like format...which is very hard to read - because 
> there are to many columns
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment bitVector   
>   
>  
> insert_numint 
>   
>   
> from deserializer   
> {code}
> I think it would be better to show the same information like this:
> {code}
> col_name  insert_num  
> data_type int 
> min   
> max   
> num_nulls 
> distinct_count
> avg_col_len   
> max_col_len   
> num_trues 
> num_falses
> comment   from deserializer   
> bitVector 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17338) Utilities.get*Tasks multiple methods duplicate code

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167769#comment-16167769
 ] 

Hive QA commented on HIVE-17338:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887301/HIVE-17338.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6824/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6824/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6824/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887301 - PreCommit-HIVE-Build

> Utilities.get*Tasks multiple methods duplicate code
> ---
>
> Key: HIVE-17338
> URL: https://issues.apache.org/jira/browse/HIVE-17338
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Gergely Hajós
> Attachments: HIVE-17338.1.patch, HIVE-17338.2.patch, 
> HIVE-17338.2.patch, HIVE-17338.3.patch
>
>
> As discussed in https://github.com/apache/hive/pull/212/files, the 3 
> functions can share a more general function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17512) Not use doAs if distcp privileged user same as user running hive

2017-09-15 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17512:
---
Attachment: HIVE-17512.2.patch

rebased from master, test failures in the previous builds are failing on apache 
master also, but the test report does not show the history of the failed tests 
and indicate that they are failing as a result of the current patch, those 
failures are not related to the code change in the patch, this rebased patch 
will do one more run to see if anything changes. 

> Not use doAs if distcp privileged user same as user running hive
> 
>
> Key: HIVE-17512
> URL: https://issues.apache.org/jira/browse/HIVE-17512
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17512.1.patch, HIVE-17512.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167684#comment-16167684
 ] 

Hive QA commented on HIVE-17535:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887176/HIVE-17535.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 915 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[mapjoin2] 
(batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[select_dummy_source] 
(batchId=239)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[join2] 
(batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[map_join] 
(batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[nested_outer_join]
 (batchId=242)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[allcolref_in_udf] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_onto_nocurrent_db]
 (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_table] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[archive_excludeHadoop20] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[archive_multi] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_explain] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_1] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join10] (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join11] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join12] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join13] (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join16] (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18] (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join22] (batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join24] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join27] (batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join32] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join33] (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values]
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats2] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask]
 (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_10] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_11] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_12] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_14] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_15] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_1] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_3] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_4] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_5] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_7] 
(batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avrotblsjoin] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_if_with_path_filter]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] 

[jira] [Commented] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning

2017-09-15 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167673#comment-16167673
 ] 

Peter Vary commented on HIVE-15053:
---

[~kgyrtkirk]: Looking at your patch, I like how it simplifies the code. My only 
concern, that it can be an incompatible change if someone uses driver which is 
not service loadable. I would value [~Ferd]'s opinion on this since he is the 
original author of HIVE-9302 which introduced this feature.

Thanks,
Peter

> Beeline#addlocaldriver - reduce classpath scanning
> --
>
> Key: HIVE-15053
> URL: https://issues.apache.org/jira/browse/HIVE-15053
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, 
> HIVE-15053.1.patch, HIVE-15053.2.patch, HIVE-15053.3.patch
>
>
> There is a classpath scanning machinery inside {{ClassNameCompleter}}.
> I think the sole purpose of these things is to scan for jdbc drivers...(but 
> not entirely sure)
> if it is indeed looking for jdbc drivers..then possibly this can be removed 
> without any issues because modern jdbc drivers usually advertise their driver 
> as a service-loadable class for {{java.sql.Driver}}
> http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html
> Auto-Loading of JDBC Driver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17338) Utilities.get*Tasks multiple methods duplicate code

2017-09-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167616#comment-16167616
 ] 

Gergely Hajós commented on HIVE-17338:
--

[~kgyrtkirk] I've deleted too much. Corrected again!

> Utilities.get*Tasks multiple methods duplicate code
> ---
>
> Key: HIVE-17338
> URL: https://issues.apache.org/jira/browse/HIVE-17338
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Gergely Hajós
> Attachments: HIVE-17338.1.patch, HIVE-17338.2.patch, 
> HIVE-17338.2.patch, HIVE-17338.3.patch
>
>
> As discussed in https://github.com/apache/hive/pull/212/files, the 3 
> functions can share a more general function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17527) Support replication for rename/move table across database

2017-09-15 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167615#comment-16167615
 ] 

anishek commented on HIVE-17527:


+1 

cc [~thejas]/[~daijy]

> Support replication for rename/move table across database
> -
>
> Key: HIVE-17527
> URL: https://issues.apache.org/jira/browse/HIVE-17527
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17527.01.patch
>
>
> Rename/move table across database should be supported for replication. The 
> scenario is as follows.
> 1. Create 2 databases (db1 and db2) in source cluster.
> 2. Create the table db1.tbl1.
> 3. Run bootstrap replication for db1 and db2 to target cluster.
> 4. Rename db1.tbl1 to db2.tbl1 in source.
> 5. Run incremental replication for both db1 and db2.
> - db1 dump missed the rename table operation as no event is generated for 
> db1. So, table exist after load.
> - db2 load skips the rename event as the source table is missing in target.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17338) Utilities.get*Tasks multiple methods duplicate code

2017-09-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Hajós updated HIVE-17338:
-
Attachment: HIVE-17338.3.patch

> Utilities.get*Tasks multiple methods duplicate code
> ---
>
> Key: HIVE-17338
> URL: https://issues.apache.org/jira/browse/HIVE-17338
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Gergely Hajós
> Attachments: HIVE-17338.1.patch, HIVE-17338.2.patch, 
> HIVE-17338.2.patch, HIVE-17338.3.patch
>
>
> As discussed in https://github.com/apache/hive/pull/212/files, the 3 
> functions can share a more general function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-09-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167583#comment-16167583
 ] 

Hive QA commented on HIVE-15665:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12887170/HIVE-15665.12.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ptf_matchpath] 
(batchId=242)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2]
 (batchId=89)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6822/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6822/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6822/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12887170 - PreCommit-HIVE-Build

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, 
> HIVE-15665.03.patch, HIVE-15665.04.patch, HIVE-15665.05.patch, 
> HIVE-15665.06.patch, HIVE-15665.07.patch, HIVE-15665.08.patch, 
> HIVE-15665.09.patch, HIVE-15665.10.patch, HIVE-15665.11.patch, 
> HIVE-15665.12.patch, HIVE-15665.patch
>
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2017-09-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17313:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

pushed to master, Thank you [~olegd] for fixing it!

> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-17313.patch
>
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16898) Validation of source file after distcp in repl load

2017-09-15 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167575#comment-16167575
 ] 

anishek commented on HIVE-16898:


[~daijy] can you please provide a pull request for the same.


> Validation of source file after distcp in repl load 
> 
>
> Key: HIVE-16898
> URL: https://issues.apache.org/jira/browse/HIVE-16898
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: Daniel Dai
> Fix For: 3.0.0
>
> Attachments: HIVE-16898.1.patch
>
>
> time between deciding the source and destination path for distcp to invoking 
> of distcp can have a change of the source file, hence distcp might copy the 
> wrong file to destination, hence we should an additional check on the 
> checksum of the source file path after distcp finishes to make sure the path 
> didnot change during the copy process. if it has take additional steps to 
> delete the previous file on destination and copy the new source and repeat 
> the same process as above till we copy the correct file. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17539) User impersonation failure is not propagated by server as a failure to client

2017-09-15 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-17539:
--


> User impersonation failure is not propagated by server as a failure to client
> -
>
> Key: HIVE-17539
> URL: https://issues.apache.org/jira/browse/HIVE-17539
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
> Fix For: 3.0.0
>
>
> As part of HIVE-17512 we fixed the distCp user impersonation if doAs = false 
> and configured "hive.distcp.privileged.doAs" is same as the user running 
> hiveServer. However in the event of not applying the source in patch for 
> HIVE-17512 and running the corresponding test run in HIVE-17512 with older 
> code there is impersonation error in the hive server logs, however the driver 
> returns an "exitValue"  of 0 which is wrong, since the copy failed we should 
> return the error code appropriately.
> also since the table creation happens and only data is missing there is a 
> possibility that the last.repl.id on the table is the latest value with the 
> data missing and coupled with no error returned to the client this can lead 
> to serious replication inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >