[jira] [Commented] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994302#comment-15994302
 ] 

Gopal V commented on HIVE-16570:


Waiting for more accurate profiles from earlier patch - so far the "could be 
faster" has not be identified as being related to this patch.

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16570.patch
>
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994291#comment-15994291
 ] 

Hive QA commented on HIVE-16570:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866076/HIVE-16570.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10638 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30]
 (batchId=148)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5012/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5012/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5012/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866076 - PreCommit-HIVE-Build

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16570.patch
>
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994266#comment-15994266
 ] 

Pengcheng Xiong commented on HIVE-16485:


query and plan can be seen in the attachment.  [~pallavkul] can you verify it 
using the JSON in the attachment?  ccing [~ashutoshc], [~hagleitn]

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485.04.patch, HIVE-16485-disableMasking, plan, 
> query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Patch Available  (was: Open)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485.04.patch, HIVE-16485-disableMasking, plan, 
> query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Open  (was: Patch Available)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485.04.patch, HIVE-16485-disableMasking, plan, 
> query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: HIVE-16485.04.patch

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485.04.patch, HIVE-16485-disableMasking, plan, 
> query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: plan

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485-disableMasking, plan, query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: query

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485-disableMasking, plan, query
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16143) Improve msck repair batching

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994263#comment-15994263
 ] 

Hive QA commented on HIVE-16143:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866074/HIVE-16143.06.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10650 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[table_nonprintable]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30]
 (batchId=148)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5011/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5011/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5011/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866074 - PreCommit-HIVE-Build

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch, HIVE-16143.02.patch, 
> HIVE-16143.03.patch, HIVE-16143.04.patch, HIVE-16143.05.patch, 
> HIVE-16143.06.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-05-02 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994247#comment-15994247
 ] 

Rui Li commented on HIVE-16047:
---

Patch reverted in master and branch-2.3.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994225#comment-15994225
 ] 

Hive QA commented on HIVE-16568:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866062/HIVE-16568.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10639 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.llap.TestRow.testUsage (batchId=279)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5010/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5010/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5010/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866062 - PreCommit-HIVE-Build

> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16568.1.patch
>
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-05-02 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994215#comment-15994215
 ] 

Rui Li commented on HIVE-16047:
---

Hi [~spena], I'm not aware the patch is not in branch-2.2. I'll revert the 
changes in master and branch-2.3. Meanwhile, it'd be great that HDFS team can 
quash the log and make sure the future Hadoop does work with Hive-2.3. 
Otherwise we'll be reverting for no good.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16500) Remove parser references from PrivilegeType

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994184#comment-15994184
 ] 

Hive QA commented on HIVE-16500:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866054/HIVE-16500.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 106 failed/errored test(s), 10638 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition_authorization]
 (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_1] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_1_sql_std] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_3] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_4] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_5] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_6] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_7] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_8] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_9] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_admin_almighty1]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_alter_table_exchange_partition]
 (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_create_temp_table]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_delete] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_grant_option_role]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_grant_public_role]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_grant_table_priv]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_insert] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_load] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_non_id] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_revoke_table_priv]
 (batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_show_grant]
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_update] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_1] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_1]
 (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_2]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_3]
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_4]
 (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_21_export_authsuccess]
 (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_22_import_exist_authsuccess]
 (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_23_import_part_authsuccess]
 (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_24_import_nonexist_authsuccess]
 (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_25_export_parentpath_has_inaccessible_children]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auth] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[keyword_1] (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_exist_part_authsuccess]
 (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_nonpart_authsuccess]
 (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_part_authsuccess] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_authorization_sqlstd]
 (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[view_authorization_sqlstd]
 (batchId=47)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[authorization_2]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)

[jira] [Assigned] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16572:
--


> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16562) Issues with nullif / fetch task

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994148#comment-15994148
 ] 

Hive QA commented on HIVE-16562:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866051/HIVE-16562.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10639 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5008/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5008/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5008/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866051 - PreCommit-HIVE-Build

> Issues with nullif / fetch task
> ---
>
> Key: HIVE-16562
> URL: https://issues.apache.org/jira/browse/HIVE-16562
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16562.1.patch
>
>
> HIVE-13555 adds support for nullif. I'm encountering issues with nullif on 
> master (3.0.0-SNAPSHOT rdac3786d86462e4d08d62d23115e6b7a3e534f5d)
> Cluster side jobs work fine but client side don't.
> Consider these two tables:
> e011_02:
> Columns c1 = float, c2 = double
> 1.0   1.0
> 1.5   1.5
> 2.0   2.0
> test:
> Columns c1 = int, c2 = int
> Data:
> 1 1
> 2 2
> And this query:
> select nullif(c1, c2) from e011_02;
> With e011_02 I get:
> {code}
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> NULLIF(c1,c2)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   ... 13 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.LazyFloat cannot be cast to 
> org.apache.hadoop.io.FloatWritable
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloatObjectInspector.java:36)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.comparePrimitiveObjects(PrimitiveObjectInspectorUtils.java:412)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFNullif.evaluate(GenericUDFNullif.java:93)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>

[jira] [Commented] (HIVE-16550) Semijoin Hints should be able to skip the optimization if needed.

2017-05-02 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994147#comment-15994147
 ] 

Jason Dere commented on HIVE-16550:
---

+1 pending tests

> Semijoin Hints should be able to skip the optimization if needed.
> -
>
> Key: HIVE-16550
> URL: https://issues.apache.org/jira/browse/HIVE-16550
> Project: Hive
>  Issue Type: Improvement
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16550.1.patch, HIVE-16550.2.patch, 
> HIVE-16550.3.patch
>
>
> Currently semi join hints are designed to enforce a particular semi join, 
> however, it should also be able to skip the optimization all together in a 
> query using hints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16550) Semijoin Hints should be able to skip the optimization if needed.

2017-05-02 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16550:
--
Attachment: HIVE-16550.3.patch

Worked on Jason's comments. Added some more tests.

> Semijoin Hints should be able to skip the optimization if needed.
> -
>
> Key: HIVE-16550
> URL: https://issues.apache.org/jira/browse/HIVE-16550
> Project: Hive
>  Issue Type: Improvement
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16550.1.patch, HIVE-16550.2.patch, 
> HIVE-16550.3.patch
>
>
> Currently semi join hints are designed to enforce a particular semi join, 
> however, it should also be able to skip the optimization all together in a 
> query using hints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16562) Issues with nullif / fetch task

2017-05-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994137#comment-15994137
 ] 

Ashutosh Chauhan commented on HIVE-16562:
-

Patch looks good. Can you also add a test with mixed types, e.g, int and float?

> Issues with nullif / fetch task
> ---
>
> Key: HIVE-16562
> URL: https://issues.apache.org/jira/browse/HIVE-16562
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16562.1.patch
>
>
> HIVE-13555 adds support for nullif. I'm encountering issues with nullif on 
> master (3.0.0-SNAPSHOT rdac3786d86462e4d08d62d23115e6b7a3e534f5d)
> Cluster side jobs work fine but client side don't.
> Consider these two tables:
> e011_02:
> Columns c1 = float, c2 = double
> 1.0   1.0
> 1.5   1.5
> 2.0   2.0
> test:
> Columns c1 = int, c2 = int
> Data:
> 1 1
> 2 2
> And this query:
> select nullif(c1, c2) from e011_02;
> With e011_02 I get:
> {code}
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> NULLIF(c1,c2)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   ... 13 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.LazyFloat cannot be cast to 
> org.apache.hadoop.io.FloatWritable
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloatObjectInspector.java:36)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.comparePrimitiveObjects(PrimitiveObjectInspectorUtils.java:412)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFNullif.evaluate(GenericUDFNullif.java:93)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   ... 18 more
> {code}
> With 
> select nullif(c1, c2) from test;
> I get:
> {code}
> 2017-05-01T03:32:19,905 ERROR [cbaf5380-5b06-4531-aeb9-524c62314a46 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at 

[jira] [Commented] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994136#comment-15994136
 ] 

Jason Dere commented on HIVE-16568:
---

This might increase memory overhead, certainly in the case of complex types 
where the list/map/struct needs to be traversed/converted. But I'm not sure how 
else to provide the complex types, unless we make clients use the 
ObjectInspector interface to access the various fields of the complex types.

Will look into doing more tests.

> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16568.1.patch
>
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13583) E061-14: Search Conditions

2017-05-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994133#comment-15994133
 ] 

Ashutosh Chauhan commented on HIVE-13583:
-

Patch looks good. Per Hive QA report, some tests needs to be updated.
Can you also please add following tests:
{code}
select NULL is true, NULL is not true, NULL is false, NULL is not false, from 
t1;
{code}

> E061-14: Search Conditions
> --
>
> Key: HIVE-13583
> URL: https://issues.apache.org/jira/browse/HIVE-13583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13583.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Support for various forms of search conditions are mandatory in the SQL 
> standard. For example, " is not true;" Hive should support those 
> forms mandated by the standard.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-05-02 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-16465:

Fix Version/s: 2.3.0

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch, 
> HIVE-16465-branch-2.3.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-05-02 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994116#comment-15994116
 ] 

Ferdinand Xu commented on HIVE-16465:
-

Committed to the branch 2.3. Thanks [~pxiong].

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch, 
> HIVE-16465-branch-2.3.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994114#comment-15994114
 ] 

Pengcheng Xiong commented on HIVE-16465:


Sure. I saw you already run ptest on 2.3. Thus, it is safe to cherry-pick. 

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch, 
> HIVE-16465-branch-2.3.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994114#comment-15994114
 ] 

Pengcheng Xiong edited comment on HIVE-16465 at 5/3/17 12:44 AM:
-

Sure. I saw you already run ptest on 2.3. Thus, it is safe to cherry-pick. Make 
sure you change the fixed version as well after you cherry-pick. Thanks!


was (Author: pxiong):
Sure. I saw you already run ptest on 2.3. Thus, it is safe to cherry-pick. 

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch, 
> HIVE-16465-branch-2.3.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13583) E061-14: Search Conditions

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994111#comment-15994111
 ] 

Hive QA commented on HIVE-13583:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866050/HIVE-13583.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10639 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_isnull_isnotnull] 
(batchId=36)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.ql.parse.TestIUD.testDeleteWithWhere (batchId=260)
org.apache.hadoop.hive.ql.parse.TestIUD.testStandardInsertIntoTable 
(batchId=260)
org.apache.hadoop.hive.ql.parse.TestIUD.testUpdateWithWhereSingleSet 
(batchId=260)
org.apache.hadoop.hive.ql.parse.TestIUD.testUpdateWithWhereSingleSetExpr 
(batchId=260)
org.apache.hadoop.hive.ql.parse.TestMergeStatement.test1 (batchId=259)
org.apache.hadoop.hive.ql.parse.TestMergeStatement.test4 (batchId=259)
org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish
 (batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5007/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5007/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5007/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866050 - PreCommit-HIVE-Build

> E061-14: Search Conditions
> --
>
> Key: HIVE-13583
> URL: https://issues.apache.org/jira/browse/HIVE-13583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13583.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Support for various forms of search conditions are mandatory in the SQL 
> standard. For example, " is not true;" Hive should support those 
> forms mandated by the standard.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-05-02 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994110#comment-15994110
 ] 

Ferdinand Xu commented on HIVE-16465:
-

Hi [~pxiong], can we include this patch to release 2.3? This issue will lead to 
Parquet Vectorization unable to use due to a NPE. It was involved by 
HIVE-12767. Thanks.

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch, 
> HIVE-16465-branch-2.3.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-14529) Union All query returns incorrect results.

2017-05-02 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-14529.
-
   Resolution: Duplicate
Fix Version/s: 2.1.1
   2.2.0

> Union All query returns incorrect results.
> --
>
> Key: HIVE-14529
> URL: https://issues.apache.org/jira/browse/HIVE-14529
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.0
> Environment: Hadoop 2.6
> Hive 2.1
>Reporter: wenhe li
> Fix For: 2.2.0, 2.1.1
>
>
> create table dw_tmp.l_test1 (id bigint,val string,trans_date string) row 
> format delimited fields terminated by ' ' ;
> create table dw_tmp.l_test2 (id bigint,val string,trans_date string) row 
> format delimited fields terminated by ' ' ;  
> select * from dw_tmp.l_test1;
> 1   table_1  2016-08-11
> select * from dw_tmp.l_test2;
> 2   table_2  2016-08-11
> -- right like this
> select 
> id,
> 'table_1' ,
> trans_date
> from dw_tmp.l_test1
> union all
> select 
> id,
> val,
> trans_date
> from dw_tmp.l_test2 ;
> 1   table_1 2016-08-11
> 2   table_2 2016-08-11
> -- incorrect
> select 
> id,
> 999,
> 'table_1' ,
> trans_date
> from dw_tmp.l_test1
> union all
> select 
> id,
> 999,
> val,
> trans_date
> from dw_tmp.l_test2 ;
> 1   999 table_1 2016-08-11
> 2   999 table_1 2016-08-11 <-- here is wrong
> -- incorrect
> select 
> id,
> 999,
> 666,
> 'table_1' ,
> trans_date
> from dw_tmp.l_test1
> union all
> select 
> id,
> 999,
> 666,
> val,
> trans_date
> from dw_tmp.l_test2 ;
> 1   999 666 table_1 2016-08-11
> 2   999 666 table_1 2016-08-11 <-- here is wrong
> -- right
> select 
> id,
> 999,
> 'table_1' ,
> trans_date,
> '2016-11-11'
> from dw_tmp.l_test1
> union all
> select 
> id,
> 999,
> val,
> trans_date,
> trans_date
> from dw_tmp.l_test2 ;
> 1   999 table_1 2016-08-11  2016-11-11
> 2   999 table_2 2016-08-11  2016-08-11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994106#comment-15994106
 ] 

Gunther Hagleitner commented on HIVE-16568:
---

I'm +1 on this, but I think it would be good to expand testing a bit. If I read 
it right you're basically checking only for a single row with a handful complex 
objects. Some more nesting and variance in the data (nulls for instance) might 
be good.

Also question: With the switch from writable to object in row - will this 
increase the memory overhead a lot if ppl cache a bunch of them?

> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16568.1.patch
>
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16558) In the hiveserver2.jsp page, when you click Drilldown to view the details of the Closed Queries, the Chinese show garbled

2017-05-02 Thread ZhangBing Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994101#comment-15994101
 ] 

ZhangBing Lin commented on HIVE-16558:
--

[~nzhang],Can you plz take a quick review?

> In the hiveserver2.jsp page, when you click Drilldown to view the details of 
> the Closed Queries, the Chinese show garbled
> -
>
> Key: HIVE-16558
> URL: https://issues.apache.org/jira/browse/HIVE-16558
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Fix For: 3.0.0
>
> Attachments: HIVE-16558.1.patch
>
>
> In QueryProfileImpl.jamon,We see the following settings:
> 
> 
>   
> 
> HiveServer2
> 
> 
> 
> 
> 
>   
> So we should set the response code to utf-8, which can avoid Chinese garbled 
> or other languages garbled,Please check it!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12188) DoAs does not work properly in non-kerberos secured HS2

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12188:
---
Component/s: Security

> DoAs does not work properly in non-kerberos secured HS2
> ---
>
> Key: HIVE-12188
> URL: https://issues.apache.org/jira/browse/HIVE-12188
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12188.patch
>
>
> The case with following settings is valid but it seems still not work 
> correctly in current HS2
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=true (with HMS Kerberos enabled)
> ==
> Currently HS2 is able to fetch the delegation token to a kerberos secured HMS 
> only when itself is also kerberos secured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12965) Insert overwrite local directory should perserve the overwritten directory permission

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12965:
---
Component/s: Security

> Insert overwrite local directory should perserve the overwritten directory 
> permission
> -
>
> Key: HIVE-12965
> URL: https://issues.apache.org/jira/browse/HIVE-12965
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12965.1.patch, HIVE-12965.2.patch, 
> HIVE-12965.3.patch, HIVE-12965.patch
>
>
> In Hive, "insert overwrite local directory" first deletes the overwritten 
> directory if exists, recreate a new one, then copy the files from src 
> directory to the new local directory. This process sometimes changes the 
> permissions of the to-be-overwritten local directory, therefore causing some 
> applications no more to be able to access its content.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-13401) Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token authentication

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-13401:
---
Component/s: Security

> Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token 
> authentication
> 
>
> Key: HIVE-13401
> URL: https://issues.apache.org/jira/browse/HIVE-13401
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0
>
> Attachments: HIVE-13401-branch2.0.1.patch, HIVE-13401.patch
>
>
> When HS2 is running in kerberos cluster but with other Sasl authentication 
> (e.g. LDAP) enabled, it fails in kerberos/delegation token authentication. It 
> is because the HS2 server uses the TSetIpAddressProcess when other 
> authentication is enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12270) Add DBTokenStore support to HS2 delegation token

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12270:
---
Component/s: Security
 Authentication

> Add DBTokenStore support to HS2 delegation token
> 
>
> Key: HIVE-12270
> URL: https://issues.apache.org/jira/browse/HIVE-12270
> Project: Hive
>  Issue Type: New Feature
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0
>
> Attachments: HIVE-12270.1.nothrift.patch, HIVE-12270.1.patch, 
> HIVE-12270.2.patch, HIVE-12270.3.nothrift.patch, HIVE-12270.3.patch, 
> HIVE-12270.nothrift.patch
>
>
> DBTokenStore was initially introduced by HIVE-3255 in Hive-0.12 and it is 
> mainly for HMS delegation token. Later in Hive-0.13, the HS2 delegation token 
> support was introduced by HIVE-5155 but it used MemoryTokenStore as token 
> store. That the HIVE-9622 uses the shared RawStore (or HMSHandler) to access 
> the token/keys information in HMS DB directly from HS2 seems not the right 
> approach to support DBTokenStore in HS2. I think we should use 
> HiveMetaStoreClient in HS2 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14697) Can not access kerberized HS2 Web UI

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14697:
---
Component/s: Security

> Can not access kerberized HS2 Web UI
> 
>
> Key: HIVE-14697
> URL: https://issues.apache.org/jira/browse/HIVE-14697
> Project: Hive
>  Issue Type: Bug
>  Components: Security, Web UI
>Affects Versions: 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-14697.patch
>
>
> Failed to access kerberized HS2 WebUI with following error msg:
> {code}
> curl -v -u : --negotiate http://util185.phx2.cbsig.net:10002/ 
> > GET / HTTP/1.1 
> > Host: util185.phx2.cbsig.net:10002 
> > Authorization: Negotiate YIIU7...[redacted]... 
> > User-Agent: curl/7.42.1 
> > Accept: */* 
> > 
> < HTTP/1.1 413 FULL head 
> < Content-Length: 0 
> < Connection: close 
> < Server: Jetty(7.6.0.v20120127) 
> {code}
> It is because the Jetty default request header (4K) is too small in some 
> kerberos case.
> So this patch is to increase the request header to 64K.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14359) Hive on Spark might fail in HS2 with LDAP authentication in a kerberized cluster

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14359:
---
Component/s: Spark

> Hive on Spark might fail in HS2 with LDAP authentication in a kerberized 
> cluster
> 
>
> Key: HIVE-14359
> URL: https://issues.apache.org/jira/browse/HIVE-14359
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-14359.patch
>
>
> When HS2 is used as a gateway for the LDAP users to access and run the 
> queries in kerberized cluster, it's authentication mode is configured as LDAP 
> and at this time, HoS might fail by the same reason as HIVE-10594. 
> hive.server2.authentication is not a proper property to determine if a 
> cluster is kerberized, instead hadoop.security.authentication should be used.
> The failure is in spark client communicating with rest of hadoop as it 
> assumes kerberos does not need to be used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15653:
---
Component/s: Statistics

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Statistics
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15485) Investigate the DoAs failure in HoS

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15485:
---
Component/s: Spark

> Investigate the DoAs failure in HoS
> ---
>
> Key: HIVE-15485
> URL: https://issues.apache.org/jira/browse/HIVE-15485
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0
>
> Attachments: HIVE-15485.1.patch, HIVE-15485.2.patch, HIVE-15485.patch
>
>
> With DoAs enabled, HoS failed with following errors:
> {code}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> systest tries to renew a token with renewer hive
>   at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:484)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7543)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:555)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:674)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:999)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)
> {code}
> It is related to the change from HIVE-14383. It looks like that SparkSubmit 
> logs in Kerberos with passed in hive principal/keytab and then tries to 
> create a hdfs delegation token for user systest with renewer hive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Component/s: Statistics

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0
>
> Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, 
> HIVE-16189.2.patch, HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Component/s: Statistics

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16394:
---
Component/s: Spark

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0
>
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16555) Add a new thrift API call for get_metastore_uuid

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16555:
---
Attachment: HIVE-16555.03.patch

Attaching a patch with no prefix. Tested locally using smart-apply-patch.sh 
script. Hopefully this should work now ..

> Add a new thrift API call for get_metastore_uuid
> 
>
> Key: HIVE-16555
> URL: https://issues.apache.org/jira/browse/HIVE-16555
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16555.01.patch, HIVE-16555.02.patch, 
> HIVE-16555.03.patch
>
>
> Sub-task of the main JIRA to add the new thrift API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16550) Semijoin Hints should be able to skip the optimization if needed.

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994044#comment-15994044
 ] 

Hive QA commented on HIVE-16550:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866045/HIVE-16550.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[setop_no_distinct] 
(batchId=76)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30]
 (batchId=148)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5006/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5006/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5006/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866045 - PreCommit-HIVE-Build

> Semijoin Hints should be able to skip the optimization if needed.
> -
>
> Key: HIVE-16550
> URL: https://issues.apache.org/jira/browse/HIVE-16550
> Project: Hive
>  Issue Type: Improvement
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16550.1.patch, HIVE-16550.2.patch
>
>
> Currently semi join hints are designed to enforce a particular semi join, 
> however, it should also be able to skip the optimization all together in a 
> query using hints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16555) Add a new thrift API call for get_metastore_uuid

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993988#comment-15993988
 ] 

Vihang Karajgaonkar commented on HIVE-16555:


Looks like HIVE-16534 also was commited recently which also generated thrift 
files. Thats why the patch is not merging since there are conflicts. Will 
upload another patch.

> Add a new thrift API call for get_metastore_uuid
> 
>
> Key: HIVE-16555
> URL: https://issues.apache.org/jira/browse/HIVE-16555
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16555.01.patch, HIVE-16555.02.patch
>
>
> Sub-task of the main JIRA to add the new thrift API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16571) HiveServer2: Prefer LIFO over round-robin for Tez session reuse

2017-05-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993976#comment-15993976
 ] 

Sergey Shelukhin commented on HIVE-16571:
-

+1 pending tests... some cluster testing might also be good

> HiveServer2: Prefer LIFO over round-robin for Tez session reuse
> ---
>
> Key: HIVE-16571
> URL: https://issues.apache.org/jira/browse/HIVE-16571
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Tez
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16571.patch
>
>
> Currently Tez session reuse is entirely round-robin, which means a single 
> user might have to run upto 32 queries before reusing a warm session on a 
> HiveServer2.
> This is not the case when session reuse is disabled, with a user warming up 
> their session on the 1st query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16571) HiveServer2: Prefer LIFO over round-robin for Tez session reuse

2017-05-02 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16571:
---
Attachment: HIVE-16571.patch

> HiveServer2: Prefer LIFO over round-robin for Tez session reuse
> ---
>
> Key: HIVE-16571
> URL: https://issues.apache.org/jira/browse/HIVE-16571
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Tez
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16571.patch
>
>
> Currently Tez session reuse is entirely round-robin, which means a single 
> user might have to run upto 32 queries before reusing a warm session on a 
> HiveServer2.
> This is not the case when session reuse is disabled, with a user warming up 
> their session on the 1st query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16571) HiveServer2: Prefer LIFO over round-robin for Tez session reuse

2017-05-02 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16571:
---
Status: Patch Available  (was: Open)

> HiveServer2: Prefer LIFO over round-robin for Tez session reuse
> ---
>
> Key: HIVE-16571
> URL: https://issues.apache.org/jira/browse/HIVE-16571
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Tez
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16571.patch
>
>
> Currently Tez session reuse is entirely round-robin, which means a single 
> user might have to run upto 32 queries before reusing a warm session on a 
> HiveServer2.
> This is not the case when session reuse is disabled, with a user warming up 
> their session on the 1st query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16571) HiveServer2: Prefer LIFO over round-robin for Tez session reuse

2017-05-02 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-16571:
--

Assignee: Gopal V

> HiveServer2: Prefer LIFO over round-robin for Tez session reuse
> ---
>
> Key: HIVE-16571
> URL: https://issues.apache.org/jira/browse/HIVE-16571
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Tez
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16571.patch
>
>
> Currently Tez session reuse is entirely round-robin, which means a single 
> user might have to run upto 32 queries before reusing a warm session on a 
> HiveServer2.
> This is not the case when session reuse is disabled, with a user warming up 
> their session on the 1st query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16563) Alter table partition set location should use fully qualified path for non-default FS

2017-05-02 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-16563:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Xuefu for the review.

> Alter table partition set location should use fully qualified path for 
> non-default FS
> -
>
> Key: HIVE-16563
> URL: https://issues.apache.org/jira/browse/HIVE-16563
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 3.0.0
>
> Attachments: HIVE-16563.1.patch
>
>
> Similar to HIVE-6374, for command {{ALTER TABLE .. PARTITION(..) SET LOCATION 
> ..}}, if location path is not a fully qualified path and Hive is not using 
> default namenode, it should use fully qualified path for the partition.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16555) Add a new thrift API call for get_metastore_uuid

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993946#comment-15993946
 ] 

Hive QA commented on HIVE-16555:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866030/HIVE-16555.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5005/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5005/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5005/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-05-02 22:54:10.498
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-5005/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-05-02 22:54:10.501
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1af9802 HIVE-15396: Basic Stats are not collected when for 
managed tables with LOCATION specified (Sahil Takiar, reviewed by Pengcheng 
Xiong)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1af9802 HIVE-15396: Basic Stats are not collected when for 
managed tables with LOCATION specified (Sahil Takiar, reviewed by Pengcheng 
Xiong)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-05-02 22:54:11.468
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java:
 No such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestEmbeddedHiveMetaStore.java:
 No such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:
 No such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRemoteHiveMetaStore.java:
 No such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestSetUGIOnOnlyClient.java:
 No such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestSetUGIOnOnlyServer.java:
 No such file or directory
error: a/metastore/if/hive_metastore.thrift: No such file or directory
error: a/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp: No such file 
or directory
error: a/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h: No such file 
or directory
error: 
a/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp: No 
such file or directory
error: a/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp: No such 
file or directory
error: a/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h: No such file 
or directory
error: 
a/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java:
 No such file or directory
error: a/metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php: No 
such file or directory
error: a/metastore/src/gen/thrift/gen-php/metastore/Types.php: No such file or 
directory
error: 
a/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote: No 
such file or directory
error: a/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py: 
No such file or directory
error: a/metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py: No such file 
or directory
error: a/metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb: No such file 
or directory
error: 

[jira] [Commented] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993937#comment-15993937
 ] 

Hive QA commented on HIVE-16552:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866025/HIVE-16552.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10637 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=234)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5004/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5004/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5004/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866025 - PreCommit-HIVE-Build

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.1.patch, HIVE-16552.2.patch, HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2017-05-02 Thread Rajesh Chandramohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993928#comment-15993928
 ] 

Rajesh Chandramohan commented on HIVE-5837:
---

Why we  are insisting on having hive.server2.enable.doAs=false along with 
SQLSTDAuth . We like to support Proxy user executing hive query as well right? 



> SQL standard based secure authorization for hive
> 
>
> Key: HIVE-5837
> URL: https://issues.apache.org/jira/browse/HIVE-5837
> Project: Hive
>  Issue Type: New Feature
>  Components: Authorization, SQLStandardAuthorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: SQL standard authorization hive.pdf
>
>
> The current default authorization is incomplete and not secure. The 
> alternative of storage based authorization provides security but does not 
> provide fine grained authorization.
> The proposal is to support secure fine grained authorization in hive using 
> SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16556) Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES table

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16556:
---
Attachment: HIVE-16556.02.patch

There was a filename conflict after I rebased to latest code. Modified the file 
names accordingly

> Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES 
> table
> 
>
> Key: HIVE-16556
> URL: https://issues.apache.org/jira/browse/HIVE-16556
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16556.01.patch, HIVE-16556.02.patch
>
>
> sub-task to modify schema tool and its related changes so that the new table 
> is added to the schema when schematool initializes or upgrades the schema.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16570:

Attachment: (was: HIVE-16570.patch)

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16570.patch
>
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16570:

Attachment: HIVE-16570.patch

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16570.patch
>
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16567) NPE when reading Parquet file when getting old timestamp configuration

2017-05-02 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993914#comment-15993914
 ] 

Thejas M Nair commented on HIVE-16567:
--

Thanks. LGTM


> NPE when reading Parquet file when getting old timestamp configuration
> --
>
> Key: HIVE-16567
> URL: https://issues.apache.org/jira/browse/HIVE-16567
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Priority: Blocker
> Attachments: HIVE-16567.01.patch
>
>
> In branch-1.2, the file 
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> throws an NPE on line 148:
> {code}
>  boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> when the metadata reference is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16556) Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES table

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993910#comment-15993910
 ] 

Vihang Karajgaonkar commented on HIVE-16556:


[~spena] [~stakiar] Can you also take a look? I tested these scripts using the 
{{metastore-upgrade-test.sh}} for postgres, mysql and derby databases as well. 
The tool currently does not support Oracle and MSSQL databases. I am looking 
into how can I test on those databases as well.

> Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES 
> table
> 
>
> Key: HIVE-16556
> URL: https://issues.apache.org/jira/browse/HIVE-16556
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16556.01.patch
>
>
> sub-task to modify schema tool and its related changes so that the new table 
> is added to the schema when schematool initializes or upgrades the schema.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16570:

Status: Patch Available  (was: Open)

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16570.patch
>
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16570:

Attachment: HIVE-16570.patch

[~gopalv] this restores the ctx to be stored not in the key (I just took the 
patch from the original JIRA and merged with what is in master to preserve 
other differences like the local variable, feel free to put it in e.g. the 
batch if that seems better.
Also got rid of arraycopy.

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16570.patch
>
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16143) Improve msck repair batching

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993905#comment-15993905
 ] 

Vihang Karajgaonkar commented on HIVE-16143:


[~spena] [~stakiar] Can you please review?

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch, HIVE-16143.02.patch, 
> HIVE-16143.03.patch, HIVE-16143.04.patch, HIVE-16143.05.patch, 
> HIVE-16143.06.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16143) Improve msck repair batching

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16143:
---
Attachment: HIVE-16143.06.patch

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch, HIVE-16143.02.patch, 
> HIVE-16143.03.patch, HIVE-16143.04.patch, HIVE-16143.05.patch, 
> HIVE-16143.06.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16570:

Description: Some things could be faster.  (was: Some things could be 
faster. Some things could also be more correct.)

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> Some things could be faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-16570:
---

Assignee: Sergey Shelukhin

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> Some things could be faster



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16570) improve upon HIVE-16523

2017-05-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16570:

Description: Some things could be faster. Some things could also be more 
correct.  (was: Some things could be faster)

> improve upon HIVE-16523
> ---
>
> Key: HIVE-16570
> URL: https://issues.apache.org/jira/browse/HIVE-16570
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> Some things could be faster. Some things could also be more correct.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16143) Improve msck repair batching

2017-05-02 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16143:
---
Attachment: HIVE-16143.05.patch

We should not rely on the msck output strings "Repair: Added partition..." 
 when we diff on q.out files. since we are iterating overing Set in the code 
and the iteration order is not guaranteed. Modified the existing msck repair 
files to use {{show partitions}} command instead.

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch, HIVE-16143.02.patch, 
> HIVE-16143.03.patch, HIVE-16143.04.patch, HIVE-16143.05.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16501) Add rej/orig to .gitignore ; remove *.orig files

2017-05-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993864#comment-15993864
 ] 

Eugene Koifman commented on HIVE-16501:
---

[~kgyrtkirk], [~ashutoshc], why was this done?
It makes it very easy to miss cases where some patch didn't apply cleanly and 
you don't see the .rej files

This doesn't seem like something that should be done globally to affect 
everyone's environemnt

> Add rej/orig to .gitignore ; remove *.orig files
> 
>
> Key: HIVE-16501
> URL: https://issues.apache.org/jira/browse/HIVE-16501
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-16501.1.patch
>
>
> sometimes git reject/orig files made there way into the repo...
> it would be better to just ignore them :)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16566) Set column stats default as true when creating new tables/partitions

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993863#comment-15993863
 ] 

Hive QA commented on HIVE-16566:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866021/HIVE-16566.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10639 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=234)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5003/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5003/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5003/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866021 - PreCommit-HIVE-Build

> Set column stats default as true when creating new tables/partitions
> 
>
> Key: HIVE-16566
> URL: https://issues.apache.org/jira/browse/HIVE-16566
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16566.01.patch, HIVE-16566.02.patch, 
> HIVE-16566.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993860#comment-15993860
 ] 

Jason Dere commented on HIVE-16568:
---

RB at https://reviews.apache.org/r/58934/
[~hagleitn] [~sseth] [~prasanth_j] can you take a look?

> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16568.1.patch
>
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16568:
--
Status: Patch Available  (was: Open)

> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16568.1.patch
>
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16568:
--
Attachment: HIVE-16568.1.patch

> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16568.1.patch
>
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16073) Fix partition column check during runtime filtering

2017-05-02 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16073:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Fix partition column check during runtime filtering
> ---
>
> Key: HIVE-16073
> URL: https://issues.apache.org/jira/browse/HIVE-16073
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16073.1.patch
>
>
> Followup of incorrect partition column check from HIVE-16022.
> Couple things to look at:
> 1. Does this check need to happen at all? Seems like this was a workaround.
> 2. If it is necessary, the logic looked incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16289) add hints for semijoin reduction

2017-05-02 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16289:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> add hints for semijoin reduction
> 
>
> Key: HIVE-16289
> URL: https://issues.apache.org/jira/browse/HIVE-16289
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16289.01.patch, HIVE-16289.patch
>
>
> For now hints will only impact bloom filter size if semijoin is enabled.
> In a follow-up, after some cost-based semi-join decision logic is added, they 
> may also influence it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (HIVE-16073) Fix partition column check during runtime filtering

2017-05-02 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reopened HIVE-16073:
---

> Fix partition column check during runtime filtering
> ---
>
> Key: HIVE-16073
> URL: https://issues.apache.org/jira/browse/HIVE-16073
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16073.1.patch
>
>
> Followup of incorrect partition column check from HIVE-16022.
> Couple things to look at:
> 1. Does this check need to happen at all? Seems like this was a workaround.
> 2. If it is necessary, the logic looked incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15224) replace org.json usage in branch-1 with as minor changes as possible

2017-05-02 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993835#comment-15993835
 ] 

Zoltan Haindrich commented on HIVE-15224:
-

+1 looks good to me

> replace org.json usage in branch-1 with as minor changes as possible
> 
>
> Key: HIVE-15224
> URL: https://issues.apache.org/jira/browse/HIVE-15224
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
> Attachments: HIVE-15224.1-branch-1.patch, 
> HIVE-15224.2-branch-1.patch, HIVE-15224.3-branch-1.patch
>
>
> branch-1 / master have diverged in many ways - StatsCollector have changed; 
> EximUtil supports new replication
> ...so backporting any changes from master would be hard.
> maybe we should use some drop-in replacement like the android one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16567) NPE when reading Parquet file when getting old timestamp configuration

2017-05-02 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993811#comment-15993811
 ] 

Matt McCline edited comment on HIVE-16567 at 5/2/17 9:36 PM:
-

Yes (by guesswork).  The person reporting the issue was trying to read new 
Parquet data written by Spark.


was (Author: mmccline):
Yes (by guesswork)

> NPE when reading Parquet file when getting old timestamp configuration
> --
>
> Key: HIVE-16567
> URL: https://issues.apache.org/jira/browse/HIVE-16567
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Priority: Blocker
> Attachments: HIVE-16567.01.patch
>
>
> In branch-1.2, the file 
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> throws an NPE on line 148:
> {code}
>  boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> when the metadata reference is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16354) Modularization efforts - change some dependencies to smaller client/api modules

2017-05-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16354:

Attachment: allinwonder.3.patch

> Modularization efforts - change some dependencies to smaller client/api 
> modules
> ---
>
> Key: HIVE-16354
> URL: https://issues.apache.org/jira/browse/HIVE-16354
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Server Infrastructure
>Reporter: Zoltan Haindrich
> Attachments: allinwonder.1.patch, allinwonder.2.patch, 
> allinwonder.3.patch
>
>
> in HIVE-16214 I've identified some pieces which might be good to move to new 
> modules...since that I've looked into it a bit more what could be done in 
> this aspect...and to prevent going backward in this path; or get stuck at 
> some point - I would like to be able to propose smaller changes prior to 
> creating any modules...
> The goal here is to remove the unneeded dependencies from the modules which 
> doesn't necessarily need them: the biggest fish in this tank is the {{jdbc}} 
> module, which currently ships with full hiveserver server side + all of the 
> ql codes + the whole metastore (including the jpa persistence libs) - this 
> makes the jdbc driver a really fat jar...
> These changes will also reduce the hive binary distribution size; introducing 
> service-client have reduce it by 20% percent alone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16567) NPE when reading Parquet file when getting old timestamp configuration

2017-05-02 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993811#comment-15993811
 ] 

Matt McCline edited comment on HIVE-16567 at 5/2/17 9:35 PM:
-

Yes (by guesswork)


was (Author: mmccline):
Yes.

> NPE when reading Parquet file when getting old timestamp configuration
> --
>
> Key: HIVE-16567
> URL: https://issues.apache.org/jira/browse/HIVE-16567
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Priority: Blocker
> Attachments: HIVE-16567.01.patch
>
>
> In branch-1.2, the file 
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> throws an NPE on line 148:
> {code}
>  boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> when the metadata reference is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16567) NPE when reading Parquet file when getting old timestamp configuration

2017-05-02 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993811#comment-15993811
 ] 

Matt McCline commented on HIVE-16567:
-

Yes.

> NPE when reading Parquet file when getting old timestamp configuration
> --
>
> Key: HIVE-16567
> URL: https://issues.apache.org/jira/browse/HIVE-16567
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Priority: Blocker
> Attachments: HIVE-16567.01.patch
>
>
> In branch-1.2, the file 
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> throws an NPE on line 148:
> {code}
>  boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> when the metadata reference is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15396:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.0.0, 2.3.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params: 

[jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993805#comment-15993805
 ] 

Sahil Takiar commented on HIVE-15396:
-

I think 3.0 is good for this. Thanks for all the help reviewing this patch!

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.0.0, 2.3.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL 

[jira] [Commented] (HIVE-16567) NPE when reading Parquet file when getting old timestamp configuration

2017-05-02 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993797#comment-15993797
 ] 

Thejas M Nair commented on HIVE-16567:
--

[~mmccline]
Is the assumption that if metadata is null, the data is not produced by hive 
and hence it should skip the conversion ?

( HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.defaultBoolVal is 
the default value in HiveConf.java and hence always true, as long as 
HiveConf.java default remains same).


> NPE when reading Parquet file when getting old timestamp configuration
> --
>
> Key: HIVE-16567
> URL: https://issues.apache.org/jira/browse/HIVE-16567
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Priority: Blocker
> Attachments: HIVE-16567.01.patch
>
>
> In branch-1.2, the file 
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> throws an NPE on line 148:
> {code}
>  boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> when the metadata reference is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993793#comment-15993793
 ] 

Pengcheng Xiong commented on HIVE-15396:


do you want this in branch-2 or 2.3? In my opinion, this is not a bug, but some 
improvement thus I think 3.0 is enough.

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.0.0, 2.3.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []

[jira] [Updated] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15396:
---
Issue Type: Improvement  (was: Bug)

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.0.0, 2.3.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params:  | NULL  

[jira] [Updated] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15396:
---
Affects Version/s: 2.3.0
   2.0.0

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.0.0, 2.3.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params:  | NULL

[jira] [Updated] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15396:
---
Fix Version/s: 3.0.0

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params:  | NULL
>| NULL 

[jira] [Updated] (HIVE-16500) Remove parser references from PrivilegeType

2017-05-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16500:

Attachment: HIVE-16500.1.patch

> Remove parser references from PrivilegeType
> ---
>
> Key: HIVE-16500
> URL: https://issues.apache.org/jira/browse/HIVE-16500
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Server Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16500.1.patch, HIVE-16500.1.patch
>
>
> the authorization uses {{PrivilegeType}}, but that shouldn't depend on parser 
> tokens
> https://github.com/apache/hive/blob/ff67cdda1c538dc65087878eeba3e165cf3230f4/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/PrivilegeType.java#L31



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993789#comment-15993789
 ] 

Pengcheng Xiong commented on HIVE-15396:


pushed to master. thanks for your hard work!

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params:  | NULL

[jira] [Commented] (HIVE-16355) Service: embedded mode should only be available if service is loaded onto the classpath

2017-05-02 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993786#comment-15993786
 ] 

Zoltan Haindrich commented on HIVE-16355:
-

[~vgumashta], [~thejas]: could you please take a look at these changes?

> Service: embedded mode should only be available if service is loaded onto the 
> classpath
> ---
>
> Key: HIVE-16355
> URL: https://issues.apache.org/jira/browse/HIVE-16355
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Server Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16355.1.patch, HIVE-16355.2.patch, 
> HIVE-16355.2.patch, HIVE-16355.3.patch
>
>
> I would like to relax the hard reference to 
> {{EmbeddedThriftBinaryCLIService}} to be only used in case {{service}} module 
> is loaded onto the classpath.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16416) Service: move constants out from HiveAuthFactory

2017-05-02 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993785#comment-15993785
 ] 

Zoltan Haindrich commented on HIVE-16416:
-

[~vgumashta], [~thejas]: could you please take a look at these changes?

> Service: move constants out from HiveAuthFactory
> 
>
> Key: HIVE-16416
> URL: https://issues.apache.org/jira/browse/HIVE-16416
> Project: Hive
>  Issue Type: Sub-task
>  Components: Server Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16416.1.patch
>
>
> It took me a while to notice that there are only some constants which are 
> keep pulling in this class :)
> it contains a tricky dependency to the whole ql module; but in client mode 
> that part is totally unused - moving the constants out from it, enables the 
> client to operate without the factory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993784#comment-15993784
 ] 

Pengcheng Xiong commented on HIVE-15396:


good thanks!

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params:  | NULL
>| NULL   

[jira] [Updated] (HIVE-16562) Issues with nullif / fetch task

2017-05-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16562:

Attachment: HIVE-16562.1.patch

it seemed redundant to check and optionally cast arg[0]'s type...but it seems 
like that can't be left out...since the lazy types are a bit different.

patch#1: add missing conversion for arg0

> Issues with nullif / fetch task
> ---
>
> Key: HIVE-16562
> URL: https://issues.apache.org/jira/browse/HIVE-16562
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16562.1.patch
>
>
> HIVE-13555 adds support for nullif. I'm encountering issues with nullif on 
> master (3.0.0-SNAPSHOT rdac3786d86462e4d08d62d23115e6b7a3e534f5d)
> Cluster side jobs work fine but client side don't.
> Consider these two tables:
> e011_02:
> Columns c1 = float, c2 = double
> 1.0   1.0
> 1.5   1.5
> 2.0   2.0
> test:
> Columns c1 = int, c2 = int
> Data:
> 1 1
> 2 2
> And this query:
> select nullif(c1, c2) from e011_02;
> With e011_02 I get:
> {code}
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> NULLIF(c1,c2)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   ... 13 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.LazyFloat cannot be cast to 
> org.apache.hadoop.io.FloatWritable
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloatObjectInspector.java:36)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.comparePrimitiveObjects(PrimitiveObjectInspectorUtils.java:412)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFNullif.evaluate(GenericUDFNullif.java:93)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   ... 18 more
> {code}
> With 
> select nullif(c1, c2) from test;
> I get:
> {code}
> 2017-05-01T03:32:19,905 ERROR [cbaf5380-5b06-4531-aeb9-524c62314a46 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at 

[jira] [Updated] (HIVE-16562) Issues with nullif / fetch task

2017-05-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16562:

Status: Patch Available  (was: Open)

> Issues with nullif / fetch task
> ---
>
> Key: HIVE-16562
> URL: https://issues.apache.org/jira/browse/HIVE-16562
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16562.1.patch
>
>
> HIVE-13555 adds support for nullif. I'm encountering issues with nullif on 
> master (3.0.0-SNAPSHOT rdac3786d86462e4d08d62d23115e6b7a3e534f5d)
> Cluster side jobs work fine but client side don't.
> Consider these two tables:
> e011_02:
> Columns c1 = float, c2 = double
> 1.0   1.0
> 1.5   1.5
> 2.0   2.0
> test:
> Columns c1 = int, c2 = int
> Data:
> 1 1
> 2 2
> And this query:
> select nullif(c1, c2) from e011_02;
> With e011_02 I get:
> {code}
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> NULLIF(c1,c2)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   ... 13 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.LazyFloat cannot be cast to 
> org.apache.hadoop.io.FloatWritable
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector.get(WritableFloatObjectInspector.java:36)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.comparePrimitiveObjects(PrimitiveObjectInspectorUtils.java:412)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFNullif.evaluate(GenericUDFNullif.java:93)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   ... 18 more
> {code}
> With 
> select nullif(c1, c2) from test;
> I get:
> {code}
> 2017-05-01T03:32:19,905 ERROR [cbaf5380-5b06-4531-aeb9-524c62314a46 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating NULLIF(c1,c2)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2177)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> 

[jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993772#comment-15993772
 ] 

Sahil Takiar commented on HIVE-15396:
-

[~pxiong] test failures are unrelated:

* HIVE-16569 - TestAccumuloCliDriver.testCliDriver[accumulo_index]
* HIVE-15169 - 
TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []

[jira] [Updated] (HIVE-13583) E061-14: Search Conditions

2017-05-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13583:

Status: Patch Available  (was: Open)

> E061-14: Search Conditions
> --
>
> Key: HIVE-13583
> URL: https://issues.apache.org/jira/browse/HIVE-13583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13583.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Support for various forms of search conditions are mandatory in the SQL 
> standard. For example, " is not true;" Hive should support those 
> forms mandated by the standard.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-13583) E061-14: Search Conditions

2017-05-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13583:

Attachment: HIVE-13583.1.patch

patch#1)

* removed: TOK_ISNULL / TOK_ISNOTNULL references...and mappings which looked 
redundant
* added {{is (not) [true|false]}} functions/etc
* I've left out vectorization

> E061-14: Search Conditions
> --
>
> Key: HIVE-13583
> URL: https://issues.apache.org/jira/browse/HIVE-13583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13583.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Support for various forms of search conditions are mandatory in the SQL 
> standard. For example, " is not true;" Hive should support those 
> forms mandated by the standard.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16568) Support complex types in external LLAP InputFormat

2017-05-02 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-16568:
-


> Support complex types in external LLAP InputFormat
> --
>
> Key: HIVE-16568
> URL: https://issues.apache.org/jira/browse/HIVE-16568
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Currently just supports primitive types



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-05-02 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993755#comment-15993755
 ] 

Pengcheng Xiong commented on HIVE-16047:


Glad that [~spena] also copied the information from mailing list here to the 
authors. I am OK with both the solutions, i.e., reverting the patch or keeping 
the patch. Please discuss and make an agreement. Thanks.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-05-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993736#comment-15993736
 ] 

Hive QA commented on HIVE-15396:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12866017/HIVE-15396.8.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10634 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=155)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5002/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5002/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5002/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12866017 - PreCommit-HIVE-Build

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch, 
> HIVE-15396.6.patch, HIVE-15396.7.patch, HIVE-15396.8.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL   

[jira] [Updated] (HIVE-16550) Semijoin Hints should be able to skip the optimization if needed.

2017-05-02 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16550:
--
Attachment: HIVE-16550.2.patch

Implemented review comments.

> Semijoin Hints should be able to skip the optimization if needed.
> -
>
> Key: HIVE-16550
> URL: https://issues.apache.org/jira/browse/HIVE-16550
> Project: Hive
>  Issue Type: Improvement
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-16550.1.patch, HIVE-16550.2.patch
>
>
> Currently semi join hints are designed to enforce a particular semi join, 
> however, it should also be able to skip the optimization all together in a 
> query using hints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16534) Add capability to tell aborted transactions apart from open transactions in ValidTxnList

2017-05-02 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-16534:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed patch 7 to master. Thanks Eugene for the review.

> Add capability to tell aborted transactions apart from open transactions in 
> ValidTxnList
> 
>
> Key: HIVE-16534
> URL: https://issues.apache.org/jira/browse/HIVE-16534
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 3.0.0
>
> Attachments: HIVE-16534.1.patch, HIVE-16534.2.patch, 
> HIVE-16534.3.patch, HIVE-16534.4.patch, HIVE-16534.5.patch, 
> HIVE-16534.6.patch, HIVE-16534.7.patch
>
>
> Currently in ValidReadTxnList, open transactions and aborted transactions are 
> stored together in one array. That makes it impossible to extract just 
> aborted transactions or open transactions.
> For ValidCompactorTxnList this is fine, since we only store aborted 
> transactions but no open transactions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16567) NPE when reading Parquet file when getting old timestamp configuration

2017-05-02 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16567:

Attachment: HIVE-16567.01.patch

> NPE when reading Parquet file when getting old timestamp configuration
> --
>
> Key: HIVE-16567
> URL: https://issues.apache.org/jira/browse/HIVE-16567
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Priority: Blocker
> Attachments: HIVE-16567.01.patch
>
>
> In branch-1.2, the file 
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> throws an NPE on line 148:
> {code}
>  boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> when the metadata reference is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >