[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142481#comment-15142481
 ] 

Hive QA commented on HIVE-12963:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787156/HIVE-12963.4.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9758 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-ppd_union.q-udf_var_samp.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_limit_extrastep
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_extrastep
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_limit_extrastep
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown_extrastep
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_pushdown
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6940/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6940/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6940/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787156 - PreCommit-HIVE-TRUNK-Build

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-02-11 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-12749:

Attachment: HIVE-12749.2.patch

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144185#comment-15144185
 ] 

Hive QA commented on HIVE-13039:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787599/HIVE-13039.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9762 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.io.parquet.TestParquetRecordReaderWrapper.testBuilder
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression3
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression5
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6952/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6952/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6952/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787599 - PreCommit-HIVE-TRUNK-Build

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13033) SPDO unnecessarily duplicates columns in key & value of mapper output

2016-02-11 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144208#comment-15144208
 ] 

Ashutosh Chauhan commented on HIVE-13033:
-

[~prasanth_j] Patch is ready for review. Failures are golden file updates for 
tez cli driver as per cli driver counterpart.

> SPDO unnecessarily duplicates columns in key & value of mapper output
> -
>
> Key: HIVE-13033
> URL: https://issues.apache.org/jira/browse/HIVE-13033
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13033.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13033) SPDO unnecessarily duplicates columns in key & value of mapper output

2016-02-11 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144231#comment-15144231
 ] 

Prasanth Jayachandran commented on HIVE-13033:
--

LGTM, +1

> SPDO unnecessarily duplicates columns in key & value of mapper output
> -
>
> Key: HIVE-13033
> URL: https://issues.apache.org/jira/browse/HIVE-13033
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13033.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-02-11 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-12749:

Assignee: Aleksey Vovchenko  (was: Oleksiy Sayankin)

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Aleksey Vovchenko
> Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13028) Remove javadoc plugin from webhcat

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142816#comment-15142816
 ] 

Hive QA commented on HIVE-13028:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787195/HIVE-13028.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9753 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHadoopVersion
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getPigVersion
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getStatus
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.invalidPath
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6942/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6942/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6942/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787195 - PreCommit-HIVE-TRUNK-Build

> Remove javadoc plugin from webhcat
> --
>
> Key: HIVE-13028
> URL: https://issues.apache.org/jira/browse/HIVE-13028
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13028.patch
>
>
> Webhcat has about 3 million javadoc errors that nobody, presumably, cares 
> about. It also has its very own javadoc section in the pom that causes mvn 
> deploy to fail with 3 million javadoc errors, even when maven.javadoc.skip is 
> true. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142854#comment-15142854
 ] 

Yongzhi Chen commented on HIVE-13039:
-

[~spena], could you review the change? Thanks

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13043) Reload function has no impact to function registry

2016-02-11 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-13043:
---
Attachment: HIVE-13043.1.patch

Attached a simple fix so that "reload function" is handled as a SQL operation 
similar to create/drop function, instead of a reload command.

> Reload function has no impact to function registry
> --
>
> Key: HIVE-13043
> URL: https://issues.apache.org/jira/browse/HIVE-13043
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-13043.1.patch
>
>
> With HIVE-2573, users should run "reload function" to refresh cached function 
> registry. However, "reload function" has no impact at all. We need to fix 
> this. Otherwise, HS2 needs to be restarted to see new global functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143395#comment-15143395
 ] 

Sergey Shelukhin edited comment on HIVE-12749 at 2/11/16 8:01 PM:
--

This seems to disable constant propagation pretty much everywhere in the q 
files.
1) It seems like it should be safe to compare string with string. E.g. in 
input_part4 "WHERE x.ds = '2008-04-08'" stopped triggering the optimization, 
even though ds is string.
2) At least the q files that purport to test constant propagation 
(constprog-something, cbo_const, etc.) need to be updated to change the types 
so it still triggers.


was (Author: sershe):
This seems to disable constant propagation pretty much everywhere in the q 
files.
1) It seems like it should be safe to compare string with string. E.g. in 
input_part4 WHERE x.ds = '2008-04-08' stopped being propagated, even though ds 
is string.
2) At least the q files that purport to test constant propagation 
(constprog-something, cbo_const, etc.) need to be updated to change the types 
so it still triggers.

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8214) Release 0.13.1 missing hwi-war file

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143357#comment-15143357
 ] 

Hive QA commented on HIVE-8214:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12680806/HIVE-8214.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9769 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6945/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6945/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6945/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12680806 - PreCommit-HIVE-TRUNK-Build

> Release 0.13.1 missing hwi-war file
> ---
>
> Key: HIVE-8214
> URL: https://issues.apache.org/jira/browse/HIVE-8214
> Project: Hive
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 0.13.1
>Reporter: Naimdjon Takhirov
>Priority: Minor
>  Labels: HIVE-8214.1.patch, branch-0.14, trunk
> Attachments: HIVE-8214.1.patch, HIVE-8214.2.patch
>
>
> Starting the Hive with --service hwi option:
> $opt/hive/latest: hive --service hwi
> ls: /opt/hive/latest/lib/hive-hwi-*.war: No such file or directory
> 14/09/22 11:43:46 INFO hwi.HWIServer: HWI is starting up
> 14/09/22 11:43:46 INFO mortbay.log: Logging to 
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via 
> org.mortbay.log.Slf4jLog
> 14/09/22 11:43:46 INFO mortbay.log: jetty-6.1.26
> 14/09/22 11:43:47 INFO mortbay.log: Started SocketConnector@0.0.0.0:
> When navigating to localhost:, it just shows the directory index. Looking 
> at the distribution, the war file is missing in the lib directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143391#comment-15143391
 ] 

Hive QA commented on HIVE-10468:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787494/HIVE-10468.9.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 42 tests executed
*Failed tests:*
{noformat}
Test failed: mysql/upgrade-1.2.0-to-2.0.0.mysql.sql
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/110/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/110/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-110/

Messages:
{noformat}
LXC derby found.
LXC derby is not started. Starting container...
Container started.
Preparing derby container...
Container prepared.
Calling /hive/testutils/metastore/dbs/derby/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/derby/execute.sh ...
Tests executed.
LXC mysql found.
LXC mysql is not started. Starting container...
Container started.
Preparing mysql container...
Container prepared.
Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/mysql/execute.sh ...
Test failed: mysql/upgrade-1.2.0-to-2.0.0.mysql.sql
Tests executed.
LXC oracle found.
LXC oracle is not started. Starting container...
Container started.
Preparing oracle container...
Container prepared.
Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/oracle/execute.sh ...
Tests executed.
LXC postgres found.
LXC postgres is not started. Starting container...
Container started.
Preparing postgres container...
Container prepared.
Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ...
Server prepared.
Calling /hive/testutils/metastore/dbs/postgres/execute.sh ...
Tests executed.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787494 - PreCommit-HIVE-METASTORE-Test

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.2.patch, 
> HIVE-10468.3.patch, HIVE-10468.4.patch, HIVE-10468.5.patch, 
> HIVE-10468.6.patch, HIVE-10468.7.patch, HIVE-10468.9.patch, 
> HIVE-10468.9.patch, HIVE-10468.9.patch, HIVE-10468.9.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13039:

Attachment: HIVE-13039.2.patch

Thanks [~spena] for reviewing the code, I attach patch 2 with the new tests. 

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12998) ORC FileDump.printJsonData() does not close RecordReader

2016-02-11 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143443#comment-15143443
 ] 

Gopal V commented on HIVE-12998:


Pushed to master, signed-off

> ORC FileDump.printJsonData() does not close RecordReader
> 
>
> Key: HIVE-12998
> URL: https://issues.apache.org/jira/browse/HIVE-12998
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 2.1.0
>
> Attachments: HIVE-12998.1.patch, HIVE-12998.2.patch
>
>
> This causes TestFileDump to fail on Windows, because the test ORC file does 
> not get deleted between tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13044) Enable TLS encryption to HMS backend database

2016-02-11 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13044:

Description: When the database like mysql enables TLS/SSL encryption, we 
should provide some configuration properties like the ones to HS2 to enable 
that. Right now, I think we can enable that through javaopts and connection 
url.  (was: When the database like mysql enables TLS/SSL encryption, we should 
provide some configuration properties like the ones to HS2 to enable that. )

> Enable TLS encryption to HMS backend database
> -
>
> Key: HIVE-13044
> URL: https://issues.apache.org/jira/browse/HIVE-13044
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> When the database like mysql enables TLS/SSL encryption, we should provide 
> some configuration properties like the ones to HS2 to enable that. Right now, 
> I think we can enable that through javaopts and connection url.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler

2016-02-11 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143336#comment-15143336
 ] 

Alan Gates commented on HIVE-13013:
---

This will take me a bit to comb through completely as the changes are pretty in 
depth, but one piece of early feedback is that it would be useful to 
distinguish the new lock() and unlock() functions with names like derbyLock() 
or localLock() or something since there's multiple types of locks floating 
around here.  

> Further Improve concurrency in TxnHandler
> -
>
> Key: HIVE-13013
> URL: https://issues.apache.org/jira/browse/HIVE-13013
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13013.patch
>
>
> There are still a few operations in TxnHandler that run at Serializable 
> isolation.
> Most or all of them can be dropped to READ_COMMITTED now that we have SELECT 
> ... FOR UPDATE support.  This will reduce number of deadlocks in the DBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143395#comment-15143395
 ] 

Sergey Shelukhin commented on HIVE-12749:
-

This seems to disable constant propagation pretty much everywhere in the q 
files.
1) It seems like it should be safe to compare string with string. E.g. in 
input_part4 WHERE x.ds = '2008-04-08' stopped being propagated, even though ds 
is string.
2) At least the q files that purport to test constant propagation 
(constprog-something, cbo_const, etc.) need to be updated to change the types 
so it still triggers.

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Attachments: HIVE-12749.1.patch, HIVE-12749.2.patch
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12998) ORC FileDump.printJsonData() does not close RecordReader

2016-02-11 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12998:
---
Fix Version/s: 2.1.0

> ORC FileDump.printJsonData() does not close RecordReader
> 
>
> Key: HIVE-12998
> URL: https://issues.apache.org/jira/browse/HIVE-12998
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 2.1.0
>
> Attachments: HIVE-12998.1.patch, HIVE-12998.2.patch
>
>
> This causes TestFileDump to fail on Windows, because the test ORC file does 
> not get deleted between tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13017) Child process of HiveServer2 fails to get delegation token from non default FileSystem

2016-02-11 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13017:

Attachment: HIVE-13017.3.patch

Updated,  mvn checkstyle:checkstyle-aggregate shows no issues now.

> Child process of HiveServer2 fails to get delegation token from non default 
> FileSystem
> --
>
> Key: HIVE-13017
> URL: https://issues.apache.org/jira/browse/HIVE-13017
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Affects Versions: 1.2.1
> Environment: Secure 
>Reporter: Takahiko Saito
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13017.2.patch, HIVE-13017.3.patch, HIVE-13017.patch
>
>
> The following query fails, when Azure Filesystem is used as default file 
> system, and HDFS is used for intermediate data.
> {noformat}
> >>>  create temporary table s10k stored as orc as select * from studenttab10k;
> >>>  create temporary table v10k as select * from votertab10k;
> >>>  select registration 
> from s10k s join v10k v 
> on (s.name = v.name) join studentparttab30k p 
> on (p.name = v.name) 
> where s.age < 25 and v.age < 25 and p.age < 25;
> ERROR : Execution failed with exit status: 2
> ERROR : Obtaining error information
> ERROR : 
> Task failed!
> Task ID:
>   Stage-5
> Logs:
> ERROR : /var/log/hive/hiveServer2.log
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask (state=08S01,code=2)
> Aborting command set because "force" is false and command failed: "select 
> registration 
> from s10k s join v10k v 
> on (s.name = v.name) join studentparttab30k p 
> on (p.name = v.name) 
> where s.age < 25 and v.age < 25 and p.age < 25;"
> Closing: 0: 
> jdbc:hive2://zk2-hs21-h.hdinsight.net:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@hdinsight.net;transportMode=http;httpPath=cliservice
> hiveServer2.log shows:
> 2016-02-02 18:04:34,182 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,199 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,212 INFO  [HiveServer2-HttpHandler-Pool: Thread-55]: 
> thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(127)) - Could not 
> validate cookie sent, will try to generate a new cookie
> 2016-02-02 18:04:34,213 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> ql.Driver (Driver.java:checkConcurrency(168)) - Concurrency mode is disabled, 
> not creating a lock manager
> 2016-02-02 18:04:34,219 INFO  [HiveServer2-HttpHandler-Pool: Thread-55]: 
> thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(352)) - 
> Failed to authenticate with http/_HOST kerberos principal, trying with 
> hive/_HOST kerberos principal
> 2016-02-02 18:04:34,219 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,225 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> ql.Driver (Driver.java:execute(1390)) - Setting caller context to query id 
> hive_20160202180429_76ab-64d6-4c89-88b0-6355cc5acbd0
> 2016-02-02 18:04:34,226 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> ql.Driver (Driver.java:execute(1393)) - Starting 
> command(queryId=hive_20160202180429_76ab-64d6-4c89-88b0-6355cc5acbd0): 
> select registration
> from s10k s join v10k v
> on (s.name = v.name) join studentparttab30k p
> on (p.name = v.name)
> where s.age < 25 and v.age < 25 and p.age < 25
> 2016-02-02 18:04:34,228 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> hooks.ATSHook (ATSHook.java:(90)) - Created ATS Hook
> 2016-02-02 18:04:34,229 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=PreHook.org.apache.hadoop.hive.ql.hooks.ATSHook 
> from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,237 INFO  [HiveServer2-HttpHandler-Pool: Thread-55]: 
> thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(169)) - Cookie added 
> for clientUserName hrt_qa
> 2016-02-02 18:04:34,238 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) -  method=PreHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1454436274229 
> end=1454436274238 duration=9 from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,239 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  

[jira] [Commented] (HIVE-13043) Reload function has no impact to function registry

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143305#comment-15143305
 ] 

Sergey Shelukhin commented on HIVE-13043:
-

I remember seeing a reload path in FunctionTask(?) that didn't look like it 
could trigger. Does this make the command go to that path? 

> Reload function has no impact to function registry
> --
>
> Key: HIVE-13043
> URL: https://issues.apache.org/jira/browse/HIVE-13043
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-13043.1.patch
>
>
> With HIVE-2573, users should run "reload function" to refresh cached function 
> registry. However, "reload function" has no impact at all. We need to fix 
> this. Otherwise, HS2 needs to be restarted to see new global functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13043) Reload function has no impact to function registry

2016-02-11 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143412#comment-15143412
 ] 

Jimmy Xiang commented on HIVE-13043:


Yes, this make the command go to the FunctionTask path and trigger the 
Hive::reloadFunction call.

> Reload function has no impact to function registry
> --
>
> Key: HIVE-13043
> URL: https://issues.apache.org/jira/browse/HIVE-13043
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-13043.1.patch
>
>
> With HIVE-2573, users should run "reload function" to refresh cached function 
> registry. However, "reload function" has no impact at all. We need to fix 
> this. Otherwise, HS2 needs to be restarted to see new global functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13041) Backport to branch-1 HIVE-9862 Vectorized execution corrupts timestamp values

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142490#comment-15142490
 ] 

Hive QA commented on HIVE-13041:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787430/HIVE-13041.2-branch1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BRANCH_1-Build/24/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BRANCH_1-Build/24/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-BRANCH_1-Build-24/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[WARNING] 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/tmp/conf
 [copy] Copying 11 files to 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/spark-client-1.3.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/target/spark-client-1.3.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/1.3.0-SNAPSHOT/spark-client-1.3.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-branch1-source/spark-client/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/1.3.0-SNAPSHOT/spark-client-1.3.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 1.3.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-github-branch1-source/ql/target
[INFO] Deleting /data/hive-ptest/working/apache-github-branch1-source/ql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-branch1-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-branch1-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-branch1-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-branch1-source/ql/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 

[jira] [Commented] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142622#comment-15142622
 ] 

Hive QA commented on HIVE-10632:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787175/HIVE-10632.3.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9770 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveAuthorizerShowFilters.org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveAuthorizerShowFilters
org.apache.hadoop.hive.ql.txn.compactor.TestCleaner.droppedPartition
org.apache.hadoop.hive.ql.txn.compactor.TestCleaner.droppedTable
org.apache.hadoop.hive.ql.txn.compactor.TestCleaner2.droppedPartition
org.apache.hadoop.hive.ql.txn.compactor.TestCleaner2.droppedTable
org.apache.hadoop.hive.ql.txn.compactor.TestWorker.droppedPartition
org.apache.hadoop.hive.ql.txn.compactor.TestWorker.droppedTable
org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.droppedPartition
org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.droppedTable
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.testPigPopulation
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testConnectionMismatch
org.apache.hive.jdbc.TestSSL.testInvalidConfig
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithURL
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6941/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6941/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6941/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787175 - PreCommit-HIVE-TRUNK-Build

> Make sure TXN_COMPONENTS gets cleaned up if table is dropped before 
> compaction.
> ---
>
> Key: HIVE-10632
> URL: https://issues.apache.org/jira/browse/HIVE-10632
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-10632.1.patch, HIVE-10632.2.patch, 
> HIVE-10632.3.patch
>
>
> The compaction process will clean up entries in  TXNS, 
> COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS.  If the table/partition is dropped 
> before compaction is complete there will be data left in these tables.  Need 
> to investigate if there are other situations where this may happen and 
> address it.
> see HIVE-10595 for additional info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2016-02-11 Thread Akshay Goyal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142658#comment-15142658
 ] 

Akshay Goyal commented on HIVE-4570:


[~cwsteinbach] Review request: https://reviews.apache.org/r/42134/

> More information to user on GetOperationStatus in Hive Server2 when query is 
> still executing
> 
>
> Key: HIVE-4570
> URL: https://issues.apache.org/jira/browse/HIVE-4570
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
> Attachments: HIVE-4570.01.patch, HIVE-4570.02.patch, 
> HIVE-4570.03.patch
>
>
> Currently in Hive Server2, when the query is still executing only the status 
> is set as STILL_EXECUTING. 
> This issue is to give more information to the user such as progress and 
> running job handles, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12994) Implement support for NULLS FIRST/NULLS LAST

2016-02-11 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12994:
---
Attachment: HIVE-12994.04.patch

Rebasing patch.

> Implement support for NULLS FIRST/NULLS LAST
> 
>
> Key: HIVE-12994
> URL: https://issues.apache.org/jira/browse/HIVE-12994
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Metastore, Parser, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12994.01.patch, HIVE-12994.02.patch, 
> HIVE-12994.03.patch, HIVE-12994.04.patch, HIVE-12994.patch
>
>
> From SQL:2003, the NULLS FIRST and NULLS LAST options can be used to 
> determine whether nulls appear before or after non-null data values when the 
> ORDER BY clause is used.
> SQL standard does not specify the behavior by default. Currently in Hive, 
> null values sort as if lower than any non-null value; that is, NULLS FIRST is 
> the default for ASC order, and NULLS LAST for DESC order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11509) getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE column

2016-02-11 Thread David Zanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Zanter updated HIVE-11509:

Affects Version/s: 1.2.1

> getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE 
> column
> -
>
> Key: HIVE-11509
> URL: https://issues.apache.org/jira/browse/HIVE-11509
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.0.0, 1.1.0, 1.2.1
>Reporter: reena upadhyay
>Priority: Critical
>
> I am executing a simple select query on table with uniontype column. Hive 
> already has a support for creating table with uniontype data type. Now when I 
> am trying to fetch the column type information using  getColumnTypeName(i+1) 
> method of ResultSetMetaData using hive-jdbc 1.0.0 version, getting below 
> exception:
> 2015-08-10 16:00:07 ERROR ExecuteStatementOperation:114 - 
> java.sql.SQLException: Unrecognized column type: UNIONTYPE
>   at 
> org.apache.hive.jdbc.JdbcColumn.getColumnTypeName(JdbcColumn.java:185)
>   at 
> org.apache.hive.jdbc.HiveResultSetMetaData.getColumnTypeName(HiveResultSetMetaData.java:78)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11509) getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE column

2016-02-11 Thread David Zanter (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142758#comment-15142758
 ] 

David Zanter commented on HIVE-11509:
-

And also Hive 1.2.1 (hortonworks 2.3)

> getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE 
> column
> -
>
> Key: HIVE-11509
> URL: https://issues.apache.org/jira/browse/HIVE-11509
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.0.0, 1.1.0, 1.2.1
>Reporter: reena upadhyay
>Priority: Critical
>
> I am executing a simple select query on table with uniontype column. Hive 
> already has a support for creating table with uniontype data type. Now when I 
> am trying to fetch the column type information using  getColumnTypeName(i+1) 
> method of ResultSetMetaData using hive-jdbc 1.0.0 version, getting below 
> exception:
> 2015-08-10 16:00:07 ERROR ExecuteStatementOperation:114 - 
> java.sql.SQLException: Unrecognized column type: UNIONTYPE
>   at 
> org.apache.hive.jdbc.JdbcColumn.getColumnTypeName(JdbcColumn.java:185)
>   at 
> org.apache.hive.jdbc.HiveResultSetMetaData.getColumnTypeName(HiveResultSetMetaData.java:78)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11509) getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE column

2016-02-11 Thread David Zanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Zanter updated HIVE-11509:

Affects Version/s: 1.1.0

> getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE 
> column
> -
>
> Key: HIVE-11509
> URL: https://issues.apache.org/jira/browse/HIVE-11509
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.0.0, 1.1.0
>Reporter: reena upadhyay
>Priority: Critical
>
> I am executing a simple select query on table with uniontype column. Hive 
> already has a support for creating table with uniontype data type. Now when I 
> am trying to fetch the column type information using  getColumnTypeName(i+1) 
> method of ResultSetMetaData using hive-jdbc 1.0.0 version, getting below 
> exception:
> 2015-08-10 16:00:07 ERROR ExecuteStatementOperation:114 - 
> java.sql.SQLException: Unrecognized column type: UNIONTYPE
>   at 
> org.apache.hive.jdbc.JdbcColumn.getColumnTypeName(JdbcColumn.java:185)
>   at 
> org.apache.hive.jdbc.HiveResultSetMetaData.getColumnTypeName(HiveResultSetMetaData.java:78)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11509) getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE column

2016-02-11 Thread David Zanter (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142732#comment-15142732
 ] 

David Zanter commented on HIVE-11509:
-

I'm also able to reproduce this on Hive 1.1.0 (cloudera 5.4)

> getColumnTypeName method of ResultSetMetaData does not works for UNIONTYPE 
> column
> -
>
> Key: HIVE-11509
> URL: https://issues.apache.org/jira/browse/HIVE-11509
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.0.0, 1.1.0
>Reporter: reena upadhyay
>Priority: Critical
>
> I am executing a simple select query on table with uniontype column. Hive 
> already has a support for creating table with uniontype data type. Now when I 
> am trying to fetch the column type information using  getColumnTypeName(i+1) 
> method of ResultSetMetaData using hive-jdbc 1.0.0 version, getting below 
> exception:
> 2015-08-10 16:00:07 ERROR ExecuteStatementOperation:114 - 
> java.sql.SQLException: Unrecognized column type: UNIONTYPE
>   at 
> org.apache.hive.jdbc.JdbcColumn.getColumnTypeName(JdbcColumn.java:185)
>   at 
> org.apache.hive.jdbc.HiveResultSetMetaData.getColumnTypeName(HiveResultSetMetaData.java:78)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13013) Further Improve concurrency in TxnHandler

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142966#comment-15142966
 ] 

Hive QA commented on HIVE-13013:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787206/HIVE-13013.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9769 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.txn.TestTxnHandlerNegative.testBadConnection
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6943/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6943/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6943/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787206 - PreCommit-HIVE-TRUNK-Build

> Further Improve concurrency in TxnHandler
> -
>
> Key: HIVE-13013
> URL: https://issues.apache.org/jira/browse/HIVE-13013
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13013.patch
>
>
> There are still a few operations in TxnHandler that run at Serializable 
> isolation.
> Most or all of them can be dropped to READ_COMMITTED now that we have SELECT 
> ... FOR UPDATE support.  This will reduce number of deadlocks in the DBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142910#comment-15142910
 ] 

Sergio Peña commented on HIVE-13039:


Thanks [~ychena]. The patch looks good. 
Could you add some test cases to {{TestParquetFilterPredicate}}  as well? It 
would be great if you can include different examples, like {{BETWEEN 5 and 1}}, 
{{BETWEEN 1 and 5}}, {{BETWEEN 1 and 1}} to see how the Filter predicate would 
be.

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13004) Remove encryption shims

2016-02-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142930#comment-15142930
 ] 

Sergio Peña commented on HIVE-13004:


[~ashutoshc] I think we should leave some shim support for encryption as HDFS 
introduced encryption on 2.6.0. If a user runs HDFS 2.5, then Hive will fail 
because the encryption API does not exist there.

> Remove encryption shims
> ---
>
> Key: HIVE-13004
> URL: https://issues.apache.org/jira/browse/HIVE-13004
> Project: Hive
>  Issue Type: Task
>  Components: Encryption
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13004.patch
>
>
> It has served its purpose. Now that we don't support hadoop-1, its no longer 
> needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12064) prevent transactional=false

2016-02-11 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12064:
-
Target Version/s: 1.3.0, 2.1.0

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-12064.2.patch, HIVE-12064.3.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-11 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12935:
-
Attachment: HIVE-12935.3.patch

Patch rebased. Also restored UUID for worker identity.

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12965) Insert overwrite local directory should perserve the overwritten directory permission

2016-02-11 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143573#comment-15143573
 ] 

Chaoyu Tang commented on HIVE-12965:


[~xuefuz] I uploaded a patch in RB which fixed the test failure. Could you take 
a look? Thanks

> Insert overwrite local directory should perserve the overwritten directory 
> permission
> -
>
> Key: HIVE-12965
> URL: https://issues.apache.org/jira/browse/HIVE-12965
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-12965.1.patch, HIVE-12965.2.patch, 
> HIVE-12965.3.patch, HIVE-12965.patch
>
>
> In Hive, "insert overwrite local directory" first deletes the overwritten 
> directory if exists, recreate a new one, then copy the files from src 
> directory to the new local directory. This process sometimes changes the 
> permissions of the to-be-overwritten local directory, therefore causing some 
> applications no more to be able to access its content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12935) LLAP: Replace Yarn registry with Zookeeper registry

2016-02-11 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143729#comment-15143729
 ] 

Gopal V commented on HIVE-12935:


Tested this on the cluster for a bit of scale testing and it flexes up/down 
fine.

I'm chaining this after HIVE-12967, so the LlapConfiguration class disappears 
(but easy to fix).

+1 after the UUID change, but I'll rebase after the config changes go in.

[~sseth] left comments on RB on chaining the listeners for the registry, the 
thread-safety for callbacks & possibly about race conditions when an instance 
dies, while a task is still scheduled on it.

The API chaining & splitting up the conf out HS2 (hive.llap.zookeeper.*) can be 
new JIRAs, but the race-conditions/thread-safety need a closer look before this 
goes in tomorrow.

> LLAP: Replace Yarn registry with Zookeeper registry
> ---
>
> Key: HIVE-12935
> URL: https://issues.apache.org/jira/browse/HIVE-12935
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: 12935.1.patch, HIVE-12935.2.patch, HIVE-12935.3.patch
>
>
> Existing YARN registry service for cluster membership has to depend on 
> refresh intervals to get the list of instances/daemons that are running in 
> the cluster. Better approach would be replace it with zookeeper based 
> registry service so that custom listeners can be added to update healthiness 
> of daemons in the cluster.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12558) LLAP: output QueryFragmentCounters somewhere

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143840#comment-15143840
 ] 

Sergey Shelukhin commented on HIVE-12558:
-

+1

> LLAP: output QueryFragmentCounters somewhere
> 
>
> Key: HIVE-12558
> URL: https://issues.apache.org/jira/browse/HIVE-12558
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12558.1.patch, HIVE-12558.2.patch, 
> HIVE-12558.wip.patch, sample-output.png
>
>
> Right now, LLAP logs counters for every fragment; most of them are IO related 
> and could be very useful, they also include table names so that things like 
> cache hit ratio, etc., could be calculated for every table.
> We need to output them to some metrics system (preserving the breakdown by 
> table, possibly also adding query ID or even stage) so that they'd be usable 
> without grep/sed/awk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2016-02-11 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143743#comment-15143743
 ] 

Yongzhi Chen commented on HIVE-9534:


It seems that your method need a global sorting first, then group the values to 
have a right answer. Not As what I thought: each node sort and group its own 
values, then merge (which need combine arrays).  Does your method do global 
sorting first?

> incorrect result set for query that projects a windowed aggregate
> -
>
> Key: HIVE-9534
> URL: https://issues.apache.org/jira/browse/HIVE-9534
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Reporter: N Campbell
>Assignee: Aihua Xu
> Attachments: HIVE-9534.1.patch, HIVE-9534.2.patch, HIVE-9534.3.patch, 
> HIVE-9534.4.patch
>
>
> Result set returned by Hive has one row instead of 5
> {code}
> select avg(distinct tsint.csint) over () from tsint 
> create table  if not exists TSINT (RNUM int , CSINT smallint)
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE;
> 0|\N
> 1|-1
> 2|0
> 3|1
> 4|10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13017) Child process of HiveServer2 fails to get delegation token from non default FileSystem

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143677#comment-15143677
 ] 

Hive QA commented on HIVE-13017:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787513/HIVE-13017.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9775 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6947/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6947/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6947/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787513 - PreCommit-HIVE-TRUNK-Build

> Child process of HiveServer2 fails to get delegation token from non default 
> FileSystem
> --
>
> Key: HIVE-13017
> URL: https://issues.apache.org/jira/browse/HIVE-13017
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Affects Versions: 1.2.1
> Environment: Secure 
>Reporter: Takahiko Saito
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13017.2.patch, HIVE-13017.3.patch, HIVE-13017.patch
>
>
> The following query fails, when Azure Filesystem is used as default file 
> system, and HDFS is used for intermediate data.
> {noformat}
> >>>  create temporary table s10k stored as orc as select * from studenttab10k;
> >>>  create temporary table v10k as select * from votertab10k;
> >>>  select registration 
> from s10k s join v10k v 
> on (s.name = v.name) join studentparttab30k p 
> on (p.name = v.name) 
> where s.age < 25 and v.age < 25 and p.age < 25;
> ERROR : Execution failed with exit status: 2
> ERROR : Obtaining error information
> ERROR : 
> Task failed!
> Task ID:
>   Stage-5
> Logs:
> ERROR : /var/log/hive/hiveServer2.log
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask (state=08S01,code=2)
> Aborting command set because "force" is false and command failed: "select 
> registration 
> from s10k s join v10k v 
> on (s.name = v.name) join studentparttab30k p 
> on (p.name = v.name) 
> where s.age < 25 and v.age < 25 and p.age < 25;"
> Closing: 0: 
> jdbc:hive2://zk2-hs21-h.hdinsight.net:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@hdinsight.net;transportMode=http;httpPath=cliservice
> hiveServer2.log shows:
> 2016-02-02 18:04:34,182 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,199 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,212 INFO  [HiveServer2-HttpHandler-Pool: Thread-55]: 
> thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(127)) - Could not 
> validate cookie sent, will try to generate a new cookie
> 2016-02-02 18:04:34,213 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> ql.Driver (Driver.java:checkConcurrency(168)) - Concurrency mode is disabled, 
> not creating a lock manager
> 2016-02-02 18:04:34,219 INFO  [HiveServer2-HttpHandler-Pool: Thread-55]: 
> thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(352)) - 
> Failed to authenticate with http/_HOST kerberos principal, trying with 
> hive/_HOST kerberos principal
> 2016-02-02 18:04:34,219 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -  method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
> 2016-02-02 18:04:34,225 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> ql.Driver (Driver.java:execute(1390)) - Setting caller context to query id 
> hive_20160202180429_76ab-64d6-4c89-88b0-6355cc5acbd0
> 2016-02-02 18:04:34,226 INFO  [HiveServer2-Background-Pool: Thread-517]: 
> ql.Driver (Driver.java:execute(1393)) - Starting 
> command(queryId=hive_20160202180429_76ab-64d6-4c89-88b0-6355cc5acbd0): 
> select registration
> from s10k s 

[jira] [Updated] (HIVE-12543) Disable Hive ConstantPropagate optimizer when CBO has optimized the plan

2016-02-11 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12543:
---
Attachment: HIVE-12543.06.patch

New patch solves issues with DynamicPartitionPruning (unnecessarily execution). 
Triggering new QA.

> Disable Hive ConstantPropagate optimizer when CBO has optimized the plan
> 
>
> Key: HIVE-12543
> URL: https://issues.apache.org/jira/browse/HIVE-12543
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12543.01.patch, HIVE-12543.02.patch, 
> HIVE-12543.03.patch, HIVE-12543.04.patch, HIVE-12543.05.patch, 
> HIVE-12543.06.patch, HIVE-12543.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12163) LLAP: Tez counters for LLAP 2

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-12163.
---
Resolution: Duplicate

> LLAP: Tez counters for LLAP 2
> -
>
> Key: HIVE-12163
> URL: https://issues.apache.org/jira/browse/HIVE-12163
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>
> Some counters, such as cache hit ratio for a fragment, are not propagated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12965) Insert overwrite local directory should perserve the overwritten directory permission

2016-02-11 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12965:
---
Attachment: HIVE-12965.3.patch

Fix the patch for the test failures.

> Insert overwrite local directory should perserve the overwritten directory 
> permission
> -
>
> Key: HIVE-12965
> URL: https://issues.apache.org/jira/browse/HIVE-12965
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-12965.1.patch, HIVE-12965.2.patch, 
> HIVE-12965.3.patch, HIVE-12965.patch
>
>
> In Hive, "insert overwrite local directory" first deletes the overwritten 
> directory if exists, recreate a new one, then copy the files from src 
> directory to the new local directory. This process sometimes changes the 
> permissions of the to-be-overwritten local directory, therefore causing some 
> applications no more to be able to access its content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143569#comment-15143569
 ] 

Sergey Shelukhin edited comment on HIVE-4897 at 2/11/16 9:56 PM:
-

I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore, and then send this token 
with a create request. The tokens can be stored in MS memory and are unique (an 
incrementing number), they are GCed after a long period (an hour?). For 
generality, each operation with the token could have a client-maintained 
sequence number that is not incremented on retries; to start, just one expected 
operation per token could be permitted - e.g. createSomething. Success (or 
error) would be recorded with the token in MS.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op, or the original error would be re-thrown.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.

This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.


was (Author: sershe):
I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore. The tokens can be stored 
in memory and are unique (an incrementing number), they are GCed after a long 
period (an hour?). Then each operation with the token would have a 
client-maintained sequence number that is not incremented on retries (for 
generality; or just one expected operation to start, which would be good enough 
for this case) would record success for the token.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.
This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143569#comment-15143569
 ] 

Sergey Shelukhin edited comment on HIVE-4897 at 2/11/16 9:59 PM:
-

I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore, and then send this token 
with a create request. The tokens can be stored in MS memory and are unique (an 
incrementing number), they are GCed after a long period (an hour?). For 
generality, each operation with the token could have a client-maintained 
per-operation sequence number that is not incremented on retries; to start, 
just one expected operation per token could be permitted - e.g. 
createSomething. Success (or error) would be recorded with the token in MS.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op, or the original error would be re-thrown.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.

This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. Oh yeah, it will be configurable, of course :) On 
by default. [~thejas] should we do that? Opinions? Related to the timeout issue 
we were discussing.


was (Author: sershe):
I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore, and then send this token 
with a create request. The tokens can be stored in MS memory and are unique (an 
incrementing number), they are GCed after a long period (an hour?). For 
generality, each operation with the token could have a client-maintained 
per-operation sequence number that is not incremented on retries; to start, 
just one expected operation per token could be permitted - e.g. 
createSomething. Success (or error) would be recorded with the token in MS.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op, or the original error would be re-thrown.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.

This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10236) LLAP: Certain errors are not reported to the AM when a fragment fails

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10236:
--
Assignee: (was: Siddharth Seth)

> LLAP: Certain errors are not reported to the AM when a fragment fails
> -
>
> Key: HIVE-10236
> URL: https://issues.apache.org/jira/browse/HIVE-10236
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13021) GenericUDAFEvaluator.isEstimable(agg) always returns false

2016-02-11 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143672#comment-15143672
 ] 

Prasanth Jayachandran commented on HIVE-13021:
--

lgmt, +1

> GenericUDAFEvaluator.isEstimable(agg) always returns false
> --
>
> Key: HIVE-13021
> URL: https://issues.apache.org/jira/browse/HIVE-13021
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Sergey Zadoroshnyak
>Assignee: Gopal V
>Priority: Critical
>  Labels: Performance
> Attachments: HIVE-13021.1.patch
>
>
> GenericUDAFEvaluator.isEstimable(agg) always returns false, because 
> annotation AggregationType has default RetentionPolicy.CLASS and cannot be 
> retained by the VM at run time.
> As result estimate method will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12981) ThriftCLIService uses incompatible getShortName() implementation

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143681#comment-15143681
 ] 

Hive QA commented on HIVE-12981:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787304/HIVE-12981.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6948/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6948/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6948/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6948/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at f7c9f55 HIVE-12998: ORC FileDump.printJsonData() does not close 
RecordReader (Jason Dere, via Gopal V)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at f7c9f55 HIVE-12998: ORC FileDump.printJsonData() does not close 
RecordReader (Jason Dere, via Gopal V)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787304 - PreCommit-HIVE-TRUNK-Build

> ThriftCLIService uses incompatible getShortName() implementation
> 
>
> Key: HIVE-12981
> URL: https://issues.apache.org/jira/browse/HIVE-12981
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Authorization, CLI, Security
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Thejas M Nair
>Priority: Critical
>  Labels: kerberos
> Attachments: 0001-HIVE-12981-Use-KerberosName.patch, HIVE-12981.patch
>
>
> ThriftCLIService has a local implementation getShortName() that assumes a 
> short name is always the part before "@" and "/". This is not always the case 
> as Kerberos Rules (from Hadoop's KerberosName) might actually transform a 
> name to something else.
> Considering a pending change to getShortName() (#HADOOP-12751) and the normal 
> use of KerberosName in other parts of Hive it only seems logical to use the 
> standard implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12626) Publish some of the LLAP cache counters via Tez

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-12626.
---
Resolution: Duplicate

> Publish some of the LLAP cache counters via Tez
> ---
>
> Key: HIVE-12626
> URL: https://issues.apache.org/jira/browse/HIVE-12626
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>
> To make them available via the final the final dag details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13043) Reload function has no impact to function registry

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143474#comment-15143474
 ] 

Sergey Shelukhin commented on HIVE-13043:
-

+1 pending tests

> Reload function has no impact to function registry
> --
>
> Key: HIVE-13043
> URL: https://issues.apache.org/jira/browse/HIVE-13043
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-13043.1.patch
>
>
> With HIVE-2573, users should run "reload function" to refresh cached function 
> registry. However, "reload function" has no impact at all. We need to fix 
> this. Otherwise, HS2 needs to be restarted to see new global functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13000) Hive returns useless parsing error

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143543#comment-15143543
 ] 

Hive QA commented on HIVE-13000:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787260/HIVE-13000.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9775 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_subquery_notexists_implicit_gby
org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_nonkey_groupby
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6946/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6946/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6946/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787260 - PreCommit-HIVE-TRUNK-Build

> Hive returns useless parsing error 
> ---
>
> Key: HIVE-13000
> URL: https://issues.apache.org/jira/browse/HIVE-13000
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 1.0.0, 1.2.1
>Reporter: Alina Abramova
>Assignee: Alina Abramova
>Priority: Minor
> Attachments: HIVE-13000.1.patch, HIVE-13000.2.patch, 
> HIVE-13000.3.patch
>
>
> When I run query like these I receive unclear exception
> hive> SELECT record FROM ctest GROUP BY record.instance_id;
> FAILED: SemanticException Error in parsing 
> It will be clearer if it would be like:
> hive> SELECT record FROM ctest GROUP BY record.instance_id;
> FAILED: SemanticException  Expression not in GROUP BY key record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10029) LLAP: Scheduling of work from different queries within the daemon

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-10029.
---
Resolution: Done

Done elsewhere.

> LLAP: Scheduling of work from different queries within the daemon
> -
>
> Key: HIVE-10029
> URL: https://issues.apache.org/jira/browse/HIVE-10029
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
> Fix For: llap
>
>
> The current implementation is a simple queue - whichever query wins the race 
> to submit work to a daemon will execute first.
> A policy around this may be useful - potentially a fair share, or a first 
> query in gets all slots approach.
> Also, prioritiy associated with work within a query should be considered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12064) prevent transactional=false

2016-02-11 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12064:
-
Attachment: HIVE-12064.4.patch

patch 4 for test: fixed many unit test failures

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-12064.2.patch, HIVE-12064.3.patch, 
> HIVE-12064.4.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143569#comment-15143569
 ] 

Sergey Shelukhin commented on HIVE-4897:


I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore. The tokens can be stored 
in memory and are unique (an incrementing number), they are GCed after a long 
period (an hour?). Then each operation with the token would have a 
client-maintained sequence number that is not incremented on retries (for 
generality; or just one expected operation to start, which would be good enough 
for this case) would record success for the token.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.
This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10026) LLAP: AM should get notifications on daemons going down or restarting

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-10026.
---
Resolution: Duplicate

> LLAP: AM should get notifications on daemons going down or restarting
> -
>
> Key: HIVE-10026
> URL: https://issues.apache.org/jira/browse/HIVE-10026
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
> Fix For: llap
>
>
> There's lost state otherwise, which can cause queries to hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143569#comment-15143569
 ] 

Sergey Shelukhin edited comment on HIVE-4897 at 2/11/16 9:58 PM:
-

I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore, and then send this token 
with a create request. The tokens can be stored in MS memory and are unique (an 
incrementing number), they are GCed after a long period (an hour?). For 
generality, each operation with the token could have a client-maintained 
per-operation sequence number that is not incremented on retries; to start, 
just one expected operation per token could be permitted - e.g. 
createSomething. Success (or error) would be recorded with the token in MS.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op, or the original error would be re-thrown.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.

This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.


was (Author: sershe):
I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore, and then send this token 
with a create request. The tokens can be stored in MS memory and are unique (an 
incrementing number), they are GCed after a long period (an hour?). For 
generality, each operation with the token could have a client-maintained 
sequence number that is not incremented on retries; to start, just one expected 
operation per token could be permitted - e.g. createSomething. Success (or 
error) would be recorded with the token in MS.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op, or the original error would be re-thrown.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.

This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10188) LLAP: Failures from TaskRunnerCallable should be reported

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-10188.
---
Resolution: Done

> LLAP: Failures from TaskRunnerCallable should be reported
> -
>
> Key: HIVE-10188
> URL: https://issues.apache.org/jira/browse/HIVE-10188
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12988) Improve dynamic partition loading IV

2016-02-11 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12988:

Attachment: HIVE-12988.4.patch

> Improve dynamic partition loading IV
> 
>
> Key: HIVE-12988
> URL: https://issues.apache.org/jira/browse/HIVE-12988
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12988.2.patch, HIVE-12988.2.patch, 
> HIVE-12988.3.patch, HIVE-12988.4.patch, HIVE-12988.patch
>
>
> Parallelize copyFiles()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12965) Insert overwrite local directory should perserve the overwritten directory permission

2016-02-11 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143619#comment-15143619
 ] 

Xuefu Zhang commented on HIVE-12965:


+1. Thanks for fixing the tests.

> Insert overwrite local directory should perserve the overwritten directory 
> permission
> -
>
> Key: HIVE-12965
> URL: https://issues.apache.org/jira/browse/HIVE-12965
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-12965.1.patch, HIVE-12965.2.patch, 
> HIVE-12965.3.patch, HIVE-12965.patch
>
>
> In Hive, "insert overwrite local directory" first deletes the overwritten 
> directory if exists, recreate a new one, then copy the files from src 
> directory to the new local directory. This process sometimes changes the 
> permissions of the to-be-overwritten local directory, therefore causing some 
> applications no more to be able to access its content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143475#comment-15143475
 ] 

Sergio Peña commented on HIVE-13039:


Thanks. Those tests are good.
Btw, {{testFilterFloatColumns}} is failing:

{noformat}
Expected :and(and(not(eq(a, null)), not(and(lt(a, 20.3), not(lteq(a, 10.2), 
not(or(or(eq(b, 1), eq(b, 2)), eq(b, 3
Actual   :and(and(not(eq(a, null)), not(and(lteq(a, 20.3), not(lt(a, 10.2), 
not(or(or(eq(b, 1), eq(b, 2)), eq(b, 3
{noformat}

It's related to your change. I did not notice that we're using a {{between}} 
call there as well.

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9901) LLAP: Use service discovery in the scheduler to determine running daemons

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-9901.
--
Resolution: Done

Implemented elsewhere.

> LLAP: Use service discovery in the scheduler to determine running daemons
> -
>
> Key: HIVE-9901
> URL: https://issues.apache.org/jira/browse/HIVE-9901
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13021) GenericUDAFEvaluator.isEstimable(agg) always returns false

2016-02-11 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13021:
---
Attachment: HIVE-13021.1.patch

> GenericUDAFEvaluator.isEstimable(agg) always returns false
> --
>
> Key: HIVE-13021
> URL: https://issues.apache.org/jira/browse/HIVE-13021
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Sergey Zadoroshnyak
>Assignee: Gopal V
>Priority: Critical
>  Labels: Performance
> Attachments: HIVE-13021.1.patch
>
>
> GenericUDAFEvaluator.isEstimable(agg) always returns false, because 
> annotation AggregationType has default RetentionPolicy.CLASS and cannot be 
> retained by the VM at run time.
> As result estimate method will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9901) LLAP: Use service discovery in the scheduler to determine running daemons

2016-02-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-9901:
-
Assignee: (was: Siddharth Seth)

> LLAP: Use service discovery in the scheduler to determine running daemons
> -
>
> Key: HIVE-9901
> URL: https://issues.apache.org/jira/browse/HIVE-9901
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13033) SPDO unnecessarily duplicates columns in key & value of mapper output

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143893#comment-15143893
 ] 

Hive QA commented on HIVE-13033:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787325/HIVE-13033.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9760 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6949/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6949/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6949/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787325 - PreCommit-HIVE-TRUNK-Build

> SPDO unnecessarily duplicates columns in key & value of mapper output
> -
>
> Key: HIVE-13033
> URL: https://issues.apache.org/jira/browse/HIVE-13033
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13033.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13039:

Attachment: (was: HIVE-13039.2.patch)

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-02-11 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12049:

Attachment: HIVE-12049.5.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

2016-02-11 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13039:

Attachment: HIVE-13039.2.patch

My fault, re-attach patch 2 with the test fix. 

> BETWEEN predicate is not functioning correctly with predicate pushdown on 
> Parquet table
> ---
>
> Key: HIVE-13039
> URL: https://issues.apache.org/jira/browse/HIVE-13039
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch
>
>
> BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as 
> it is by default in newer Hive versions). To reproduce(in a cluster, not 
> local setup):
> CREATE TABLE parquet_tbl(
>   key int,
>   ldate string)
>  PARTITIONED BY (
>  lyear string )
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> insert overwrite table parquet_tbl partition (lyear='2016') select
>   1,
>   '2016-02-03' from src limit 1;
> set hive.optimize.ppd.storage = true;
> set hive.optimize.ppd = true;
> select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';
> No row will be returned in a cluster.
> But if you turn off hive.optimize.ppd, one row will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-02-11 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11675:

Attachment: HIVE-11675.05.patch

Tests, with some refactoring to make them possible. corrupt IDs should also be 
restored in PPD case, I guess; I will file a separate JIRA. Might be better to 
queue the updates right inside metastore; unlike with plain get, metastore 
detects if they are corrupted.

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12558) LLAP: output QueryFragmentCounters somewhere

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143970#comment-15143970
 ] 

Sergey Shelukhin commented on HIVE-12558:
-

btw, allocation and alloc-used is not a super useful user facing counter. It 
basically shows how much memory was wasted due to allocation granularity... 
timings might be more useful. Can be fixed on commit or later.

> LLAP: output QueryFragmentCounters somewhere
> 
>
> Key: HIVE-12558
> URL: https://issues.apache.org/jira/browse/HIVE-12558
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12558.1.patch, HIVE-12558.2.patch, 
> HIVE-12558.wip.patch, sample-output.png
>
>
> Right now, LLAP logs counters for every fragment; most of them are IO related 
> and could be very useful, they also include table names so that things like 
> cache hit ratio, etc., could be calculated for every table.
> We need to output them to some metrics system (preserving the breakdown by 
> table, possibly also adding query ID or even stage) so that they'd be usable 
> without grep/sed/awk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.

2016-02-11 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10632:
-
Attachment: HIVE-10632.4.patch

patch 4 for test: fixed many UT failures

> Make sure TXN_COMPONENTS gets cleaned up if table is dropped before 
> compaction.
> ---
>
> Key: HIVE-10632
> URL: https://issues.apache.org/jira/browse/HIVE-10632
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-10632.1.patch, HIVE-10632.2.patch, 
> HIVE-10632.3.patch, HIVE-10632.4.patch
>
>
> The compaction process will clean up entries in  TXNS, 
> COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS.  If the table/partition is dropped 
> before compaction is complete there will be data left in these tables.  Need 
> to investigate if there are other situations where this may happen and 
> address it.
> see HIVE-10595 for additional info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation to HBaseStore

2016-02-11 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143904#comment-15143904
 ] 

Pengcheng Xiong commented on HIVE-12960:


The patch passed QA with a minor golden file update issue. [~ashutoshc], could 
you please take a look? Thanks.

> Migrate Column Stats Extrapolation to HBaseStore
> 
>
> Key: HIVE-12960
> URL: https://issues.apache.org/jira/browse/HIVE-12960
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12960.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2016-02-11 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-9534:
---
Attachment: HIVE-13039.2.patch

My fault, re-attach patch2 with the test fix. 

> incorrect result set for query that projects a windowed aggregate
> -
>
> Key: HIVE-9534
> URL: https://issues.apache.org/jira/browse/HIVE-9534
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Reporter: N Campbell
>Assignee: Aihua Xu
> Attachments: HIVE-9534.1.patch, HIVE-9534.2.patch, HIVE-9534.3.patch, 
> HIVE-9534.4.patch
>
>
> Result set returned by Hive has one row instead of 5
> {code}
> select avg(distinct tsint.csint) over () from tsint 
> create table  if not exists TSINT (RNUM int , CSINT smallint)
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE;
> 0|\N
> 1|-1
> 2|0
> 3|1
> 4|10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2016-02-11 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-9534:
---
Attachment: (was: HIVE-13039.2.patch)

> incorrect result set for query that projects a windowed aggregate
> -
>
> Key: HIVE-9534
> URL: https://issues.apache.org/jira/browse/HIVE-9534
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Reporter: N Campbell
>Assignee: Aihua Xu
> Attachments: HIVE-9534.1.patch, HIVE-9534.2.patch, HIVE-9534.3.patch, 
> HIVE-9534.4.patch
>
>
> Result set returned by Hive has one row instead of 5
> {code}
> select avg(distinct tsint.csint) over () from tsint 
> create table  if not exists TSINT (RNUM int , CSINT smallint)
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE;
> 0|\N
> 1|-1
> 2|0
> 3|1
> 4|10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2016-02-11 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143953#comment-15143953
 ] 

Yongzhi Chen commented on HIVE-9534:


As the Avg(distinct) is calculated after global sort, the fix is a good one.
+1

> incorrect result set for query that projects a windowed aggregate
> -
>
> Key: HIVE-9534
> URL: https://issues.apache.org/jira/browse/HIVE-9534
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Reporter: N Campbell
>Assignee: Aihua Xu
> Attachments: HIVE-9534.1.patch, HIVE-9534.2.patch, HIVE-9534.3.patch, 
> HIVE-9534.4.patch
>
>
> Result set returned by Hive has one row instead of 5
> {code}
> select avg(distinct tsint.csint) over () from tsint 
> create table  if not exists TSINT (RNUM int , CSINT smallint)
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE;
> 0|\N
> 1|-1
> 2|0
> 3|1
> 4|10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority

2016-02-11 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-13046:
---
Attachment: HIVE-13046.1.patch

Uploaded one-line patch.

> DependencyResolver should not lowercase the dependency URI's authority
> --
>
> Key: HIVE-13046
> URL: https://issues.apache.org/jira/browse/HIVE-13046
> Project: Hive
>  Issue Type: Bug
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-13046.1.patch
>
>
> When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, 
> Hive will lowercase it to {{1.2.3-snapshot}} due to:
> {code:title=DependencyResolver.java}
> String[] authorityTokens = authority.toLowerCase().split(":");
> {code}
> We should not {{.lowerCase()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13047) Disabling Web UI leads to NullPointerException

2016-02-11 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13047:
---
Attachment: HIVE-13047.patch

> Disabling Web UI leads to NullPointerException
> --
>
> Key: HIVE-13047
> URL: https://issues.apache.org/jira/browse/HIVE-13047
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13047.patch
>
>
> Disabling the Web UI or it's historical query display feature can lead to 
> NullPointerException since {{historicSqlOperations}} is unintialized.
> For ex: If hive.server2.webui.port is set to 0
> {code}
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.hive.service.cli.operation.OperationManager.removeOperation(OperationManager.java:212)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:240)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:727)
>   at 
> org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:408)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:664)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1513)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1498)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13036) Split hive.root.logger separately to make it compatible with log4j1.x (for remaining services)

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143992#comment-15143992
 ] 

Hive QA commented on HIVE-13036:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787333/HIVE-13036.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9714 tests executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6950/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6950/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6950/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787333 - PreCommit-HIVE-TRUNK-Build

> Split hive.root.logger separately to make it compatible with log4j1.x (for 
> remaining services)
> --
>
> Key: HIVE-13036
> URL: https://issues.apache.org/jira/browse/HIVE-13036
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13036.1.patch
>
>
> Similar to HIVE-12402 but for HS2 and metastore this time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority

2016-02-11 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-13046:
---
Description: 
When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, Hive 
will lowercase it to {{1.2.3-snapshot}} due to:

{code:title=DependencyResolver.java}
String[] authorityTokens = authority.toLowerCase().split(":");
{code}

We should not {{.lowerCase()}}.

RB: https://reviews.apache.org/r/43513

  was:
When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, Hive 
will lowercase it to {{1.2.3-snapshot}} due to:

{code:title=DependencyResolver.java}
String[] authorityTokens = authority.toLowerCase().split(":");
{code}

We should not {{.lowerCase()}}.


> DependencyResolver should not lowercase the dependency URI's authority
> --
>
> Key: HIVE-13046
> URL: https://issues.apache.org/jira/browse/HIVE-13046
> Project: Hive
>  Issue Type: Bug
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-13046.1.patch
>
>
> When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, 
> Hive will lowercase it to {{1.2.3-snapshot}} due to:
> {code:title=DependencyResolver.java}
> String[] authorityTokens = authority.toLowerCase().split(":");
> {code}
> We should not {{.lowerCase()}}.
> RB: https://reviews.apache.org/r/43513



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13045) move guava dependency back to 14 after HIVE-12952

2016-02-11 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13045:
---
Attachment: HIVE-13045.patch

> move guava dependency back to 14 after HIVE-12952
> -
>
> Key: HIVE-13045
> URL: https://issues.apache.org/jira/browse/HIVE-13045
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13045.patch
>
>
> HIVE-12952 removed usage of EvictingQueue, so we don't need to up dependency 
> to guava 15 at this point - avoid version related conflicts with clients if 
> we can avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12918) LLAP should never create embedded metastore when localizing functions

2016-02-11 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143068#comment-15143068
 ] 

Sushanth Sowmyan commented on HIVE-12918:
-

+1 on .03.patch with one minor note : You have the following in the patch that 
you probably didn't want :

{noformat}
diff --git common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
index 73e6c21..90a15ba 100644
--- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
+++ common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
@@ -2535,7 +2535,7 @@ public void setSparkConfigUpdated(boolean 
isSparkConfigUpdated) {
 
LLAP_DAEMON_COMMUNICATOR_NUM_THREADS("hive.llap.daemon.communicator.num.threads",
 10, 
   "Number of threads to use in LLAP task communicator in Tez AM.",
   "llap.daemon.communicator.num.threads"),
-LLAP_DAEMON_ALLOW_PERMANENT_FNS("hive.llap.daemon.allow.permanent.fns", 
true,
+LLAP_DAEMON_ALLOW_PERMANENT_FNS("hive.llap.daemon.allow.permanent.fns", 
true, // TODO#: moo
 "Whether LLAP daemon should localize the resources for permanent 
UDFs."),
 LLAP_TASK_SCHEDULER_NODE_REENABLE_MIN_TIMEOUT_MS(
   "hive.llap.task.scheduler.node.reenable.min.timeout.ms", "200ms",
{noformat}

> LLAP should never create embedded metastore when localizing functions
> -
>
> Key: HIVE-12918
> URL: https://issues.apache.org/jira/browse/HIVE-12918
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12918.01.patch, HIVE-12918.02.patch, 
> HIVE-12918.03.patch, HIVE-12918.patch
>
>
> {code}
> 16/01/24 21:29:02 INFO service.AbstractService: Service LlapDaemon failed in 
> state INITED; cause: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1552)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3110)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3130)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3355)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.FunctionLocalizer.startLocalizeAllFunctions(FunctionLocalizer.java:88)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.serviceInit(LlapDaemon.java:244)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.main(LlapDaemon.java:323)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1550)
>   ... 10 more
> Caused by: java.lang.NoClassDefFoundError: org/datanucleus/NucleusContext
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getClass(MetaStoreUtils.java:1517)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:61)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:568)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:533)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:595)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:387)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:78)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5935)
>   at 
> 

[jira] [Commented] (HIVE-13042) OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against old versions of ORC files

2016-02-11 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143143#comment-15143143
 ] 

Prasanth Jayachandran commented on HIVE-13042:
--

+1

> OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against 
> old versions of ORC files
> --
>
> Key: HIVE-13042
> URL: https://issues.apache.org/jira/browse/HIVE-13042
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-13042.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0
> at java.util.Collections$EmptyList.get(Collections.java:3212)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getFileVersion(ReaderImpl.java:194)
> at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:289)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:261)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> \cc [~prasanth_j], [~sseth]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13042) OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against old versions of ORC files

2016-02-11 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143153#comment-15143153
 ] 

Prasanth Jayachandran commented on HIVE-13042:
--

[~rajesh.balamohan] What version of ORC file are you trying to read?

> OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against 
> old versions of ORC files
> --
>
> Key: HIVE-13042
> URL: https://issues.apache.org/jira/browse/HIVE-13042
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-13042.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0
> at java.util.Collections$EmptyList.get(Collections.java:3212)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getFileVersion(ReaderImpl.java:194)
> at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:289)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:261)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> \cc [~prasanth_j], [~sseth]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13004) Remove encryption shims

2016-02-11 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143159#comment-15143159
 ] 

Ashutosh Chauhan commented on HIVE-13004:
-

This patch is targeted for master. By the time its released via Hive 2.1 
(likely second half of this year) and user upgrades to 2.1, its more than 
likely that they will be at hadoop 2.6.0 which was released in 2014. So, by the 
time this hits the user cluster Hive version will be 2 years ahead of the first 
Hadoop version supporting encryption. If user hasnt upgraded hadoop past 2.5 
and wants to use Hive 2.1 on it they likely will be hitting into lot more 
issues because of version mismatch between Hive & Hadoop. What do you think?

> Remove encryption shims
> ---
>
> Key: HIVE-13004
> URL: https://issues.apache.org/jira/browse/HIVE-13004
> Project: Hive
>  Issue Type: Task
>  Components: Encryption
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13004.patch
>
>
> It has served its purpose. Now that we don't support hadoop-1, its no longer 
> needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13042) OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against old versions of ORC files

2016-02-11 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143160#comment-15143160
 ] 

Rajesh Balamohan commented on HIVE-13042:
-

Old orc that was generated with hive 11.

> OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against 
> old versions of ORC files
> --
>
> Key: HIVE-13042
> URL: https://issues.apache.org/jira/browse/HIVE-13042
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-13042.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0
> at java.util.Collections$EmptyList.get(Collections.java:3212)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getFileVersion(ReaderImpl.java:194)
> at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:289)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:261)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> \cc [~prasanth_j], [~sseth]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13042) OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against old versions of ORC files

2016-02-11 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143166#comment-15143166
 ] 

Prasanth Jayachandran commented on HIVE-13042:
--

Ok. Thanks!

> OrcFiledump runs into an ArrayIndexOutOfBoundsException when running against 
> old versions of ORC files
> --
>
> Key: HIVE-13042
> URL: https://issues.apache.org/jira/browse/HIVE-13042
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-13042.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0
> at java.util.Collections$EmptyList.get(Collections.java:3212)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getFileVersion(ReaderImpl.java:194)
> at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:289)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:261)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> \cc [~prasanth_j], [~sseth]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation to HBaseStore

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143170#comment-15143170
 ] 

Hive QA commented on HIVE-12960:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787221/HIVE-12960.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9785 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCacheWithBitVector.allPartitions
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6944/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6944/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6944/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787221 - PreCommit-HIVE-TRUNK-Build

> Migrate Column Stats Extrapolation to HBaseStore
> 
>
> Key: HIVE-12960
> URL: https://issues.apache.org/jira/browse/HIVE-12960
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12960.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2016-02-11 Thread Akshay Goyal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay Goyal updated HIVE-4570:
---
Attachment: HIVE-4570.03.patch

> More information to user on GetOperationStatus in Hive Server2 when query is 
> still executing
> 
>
> Key: HIVE-4570
> URL: https://issues.apache.org/jira/browse/HIVE-4570
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
> Attachments: HIVE-4570.01.patch, HIVE-4570.02.patch, 
> HIVE-4570.03.patch
>
>
> Currently in Hive Server2, when the query is still executing only the status 
> is set as STILL_EXECUTING. 
> This issue is to give more information to the user such as progress and 
> running job handles, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10468) Create scripts to do metastore upgrade tests on jenkins for Oracle DB.

2016-02-11 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10468:
-
Attachment: HIVE-10468.9.patch

I ran the ptest2 build on the pre-commit build machine that has been failing 
with a JDK mismatch error and it works fine on the node that jenkins ran the 
job. Hoping its a random failure and re-uploading the patch.  

> Create scripts to do metastore upgrade tests on jenkins for Oracle DB.
> --
>
> Key: HIVE-10468
> URL: https://issues.apache.org/jira/browse/HIVE-10468
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-10468.1.patch, HIVE-10468.2.patch, 
> HIVE-10468.3.patch, HIVE-10468.4.patch, HIVE-10468.5.patch, 
> HIVE-10468.6.patch, HIVE-10468.7.patch, HIVE-10468.9.patch, 
> HIVE-10468.9.patch, HIVE-10468.9.patch, HIVE-10468.9.patch, HIVE-10468.patch
>
>
> This JIRA is to isolate the work specific to Oracle DB in HIVE-10239. Because 
> of absence of 64 bit debian packages for oracle-xe, the apt-get install fails 
> on the AWS systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143204#comment-15143204
 ] 

Sergey Shelukhin commented on HIVE-12963:
-

[~ashutoshc] can you comment? I am not very familiar with this code. Do we have 
a good test for this?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12441) Driver.acquireLocksAndOpenTxn() should only call recordValidTxns() when needed

2016-02-11 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143220#comment-15143220
 ] 

Wei Zheng commented on HIVE-12441:
--

[~daijy] Can you help commit the patches to master and branch-1? Thanks.

> Driver.acquireLocksAndOpenTxn() should only call recordValidTxns() when needed
> --
>
> Key: HIVE-12441
> URL: https://issues.apache.org/jira/browse/HIVE-12441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12441.1.patch, HIVE-12441.2.patch, 
> HIVE-12441.branch-1.patch
>
>
> recordValidTxns() is only needed if ACID tables are part of the query.  
> Otherwise it's just overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2016-02-11 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11866:
---
Fix Version/s: 1.3.0

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-11866.2.patch, HIVE-11866.3.patch, 
> HIVE-11866.4.patch, HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2016-02-11 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143243#comment-15143243
 ] 

Chaoyu Tang commented on HIVE-11866:


Committed to 1.3.0 as well.

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-11866.2.patch, HIVE-11866.3.patch, 
> HIVE-11866.4.patch, HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)