[jira] [Updated] (HIVE-12827) Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification

2016-01-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12827:
---
Attachment: HIVE-12827.2.patch

with test-cases

> Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign 
> needs explicit isNull[offset] modification
> ---
>
> Key: HIVE-12827
> URL: https://issues.apache.org/jira/browse/HIVE-12827
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12827.2.patch
>
>
> Some scenarios do set Double.NaN instead of isNull=true, but all types aren't 
> consistent.
> Examples of un-set isNull for the valid values are 
> {code}
>   private class FloatReader extends AbstractDoubleReader {
> FloatReader(int columnIndex) {
>   super(columnIndex);
> }
> @Override
> void apply(VectorizedRowBatch batch, int batchIndex) throws IOException {
>   DoubleColumnVector colVector = (DoubleColumnVector) 
> batch.cols[columnIndex];
>   if (deserializeRead.readCheckNull()) {
> VectorizedBatchUtil.setNullColIsNullValue(colVector, batchIndex);
>   } else {
> float value = deserializeRead.readFloat();
> colVector.vector[batchIndex] = (double) value;
>   }
> }
>   }
> {code}
> {code}
>   private class DoubleCopyRow extends CopyRow {
> DoubleCopyRow(int inColumnIndex, int outColumnIndex) {
>   super(inColumnIndex, outColumnIndex);
> }
> @Override
> void copy(VectorizedRowBatch inBatch, int inBatchIndex, 
> VectorizedRowBatch outBatch, int outBatchIndex) {
>   DoubleColumnVector inColVector = (DoubleColumnVector) 
> inBatch.cols[inColumnIndex];
>   DoubleColumnVector outColVector = (DoubleColumnVector) 
> outBatch.cols[outColumnIndex];
>   if (inColVector.isRepeating) {
> if (inColVector.noNulls || !inColVector.isNull[0]) {
>   outColVector.vector[outBatchIndex] = inColVector.vector[0];
> } else {
>   VectorizedBatchUtil.setNullColIsNullValue(outColVector, 
> outBatchIndex);
> }
>   } else {
> if (inColVector.noNulls || !inColVector.isNull[inBatchIndex]) {
>   outColVector.vector[outBatchIndex] = 
> inColVector.vector[inBatchIndex];
> } else {
>   VectorizedBatchUtil.setNullColIsNullValue(outColVector, 
> outBatchIndex);
> }
>   }
> }
>   }
> {code}
> {code}
>  private static abstract class VectorDoubleColumnAssign
> extends VectorColumnAssignVectorBase {
> protected void assignDouble(double value, int destIndex) {
>   outCol.vector[destIndex] = value;
> }
>   }
> {code}
> The pattern to imitate would be the earlier code from VectorBatchUtil
> {code}
> case DOUBLE: {
>   DoubleColumnVector dcv = (DoubleColumnVector) batch.cols[offset + 
> colIndex];
>   if (writableCol != null) {
> dcv.vector[rowIndex] = ((DoubleWritable) writableCol).get();
> dcv.isNull[rowIndex] = false;
>   } else {
> dcv.vector[rowIndex] = Double.NaN;
> setNullColIsNullValue(dcv, rowIndex);
>   }
> }
>   break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2016-01-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101539#comment-15101539
 ] 

Lefty Leverenz commented on HIVE-11634:
---

Doc note:  Adding TODOC2.0 because this adds 
*hive.optimize.partition.columns.separate* to HiveConf.java, so the wiki needs 
to be updated.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

This also removes *hive.optimize.point.lookup.extract*, which was added by 
HIVE-11573 in the same release.

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2016-01-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11634:
--
Labels: TODOC2.0  (was: )

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101547#comment-15101547
 ] 

Hive QA commented on HIVE-12657:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782394/HIVE-12657.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9989 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-mapreduce2.q-load_dyn_part3.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_resolution
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_resolution
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6632/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6632/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6632/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782394 - PreCommit-HIVE-TRUNK-Build

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12657.patch
>
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12875) Verify sem.getInputs() and sem.getOutputs()

2016-01-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-12875:

Attachment: HIVE-12875.patch

> Verify sem.getInputs() and sem.getOutputs()
> ---
>
> Key: HIVE-12875
> URL: https://issues.apache.org/jira/browse/HIVE-12875
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12875.patch
>
>
> For every partition entity object present in sem.getInputs() and 
> sem.getOutputs(), we must verify the appropriate Table in the list of 
> Entities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12724) ACID: Major compaction fails to include the original bucket files into MR job

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101655#comment-15101655
 ] 

Hive QA commented on HIVE-12724:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782399/HIVE-12724.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6633/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6633/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6633/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782399 - PreCommit-HIVE-TRUNK-Build

> ACID: Major compaction fails to include the original bucket files into MR job
> -
>
> Key: HIVE-12724
> URL: https://issues.apache.org/jira/browse/HIVE-12724
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Blocker
> Attachments: HIVE-12724.1.patch, HIVE-12724.2.patch, 
> HIVE-12724.3.patch, HIVE-12724.4.patch, HIVE-12724.ADDENDUM.1.patch, 
> HIVE-12724.branch-1.2.patch, HIVE-12724.branch-1.patch
>
>
> How the problem happens:
> * Create a non-ACID table
> * Before non-ACID to ACID table conversion, we inserted row one
> * After non-ACID to ACID table conversion, we inserted row two
> * Both rows can be retrieved before MAJOR compaction
> * After MAJOR compaction, row one is lost
> {code}
> hive> USE acidtest;
> OK
> Time taken: 0.77 seconds
> hive> CREATE TABLE t1 (nationkey INT, name STRING, regionkey INT, comment 
> STRING)
> > CLUSTERED BY (regionkey) INTO 2 BUCKETS
> > STORED AS ORC;
> OK
> Time taken: 0.179 seconds
> hive> DESC FORMATTED t1;
> OK
> # col_namedata_type   comment
> nationkey int
> name  string
> regionkey int
> comment   string
> # Detailed Table Information
> Database: acidtest
> Owner:wzheng
> CreateTime:   Mon Dec 14 15:50:40 PST 2015
> LastAccessTime:   UNKNOWN
> Retention:0
> Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> Table Type:   MANAGED_TABLE
> Table Parameters:
>   transient_lastDdlTime   1450137040
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
> InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> Compressed:   No
> Num Buckets:  2
> Bucket Columns:   [regionkey]
> Sort Columns: []
> Storage Desc Params:
>   serialization.format1
> Time taken: 0.198 seconds, Fetched: 28 row(s)
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db;
> Found 1 items
> drwxr-xr-x   - wzheng staff 68 2015-12-14 15:50 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1;
> hive> INSERT INTO TABLE t1 VALUES (1, 'USA', 1, 'united states');
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. tez, 
> spark) or using Hive 1.X releases.
> Query ID = wzheng_20151214155028_630098c6-605f-4e7e-a797-6b49fb48360d
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 2
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reduc

[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101732#comment-15101732
 ] 

Hive QA commented on HIVE-12661:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782403/HIVE-12661.11.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 151 failed/errored test(s), 10021 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_full
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial_ndv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fouter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join34
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_offset_limit_global_optimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_allchildsarenull
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regexp_extract
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.tes

[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats

2016-01-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101749#comment-15101749
 ] 

Matt McCline commented on HIVE-12878:
-

Rehydrated an old patch from last year.  Unclear how it interacts with recent 
ORC Schema Evolution.

> Support Vectorization for TEXTFILE and other formats
> 
>
> Key: HIVE-12878
> URL: https://issues.apache.org/jira/browse/HIVE-12878
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Support vectorizing when the input format is TEXTFILE and other formats for 
> better Map Vertex performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12611) Make sure spark.yarn.queue is effective and takes the value from mapreduce.job.queuename if given [Spark Branch]

2016-01-15 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-12611:
--
Attachment: HIVE-12611.1-spark.patch

Tried the patch locally and it worked for me.
{{spark.yarn.queue}} has precedence when it's set.

> Make sure spark.yarn.queue is effective and takes the value from 
> mapreduce.job.queuename if given [Spark Branch]
> 
>
> Key: HIVE-12611
> URL: https://issues.apache.org/jira/browse/HIVE-12611
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-12611.1-spark.patch
>
>
> Hive users sometimes specifies a job queue name for the submitted MR jobs. 
> For spark, the property name is spark.yarn.queue. We need to make sure that 
> user is able to submit spark jobs to the given queue. If user specifies the 
> MR property, then Hive on Spark should take that as well to make it backward 
> compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12878) Support Vectorization for TEXTFILE and other formats

2016-01-15 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12878:

Attachment: HIVE-HIVE-12878.01.patch

Save Work-In-Progress.  By no means complete or working right.

> Support Vectorization for TEXTFILE and other formats
> 
>
> Key: HIVE-12878
> URL: https://issues.apache.org/jira/browse/HIVE-12878
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-HIVE-12878.01.patch
>
>
> Support vectorizing when the input format is TEXTFILE and other formats for 
> better Map Vertex performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2016-01-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-9534:
--

Assignee: Aihua Xu  (was: Chaoyu Tang)

> incorrect result set for query that projects a windowed aggregate
> -
>
> Key: HIVE-9534
> URL: https://issues.apache.org/jira/browse/HIVE-9534
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: N Campbell
>Assignee: Aihua Xu
>
> Result set returned by Hive has one row instead of 5
> {code}
> select avg(distinct tsint.csint) over () from tsint 
> create table  if not exists TSINT (RNUM int , CSINT smallint)
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE;
> 0|\N
> 1|-1
> 2|0
> 3|1
> 4|10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2016-01-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101811#comment-15101811
 ] 

Aihua Xu commented on HIVE-9534:


Talked to Chaoyu. I will take a look at the issue.

> incorrect result set for query that projects a windowed aggregate
> -
>
> Key: HIVE-9534
> URL: https://issues.apache.org/jira/browse/HIVE-9534
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: N Campbell
>Assignee: Aihua Xu
>
> Result set returned by Hive has one row instead of 5
> {code}
> select avg(distinct tsint.csint) over () from tsint 
> create table  if not exists TSINT (RNUM int , CSINT smallint)
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS TEXTFILE;
> 0|\N
> 1|-1
> 2|0
> 3|1
> 4|10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12820) Remove the check if carriage return and new line are used for separator or escape character

2016-01-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101834#comment-15101834
 ] 

Aihua Xu commented on HIVE-12820:
-

Discussed with Chaoyu. Actually it's fine for such case.

When the file is in old format and escape=\\, \\\r will be unescaped properly 
to \r using the old logic.

The only case which could introduce the issue is when characters 'r' or 'n' are 
used as delimiters, the old format will be escape them into \\r or \\n and they 
will be incorrectly unescaped into \r or \n. But that seems to be extreme case. 
I will put that in the release notes.

> Remove the check if carriage return and new line are used for separator or 
> escape character
> ---
>
> Key: HIVE-12820
> URL: https://issues.apache.org/jira/browse/HIVE-12820
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12820.2.patch, HIVE-12820.patch
>
>
> The change in HIVE-11785 doesn't allow \r or \n to be used as separator or 
> escape character which may break some existing tables which uses \r as 
> separator or escape character e.g..
> This case actually can be supported regardless of SERIALIZATION_ESCAPE_CRLF 
> set or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12611) Make sure spark.yarn.queue is effective and takes the value from mapreduce.job.queuename if given [Spark Branch]

2016-01-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101852#comment-15101852
 ] 

Xuefu Zhang commented on HIVE-12611:


+1

> Make sure spark.yarn.queue is effective and takes the value from 
> mapreduce.job.queuename if given [Spark Branch]
> 
>
> Key: HIVE-12611
> URL: https://issues.apache.org/jira/browse/HIVE-12611
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-12611.1-spark.patch
>
>
> Hive users sometimes specifies a job queue name for the submitted MR jobs. 
> For spark, the property name is spark.yarn.queue. We need to make sure that 
> user is able to submit spark jobs to the given queue. If user specifies the 
> MR property, then Hive on Spark should take that as well to make it backward 
> compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12039) Fix TestSSL#testSSLVersion

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101970#comment-15101970
 ] 

Hive QA commented on HIVE-12039:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782405/HIVE-12039.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10004 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6635/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6635/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6635/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782405 - PreCommit-HIVE-TRUNK-Build

> Fix TestSSL#testSSLVersion 
> ---
>
> Key: HIVE-12039
> URL: https://issues.apache.org/jira/browse/HIVE-12039
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12039.1.patch, HIVE-12039.2.patch
>
>
> Looks like it's only run on Linux and failing after HIVE-11720.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102151#comment-15102151
 ] 

Hive QA commented on HIVE-12366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782409/HIVE-12366.14.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10021 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6637/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6637/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6637/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782409 - PreCommit-HIVE-TRUNK-Build

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, 
> HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, 
> HIVE-12366.2.patch, HIVE-12366.3.patch, HIVE-12366.4.patch, 
> HIVE-12366.5.patch, HIVE-12366.6.patch, HIVE-12366.7.patch, 
> HIVE-12366.8.patch, HIVE-12366.9.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12879) RowResolver of Semijoin not updated in CalcitePlanner

2016-01-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12879:
---
Attachment: HIVE-12879.patch

> RowResolver of Semijoin not updated in CalcitePlanner
> -
>
> Key: HIVE-12879
> URL: https://issues.apache.org/jira/browse/HIVE-12879
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12879.patch
>
>
> When we generate a Calcite plan, we might need to cast the column referenced 
> by equality conditions in a Semijoin because Hive works with a more relaxed 
> data type system.
> To cast these columns, we introduce a project operators over the Semijoin 
> inputs. However, these columns were not included in the RowResolver of the 
> Semijoin operator (I guess because they couldn't be referenced beyond the 
> Semijoin). However, if above the Semijoin a Project operator with a windowing 
> function is generated, the RR for the project is taken from the operator 
> below, resulting in a mismatch.
> The following query can be used to reproduce the problem (with CBO on):
> {noformat}
> CREATE TABLE table_1 (int_col_1 INT, decimal3003_col_2 DECIMAL(30, 3), 
> timestamp_col_3 TIMESTAMP, decimal0101_col_4 DECIMAL(1, 1), double_col_5 
> DOUBLE, boolean_col_6 BOOLEAN, timestamp_col_7 TIMESTAMP, varchar0098_col_8 
> VARCHAR(98), int_col_9 INT, timestamp_col_10 TIMESTAMP, decimal0903_col_11 
> DECIMAL(9, 3), int_col_12 INT, bigint_col_13 BIGINT, boolean_col_14 BOOLEAN, 
> char0254_col_15 CHAR(254), boolean_col_16 BOOLEAN, smallint_col_17 SMALLINT, 
> float_col_18 FLOAT, decimal2608_col_19 DECIMAL(26, 8), varchar0216_col_20 
> VARCHAR(216), string_col_21 STRING, timestamp_col_22 TIMESTAMP, double_col_23 
> DOUBLE, smallint_col_24 SMALLINT, float_col_25 FLOAT, decimal2016_col_26 
> DECIMAL(20, 16), string_col_27 STRING, decimal0202_col_28 DECIMAL(2, 2), 
> boolean_col_29 BOOLEAN, decimal2020_col_30 DECIMAL(20, 20), float_col_31 
> FLOAT, boolean_col_32 BOOLEAN, varchar0148_col_33 VARCHAR(148), 
> decimal2121_col_34 DECIMAL(21, 21), timestamp_col_35 TIMESTAMP, float_col_36 
> FLOAT, float_col_37 FLOAT, string_col_38 STRING, decimal3420_col_39 
> DECIMAL(34, 20), smallint_col_40 SMALLINT, decimal1408_col_41 DECIMAL(14, 8), 
> string_col_42 STRING, decimal0902_col_43 DECIMAL(9, 2), varchar0204_col_44 
> VARCHAR(204), float_col_45 FLOAT, tinyint_col_46 TINYINT, double_col_47 
> DOUBLE, timestamp_col_48 TIMESTAMP, double_col_49 DOUBLE, timestamp_col_50 
> TIMESTAMP, decimal0704_col_51 DECIMAL(7, 4), int_col_52 INT, double_col_53 
> DOUBLE, int_col_54 INT, timestamp_col_55 TIMESTAMP, decimal0505_col_56 
> DECIMAL(5, 5), char0155_col_57 CHAR(155), double_col_58 DOUBLE, 
> timestamp_col_59 TIMESTAMP, double_col_60 DOUBLE, float_col_61 FLOAT, 
> char0249_col_62 CHAR(249), float_col_63 FLOAT, smallint_col_64 SMALLINT, 
> decimal1309_col_65 DECIMAL(13, 9), timestamp_col_66 TIMESTAMP, boolean_col_67 
> BOOLEAN, tinyint_col_68 TINYINT, tinyint_col_69 TINYINT, double_col_70 
> DOUBLE, bigint_col_71 BIGINT, boolean_col_72 BOOLEAN, float_col_73 FLOAT, 
> char0222_col_74 CHAR(222), boolean_col_75 BOOLEAN, string_col_76 STRING, 
> decimal2612_col_77 DECIMAL(26, 12), bigint_col_78 BIGINT, char0128_col_79 
> CHAR(128), tinyint_col_80 TINYINT, boolean_col_81 BOOLEAN, int_col_82 INT, 
> boolean_col_83 BOOLEAN, decimal2622_col_84 DECIMAL(26, 22), boolean_col_85 
> BOOLEAN, boolean_col_86 BOOLEAN, decimal0907_col_87 DECIMAL(9, 7))
> STORED AS orc;
> CREATE TABLE table_18 (float_col_1 FLOAT, double_col_2 DOUBLE, 
> decimal2518_col_3 DECIMAL(25, 18), boolean_col_4 BOOLEAN, bigint_col_5 
> BIGINT, boolean_col_6 BOOLEAN, boolean_col_7 BOOLEAN, char0035_col_8 
> CHAR(35), decimal2709_col_9 DECIMAL(27, 9), timestamp_col_10 TIMESTAMP, 
> bigint_col_11 BIGINT, decimal3604_col_12 DECIMAL(36, 4), string_col_13 
> STRING, timestamp_col_14 TIMESTAMP, timestamp_col_15 TIMESTAMP, 
> decimal1911_col_16 DECIMAL(19, 11), boolean_col_17 BOOLEAN, tinyint_col_18 
> TINYINT, timestamp_col_19 TIMESTAMP, timestamp_col_20 TIMESTAMP, 
> tinyint_col_21 TINYINT, float_col_22 FLOAT, timestamp_col_23 TIMESTAMP)
> STORED AS orc;
> explain
> SELECT
> COALESCE(498,
>   LEAD(COALESCE(-973, -684, 515)) OVER (
> PARTITION BY (t2.tinyint_col_21 + t1.smallint_col_24)
> ORDER BY (t2.tinyint_col_21 + t1.smallint_col_24),
> FLOOR(t1.double_col_60) DESC),
>   524) AS int_col
> FROM table_1 t1 INNER JOIN table_18 t2
> ON (((t2.tinyint_col_18) = (t1.bigint_col_13))
> AND ((t2.decimal2709_col_9) = (t1.decimal1309_col_65)))
> AND ((t2.tinyint_col_21) = (t1.tinyint_col_46))
> WHERE (t2.tinyint_col_21) IN (
> SELECT COALESCE(-92, -994) AS int_col_3
>  

[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2016-01-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102193#comment-15102193
 ] 

Eugene Koifman commented on HIVE-12366:
---

[~wzheng] on more comment on (DbTxnManager)
{noformat}
+  @Override

 
+  public void releaseLocks(List hiveLocks) throws LockException {

 
+HiveLockManager lockManager = this.getLockManager();   

 
+lockManager.releaseLocks(hiveLocks);   

 
+stopHeartbeat();   

 
+  } 
{noformat}

I think it should look like this
1. don't need to init LM as it should already be there
2. stop heartbeat first to prevent possibly heartbeating after release (same as 
commit/rollback methods)
{noformat}
  public void releaseLocks(List hiveLocks) throws LockException { 


if (lockMgr != null) {  
  stopHeartbeat();  
 
  lockMgr.releaseLocks(hiveLocks);  


   
}
  } 
{noformat}

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, 
> HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, 
> HIVE-12366.2.patch, HIVE-12366.3.patch, HIVE-12366.4.patch, 
> HIVE-12366.5.patch, HIVE-12366.6.patch, HIVE-12366.7.patch, 
> HIVE-12366.8.patch, HIVE-12366.9.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2

2016-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12446:
--
Attachment: HIVE-12446.combined.1.patch

Re-uploading patch for jenkins

> Tracking jira for changes required for move to Tez 0.8.2
> 
>
> Key: HIVE-12446
> URL: https://issues.apache.org/jira/browse/HIVE-12446
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
> Attachments: HIVE-12446.combined.1.patch, HIVE-12446.combined.1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2016-01-15 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102274#comment-15102274
 ] 

Wei Zheng commented on HIVE-12366:
--

Thanks for the further review. I think you're probably right. I updated the 
patch accordingly.

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, 
> HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, 
> HIVE-12366.2.patch, HIVE-12366.3.patch, HIVE-12366.4.patch, 
> HIVE-12366.5.patch, HIVE-12366.6.patch, HIVE-12366.7.patch, 
> HIVE-12366.8.patch, HIVE-12366.9.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2016-01-15 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.15.patch

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, 
> HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, 
> HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, 
> HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, 
> HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-12828) Update Spark version to 1.6

2016-01-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12828:
---
Comment: was deleted

(was: 

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782359/HIVE-12828.2-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9866 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1030/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1030/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1030/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782359 - PreCommit-HIVE-SPARK-Build)

> Update Spark version to 1.6
> ---
>
> Key: HIVE-12828
> URL: https://issues.apache.org/jira/browse/HIVE-12828
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-12828.1-spark.patch, HIVE-12828.2-spark.patch, 
> HIVE-12828.2-spark.patch, HIVE-12828.2-spark.patch, HIVE-12828.2-spark.patch, 
> mem.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12827) Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification

2016-01-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102304#comment-15102304
 ] 

Sergey Shelukhin commented on HIVE-12827:
-

+1. Did the fill() on the main path get committed, and does it need to be 
removed with this patch?

> Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign 
> needs explicit isNull[offset] modification
> ---
>
> Key: HIVE-12827
> URL: https://issues.apache.org/jira/browse/HIVE-12827
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12827.2.patch
>
>
> Some scenarios do set Double.NaN instead of isNull=true, but all types aren't 
> consistent.
> Examples of un-set isNull for the valid values are 
> {code}
>   private class FloatReader extends AbstractDoubleReader {
> FloatReader(int columnIndex) {
>   super(columnIndex);
> }
> @Override
> void apply(VectorizedRowBatch batch, int batchIndex) throws IOException {
>   DoubleColumnVector colVector = (DoubleColumnVector) 
> batch.cols[columnIndex];
>   if (deserializeRead.readCheckNull()) {
> VectorizedBatchUtil.setNullColIsNullValue(colVector, batchIndex);
>   } else {
> float value = deserializeRead.readFloat();
> colVector.vector[batchIndex] = (double) value;
>   }
> }
>   }
> {code}
> {code}
>   private class DoubleCopyRow extends CopyRow {
> DoubleCopyRow(int inColumnIndex, int outColumnIndex) {
>   super(inColumnIndex, outColumnIndex);
> }
> @Override
> void copy(VectorizedRowBatch inBatch, int inBatchIndex, 
> VectorizedRowBatch outBatch, int outBatchIndex) {
>   DoubleColumnVector inColVector = (DoubleColumnVector) 
> inBatch.cols[inColumnIndex];
>   DoubleColumnVector outColVector = (DoubleColumnVector) 
> outBatch.cols[outColumnIndex];
>   if (inColVector.isRepeating) {
> if (inColVector.noNulls || !inColVector.isNull[0]) {
>   outColVector.vector[outBatchIndex] = inColVector.vector[0];
> } else {
>   VectorizedBatchUtil.setNullColIsNullValue(outColVector, 
> outBatchIndex);
> }
>   } else {
> if (inColVector.noNulls || !inColVector.isNull[inBatchIndex]) {
>   outColVector.vector[outBatchIndex] = 
> inColVector.vector[inBatchIndex];
> } else {
>   VectorizedBatchUtil.setNullColIsNullValue(outColVector, 
> outBatchIndex);
> }
>   }
> }
>   }
> {code}
> {code}
>  private static abstract class VectorDoubleColumnAssign
> extends VectorColumnAssignVectorBase {
> protected void assignDouble(double value, int destIndex) {
>   outCol.vector[destIndex] = value;
> }
>   }
> {code}
> The pattern to imitate would be the earlier code from VectorBatchUtil
> {code}
> case DOUBLE: {
>   DoubleColumnVector dcv = (DoubleColumnVector) batch.cols[offset + 
> colIndex];
>   if (writableCol != null) {
> dcv.vector[rowIndex] = ((DoubleWritable) writableCol).get();
> dcv.isNull[rowIndex] = false;
>   } else {
> dcv.vector[rowIndex] = Double.NaN;
> setNullColIsNullValue(dcv, rowIndex);
>   }
> }
>   break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102332#comment-15102332
 ] 

Hive QA commented on HIVE-12805:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782414/HIVE-12805.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6638/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6638/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6638/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782414 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver skewjoin.q failure
> -
>
> Key: HIVE-12805
> URL: https://issues.apache.org/jira/browse/HIVE-12805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch
>
>
> Set hive.cbo.returnpath.hiveop=true
> {code}
> FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ 
> sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key))
> {code}
> The stack trace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471)
> {code}
> Same error happens in auto_sortmerge_join_6.q.out for 
> {code}
> select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on 
> h.value = a.value
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12827) Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification

2016-01-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102353#comment-15102353
 ] 

Gopal V commented on HIVE-12827:


The fill after every operation is unnecessary, the theory about the other patch 
was that some UDF in that query wasn't handling the hasNoNulls flag.

That is not true, the original issue was that a scratch column feeding the 
COALESCE() is reused for the Join output columns & setting a column value via 
FloatReader does not set the isNull[batchIndex] = false (and then the Filter on 
cr_return_amount removes those rows).

CBO rewrites the left outer join into an inner join, pushing the filter below 
the join & there's no more TS-FIL-MJ-FIL as the FIL migrates to the broadcast 
side.

Here's my simplified example, which I used to narrow down the issue to 
FloatTreeReader.

{code}
set hive.cbo.enable=false;
set hive.vectorized.execution.reducesink.new.enabled=false;
set hive.vectorized.execution.mapjoin.native.enabled=true;
set hive.vectorized.execution.reduce.enabled=false;
set hive.vectorized.execution.reduce.groupby.enabled=false;

use testing;

create table if not exists cs stored as orc as select IF (cs_item_sk
 IN (
1365 ,
2243 ,
2445 ,
3259 ,
3267 ,
4027 ,
5263 ,
6003 ,
8371 ,
9593 ,
10383,
10763,
11351,
12359,
12887,
13449,
16501,
16547
), cs_item_sk, 0) as cs_item_sk
, cs_order_number, cs_net_paid, cs_quantity from 
tpcds_bin_partitioned_orc_200.catalog_sales
where
 true
 and cs_sold_date_sk = 2452245
 and cs_net_profit > 1
 and cs_net_paid > 0
 and cs_quantity > 0
 and cs_item_sk between 1365 and 16547
;

create table if not exists cr as select cr_return_amount, cr_item_sk, 
cr_order_number from tpcds_bin_partitioned_orc_200.catalog_returns where 
cr_returned_date_sk between 2452351 and 2452400
and cr_item_sk
 IN (
1365 ,
2243 ,
2445 ,
3259 ,
3267 ,
4027 ,
5263 ,
6003 ,
8371 ,
9593 ,
10383,
10763,
11351,
12359,
12887,
13449,
16501,
16547
)
order by cr_item_sk
;

select * from
(select cs.cs_item_sk as item,
  coalesce(cr.cr_return_amount,0) as return_amount
 ,coalesce(cs.cs_net_paid,0) as net_paid
-- (cast(sum(coalesce(cr.cr_return_amount,0)) as double)/
--  cast(sum(coalesce(cs.cs_net_paid,0)) as double)) as currency_ratio
 from cs -- catalog_sales cs
 left outer join cr -- catalog_returns cr
 on cs.cs_order_number = cr.cr_order_number
 and cs.cs_item_sk = cr.cr_item_sk
 where cr.cr_return_amount > 1
 and cs.cs_quantity > 0
-- group by cs.cs_item_sk 
) x;
{code}

> Vectorization: VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign 
> needs explicit isNull[offset] modification
> ---
>
> Key: HIVE-12827
> URL: https://issues.apache.org/jira/browse/HIVE-12827
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12827.2.patch
>
>
> Some scenarios do set Double.NaN instead of isNull=true, but all types aren't 
> consistent.
> Examples of un-set isNull for the valid values are 
> {code}
>   private class FloatReader extends AbstractDoubleReader {
> FloatReader(int columnIndex) {
>   super(columnIndex);
> }
> @Override
> void apply(VectorizedRowBatch batch, int batchIndex) throws IOException {
>   DoubleColumnVector colVector = (DoubleColumnVector) 
> batch.cols[columnIndex];
>   if (deserializeRead.readCheckNull()) {
> VectorizedBatchUtil.setNullColIsNullValue(colVector, batchIndex);
>   } else {
> float value = deserializeRead.readFloat();
> colVector.vector[batchIndex] = (double) value;
>   }
> }
>   }
> {code}
> {code}
>   private class DoubleCopyRow extends CopyRow {
> DoubleCopyRow(int inColumnIndex, int outColumnIndex) {
>   super(inColumnIndex, outColumnIndex);
> }
> @Override
> void copy(VectorizedRowBatch inBatch, int inBatchIndex, 
> VectorizedRowBatch outBatch, int outBatchIndex) {
>   DoubleColumnVector inColVector = (DoubleColumnVector) 
> inBatch.cols[inColumnIndex];
>   DoubleColumnVector outColVector = (DoubleColumnVector) 
> outBatch.cols[outColumnIndex];
>   if (inColVector.isRepeating) {
> if (inColVector.noNulls || !inColVector.isNull[0]) {
>   outColVector.vector[outBatchIndex] = inColVector.vector[0];
> } else {
>   VectorizedBatchUtil.setNullColIsNullValue(outColVector, 
> outBatchIndex);
> }
>   } else {
> if (inColVector.noNulls || !inColVector.isNull[inBatchIndex]) {
>   outColVector.vector[outBatchIndex] = 
> inColVector.vector[inBatchIndex];
> } else {
>   VectorizedBatchUtil.setNullColIsNullValue(outColVector, 
> outBatchIndex);
> }
>   }
> }
>   }
> {code}
> {code}
>  private static abstract class VectorDoubleCo

[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union

2016-01-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102373#comment-15102373
 ] 

Pengcheng Xiong commented on HIVE-12863:


as per [~jpullokkaran]'s request, include [~ashutoshc] and [~jcamachorodriguez].

> fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
> -
>
> Key: HIVE-12863
> URL: https://issues.apache.org/jira/browse/HIVE-12863
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12863.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure

2016-01-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102398#comment-15102398
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12805:
--

The failures are unrelated to the change.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver skewjoin.q failure
> -
>
> Key: HIVE-12805
> URL: https://issues.apache.org/jira/browse/HIVE-12805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch
>
>
> Set hive.cbo.returnpath.hiveop=true
> {code}
> FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ 
> sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key))
> {code}
> The stack trace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471)
> {code}
> Same error happens in auto_sortmerge_join_6.q.out for 
> {code}
> select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on 
> h.value = a.value
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12808) Logical PPD: Push filter clauses through PTF(Windowing) into TS

2016-01-15 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-12808:
--
Attachment: HIVE-12808.02.patch

> Logical PPD: Push filter clauses through PTF(Windowing) into TS
> ---
>
> Key: HIVE-12808
> URL: https://issues.apache.org/jira/browse/HIVE-12808
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-12808.01.patch, HIVE-12808.02.patch
>
>
> Simplified repro case of [HCC 
> #8880|https://community.hortonworks.com/questions/8880/hive-on-tez-pushdown-predicate-doesnt-work-in-part.html],
>  with the slow query showing the push-down miss. 
> And the manually rewritten query to indicate the expected one.
> Part of the problem could be the window range not being split apart for PPD, 
> but the FIL is not pushed down even if the rownum filter is removed.
> {code}
> create temporary table positions (regionid string, id bigint, deviceid 
> string, ts string);
> insert into positions values('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 
> 1422792010, '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-01'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-01'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-02'),
> ('1d6a0be1-6366-4692-9597-ebd5cd0f01d1', 1422792010, 
> '6c5d1a30-2331-448b-a726-a380d6b3a432', '2016-01-02');
> -- slow query
> explain
> WITH t1 AS 
> ( 
>  SELECT   *, 
>   Row_number() over ( PARTITION BY regionid, id, deviceid 
> ORDER BY ts DESC) AS rownos
>  FROM positions ), 
> latestposition as ( 
>SELECT * 
>FROM   t1 
>WHERE  rownos = 1) 
> SELECT * 
> FROM   latestposition 
> WHERE  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' 
> ANDid=1422792010 
> ANDdeviceid='6c5d1a30-2331-448b-a726-a380d6b3a432';
> -- fast query
> explain
> WITH t1 AS 
> ( 
>  SELECT   *, 
>   Row_number() over ( PARTITION BY regionid, id, deviceid 
> ORDER BY ts DESC) AS rownos
>  FROM positions 
>  WHERE  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' 
>  ANDid=1422792010 
>  ANDdeviceid='6c5d1a30-2331-448b-a726-a380d6b3a432'
> ),latestposition as ( 
>SELECT * 
>FROM   t1 
>WHERE  rownos = 1) 
> SELECT * 
> FROM   latestposition 
> ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12880) spark-assembly causes Hive class version problems

2016-01-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102423#comment-15102423
 ] 

Sergey Shelukhin commented on HIVE-12880:
-

[~xuefuz] fyi

> spark-assembly causes Hive class version problems
> -
>
> Key: HIVE-12880
> URL: https://issues.apache.org/jira/browse/HIVE-12880
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> It looks like spark-assembly contains versions of Hive classes (e.g. 
> HiveConf), and these sometimes (always?) come from older versions of Hive.
> We've seen problems where depending on classpath changes, NoSuchField errors 
> are thrown for recently added configs because the HiveConf class comes from 
> spark-assembly.
> Would making sure spark-assembly comes last in the classpath solve the 
> problem?
> Otherwise, can we depend on something that does not package Hive classes?
> Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; 
> I am assuming this issue can also affect Hive-on-Spark).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12880) spark-assembly causes Hive class version problems

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12880:

Reporter: Hui Zheng  (was: Sergey Shelukhin)

> spark-assembly causes Hive class version problems
> -
>
> Key: HIVE-12880
> URL: https://issues.apache.org/jira/browse/HIVE-12880
> Project: Hive
>  Issue Type: Bug
>Reporter: Hui Zheng
>
> It looks like spark-assembly contains versions of Hive classes (e.g. 
> HiveConf), and these sometimes (always?) come from older versions of Hive.
> We've seen problems where depending on classpath perturbations, NoSuchField 
> errors may be thrown for recently added ConfVars because the HiveConf class 
> comes from spark-assembly.
> Would making sure spark-assembly comes last in the classpath solve the 
> problem?
> Otherwise, can we depend on something that does not package older Hive 
> classes?
> Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; 
> I am assuming this issue can also affect Hive-on-Spark).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12880) spark-assembly causes Hive class version problems

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12880:

Description: 
It looks like spark-assembly contains versions of Hive classes (e.g. HiveConf), 
and these sometimes (always?) come from older versions of Hive.
We've seen problems where depending on classpath perturbations, NoSuchField 
errors may be thrown for recently added ConfVars because the HiveConf class 
comes from spark-assembly.

Would making sure spark-assembly comes last in the classpath solve the problem?
Otherwise, can we depend on something that does not package older Hive classes?

Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; I 
am assuming this issue can also affect Hive-on-Spark).

  was:
It looks like spark-assembly contains versions of Hive classes (e.g. HiveConf), 
and these sometimes (always?) come from older versions of Hive.
We've seen problems where depending on classpath changes, NoSuchField errors 
are thrown for recently added configs because the HiveConf class comes from 
spark-assembly.

Would making sure spark-assembly comes last in the classpath solve the problem?
Otherwise, can we depend on something that does not package Hive classes?

Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; I 
am assuming this issue can also affect Hive-on-Spark).


> spark-assembly causes Hive class version problems
> -
>
> Key: HIVE-12880
> URL: https://issues.apache.org/jira/browse/HIVE-12880
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> It looks like spark-assembly contains versions of Hive classes (e.g. 
> HiveConf), and these sometimes (always?) come from older versions of Hive.
> We've seen problems where depending on classpath perturbations, NoSuchField 
> errors may be thrown for recently added ConfVars because the HiveConf class 
> comes from spark-assembly.
> Would making sure spark-assembly comes last in the classpath solve the 
> problem?
> Otherwise, can we depend on something that does not package older Hive 
> classes?
> Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; 
> I am assuming this issue can also affect Hive-on-Spark).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12851) Add slider security setting support to LLAP packager

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-12851:
---

Assignee: Sergey Shelukhin

> Add slider security setting support to LLAP packager
> 
>
> Key: HIVE-12851
> URL: https://issues.apache.org/jira/browse/HIVE-12851
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> {noformat}
> "slider.hdfs.keytab.dir": "...",
> "slider.am.login.keytab.name": "...",
> "slider.keytab.principal.name": "..."
> {noformat}
> should be emitted into appConfig.json for Slider AM. Right now, they have to 
> be added manually on a secure cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12826) Vectorization: VectorUDAF* suspect isNull checks

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102462#comment-15102462
 ] 

Hive QA commented on HIVE-12826:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782435/HIVE-12826.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6639/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6639/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6639/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782435 - PreCommit-HIVE-TRUNK-Build

> Vectorization: VectorUDAF* suspect isNull checks
> 
>
> Key: HIVE-12826
> URL: https://issues.apache.org/jira/browse/HIVE-12826
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12826.1.patch
>
>
> for isRepeating=true, checking isNull[selected[i]] might return incorrect 
> results (without a heavy array fill of isNull).
> VectorUDAFSum/Min/Max/Avg and SumDecimal impls need to be reviewed for this 
> pattern.
> {code}
> private void iterateHasNullsRepeatingSelectionWithAggregationSelection(
>   VectorAggregationBufferRow[] aggregationBufferSets,
>   int aggregateIndex,
>value,
>   int batchSize,
>   int[] selection,
>   boolean[] isNull) {
>   
>   for (int i=0; i < batchSize; ++i) {
> if (!isNull[selection[i]]) {
>   Aggregation myagg = getCurrentAggregationBuffer(
> aggregationBufferSets, 
> aggregateIndex,
> i);
>   myagg.sumValue(value);
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12851) Add slider security setting support to LLAP packager

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12851:

Attachment: HIVE-12851.patch

The patch. I still need to test it, later today hopefully

> Add slider security setting support to LLAP packager
> 
>
> Key: HIVE-12851
> URL: https://issues.apache.org/jira/browse/HIVE-12851
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12851.patch
>
>
> {noformat}
> "slider.hdfs.keytab.dir": "...",
> "slider.am.login.keytab.name": "...",
> "slider.keytab.principal.name": "..."
> {noformat}
> should be emitted into appConfig.json for Slider AM. Right now, they have to 
> be added manually on a secure cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12657:

Attachment: HIVE-12657.01.patch

Another q file update

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12657.01.patch, HIVE-12657.patch
>
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2016-01-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12661:
---
Attachment: HIVE-12661.12.patch

> StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
> ---
>
> Key: HIVE-12661
> URL: https://issues.apache.org/jira/browse/HIVE-12661
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, 
> HIVE-12661.03.patch, HIVE-12661.04.patch, HIVE-12661.05.patch, 
> HIVE-12661.06.patch, HIVE-12661.07.patch, HIVE-12661.08.patch, 
> HIVE-12661.09.patch, HIVE-12661.10.patch, HIVE-12661.11.patch, 
> HIVE-12661.12.patch
>
>
> PROBLEM:
> Hive stats are autogathered properly till an 'analyze table [tablename] 
> compute statistics for columns' is run. Then it does not auto-update the 
> stats till the command is run again. repo:
> {code}
> set hive.stats.autogather=true; 
> set hive.stats.atomic=false ; 
> set hive.stats.collect.rawdatasize=true ; 
> set hive.stats.collect.scancols=false ; 
> set hive.stats.collect.tablekeys=false ; 
> set hive.stats.fetch.column.stats=true; 
> set hive.stats.fetch.partition.stats=true ; 
> set hive.stats.reliable=false ; 
> set hive.compute.query.using.stats=true; 
> CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 
> 'orc.compress'='NONE') ; 
> insert into calendar values (2010), (2011), (2012); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> ++--+ 
> select max(year) from calendar; 
> | 2012 | 
> insert into calendar values (2013); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> | 2013 | 
> ++--+ 
> select max(year) from calendar; 
> | 2013 | 
> insert into calendar values (2014); 
> select max(year) from calendar; 
> | 2014 |
> analyze table calendar compute statistics for columns;
> insert into calendar values (2015);
> select max(year) from calendar;
> | 2014 |
> insert into calendar values (2016), (2017), (2018);
> select max(year) from calendar;
> | 2014  |
> analyze table calendar compute statistics for columns;
> select max(year) from calendar;
> | 2018  |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2016-01-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102488#comment-15102488
 ] 

Pengcheng Xiong commented on HIVE-12661:


more golden file changes.

> StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
> ---
>
> Key: HIVE-12661
> URL: https://issues.apache.org/jira/browse/HIVE-12661
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, 
> HIVE-12661.03.patch, HIVE-12661.04.patch, HIVE-12661.05.patch, 
> HIVE-12661.06.patch, HIVE-12661.07.patch, HIVE-12661.08.patch, 
> HIVE-12661.09.patch, HIVE-12661.10.patch, HIVE-12661.11.patch, 
> HIVE-12661.12.patch
>
>
> PROBLEM:
> Hive stats are autogathered properly till an 'analyze table [tablename] 
> compute statistics for columns' is run. Then it does not auto-update the 
> stats till the command is run again. repo:
> {code}
> set hive.stats.autogather=true; 
> set hive.stats.atomic=false ; 
> set hive.stats.collect.rawdatasize=true ; 
> set hive.stats.collect.scancols=false ; 
> set hive.stats.collect.tablekeys=false ; 
> set hive.stats.fetch.column.stats=true; 
> set hive.stats.fetch.partition.stats=true ; 
> set hive.stats.reliable=false ; 
> set hive.compute.query.using.stats=true; 
> CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 
> 'orc.compress'='NONE') ; 
> insert into calendar values (2010), (2011), (2012); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> ++--+ 
> select max(year) from calendar; 
> | 2012 | 
> insert into calendar values (2013); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> | 2013 | 
> ++--+ 
> select max(year) from calendar; 
> | 2013 | 
> insert into calendar values (2014); 
> select max(year) from calendar; 
> | 2014 |
> analyze table calendar compute statistics for columns;
> insert into calendar values (2015);
> select max(year) from calendar;
> | 2014 |
> insert into calendar values (2016), (2017), (2018);
> select max(year) from calendar;
> | 2014  |
> analyze table calendar compute statistics for columns;
> select max(year) from calendar;
> | 2018  |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12880) spark-assembly causes Hive class version problems

2016-01-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102500#comment-15102500
 ] 

Xuefu Zhang commented on HIVE-12880:


I just checked my spark-assembly.jar, and it doesn't contain any hive classes. 
I'm curious, wondering if the observation comes with spark installation. We 
recommend user to build their own spark-assembly excluding hive classes.

> spark-assembly causes Hive class version problems
> -
>
> Key: HIVE-12880
> URL: https://issues.apache.org/jira/browse/HIVE-12880
> Project: Hive
>  Issue Type: Bug
>Reporter: Hui Zheng
>
> It looks like spark-assembly contains versions of Hive classes (e.g. 
> HiveConf), and these sometimes (always?) come from older versions of Hive.
> We've seen problems where depending on classpath perturbations, NoSuchField 
> errors may be thrown for recently added ConfVars because the HiveConf class 
> comes from spark-assembly.
> Would making sure spark-assembly comes last in the classpath solve the 
> problem?
> Otherwise, can we depend on something that does not package older Hive 
> classes?
> Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; 
> I am assuming this issue can also affect Hive-on-Spark).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12881) Count() function over partitions doesn't work properly with ORDER BY

2016-01-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12881:

Assignee: (was: Aihua Xu)

> Count() function over partitions doesn't work properly with ORDER BY 
> -
>
> Key: HIVE-12881
> URL: https://issues.apache.org/jira/browse/HIVE-12881
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>
> The following query doesn't seem to return the correct result.
> {noformat}
> create table test (empno string, deptno string, level string, manager string);
> insert into test values ('1', '2', 'B', 'Else'); 
> insert into test values ('1', '2', 'B', 'Else');
> insert into test values ('2', '2', 'B', 'Other');
> select  count( manager) over (partition by deptno, level order by manager) 
> from test; 
> {noformat}
> It  returns 
> {noformat}
> 2
> 2
> 3
> {noformat}
> Without ORDER BY, it returns correct result
> {noformat}
> 3
> 3
> 3
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12881) Count() function over partitions doesn't work properly with ORDER BY

2016-01-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102514#comment-15102514
 ] 

Aihua Xu commented on HIVE-12881:
-

Seems we intentionally interprets a windowing spec with no start and end for 
ORDER BY as UNBOUNDED PRECEDING and CURRENT ROW and match at least SQL Server 
and Oracle. Not an issue.

> Count() function over partitions doesn't work properly with ORDER BY 
> -
>
> Key: HIVE-12881
> URL: https://issues.apache.org/jira/browse/HIVE-12881
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> The following query doesn't seem to return the correct result.
> {noformat}
> create table test (empno string, deptno string, level string, manager string);
> insert into test values ('1', '2', 'B', 'Else'); 
> insert into test values ('1', '2', 'B', 'Else');
> insert into test values ('2', '2', 'B', 'Other');
> select  count( manager) over (partition by deptno, level order by manager) 
> from test; 
> {noformat}
> It  returns 
> {noformat}
> 2
> 2
> 3
> {noformat}
> Without ORDER BY, it returns correct result
> {noformat}
> 3
> 3
> 3
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12881) Count() function over partitions doesn't work properly with ORDER BY

2016-01-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu resolved HIVE-12881.
-
Resolution: Not A Problem

> Count() function over partitions doesn't work properly with ORDER BY 
> -
>
> Key: HIVE-12881
> URL: https://issues.apache.org/jira/browse/HIVE-12881
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> The following query doesn't seem to return the correct result.
> {noformat}
> create table test (empno string, deptno string, level string, manager string);
> insert into test values ('1', '2', 'B', 'Else'); 
> insert into test values ('1', '2', 'B', 'Else');
> insert into test values ('2', '2', 'B', 'Other');
> select  count( manager) over (partition by deptno, level order by manager) 
> from test; 
> {noformat}
> It  returns 
> {noformat}
> 2
> 2
> 3
> {noformat}
> Without ORDER BY, it returns correct result
> {noformat}
> 3
> 3
> 3
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12826) Vectorization: VectorUDAF* suspect isNull checks

2016-01-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102528#comment-15102528
 ] 

Gopal V commented on HIVE-12826:


No new failures with this test - all failed tests have failed previously 
without this patch.

> Vectorization: VectorUDAF* suspect isNull checks
> 
>
> Key: HIVE-12826
> URL: https://issues.apache.org/jira/browse/HIVE-12826
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12826.1.patch
>
>
> for isRepeating=true, checking isNull[selected[i]] might return incorrect 
> results (without a heavy array fill of isNull).
> VectorUDAFSum/Min/Max/Avg and SumDecimal impls need to be reviewed for this 
> pattern.
> {code}
> private void iterateHasNullsRepeatingSelectionWithAggregationSelection(
>   VectorAggregationBufferRow[] aggregationBufferSets,
>   int aggregateIndex,
>value,
>   int batchSize,
>   int[] selection,
>   boolean[] isNull) {
>   
>   for (int i=0; i < batchSize; ++i) {
> if (!isNull[selection[i]]) {
>   Aggregation myagg = getCurrentAggregationBuffer(
> aggregationBufferSets, 
> aggregateIndex,
> i);
>   myagg.sumValue(value);
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2016-01-15 Thread sai chaithanya (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sai chaithanya reassigned HIVE-11055:
-

Assignee: sai chaithanya  (was: Dmitry Tolpeko)

> HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
> ---
>
> Key: HIVE-11055
> URL: https://issues.apache.org/jira/browse/HIVE-11055
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: sai chaithanya
> Fix For: 2.0.0
>
> Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch, 
> HIVE-11055.3.patch, HIVE-11055.4.patch, hplsql-site.xml
>
>
> There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
> (actually any SQL-on-Hadoop implementation and any JDBC source).
> Alan Gates offered to contribute it to Hive under HPL/SQL name 
> (org.apache.hive.hplsql package). This JIRA is to create a patch to 
> contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12880) spark-assembly causes Hive class version problems

2016-01-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102565#comment-15102565
 ] 

Sergey Shelukhin commented on HIVE-12880:
-

Hmm, let me see why this one contains Hive classes

> spark-assembly causes Hive class version problems
> -
>
> Key: HIVE-12880
> URL: https://issues.apache.org/jira/browse/HIVE-12880
> Project: Hive
>  Issue Type: Bug
>Reporter: Hui Zheng
>
> It looks like spark-assembly contains versions of Hive classes (e.g. 
> HiveConf), and these sometimes (always?) come from older versions of Hive.
> We've seen problems where depending on classpath perturbations, NoSuchField 
> errors may be thrown for recently added ConfVars because the HiveConf class 
> comes from spark-assembly.
> Would making sure spark-assembly comes last in the classpath solve the 
> problem?
> Otherwise, can we depend on something that does not package older Hive 
> classes?
> Currently, HIVE-12179 provides a workaround (in non-Spark use case, at least; 
> I am assuming this issue can also affect Hive-on-Spark).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102578#comment-15102578
 ] 

Pengcheng Xiong commented on HIVE-12882:


cc'ing [~prasanth_j]

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102588#comment-15102588
 ] 

Prasanth Jayachandran commented on HIVE-12882:
--

ORC tables/partitions always use noscan by default. In case of ORC, noscan, 
partialscan and fullscan are all the same.

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102599#comment-15102599
 ] 

Pengcheng Xiong commented on HIVE-12882:


[~prasanth_j] that is great. Then, if it happens without customer aware, shall 
we drop the "no scan" in the parser?

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102603#comment-15102603
 ] 

Prasanth Jayachandran commented on HIVE-12882:
--

RCFile still uses three modes with different semantics. noscan, partialscan and 
default full scan. Dropping in the parser will break RCFile behaviour. As far 
as ORC is concerned, it really doesn't matter what user specifies. 

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102611#comment-15102611
 ] 

Pengcheng Xiong commented on HIVE-12882:


If noscan is much faster than partial scan and full scan and they have the same 
accuracy. why should we keep partialscan and default full scan for RCFile?

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12809) Vectorization: fast-path for coalesce if input.noNulls = true

2016-01-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12809:
---
Attachment: HIVE-12809.1.patch

> Vectorization: fast-path for coalesce if input.noNulls = true
> -
>
> Key: HIVE-12809
> URL: https://issues.apache.org/jira/browse/HIVE-12809
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12809.1.patch
>
>
> Coalesce can skip processing other columns, if all the input columns are 
> non-null.
> Possibly retaining, isRepeating=true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12854) LLAP: register permanent UDFs in the executors to make them usable, from localized jars

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12854.
-
Resolution: Duplicate

This is much simpler than I thought it would be because FunctionRegistry isn't 
actually needed on LLAP side, and also requires a different interface; so I 
will just roll this into the previous patch.

> LLAP: register permanent UDFs in the executors to make them usable, from 
> localized jars
> ---
>
> Key: HIVE-12854
> URL: https://issues.apache.org/jira/browse/HIVE-12854
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12853) LLAP: localize permanent UDF jars to daemon and add them to classloader

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12853:

Summary: LLAP: localize permanent UDF jars to daemon and add them to 
classloader  (was: LLAP: localize permanent UDF jars to daemon)

> LLAP: localize permanent UDF jars to daemon and add them to classloader
> ---
>
> Key: HIVE-12853
> URL: https://issues.apache.org/jira/browse/HIVE-12853
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12853.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12809) Vectorization: fast-path for coalesce if input.noNulls = true

2016-01-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12809:
---
Attachment: HIVE-12809.2.patch

Attach the right patch with the isNull[0] change.

> Vectorization: fast-path for coalesce if input.noNulls = true
> -
>
> Key: HIVE-12809
> URL: https://issues.apache.org/jira/browse/HIVE-12809
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12809.1.patch, HIVE-12809.2.patch
>
>
> Coalesce can skip processing other columns, if all the input columns are 
> non-null.
> Possibly retaining, isRepeating=true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102630#comment-15102630
 ] 

Prasanth Jayachandran commented on HIVE-12882:
--

RCFile supports the following and each of these are progressively slower
noscan - does not read files (total file count, total file size using hdfs 
APIs) (fastest)
partialscan - partially read files to get metadata (row count) (fast)
fullscan - read row-by-row to compute raw data size (slow)

In case of ORC, all these can be retrieved from orc file footer so there is 
only one mode which is fast. 

RCFile needs all 3 modes. 



> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102636#comment-15102636
 ] 

Pengcheng Xiong commented on HIVE-12882:


Why RCFile needs all 3 modes? This is something that I do not understand. :) 
They achieve the same purpose and noscan obviously outperforms the other two 
without any down side. Then why we need to keep partialscan and fullscan modes? 
Thanks.

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12853) LLAP: localize permanent UDF jars to daemon and add them to classloader

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12853:

Attachment: HIVE-12853.01.patch

This also adds the class to classloader. I was able to run a query with 
permanent udf on LLAP.

> LLAP: localize permanent UDF jars to daemon and add them to classloader
> ---
>
> Key: HIVE-12853
> URL: https://issues.apache.org/jira/browse/HIVE-12853
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12853.01.patch, HIVE-12853.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12851) Add slider security setting support to LLAP packager

2016-01-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102643#comment-15102643
 ] 

Gopal V commented on HIVE-12851:


-1 on the use of argparse - is a famously python3 module.

The default centos6 python version is 2.7, which is what we expect at a normal 
install location today.

> Add slider security setting support to LLAP packager
> 
>
> Key: HIVE-12851
> URL: https://issues.apache.org/jira/browse/HIVE-12851
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12851.patch
>
>
> {noformat}
> "slider.hdfs.keytab.dir": "...",
> "slider.am.login.keytab.name": "...",
> "slider.keytab.principal.name": "..."
> {noformat}
> should be emitted into appConfig.json for Slider AM. Right now, they have to 
> be added manually on a secure cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12882) Automatically choose to use noscan for stats collection

2016-01-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102644#comment-15102644
 ] 

Prasanth Jayachandran commented on HIVE-12882:
--

noscan cannot get row count and raw data size. partial scan cannot get raw data 
size. If you need all basic stats then full scan is the only way to go which is 
slow. 

> Automatically choose to use noscan for stats collection
> ---
>
> Key: HIVE-12882
> URL: https://issues.apache.org/jira/browse/HIVE-12882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> noscan is leveraging the file system to derive the #rows and rawDataSize. 
> According to [~ashutoshc], it now only works with RC and ORC file type. We 
> would like Hive to automatically choose to use noscan or scan based on the 
> file system when stats task starts or when user issues the same query 
> "Analyze "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9774) Print yarn application id to console [Spark Branch]

2016-01-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102652#comment-15102652
 ] 

Xuefu Zhang commented on HIVE-9774:
---

[~lirui], could you take a look at this to see if this is something you can 
help with? [~chinnalalam], I'm assuming you're not working on this.

> Print yarn application id to console [Spark Branch]
> ---
>
> Key: HIVE-9774
> URL: https://issues.apache.org/jira/browse/HIVE-9774
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Chinna Rao Lalam
>
> Oozie would like to use beeline to capture the yarn application id of apps so 
> that if a workflow is canceled, the job can be cancelled. When running under 
> MR we print the job id but under spark we do not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-9774) Print yarn application id to console [Spark Branch]

2016-01-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102652#comment-15102652
 ] 

Xuefu Zhang edited comment on HIVE-9774 at 1/15/16 11:07 PM:
-

[~lirui], could you take a look at this to see if this is something you can 
help with? [~chinnalalam], I'm assuming you're not working on this. Let me know 
if otherwise.


was (Author: xuefuz):
[~lirui], could you take a look at this to see if this is something you can 
help with? [~chinnalalam], I'm assuming you're not working on this.

> Print yarn application id to console [Spark Branch]
> ---
>
> Key: HIVE-9774
> URL: https://issues.apache.org/jira/browse/HIVE-9774
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Chinna Rao Lalam
>
> Oozie would like to use beeline to capture the yarn application id of apps so 
> that if a workflow is canceled, the job can be cancelled. When running under 
> MR we print the job id but under spark we do not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12758) Parallel compilation: Operator::resetId() is not thread-safe

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102668#comment-15102668
 ] 

Hive QA commented on HIVE-12758:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782437/HIVE-12758.03.patch

{color:green}SUCCESS:{color} +1 due to 13 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6640/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6640/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6640/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782437 - PreCommit-HIVE-TRUNK-Build

> Parallel compilation: Operator::resetId() is not thread-safe
> 
>
> Key: HIVE-12758
> URL: https://issues.apache.org/jira/browse/HIVE-12758
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12758.01.patch, HIVE-12758.02.patch, 
> HIVE-12758.03.patch, HIVE-12758.03.patch, HIVE-12758.patch
>
>
> {code}
>   private static AtomicInteger seqId;
> ...
>   public Operator() {
> this(String.valueOf(seqId.getAndIncrement()));
>   }
>   public static void resetId() {
> seqId.set(0);
>   }
> {code}
> Potential race-condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12851) Add slider security setting support to LLAP packager

2016-01-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102671#comment-15102671
 ] 

Sergey Shelukhin commented on HIVE-12851:
-

argparse is supported for Python 2.3 and later... I think it's reasonable to 
expect users to install it if missing. It's also a standalone module, so it can 
be copied if missing. When I was testing it, it was present on Python 2.6.6 on 
RHEL 6.6 and CentOS 6.6 (2 separate clusters). 

> Add slider security setting support to LLAP packager
> 
>
> Key: HIVE-12851
> URL: https://issues.apache.org/jira/browse/HIVE-12851
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12851.patch
>
>
> {noformat}
> "slider.hdfs.keytab.dir": "...",
> "slider.am.login.keytab.name": "...",
> "slider.keytab.principal.name": "..."
> {noformat}
> should be emitted into appConfig.json for Slider AM. Right now, they have to 
> be added manually on a secure cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12758) Parallel compilation: Operator::resetId() is not thread-safe

2016-01-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102674#comment-15102674
 ] 

Sergey Shelukhin commented on HIVE-12758:
-

Remaining test failures are unrelated.

> Parallel compilation: Operator::resetId() is not thread-safe
> 
>
> Key: HIVE-12758
> URL: https://issues.apache.org/jira/browse/HIVE-12758
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12758.01.patch, HIVE-12758.02.patch, 
> HIVE-12758.03.patch, HIVE-12758.03.patch, HIVE-12758.patch
>
>
> {code}
>   private static AtomicInteger seqId;
> ...
>   public Operator() {
> this(String.valueOf(seqId.getAndIncrement()));
>   }
>   public static void resetId() {
> seqId.set(0);
>   }
> {code}
> Potential race-condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12758) Parallel compilation: Operator::resetId() is not thread-safe

2016-01-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102684#comment-15102684
 ] 

Gopal V commented on HIVE-12758:


LGTM - +1.

The patch seems to have also fixed stats issues in {{subquery_multiinsert.q}}

> Parallel compilation: Operator::resetId() is not thread-safe
> 
>
> Key: HIVE-12758
> URL: https://issues.apache.org/jira/browse/HIVE-12758
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12758.01.patch, HIVE-12758.02.patch, 
> HIVE-12758.03.patch, HIVE-12758.03.patch, HIVE-12758.patch
>
>
> {code}
>   private static AtomicInteger seqId;
> ...
>   public Operator() {
> this(String.valueOf(seqId.getAndIncrement()));
>   }
>   public static void resetId() {
> seqId.set(0);
>   }
> {code}
> Potential race-condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12851) Add slider security setting support to LLAP packager

2016-01-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102717#comment-15102717
 ] 

Gopal V commented on HIVE-12851:


QE  automation uses it, which might be how you have it.

> Add slider security setting support to LLAP packager
> 
>
> Key: HIVE-12851
> URL: https://issues.apache.org/jira/browse/HIVE-12851
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12851.patch
>
>
> {noformat}
> "slider.hdfs.keytab.dir": "...",
> "slider.am.login.keytab.name": "...",
> "slider.keytab.principal.name": "..."
> {noformat}
> should be emitted into appConfig.json for Slider AM. Right now, they have to 
> be added manually on a secure cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12853) LLAP: localize permanent UDF jars to daemon and add them to classloader

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12853:

Attachment: HIVE-12853.02.patch

Tiny fix for the tests

> LLAP: localize permanent UDF jars to daemon and add them to classloader
> ---
>
> Key: HIVE-12853
> URL: https://issues.apache.org/jira/browse/HIVE-12853
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12853.01.patch, HIVE-12853.02.patch, 
> HIVE-12853.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist

2016-01-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-12855:
---

Assignee: Sergey Shelukhin

> LLAP: add checks when resolving UDFs to enforce whitelist
> -
>
> Key: HIVE-12855
> URL: https://issues.apache.org/jira/browse/HIVE-12855
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Currently, adding a temporary UDF and calling LLAP with it (bypassing the 
> LlapDecider check, I did it by just modifying the source) only fails because 
> the class could not be found. If the UDF was accessible to LLAP, it would 
> execute. Inside the daemon, UDF instantiation should fail for custom UDFs 
> (and only succeed for whitelisted custom UDFs, once that is implemented).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union

2016-01-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102864#comment-15102864
 ] 

Ashutosh Chauhan commented on HIVE-12863:
-

I think lowercasing needs to happen in HBaseStore. That will make it consistent 
with ObjectStore, which does lowercasing before storing column names. But more 
importantly, doing it server side (as oppose to currently in client) will 
ensure that all clients of metastore see consistent behavior of column names 
being stored as lower cased in metadata storage.

> fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
> -
>
> Key: HIVE-12863
> URL: https://issues.apache.org/jira/browse/HIVE-12863
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12863.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12611) Make sure spark.yarn.queue is effective and takes the value from mapreduce.job.queuename if given [Spark Branch]

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102878#comment-15102878
 ] 

Hive QA commented on HIVE-12611:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782532/HIVE-12611.1-spark.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9866 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1032/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1032/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1032/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782532 - PreCommit-HIVE-SPARK-Build

> Make sure spark.yarn.queue is effective and takes the value from 
> mapreduce.job.queuename if given [Spark Branch]
> 
>
> Key: HIVE-12611
> URL: https://issues.apache.org/jira/browse/HIVE-12611
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-12611.1-spark.patch
>
>
> Hive users sometimes specifies a job queue name for the submitted MR jobs. 
> For spark, the property name is spark.yarn.queue. We need to make sure that 
> user is able to submit spark jobs to the given queue. If user specifies the 
> MR property, then Hive on Spark should take that as well to make it backward 
> compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102897#comment-15102897
 ] 

Hive QA commented on HIVE-12863:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782456/HIVE-12863.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6641/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6641/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6641/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782456 - PreCommit-HIVE-TRUNK-Build

> fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
> -
>
> Key: HIVE-12863
> URL: https://issues.apache.org/jira/browse/HIVE-12863
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12863.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union

2016-01-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12863:
---
Attachment: HIVE-12863.02.patch

> fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
> -
>
> Key: HIVE-12863
> URL: https://issues.apache.org/jira/browse/HIVE-12863
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12611) Make sure spark.yarn.queue is effective and takes the value from mapreduce.job.queuename if given [Spark Branch]

2016-01-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102946#comment-15102946
 ] 

Xuefu Zhang commented on HIVE-12611:


It also looks like the problem with parquet was fixed based on the result. We 
had to go to the test machine to cleanup /thirdparty tarball.

> Make sure spark.yarn.queue is effective and takes the value from 
> mapreduce.job.queuename if given [Spark Branch]
> 
>
> Key: HIVE-12611
> URL: https://issues.apache.org/jira/browse/HIVE-12611
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Attachments: HIVE-12611.1-spark.patch
>
>
> Hive users sometimes specifies a job queue name for the submitted MR jobs. 
> For spark, the property name is spark.yarn.queue. We need to make sure that 
> user is able to submit spark jobs to the given queue. If user specifies the 
> MR property, then Hive on Spark should take that as well to make it backward 
> compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12851) Add slider security setting support to LLAP packager

2016-01-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12851:
---
Attachment: HIVE-12851.2.patch

Fix the --chaosmonkey option and add "s" suffix to the template formats.

> Add slider security setting support to LLAP packager
> 
>
> Key: HIVE-12851
> URL: https://issues.apache.org/jira/browse/HIVE-12851
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12851.2.patch, HIVE-12851.patch
>
>
> {noformat}
> "slider.hdfs.keytab.dir": "...",
> "slider.am.login.keytab.name": "...",
> "slider.keytab.principal.name": "..."
> {noformat}
> should be emitted into appConfig.json for Slider AM. Right now, they have to 
> be added manually on a secure cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12877) Hive use index for queries will lose some data if the Query file is compressed.

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102955#comment-15102955
 ] 

Hive QA commented on HIVE-12877:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782464/HIVE-12877.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6642/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6642/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6642/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782464 - PreCommit-HIVE-TRUNK-Build

> Hive use index for queries will lose some data if the Query file is 
> compressed.
> ---
>
> Key: HIVE-12877
> URL: https://issues.apache.org/jira/browse/HIVE-12877
> Project: Hive
>  Issue Type: Bug
>  Components: Indexing
>Affects Versions: 1.2.1
> Environment: This problem exists in all Hive versions.no matter what 
> platform
>Reporter: yangfang
> Attachments: HIVE-12877.patch
>
>
> Hive created the index using the extracted file length when the file is  the 
> compressed,
> but when to divide the data into pieces in MapReduce,Hive use the file length 
> to compare with the extracted file length,if
> If it found that these two lengths are not matched, It filters out the 
> file.So the query will lose some data.
> I modified the source code and make hive index can be used when the files is 
> compressed,please test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12839) Upgrade Hive to Calcite 1.6

2016-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102987#comment-15102987
 ] 

Hive QA commented on HIVE-12839:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782465/HIVE-12839.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10019 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6643/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6643/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6643/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782465 - PreCommit-HIVE-TRUNK-Build

> Upgrade Hive to Calcite 1.6
> ---
>
> Key: HIVE-12839
> URL: https://issues.apache.org/jira/browse/HIVE-12839
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12839.01.patch, HIVE-12839.02.patch
>
>
> CLEAR LIBRARY CACHE
> Upgrade Hive to Calcite 1.6.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)