Re: Review Request 40500: HIVE-12338 Add webui to HiveServer2

2015-11-19 Thread Mohit Sabharwal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40500/#review107283
---


LGTM. This patch only includes SQLOperations. Are we planning to add metadata 
operations as well ? (so we can capture jdbc clients and Hue usage as well...)


common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 1856)


Th -> The



service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
(line 65)


any reason we only want sqloperations ?


- Mohit Sabharwal


On Nov. 19, 2015, 8:53 p.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40500/
> ---
> 
> (Updated Nov. 19, 2015, 8:53 p.m.)
> 
> 
> Review request for hive, Szehon Ho and Xuefu Zhang.
> 
> 
> Bugs: HIVE-12338
> https://issues.apache.org/jira/browse/HIVE-12338
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Added web UI to HS2. The UI is similar to those for other Hadoop components.
> The default web UI port is set to 10002, which is configurable. It can be 
> disabled. Currently it shows active sessions and queries. It can also access 
> locals, metrics, and configuration.
> 
> 
> Diffs
> -
> 
>   common/pom.xml cd14581 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2185f85 
>   common/src/java/org/apache/hive/http/AdminAuthorizedServlet.java 
> PRE-CREATION 
>   common/src/java/org/apache/hive/http/ConfServlet.java PRE-CREATION 
>   common/src/java/org/apache/hive/http/HttpServer.java PRE-CREATION 
>   common/src/java/org/apache/hive/http/JMXJsonServlet.java PRE-CREATION 
>   pom.xml c6df4a5 
>   service/pom.xml afa52cf 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> d13415e 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> b0bd351 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> 8b42265 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> 1ab5652 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 2d784f0 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> d11cf3d 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java b30b6a2 
>   service/src/resources/hive-webapps/hiveserver2/hiveserver2.jsp PRE-CREATION 
>   service/src/resources/hive-webapps/hiveserver2/index.html PRE-CREATION 
>   service/src/resources/hive-webapps/static/css/bootstrap-theme.min.css 
> PRE-CREATION 
>   service/src/resources/hive-webapps/static/css/bootstrap.min.css 
> PRE-CREATION 
>   service/src/resources/hive-webapps/static/css/hive.css PRE-CREATION 
>   
> service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.eot
>  PRE-CREATION 
>   
> service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.svg
>  PRE-CREATION 
>   
> service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.ttf
>  PRE-CREATION 
>   
> service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.woff
>  PRE-CREATION 
>   service/src/resources/hive-webapps/static/hive_logo.jpeg PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/40500/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



[jira] [Created] (HIVE-12477) CBO: Left Semijoins are incompatible with a cross-product

2015-11-19 Thread Gopal V (JIRA)
Gopal V created HIVE-12477:
--

 Summary: CBO: Left Semijoins are incompatible with a cross-product
 Key: HIVE-12477
 URL: https://issues.apache.org/jira/browse/HIVE-12477
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez


with HIVE-12017 in place, a few queries generate left sem-joins without a key.

This is an invalid plan and can be produced by doing.

{code}
explain logical select count(1) from store_sales where ss_sold_date_sk in 
(select d_date_sk from date_dim where d_date_sk = 1);

LOGICAL PLAN:  
$hdt$_0:$hdt$_0:$hdt$_0:store_sales
  TableScan (TS_0)
alias: store_sales
filterExpr: (ss_sold_date_sk = 1) (type: boolean)
Filter Operator (FIL_20)
  predicate: (ss_sold_date_sk = 1) (type: boolean)
  Select Operator (SEL_2)
Reduce Output Operator (RS_9)
  sort order: 
  Join Operator (JOIN_11)
condition map:
 Left Semi Join 0 to 1
keys:
  0 
  1 
Group By Operator (GBY_14)
  aggregations: count(1)
  mode: hash
{code}

without CBO

{code}
sq_1:date_dim
  TableScan (TS_1)
alias: date_dim
filterExpr: ((1) IN (RS[6]) and (d_date_sk = 1)) (type: boolean)
Filter Operator (FIL_21)
  predicate: ((1) IN (RS[6]) and (d_date_sk = 1)) (type: boolean)
  Select Operator (SEL_3)
expressions: 1 (type: int)
outputColumnNames: _col0
Group By Operator (GBY_5)
  keys: _col0 (type: int)
  mode: hash
  outputColumnNames: _col0
  Reduce Output Operator (RS_8)
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Join Operator (JOIN_9)
  condition map:
   Left Semi Join 0 to 1
  keys:
0 ss_sold_date_sk (type: int)
1 _col0 (type: int)
  Group By Operator (GBY_12)
aggregations: count(1)
mode: hash
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12476) Metastore NPE on Oracle with Direct SQL

2015-11-19 Thread Jason Dere (JIRA)
Jason Dere created HIVE-12476:
-

 Summary: Metastore NPE on Oracle with Direct SQL
 Key: HIVE-12476
 URL: https://issues.apache.org/jira/browse/HIVE-12476
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jason Dere
Assignee: Jason Dere


Stack trace looks very similar to HIVE-8485. I believe the metastore's Direct 
SQL mode requires additional fixes similar to HIVE-8485, around the 
Partition/StorageDescriptorSerDe parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12475) Parquet schema evolution within array> doesn't work

2015-11-19 Thread Mohammad Kamrul Islam (JIRA)
Mohammad Kamrul Islam created HIVE-12475:


 Summary: Parquet schema evolution within array> doesn't 
work
 Key: HIVE-12475
 URL: https://issues.apache.org/jira/browse/HIVE-12475
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.1.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam


If we create a table with type array>, and later added a field in the 
struct, we got the following exception.

The following SQL statements would recreate the error:

{quote}
CREATE TABLE pq_test (f1 array>) STORED AS  PARQUET;
INSERT INTO TABLE pq_test select array(named_struct("c1",1,"c2",2)) FROM tmp 
LIMIT 2;

SELECT * from pq_test;

ALTER TABLE pq_test REPLACE COLUMNS (f1 
array>); //* cc
SELECT * from pq_test;
{quote}

Exception:
{quote}
Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
at 
org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldData(ArrayWritableObjectInspector.java:142)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:363)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:316)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:199)
at 
org.apache.hadoop.hive.serde2.DelimitedJSONSerDe.serializeField(DelimitedJSONSerDe.java:61)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:236)
at 
org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
at 
org.apache.hadoop.hive.ql.exec.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:71)
at 
org.apache.hadoop.hive.ql.exec.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:40)
at 
org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:89)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Hive 2.0 release plan

2015-11-19 Thread Sergey Shelukhin
Hmm. I looked at the JIRAs targeting the release and it looks like there’s
large number of features still pending.
I am going to postpone creating the branch to next week.
I am also going to unassign JIRAs from the release at that time.

On 15/11/16, 18:09, "Sergey Shelukhin"  wrote:

>With 8 binding +1s and 0 -1s the vote passes.
>The release activities will now proceed according to the plan. I will look
>at the features that are targeted at 2.0 release and create the branch
>~EOW balancing  the waiting for large commits and avoiding too much delay.
>
>On 15/11/16, 10:32, "Sergey Shelukhin"  wrote:
>
>>Including the user list.
>>
>>On 15/11/13, 17:54, "Lefty Leverenz"  wrote:
>>
>>>The Hive bylaws require this to be submitted on the user@hive mailing
>>>list
>>>(even though users don't get to vote).  See Release Plan in Actions
>>>
>>>.
>>>
>>>-- Lefty
>>>
>>>On Fri, Nov 13, 2015 at 7:33 PM, Thejas Nair 
>>>wrote:
>>>
 +1

 On Fri, Nov 13, 2015 at 2:26 PM, Vaibhav Gumashta
  wrote:
 > +1
 >
 > Thanks,
 > --Vaibhav
 >
 >
 >
 >
 >
 > On Fri, Nov 13, 2015 at 2:24 PM -0800, "Tristram de Lyones" <
 delyo...@gmail.com> wrote:
 >
 > +1
 >
 > On Fri, Nov 13, 2015 at 1:38 PM, Sergey Shelukhin <
 ser...@hortonworks.com>
 > wrote:
 >
 >> Hi.
 >> With no strong objections on DISCUSS thread, some issues raised and
 >> addressed, and a reminder from Carl about the bylaws for the
release
 >> process, I propose we release the first version of Hive 2 (2.0),
and
 >> nominate myself as release manager.
 >> The goal is to have the first release of Hive with aggressive set
of
new
 >> features, some of which are ready to use and some are at
experimental
 >> stage and will be developed in future Hive 2 releases, in line with
the
 >> Hive-1-Hive-2 split discussion.
 >> If the vote passes, the timeline to create a branch should be
around
the
 >> end of next week (to minimize merging in the wake of the release),
and
 the
 >> timeline to release would be around the end of November, depending
on
 the
 >> issues found during the RC cutting process, as usual.
 >>
 >> Please vote:
 >> +1 proceed with the release plan
 >> +-0 don’t care
 >> -1 don’t proceed with the release plan, for such and such reasons
 >>
 >> The vote will run for 3 days.
 >>
 >>

>>
>



Re: Review Request 40467: HIVE-12075 analyze for file metadata

2015-11-19 Thread Alan Gates

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40467/#review107267
---



metastore/if/hive_metastore.thrift (line 781)


You allow it to request caching for one partition or all.  Are there cases 
where you'd want to cache some put not all partitions?  Should partName be list 
instead?



metastore/src/java/org/apache/hadoop/hive/metastore/FileMetadataHandler.java 
(line 45)


I agree with this not depending on HBaseReadWrite.  conceptually 
ObjectStore could choose to implement file metadata caching.  There's nothing 
HBase specific about it.  And HBaseReadWrite was not intended to be used 
outside of the metastore/hbase package.



metastore/src/java/org/apache/hadoop/hive/metastore/FileMetadataManager.java 
(line 111)


Is this going to work with ACID?  There's an extra level of directories 
there for base and delta.



metastore/src/java/org/apache/hadoop/hive/metastore/PartitionExpressionProxy.java
 (line 73)


Why did you make these methods ORC specific?  That doesn't seem appropriate 
this level.  What's to keep Parquet or another format from supporting file 
metadata?



metastore/src/java/org/apache/hadoop/hive/metastore/PartitionExpressionProxy.java
 (line 81)


PartitionExpressionForMetastore implements PartitionExpressionProxy.  But I 
don't see any changes for that class in this patch.  Did it just fall out when 
you removed the generated code?



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseReadWrite.java 
(line 2149)


Nitpick:  data is already plural, so "metadatas" is weird.  I know we never 
talk about metadatum, but still...



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseReadWrite.java 
(line 2150)


I don't understand what's in the addedCols and addedVals arrays.


- Alan Gates


On Nov. 19, 2015, 2:37 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40467/
> ---
> 
> (Updated Nov. 19, 2015, 2:37 a.m.)
> 
> 
> Review request for hive, Alan Gates and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2185f85 
>   itests/src/test/resources/testconfiguration.properties a33e720 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
>  1d0fdf0 
>   metastore/if/hive_metastore.thrift bb754f1 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/FileMetadataHandler.java 
> 7c3525a 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/FileMetadataManager.java 
> PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> a835f6a 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> c5e7a5f 
>   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
> aa96f77 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 02cbd76 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 803c6e7 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/PartitionExpressionProxy.java
>  ed59829 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 5b36b03 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/filemeta/OrcFileMetadataHandler.java
>  14189da 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseReadWrite.java 
> 2fb3e8f 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 98e6c75 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  9a1d159 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  8dde0af 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/MockPartitionExpressionForMetastore.java
>  d72bf76 
>   metastore/src/test/org/apache/hadoop/hive/metastore/TestObjectStore.java 
> 9089d1c 
>   metastore/src/test/org/apache/hadoop/hive/metastore/hbase/MockUtils.java 
> 983129a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 9ab3e98 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 488d923 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java
>  f9978b4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/Ana

[jira] [Created] (HIVE-12474) ORDER BY should handle column refs in parantheses

2015-11-19 Thread Aaron Tokhy (JIRA)
Aaron Tokhy created HIVE-12474:
--

 Summary: ORDER BY should handle column refs in parantheses
 Key: HIVE-12474
 URL: https://issues.apache.org/jira/browse/HIVE-12474
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 1.2.1, 1.0.0
Reporter: Aaron Tokhy
Assignee: Pengcheng Xiong
Priority: Minor


CREATE TABLE test(a INT, b INT, c INT)
COMMENT 'This is a test table';

hive>
select lead(c) over (order by (a,b)) from test limit 10;
FAILED: ParseException line 1:31 missing ) at ',' near ')'
line 1:34 missing EOF at ')' near ')'

hive>
select lead(c) over (order by a,b) from test limit 10;

- Works as expected.

It appears that 'cluster by'/'sort by'/'distribute by'/'partition by' allows 
this:
https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g#L129

For example, this syntax is still valid:
select lead(c) over (sort by (a,b)) from test limit 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-11-19 Thread Gopal V (JIRA)
Gopal V created HIVE-12473:
--

 Summary: DPP: UDFs on the partition column side does not evaluate 
correctly
 Key: HIVE-12473
 URL: https://issues.apache.org/jira/browse/HIVE-12473
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.1, 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V


Related to HIVE-12462

{code}
$hdt$_0:$hdt$_1:a
  TableScan (TS_2)
alias: a
filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) IN 
(RS[6])) (type: boolean)
{code}

Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
checks for final type, not the column type.

{code}
ObjectInspector oi =

PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
.getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));

Converter converter =
ObjectInspectorConverters.getConverter(
PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12472) Add test case for HIVE-10592

2015-11-19 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-12472:


 Summary: Add test case for HIVE-10592
 Key: HIVE-12472
 URL: https://issues.apache.org/jira/browse/HIVE-12472
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-10592 has a fix for the following NPE issue (table should have all columns 
values as null for timestamp and date columns)
{code:title=query}
set hive.optimize.index.filter=true;
select count(*) from orctable where timestamp_col is null;
select count(*) from orctable where date_col is null;
{code}
{code:title=exception}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.orc.ColumnStatisticsImpl$TimestampStatisticsImpl.getMinimum(ColumnStatisticsImpl.java:845)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.getMin(RecordReaderImpl.java:308)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateProto(RecordReaderImpl.java:332)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:710)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:751)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:777)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:205)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:183)
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.(OrcRawRecordMerger.java:226)
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:437)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1235)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1117)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 26 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:1, Vertex vertex_1446768202865_0008_5_00 [Map 1] killed/failed due 
to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
killedVertices:0
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 40359: HIVE-11110 Cost Based Optimizer improvements

2015-11-19 Thread John Pullokkaran


> On Nov. 17, 2015, 12:08 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/rand_partitionpruner3.q.out, lines 
> > 280-282
> > 
> >
> > Incorrect result.

The test case is testing if Partition pruning logic takes out non-deterministic 
functions from experessions.
If you disable CBO & disable PPD you will get a different result that with 
CBO=false, PPD=true.

This is because random(1)<0.1 is non deterministic.
With this patch, CBO first performs partition pruning and then applies col 
pruning.
Col Pruning introduces a select on top of TS below filter. Hence the difference 
in result.

This is not a real issue


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40359/#review106759
---


On Nov. 16, 2015, 6:54 p.m., John Pullokkaran wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40359/
> ---
> 
> (Updated Nov. 16, 2015, 6:54 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Hari Sankar Sivarama Subramaniyan, 
> and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-0
> https://issues.apache.org/jira/browse/HIVE-0
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-0 Introduces not null filters, improves filter selectivity 
> estimation, Streamlines pre-join order optimizations
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out d044c7e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
> e1b60b0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
> cce3588 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/TraitsUtil.java 
> be28828 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveProject.java
>  4b7887a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePreFilteringRule.java
>  82d9600 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/FilterSelectivityEstimator.java
>  b52779c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/SqlFunctionConverter.java
>  a17fb94 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
>  a8ff158 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java de67b54 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java e291a48 
>   ql/src/test/queries/clientpositive/special_character_in_tabnames_1.q 
> 7867ae1 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/sortmerge_mapjoin_mismatch_1.q.out 
> b2a7d89 
>   ql/src/test/results/clientpositive/allcolref_in_udf.q.out 216b037 
>   ql/src/test/results/clientpositive/ambiguous_col.q.out 7f04e89 
>   ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out 
> 20ccda5 
>   ql/src/test/results/clientpositive/annotate_stats_join.q.out ee05e6e 
>   ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out aa380b2 
>   ql/src/test/results/clientpositive/auto_join1.q.out 48ad641 
>   ql/src/test/results/clientpositive/auto_join10.q.out fa6f62d 
>   ql/src/test/results/clientpositive/auto_join12.q.out 7d8db0a 
>   ql/src/test/results/clientpositive/auto_join13.q.out 952dbf8 
>   ql/src/test/results/clientpositive/auto_join15.q.out 8e4b24c 
>   ql/src/test/results/clientpositive/auto_join16.q.out 1bad0f9 
>   ql/src/test/results/clientpositive/auto_join17.q.out e85cae8 
>   ql/src/test/results/clientpositive/auto_join19.q.out 8a57cb0 
>   ql/src/test/results/clientpositive/auto_join2.q.out abfc611 
>   ql/src/test/results/clientpositive/auto_join22.q.out bdee886 
>   ql/src/test/results/clientpositive/auto_join24.q.out 5b57303 
>   ql/src/test/results/clientpositive/auto_join26.q.out 94ab76f 
>   ql/src/test/results/clientpositive/auto_join3.q.out d015449 
>   ql/src/test/results/clientpositive/auto_join30.q.out 5437b7f 
>   ql/src/test/results/clientpositive/auto_join33.q.out 0dcd91d 
>   ql/src/test/results/clientpositive/auto_join4.q.out dbbee56 
>   ql/src/test/results/clientpositive/auto_join5.q.out 3209d07 
>   ql/src/test/results/clientpositive/auto_join8.q.out 2ca26aa 
>   ql/src/test/results/clientpositive/auto_join9.q.out 13dd5de 
>   ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 
> f42b45e 
>   ql/src/test/results/clientpositive/auto_join_stats.q.out d75d6c4 
>   ql/src/test/results/clientpositive/auto_join_stats2.q.out a0aefa3 
>   ql/src/test/results/clientpositive/auto_join_without_localtask.q.out 
> 3d0067b 
>   ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out

[jira] [Created] (HIVE-12471) Secure HS2 web UI with SSL and kerberos

2015-11-19 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HIVE-12471:
--

 Summary: Secure HS2 web UI with SSL and kerberos
 Key: HIVE-12471
 URL: https://issues.apache.org/jira/browse/HIVE-12471
 Project: Hive
  Issue Type: Sub-task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 40500: HIVE-12338 Add webui to HiveServer2

2015-11-19 Thread Jimmy Xiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40500/
---

Review request for hive, Szehon Ho and Xuefu Zhang.


Bugs: HIVE-12338
https://issues.apache.org/jira/browse/HIVE-12338


Repository: hive-git


Description
---

Added web UI to HS2. The UI is similar to those for other Hadoop components.
The default web UI port is set to 10002, which is configurable. It can be 
disabled. Currently it shows active sessions and queries. It can also access 
locals, metrics, and configuration.


Diffs
-

  common/pom.xml cd14581 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2185f85 
  common/src/java/org/apache/hive/http/AdminAuthorizedServlet.java PRE-CREATION 
  common/src/java/org/apache/hive/http/ConfServlet.java PRE-CREATION 
  common/src/java/org/apache/hive/http/HttpServer.java PRE-CREATION 
  common/src/java/org/apache/hive/http/JMXJsonServlet.java PRE-CREATION 
  pom.xml c6df4a5 
  service/pom.xml afa52cf 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java d13415e 
  service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
b0bd351 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
8b42265 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
1ab5652 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
2d784f0 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
d11cf3d 
  service/src/java/org/apache/hive/service/server/HiveServer2.java b30b6a2 
  service/src/resources/hive-webapps/hiveserver2/hiveserver2.jsp PRE-CREATION 
  service/src/resources/hive-webapps/hiveserver2/index.html PRE-CREATION 
  service/src/resources/hive-webapps/static/css/bootstrap-theme.min.css 
PRE-CREATION 
  service/src/resources/hive-webapps/static/css/bootstrap.min.css PRE-CREATION 
  service/src/resources/hive-webapps/static/css/hive.css PRE-CREATION 
  
service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.eot
 PRE-CREATION 
  
service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.svg
 PRE-CREATION 
  
service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.ttf
 PRE-CREATION 
  
service/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.woff
 PRE-CREATION 
  service/src/resources/hive-webapps/static/hive_logo.jpeg PRE-CREATION 

Diff: https://reviews.apache.org/r/40500/diff/


Testing
---


Thanks,

Jimmy Xiang



Re: Review Request 40055: HIVE-12017

2015-11-19 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40055/
---

(Updated Nov. 19, 2015, 5:51 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Bugs: HIVE-12017
https://issues.apache.org/jira/browse/HIVE-12017


Repository: hive-git


Description
---

HIVE-12017


Diffs (updated)
-

  hbase-handler/src/test/results/positive/hbase_queries.q.out 
d044c7ed3874acaf521d83bdddfa02276bf71cb3 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 
b4e7d47134357bc1e25af8642373ffb9babc015b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateProjectMergeRule.java
 53f04ee72d8a614a602ada688f89d1febd467689 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/SqlFunctionConverter.java
 a17fb9498557fc95f273240c1484d69f514fcad0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 
de67b54a2c6cfd9bc4413ebf7f715e54c61b966f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
5323a7df342fa7b7f7b7f457d3d99d3407ed51a6 
  ql/src/test/queries/clientpositive/mergejoin.q 
7550e09ba33182415b126fb8e0002028d4c9a8ee 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 
623c2e85a84919b41735913c3da32514f5d3ff22 
  ql/src/test/results/clientnegative/join_nonexistent_part.q.out 
391dd0592611d7af8484c52efde3a50fb7dfa44d 
  ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out 
aa380b20efee11a0a3a4c7acaeb9482444c1d3ce 
  ql/src/test/results/clientpositive/archive_excludeHadoop20.q.out 
c2b98727d21f4990ae7496a0a8fa9ac16598f4c0 
  ql/src/test/results/clientpositive/archive_multi.q.out 
0ad29d122153bd4adf4d19064188b0c4f94e05ab 
  ql/src/test/results/clientpositive/auto_join1.q.out 
48ad641788a6adfad5f7e4fcdfef3d67eac70a4e 
  ql/src/test/results/clientpositive/auto_join10.q.out 
fa6f62d18abbf517c4e49ac3fa9da190c23a119f 
  ql/src/test/results/clientpositive/auto_join11.q.out 
851920b9dce7d9fb8d105ef81404f3f67166ad15 
  ql/src/test/results/clientpositive/auto_join14.q.out 
47e1724ab18ac322a83f687fab37ea44c4fdf78a 
  ql/src/test/results/clientpositive/auto_join24.q.out 
5b573033d317e3e7dbf70f9b6ef253b35ac7c140 
  ql/src/test/results/clientpositive/auto_join26.q.out 
94ab76f750a2ce51a645012dcd5beb43b560445a 
  ql/src/test/results/clientpositive/auto_join32.q.out 
161ab6b377a644e62a94d69aa9d3bba02b8045e6 
  ql/src/test/results/clientpositive/auto_join_filters.q.out 
a6720d908f4c5a354cb4f3234f8c288249d35d2d 
  ql/src/test/results/clientpositive/auto_join_nulls.q.out 
4416f3e921a3590223658eb6b0e15c317733a7e2 
  ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 
f42b45e635ca5e271d48ea6bc48c8b0e45ac67d1 
  ql/src/test/results/clientpositive/auto_join_stats.q.out 
d75d6c42eba366905afb4e6e171402c50581ba05 
  ql/src/test/results/clientpositive/auto_join_stats2.q.out 
a0aefa3de8aa07ab7f4a634fcc22b29ba621a6c5 
  ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out 
1dc9cd07cddb5bce3b2369c1776b690bb239e050 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 
f1aadef724d6f10ca4a710a3d11382e2f01ca1e5 
  ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 
fb1e6568de332e930e7836e09aef142f7f66eb17 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 
5dad0fb366d4e1fc21a9a7ba034d60c942e8664e 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 
b1ba1483e1ab83c3f7ea71fddf5247bfc5dbde0b 
  ql/src/test/results/clientpositive/auto_sortmerge_join_14.q.out 
33c56fdc6d6f01377dd78e77b99c229ff437d802 
  ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out 
460e5b1b0f60c213f3a14172482a9a8f8e85454d 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 
a7a5faa8f8cc29ff53328e6db598cc0acf4cb68e 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 
dfb1a16529bb09de8eb154976386ae39b76420c8 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 
013bc07b6ef804d57632ab2840628e7903f3cb47 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 
d751e7052a14f2ca308699c3a52beb30f989d0a0 
  ql/src/test/results/clientpositive/auto_sortmerge_join_6.q.out 
853f6413ad519840c535e32ec680b6eedd13f457 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 
e2d797ba8d5083c2400df86aacaa94e22f71d809 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 
e3bb51d6ef8d69112acb75688c5eabefc628cab7 
  ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out 
bbfa75608deef0a72df8104b6836105b296f7d29 
  ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 
870ecdd37c3e4833dc1b182e98e6e4018f719fbe 
  ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 
33f5c46c2adf43a1e13f8af305da6c585998141a 
  ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 
067d12875a762d98126994f05090d4c4e4cd 
  ql/s

[jira] [Created] (HIVE-12470) Allow splits to provide custom consistent locations, instead of being tied to data locality

2015-11-19 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12470:
-

 Summary: Allow splits to provide custom consistent locations, 
instead of being tied to data locality
 Key: HIVE-12470
 URL: https://issues.apache.org/jira/browse/HIVE-12470
 Project: Hive
  Issue Type: Improvement
  Components: llap
Reporter: Siddharth Seth
Assignee: Siddharth Seth


LLAP instances may not run on the same nodes as HDFS, or may run on a subset of 
the cluster.
Using split locations based on FileSystem locality is not very useful in such 
cases - since that guarantees not getting any locality.
Allow a split to map to a specific location - so that there's a chance of 
getting cache locality across different queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability

2015-11-19 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-12469:
-

 Summary: Bump Commons-Collections dependency from 3.2.1 to 3.2.2. 
to address vulnerability
 Key: HIVE-12469
 URL: https://issues.apache.org/jira/browse/HIVE-12469
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12468) Support multiple subquery expressions in WHERE clause

2015-11-19 Thread Jeremy Beard (JIRA)
Jeremy Beard created HIVE-12468:
---

 Summary: Support multiple subquery expressions in WHERE clause
 Key: HIVE-12468
 URL: https://issues.apache.org/jira/browse/HIVE-12468
 Project: Hive
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: Jeremy Beard
Assignee: Alan Gates


HIVE-784 introduced uncorrelated subqueries in the WHERE clause. The design 
document on that JIRA includes restriction 8.m "We allow only 1 SubQuery 
expression per Query". This restriction should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12467) Add number of dynamic partitions to error message

2015-11-19 Thread Lars Francke (JIRA)
Lars Francke created HIVE-12467:
---

 Summary: Add number of dynamic partitions to error message
 Key: HIVE-12467
 URL: https://issues.apache.org/jira/browse/HIVE-12467
 Project: Hive
  Issue Type: Improvement
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor


Currently when using dynamic partition insert we get an error message saying 
that the client tried to create too many dynamic partitions ("Maximum was set 
to"). I'll extend the error message to specify the number of dynamic partitions 
which can be helpful for debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: HIVE-TRUNK-JAVA8 #135

2015-11-19 Thread hiveqa
See 


Changes:

[rui.li] HIVE-11180: Enable native vectorized map join for spark [Spark Branch] 
(Rui reviewed by Xuefu)

[Xuefu Zhang] HIVE-11466: HIVE-10166 generates more data on hive.log causing 
Jenkins to fill all the disk (Reviewed by Prasanth)

[Chao Sun] HIVE-9139: Clean up GenSparkProcContext.clonedReduceSinks and 
related code [Spark Branch] (Chao Sun, reviewed by Xuefu Zhang)

[rui.li] HIVE-12091: Merge file doesn't work for ORC table when running on 
Spark. [Spark Branch] (Rui reviewed by Xuefu)

[rui.li] HIVE-11473: Upgrade Spark dependency to 1.5 [Spark Branch] (Rui 
reviewed by Xuefu)

[Xuefu Zhang] HIVE-12283: Fix test failures after HIVE-11844 [Spark Branch] 
(Rui via Xuefu)

[Xuefu Zhang] HIVE-12284: Merge master to Spark branch 10/28/2015 [Spark 
Branch] update some test result (Reviewed by Chao)

[rui.li] HIVE-12229: Custom script in query cannot be executed in yarn-cluster 
mode [Spark Branch] (Rui reviewed by Xuefu)

[Sergio Pena] HIVE-12330: Fix precommit Spark test part2 (Sergio Pena, reviewd 
by Szehon Ho)

[Aihua Xu] HIVE-12196 NPE when converting bad timestamp value (Aihua Xu, 
reviewed by Chaoyu Tang)

[Yongzhi Chen] HIVE-12378: Exception on HBaseSerDe.serialize binary field 
(Yongzhi Chen, reviewed by Jimmy Xiang)

[Aihua Xu] HIVE-11488: Add sessionId and queryId info to HS2 log (Aihua Xu, 
reviewed by Szehon Ho)

[ekoifman] HIVE-11948 Investigate TxnHandler and CompactionTxnHandler to see 
where we improve concurrency(Eugene Koifman, reviewed by Alan Gates)

[Szehon Ho] HIVE-12271 : Add metrics around HS2 query execution and job 
submission for Hive (Szehon, reviewed by Jimmy Xiang)

[jpullokk] Bug: HIVE-12384 - Union Operator may produce incorrect result on TEZ 
(Laljo John Pullokkaran reviewed by Sergey Shelukhin, Ashutosh Chauhan)

[sershe] HIVE-11777 : implement an option to have single ETL strategy for 
multiple directories (Sergey Shelukhin, reviewed by Prasanth Jayachandran)

[jpullokk] Bug: HIVE-12384.1 - Union Operator may produce incorrect result on 
TEZ; missed test config update (Laljo John Pullokkaran reviewed by Sergey 
Shelukhin, Ashutosh Chauhan)

[omalley] HIVE-12054. Create vectorized ORC write method. (omalley reviewed by 
prasanthj)

[mmccline] HIVE-11981: ORC Schema Evolution Issues (Vectorized, ACID, and 
Non-Vectorized) (Matt McCline, reviewed by Prasanth J)

[Szehon Ho] HIVE-12388 : GetTables cannot get external tables when TABLE type 
argument is given (Navis and Szehon, via Aihua)

[sseth] HIVE-12430. Remove remaining reference to the hadoop-2 profile. 
(Siddharth Seth, reviewed by Sergey Shelukhin)

[Xuefu Zhang] Revert "HIVE-12330: Fix precommit Spark test part2 (Sergio Pena, 
reviewd by Szehon Ho)"

[daijy] HIVE-11422: Join a ACID table with non-ACID table fail with MR

--
[...truncated 310 lines...]
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestPrepPhase.testExecute.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.testRsyncFromLocalToRemoteInstancesWithFailureOne.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestReportingPhase.testExecute.approved.txt
Aptest2/src/test/java/org/apache/hive/ptest/execution/ExtendedAssert.java
Aptest2/src/test/java/org/apache/hive/ptest/execution/AbstractTestPhase.java
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testBatch.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.testExecHostsWithFailure.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestHostExecutor.testParallelFailsOnExec.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testPassingQFileTest.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestCleanupPhase.testExecute.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestHostExecutor.testIsolatedFailsOnRsyncUnknown.approved.txt
Aptest2/src/test/java/org/apache/hive/ptest/execution/TestLocalCommand.java
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestHostExecutor.testBasic.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepNone.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testFailingQFile.approved.txt
A
ptest2/src/test/java/org/apache/hive/ptest/execution/MockLocalCommandFactory.java
A
ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testAlternativeTestJVM.approved.txt
Aptest2/src/test/java/org/apache/hive/ptest/api
Aptest2/src/test/java/org/apache/hive/ptest/api/server
Aptest2/src/test/java/org/apache/hive/ptest/api/server/TestTestExecutor.java
Aptest2/src/test/java/org/apache/hive/ptest/api/server/TestTestLogger.java
Aptest2/src/test/resources
Aptest2/src/test/resources/test

[jira] [Created] (HIVE-12466) SparkCounter not initialized error

2015-11-19 Thread Rui Li (JIRA)
Rui Li created HIVE-12466:
-

 Summary: SparkCounter not initialized error
 Key: HIVE-12466
 URL: https://issues.apache.org/jira/browse/HIVE-12466
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Rui Li
Assignee: Xuefu Zhang


During a query, lots of the following error found in executor's log:
{noformat}
03:47:28.759 [Executor task launch worker-0] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] has 
not initialized before.
03:47:28.762 [Executor task launch worker-1] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] has 
not initialized before.
03:47:30.707 [Executor task launch worker-1] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
03:47:33.385 [Executor task launch worker-1] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
RECORDS_OUT_1_default.test_table] has not initialized before.
03:47:33.388 [Executor task launch worker-0] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
RECORDS_OUT_1_default.test_table] has not initialized before.
03:47:33.495 [Executor task launch worker-0] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
RECORDS_OUT_1_default.test_table] has not initialized before.
03:47:35.141 [Executor task launch worker-1] ERROR 
org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
RECORDS_OUT_1_default.test_table] has not initialized before.

...
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged

2015-11-19 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-12465:
--

 Summary: Hive might produce wrong results when (outer) joins are 
merged
 Key: HIVE-12465
 URL: https://issues.apache.org/jira/browse/HIVE-12465
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Consider the following query:

{noformat}
select * from
  (select * from tab where tab.key = 0)a
full outer join
  (select * from tab_part where tab_part.key = 98)b
join
  tab_part c
on a.key = b.key and b.key = c.key;
{noformat}

Hive should execute the full outer join operation (without ON clause) and then 
the join operation (ON a.key = b.key and b.key = c.key). Instead, it merges 
both joins, generating the following plan:

{noformat}
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: tab
filterExpr: (key = 0) (type: boolean)
Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE 
Column stats: NONE
Filter Operator
  predicate: (key = 0) (type: boolean)
  Statistics: Num rows: 121 Data size: 11374 Basic stats: COMPLETE 
Column stats: NONE
  Select Operator
expressions: 0 (type: int), value (type: string), ds (type: 
string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 121 Data size: 11374 Basic stats: 
COMPLETE Column stats: NONE
Reduce Output Operator
  key expressions: _col0 (type: int)
  sort order: +
  Map-reduce partition columns: _col0 (type: int)
  Statistics: Num rows: 121 Data size: 11374 Basic stats: 
COMPLETE Column stats: NONE
  value expressions: _col1 (type: string), _col2 (type: string)
  TableScan
alias: tab_part
filterExpr: (key = 98) (type: boolean)
Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
Column stats: NONE
Filter Operator
  predicate: (key = 98) (type: boolean)
  Statistics: Num rows: 250 Data size: 23500 Basic stats: COMPLETE 
Column stats: NONE
  Select Operator
expressions: 98 (type: int), value (type: string), ds (type: 
string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 250 Data size: 23500 Basic stats: 
COMPLETE Column stats: NONE
Reduce Output Operator
  key expressions: _col0 (type: int)
  sort order: +
  Map-reduce partition columns: _col0 (type: int)
  Statistics: Num rows: 250 Data size: 23500 Basic stats: 
COMPLETE Column stats: NONE
  value expressions: _col1 (type: string), _col2 (type: string)
  TableScan
alias: c
Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
Column stats: NONE
Reduce Output Operator
  key expressions: key (type: int)
  sort order: +
  Map-reduce partition columns: key (type: int)
  Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
Column stats: NONE
  value expressions: value (type: string), ds (type: string)
  Reduce Operator Tree:
Join Operator
  condition map:
   Outer Join 0 to 1
   Inner Join 1 to 2
  keys:
0 _col0 (type: int)
1 _col0 (type: int)
2 key (type: int)
  outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
_col7, _col8
  Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
Column stats: NONE
  File Output Operator
compressed: false
Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
Column stats: NONE
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
ListSink
{noformat}

That plan is similar to the following query, which is different than the 
original one:
{noformat}
select * from
  (select * from tab where tab.key = 0)a
full outer join
  (select * from tab_part where tab_part.key = 98)b
on a.key = b.key
join
  tab_part c
on b.key = c.key;
{noformat}

It seems to be a problem in the recognition of join operations that can be 
merged into a single multijoin operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12464) Inconsistent behavior between MapReduce and Spark engine on bucketed mapjoin

2015-11-19 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12464:


 Summary: Inconsistent behavior between MapReduce and Spark engine 
on bucketed mapjoin
 Key: HIVE-12464
 URL: https://issues.apache.org/jira/browse/HIVE-12464
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Spark
Affects Versions: 1.2.1
Reporter: Nemon Lou


Steps to reproduce:
1,prepare the table and data
{noformat}
create table if not exists lxw_test(imei string,sndaid string,data_time string)
CLUSTERED BY(imei) SORTED BY(imei) INTO 10 BUCKETS;
create table if not exists lxw_test1(imei string,sndaid string,data_time string)
CLUSTERED BY(imei) SORTED BY(imei) INTO 5 BUCKETS;
set hive.enforce.bucketing = true;
set hive.enforce.sorting = true;
insert overwrite table lxw_test
values(1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5),(6,6,6),(7,7,7),(8,8,8),(9,9,9),(10,10,10);
insert overwrite table lxw_test1
values 
(1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5),(6,6,6),(7,7,7),(8,8,8),(9,9,9),(10,10,10);
set hive.enforce.bucketing;
insert into table lxw_test1 select * from lxw_test;
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
{noformat}
2,the following sql will success :
{noformat}
set hive.execution.engine=mr;
select  count(1) 
from lxw_test1 a 
join lxw_test b 
on a.imei = b.imei ;
{noformat}
3,this one will fail :
{noformat}
set hive.execution.engine=spark;
select  count(1) 
from lxw_test1 a 
join lxw_test b 
on a.imei = b.imei ;
{noformat}
On spark,the query returns this error:
{noformat}
Error: Error while compiling statement: FAILED: SemanticException [Error 
10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of 
buckets for table lxw_test1 is 5, whereas the number of files is 10 
(state=42000,code=10141)
{noformat}
After set hive.ignore.mapjoin.hint=false and use mapjoin hint,the MapReduce 
engine return the same error.
{noformat}
set hive.execution.engine=mr;
set hive.ignore.mapjoin.hint=false;
explain
select /*+ mapjoin(b) */ count(1) 
from lxw_test1 a 
join lxw_test b 
on a.imei = b.imei ;
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12463) VectorMapJoinFastKeyStore has Array OOB errors

2015-11-19 Thread Gopal V (JIRA)
Gopal V created HIVE-12463:
--

 Summary: VectorMapJoinFastKeyStore has Array OOB errors
 Key: HIVE-12463
 URL: https://issues.apache.org/jira/browse/HIVE-12463
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 2.0.0
Reporter: Gopal V
Assignee: Gopal V


When combining different sized keys, observing an occasional error in hashtable 
probes.

{code}
Caused by: java.lang.ArrayIndexOutOfBoundsException: 162046429
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastKeyStore.equalKey(VectorMapJoinFastKeyStore.java:150)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.findReadSlot(VectorMapJoinFastBytesHashTable.java:191)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.lookup(VectorMapJoinFastBytesHashMap.java:76)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerMultiKeyOperator.process(VectorMapJoinInnerMultiKeyOperator.java:300)
... 26 more
{code}

{code}
// Our reading is positioned to the key.
writeBuffers.getByteSegmentRefToCurrent(byteSegmentRef, keyLength, readPos);

byte[] currentBytes = byteSegmentRef.getBytes();
int currentStart = (int) byteSegmentRef.getOffset();

for (int i = 0; i < keyLength; i++) {
  if (currentBytes[currentStart + i] != keyBytes[keyStart + i]) {
// LOG.debug("VectorMapJoinFastKeyStore equalKey no match on bytes");
return false;
  }
}
{code}

This needs an identical fix to match 

{code}
// Rare case of buffer boundary. Unfortunately we'd have to copy some bytes.

   // Rare case of buffer boundary. Unfortunately we'd have to copy some bytes.
byte[] bytes = new byte[length];
int destOffset = 0;
while (destOffset < length) {
  ponderNextBufferToRead(readPos);
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)