Re: VOTE] Apache Hive 3.0.0 Release Candidate 1

2018-05-18 Thread Jesus Camacho Rodriguez
+1

Downloaded, checked hashes, checked release notes, ran rat checks, built from 
sources.

Thanks,
-Jesús


On 5/18/18, 5:17 PM, "Gunther Hagleitner"  wrote:

+1

- Verified signature & checksum
- Built from source and ran tests
- Verified libraries and binaries

Thanks,
Gunther.

From: Prasanth Jayachandran 
Sent: Friday, May 18, 2018 2:25 PM
To: dev@hive.apache.org
Subject: Re: VOTE] Apache Hive 3.0.0 Release Candidate 1

+1
- Verified signature, checksum
- Ran rat check
- Built from src
- Ran few unit tests

Thanks
Prasanth

> On May 18, 2018, at 1:46 PM, Vineet Garg  wrote:
>
> Apache Hive 3.0.0 Release Candidate 1 is available here:
>
> http://people.apache.org/~vgarg/apache-hive-3.0.0-rc-1/
>
>
> Tag: https://github.com/apache/hive/tree/release-3.0.0-rc1
>
>
> My public key is available at https://pgp.mit.edu (Lookup using ‘vgarg’).
>
> Voting will conclude in 72 hours.
>
> Hive PMC Members: Please test and vote.
>
> Thanks.








Review Request 67229: HIVE-19614: GenericUDTFGetSplits does not honor ORDER BY

2018-05-18 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67229/
---

Review request for hive and Jason Dere.


Bugs: HIVE-19614
https://issues.apache.org/jira/browse/HIVE-19614


Repository: hive-git


Description
---

HIVE-19614: GenericUDTFGetSplits does not honor ORDER BY


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
e74a18853ced722a8a5d7f5b8e80015e461e4a0b 


Diff: https://reviews.apache.org/r/67229/diff/1/


Testing
---


Thanks,

Prasanth_J



[jira] [Created] (HIVE-19616) Enable TestAutoPurge test

2018-05-18 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19616:
--

 Summary: Enable TestAutoPurge test
 Key: HIVE-19616
 URL: https://issues.apache.org/jira/browse/HIVE-19616
 Project: Hive
  Issue Type: Test
  Components: Test
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez


Disabled by HIVE-19589.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: VOTE] Apache Hive 3.0.0 Release Candidate 1

2018-05-18 Thread Gunther Hagleitner
+1

- Verified signature & checksum
- Built from source and ran tests
- Verified libraries and binaries

Thanks,
Gunther.

From: Prasanth Jayachandran 
Sent: Friday, May 18, 2018 2:25 PM
To: dev@hive.apache.org
Subject: Re: VOTE] Apache Hive 3.0.0 Release Candidate 1

+1
- Verified signature, checksum
- Ran rat check
- Built from src
- Ran few unit tests

Thanks
Prasanth

> On May 18, 2018, at 1:46 PM, Vineet Garg  wrote:
>
> Apache Hive 3.0.0 Release Candidate 1 is available here:
>
> http://people.apache.org/~vgarg/apache-hive-3.0.0-rc-1/
>
>
> Tag: https://github.com/apache/hive/tree/release-3.0.0-rc1
>
>
> My public key is available at https://pgp.mit.edu (Lookup using ‘vgarg’).
>
> Voting will conclude in 72 hours.
>
> Hive PMC Members: Please test and vote.
>
> Thanks.





[jira] [Created] (HIVE-19615) Proper handling of is null and not is null predicate when pushed to Druid

2018-05-18 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-19615:
-

 Summary: Proper handling of is null and not is null predicate when 
pushed to Druid
 Key: HIVE-19615
 URL: https://issues.apache.org/jira/browse/HIVE-19615
 Project: Hive
  Issue Type: Bug
Reporter: slim bouguerra
Assignee: slim bouguerra
 Fix For: 3.0.0


Recent development in Druid introduced new semantic of null handling 
[here|https://github.com/b-slim/druid/commit/219e77aeac9b07dc20dd9ab2dd537f3f17498346]

Based on those changes when need to honer push down of expressions with is 
null/ is not null predicates.
The prosed fix overrides the mapping of Calcite Function to Druid Expression to 
much the correct semantic.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19614) GenericUDTFGetSplits does not honor ORDER BY

2018-05-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19614:


 Summary: GenericUDTFGetSplits does not honor ORDER BY
 Key: HIVE-19614
 URL: https://issues.apache.org/jira/browse/HIVE-19614
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


GenericUDTFGetSplits handles ORDER BY by writing the results to temp table. 
However running select * on that temp table may create >1 splits which will 
lose the original ordering. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67222: HIVE-19613: GenericUDTFGetSplits should handle fetch task with temp table rewrite

2018-05-18 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67222/
---

Review request for hive and Jason Dere.


Bugs: HIVE-19613
https://issues.apache.org/jira/browse/HIVE-19613


Repository: hive-git


Description
---

HIVE-19613: GenericUDTFGetSplits should handle fetch task with temp table 
rewrite


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
e74a18853ced722a8a5d7f5b8e80015e461e4a0b 


Diff: https://reviews.apache.org/r/67222/diff/1/


Testing
---


Thanks,

Prasanth_J



[jira] [Created] (HIVE-19613) GenericUDTFGetSplits should handle fetch task with temp table rewrite

2018-05-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19613:


 Summary: GenericUDTFGetSplits should handle fetch task with temp 
table rewrite
 Key: HIVE-19613
 URL: https://issues.apache.org/jira/browse/HIVE-19613
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
 Environment: GenericUDTFGetSplits fails for fetch task only queries. 
Fetch task only queries can be handled same way as >1 task queries using temp 
tables. 
{code:java}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Was expecting a 
single TezTask.
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:262)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:201)
at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:984)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:930)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:917)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:984)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:930)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:492)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:484)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:145)
... 16 more{code}
Reporter: Eric Wohlstadter
Assignee: Prasanth Jayachandran






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: VOTE] Apache Hive 3.0.0 Release Candidate 1

2018-05-18 Thread Prasanth Jayachandran
+1
- Verified signature, checksum
- Ran rat check
- Built from src
- Ran few unit tests

Thanks
Prasanth

> On May 18, 2018, at 1:46 PM, Vineet Garg  wrote:
> 
> Apache Hive 3.0.0 Release Candidate 1 is available here:
> 
> http://people.apache.org/~vgarg/apache-hive-3.0.0-rc-1/
> 
> 
> Tag: https://github.com/apache/hive/tree/release-3.0.0-rc1
> 
> 
> My public key is available at https://pgp.mit.edu (Lookup using ‘vgarg’).
> 
> Voting will conclude in 72 hours.
> 
> Hive PMC Members: Please test and vote.
> 
> Thanks.



[jira] [Created] (HIVE-19612) Add option to mask lineage in q files

2018-05-18 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19612:
--

 Summary: Add option to mask lineage in q files
 Key: HIVE-19612
 URL: https://issues.apache.org/jira/browse/HIVE-19612
 Project: Hive
  Issue Type: Test
  Components: Test, Testing Infrastructure
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


VOTE] Apache Hive 3.0.0 Release Candidate 1

2018-05-18 Thread Vineet Garg
Apache Hive 3.0.0 Release Candidate 1 is available here:

http://people.apache.org/~vgarg/apache-hive-3.0.0-rc-1/


Tag: https://github.com/apache/hive/tree/release-3.0.0-rc1


My public key is available at https://pgp.mit.edu (Lookup using ‘vgarg’).

Voting will conclude in 72 hours.

Hive PMC Members: Please test and vote.

Thanks.


[jira] [Created] (HIVE-19611) reenable BHIF test on SparkonYarn if needed

2018-05-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19611:
---

 Summary: reenable BHIF test on SparkonYarn if needed
 Key: HIVE-19611
 URL: https://issues.apache.org/jira/browse/HIVE-19611
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sahil Takiar


See HIVE-19608. Fails occasionally, looks like it's caused by an OOM in some 
task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19610) reeenable union_stats test if needed

2018-05-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19610:
---

 Summary: reeenable union_stats test if needed
 Key: HIVE-19610
 URL: https://issues.apache.org/jira/browse/HIVE-19610
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Ashutosh Chauhan


There's some flaky stats diff. See HIVE-19608 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67220: HIVE-12192

2018-05-18 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67220/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-12192
https://issues.apache.org/jira/browse/HIVE-12192


Repository: hive-git


Description
---

HIVE-12192


Diffs
-

  
accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/mr/TestHiveAccumuloTypes.java
 926f5720ac6ff0cfaf3858f2eba4e9e4ef37a889 
  common/src/java/org/apache/hadoop/hive/common/type/Date.java PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/type/TimestampTZUtil.java 
90ffddba0dd6c608857ee645023532de86c731b8 
  common/src/java/org/apache/hadoop/hive/common/type/TimestampUtils.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
b12a7a4d4071badde6a7a6d7d41ae1e1413268ac 
  common/src/java/org/apache/hive/common/util/DateParser.java 
949fdbafcfd599e47889297955ed1e39b4f747b5 
  common/src/java/org/apache/hive/common/util/TimestampParser.java 
f674b5d30bd8addde1dad70dba3553651c0bea60 
  
common/src/test/org/apache/hadoop/hive/common/type/TestHiveDecimalOrcSerializationUtils.java
 72dce4deaaef819a436aa0ac6bdf3f526ff5aba5 
  common/src/test/org/apache/hadoop/hive/common/type/TestTimestampTZ.java 
5a3f0481bc23e807f1f5e7b12cb9e5422b7f0e4a 
  common/src/test/org/apache/hive/common/util/TestDateParser.java 
0553b3d38718b4b4fabc808a607a98e3aa00efd4 
  common/src/test/org/apache/hive/common/util/TestTimestampParser.java 
c982af65c6228bf515aa06c2b586a7651e100ccc 
  data/files/alltypesorc3xcols e48487328bcac6872ce7cc652b5abed2529d538b 
  data/files/orc_split_elim.orc cd145d343104983b4b09603c251ee749e5f82cc7 
  druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDe.java 
5f7657975a2e664ab7dda59fd7be8b06e5fe0c92 
  druid-handler/src/test/org/apache/hadoop/hive/druid/serde/TestDruidSerDe.java 
e45de0f93f741b10d2f49cd7e36260f55c77edec 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseRowSerializer.java 
bc4e1466f513bc1d009f6ce64ce24e1cae592e44 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseInputFormatUtil.java
 05cc30a62137733e7e94cb8d61c79503ec4f16e2 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 
d94dbe8d8aadc78cfc9b60799165f378dcc8486b 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/DataType.java 
6dcee4024ba16da30fb55a8c07a2439292324115 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/JsonSerDe.java 
114c205c83cefa23827af4072d0dc1dc4056d83c 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/ReaderWriter.java 
cb1c459afbd6a42e245728cbd782ea9261b4e85e 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/HCatBaseStorer.java
 ec620d2fe06ebe14cf6ce03bbcfce877e34651a9 
  
itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/ColumnVectorGenUtil.java
 d80b6d43fe44b19401a79d5190935bbabb9d522e 
  
llap-common/src/test/org/apache/hadoop/hive/llap/io/TestChunkedInputStream.java 
77559e1e589a67cad1bbc90ff190ffd5ec1db74f 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcEncodedDataConsumer.java
 feccb878b79915c3442d514a58995a85a4b2dcd0 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
 4033b379defbef2ed952bee7e4f737149a6c5a9d 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java
 1cfe92978a029ef68aa1af6eeda5a5aacbad255e 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ConsumerFileMetadata.java
 bf139c071ccee07be3094ab64b9efafcb95c7b7c 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java
 0012afb3efa1fa67c8c897711ab102c833f1f0a0 
  pom.xml ce3da37d2b4a33819fe62c32589c417ad5dd17fd 
  ql/pom.xml fedb5f1f80e8aab570a9c6a63272f9327199eb02 
  ql/src/gen/vectorization/ExpressionTemplates/DTIColumnCompareScalar.txt 
0d3ee2b74c9cdaeb4c07fc384a43603e543d5371 
  ql/src/gen/vectorization/ExpressionTemplates/DTIScalarCompareColumn.txt 
be5f641291e169822e8682aa0e1373d276cf50d7 
  
ql/src/gen/vectorization/ExpressionTemplates/DateColumnArithmeticIntervalYearMonthColumn.txt
 32dd6ed69f55ed2632447cdc6dedd73df9e9be33 
  
ql/src/gen/vectorization/ExpressionTemplates/DateColumnArithmeticIntervalYearMonthScalar.txt
 94c0c5c86fd3a71aa308d3c153b40db29a8bebb2 
  
ql/src/gen/vectorization/ExpressionTemplates/DateColumnArithmeticTimestampColumn.txt
 96c525d2b534da76ceff37f3ed293cad80c3fcaf 
  
ql/src/gen/vectorization/ExpressionTemplates/DateColumnArithmeticTimestampScalar.txt
 fb22992657ca9cec333487082a65ca6273a8cc47 
  
ql/src/gen/vectorization/ExpressionTemplates/DateScalarArithmeticIntervalYearMonthColumn.txt
 0c8ec9c161203513cfa84aa51151e5e187dc2739 
  
ql/src/gen/vectorization/ExpressionTemplates/DateScalarArithmeticTimestampColumn.txt
 

[jira] [Created] (HIVE-19609) pointless callstacks in the logs as usual

2018-05-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19609:
---

 Summary: pointless callstacks in the logs as usual
 Key: HIVE-19609
 URL: https://issues.apache.org/jira/browse/HIVE-19609
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-19609.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19608) disable flaky tests 2

2018-05-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19608:
---

 Summary: disable flaky tests 2
 Key: HIVE-19608
 URL: https://issues.apache.org/jira/browse/HIVE-19608
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


union_stats
{noformat}
java.lang.AssertionError: 
Client Execution succeeded but contained differences (error code = 1) after 
executing union_stats.q 
362a363
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
364a366,367
>   numRows 1000
>   rawDataSize 10624
{noformat}
Every few runs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67186: HIVE-19585: Add UNKNOWN to PrincipalType

2018-05-18 Thread Arjun Mishra via Review Board


> On May 17, 2018, 8:24 p.m., Sergio Pena wrote:
> > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/PrincipalType.java
> > Lines 18 (patched)
> > 
> >
> > This file is auto-generated by Thrift. To add a new field, you need to 
> > edit the hive_metastore.thrift file and generate the new thrift files.
> > 
> > Btw, this might add a behavior to all the authorization commands like: 
> > ALTER TABLE ... SET OWNER UNKNOWN 
> > 
> > Do we want to support that? Do we need the UNKNOWN on the PrincipalType?

Good point about thrift file. 
Do we need to worry about adding new behavior? UNKNOWN is just the default type 
when no principal type is needed. 
We need the UNKNOWN on PrincipalType to match with 
HivePrincipal.HivePrincipalType


- Arjun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67186/#review203369
---


On May 17, 2018, 3:19 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67186/
> ---
> 
> (Updated May 17, 2018, 3:19 p.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> We need to include type UNKNOWN to PrincipalType to match with 
> HivePrincipal.HivePrincipalType.UKNOWN
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/PrincipalType.java
>  82eb8fd700 
> 
> 
> Diff: https://reviews.apache.org/r/67186/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



ptest tmp directories and test flakiness

2018-05-18 Thread Sergey Shelukhin
Hi.

We have many test failures due to flakiness on ptest machines; looks like
tmp directory is deleted while tests are running:

2018-05-18T10:24:44,991 WARN [Thread-3915] mapred.LocalJobRunner:
job_local632888732_0106
java.io.FileNotFoundException: File
file:/tmp/hadoop/mapred/staging/hiveptest632888732/.staging/job_local632888
732_0106/job.splitmetainfo does not exist
…

Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
Input path does not exist: file:/tmp/temp1540619121/tmp-2080326801

…
etc.

1) Can we have tmp directory NOT cleaned up while tests are running? I
wonder if it’s easy to nuke between runs.

2) Otherwise we need to weed out all the tests that use tmp and make them
not use it. I’m not sure about the best way to do this… hadoop/mapred
seems to come from mapreduce.jobtracker.staging.root.dir and
hadoop.tmp.dir, but at least after some time looking I cannot find where
we set hadoop.tmp.dir to /tmp/hadoop, and it also doesn’t match the
default value that has username.
Where the other one comes I’m not sure at all.
I wonder if it’s viable to deny ptest user access to tmp temporarily, then
see what fails at the earliest possible point? 



[jira] [Created] (HIVE-19607) Pushing Aggregates on Top of Aggregates

2018-05-18 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-19607:
-

 Summary: Pushing Aggregates on Top of Aggregates
 Key: HIVE-19607
 URL: https://issues.apache.org/jira/browse/HIVE-19607
 Project: Hive
  Issue Type: Sub-task
Reporter: slim bouguerra
 Fix For: 3.1.0


This plan shows an instance where the count aggregates can be pushed to Druid 
which will eliminate the last stage reducer.

{code}
+PREHOOK: query: EXPLAIN select count(DISTINCT cstring2), sum(cdouble) FROM 
druid_table
+PREHOOK: type: QUERY
+POSTHOOK: query: EXPLAIN select count(DISTINCT cstring2), sum(cdouble) FROM 
druid_table
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+Tez
+ A masked pattern was here 
+  Edges:
+Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
+ A masked pattern was here 
+  Vertices:
+Map 1
+Map Operator Tree:
+TableScan
+  alias: druid_table
+  properties:
+druid.fieldNames cstring2,$f1
+druid.fieldTypes string,double
+druid.query.json 
{"queryType":"groupBy","dataSource":"default.druid_table","granularity":"all","dimensions":[{"type":"default","dimension":"cstring2","outputName":"cstring2","outputType":"STRING"}],"limitSpec":{"type":"default"},"aggregations":[{"type":"doubleSum","name":"$f1","fieldName":"cdouble"}],"intervals":["1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]}
+druid.query.type groupBy
+  Statistics: Num rows: 9173 Data size: 1673472 Basic stats: 
COMPLETE Column stats: NONE
+  Select Operator
+expressions: cstring2 (type: string), $f1 (type: double)
+outputColumnNames: cstring2, $f1
+Statistics: Num rows: 9173 Data size: 1673472 Basic stats: 
COMPLETE Column stats: NONE
+Group By Operator
+  aggregations: count(cstring2), sum($f1)
+  mode: hash
+  outputColumnNames: _col0, _col1
+  Statistics: Num rows: 1 Data size: 208 Basic stats: 
COMPLETE Column stats: NONE
+  Reduce Output Operator
+sort order:
+Statistics: Num rows: 1 Data size: 208 Basic stats: 
COMPLETE Column stats: NONE
+value expressions: _col0 (type: bigint), _col1 (type: 
double)
+Reducer 2
+Reduce Operator Tree:
+  Group By Operator
+aggregations: count(VALUE._col0), sum(VALUE._col1)
+mode: mergepartial
+outputColumnNames: _col0, _col1
+Statistics: Num rows: 1 Data size: 208 Basic stats: COMPLETE 
Column stats: NONE
+File Output Operator
+  compressed: false
+  Statistics: Num rows: 1 Data size: 208 Basic stats: COMPLETE 
Column stats: NONE
+  table:
+  input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
+  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19606) Straggler thread in HS2 for rename directory operation stuck in loop causing performance issue and cluster slowdown

2018-05-18 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-19606:
-

 Summary: Straggler thread in HS2 for rename directory operation 
stuck in loop causing performance issue and cluster slowdown
 Key: HIVE-19606
 URL: https://issues.apache.org/jira/browse/HIVE-19606
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.0.0
Reporter: Eugene Koifman






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Apache Hive 3.0.0 Release Candidate 0

2018-05-18 Thread Vineet Garg
Working on fixing this. Look for a new email to vote soon.

> On May 17, 2018, at 10:52 PM, Prasanth Jayachandran 
>  wrote:
> 
> Could you please increment the rc version when roll a new RC? To make it easy 
> to verify latest RC version.
> Also your email in public key seems to have a typo 
> vg...@apche.org.
> 
> Thanks
> Prasanth
> 
> On May 17, 2018, at 9:33 PM, Vineet Garg 
> > wrote:
> 
> standalone-metastore/pom.xml’s dependency on storage-api wasn’t updated in 
> this RC. I have fixed that and I am working on creating new RC. I’ll send 
> another email for voting soon.
> 
> On May 15, 2018, at 5:56 PM, Vineet Garg 
> > wrote:
> 
> Apache Hive 3.0.0 Release Candidate 0 is available here:
> 
> http://people.apache.org/~vgarg/apache-hive-3.0.0-rc-0
> 
> 
> Tag: https://github.com/apache/hive/tree/release-3.0.0-rc0
> 
> 
> Voting will conclude in 72 hours.
> 
> Hive PMC Members: Please test and vote.
> 
> Thanks.
> 
> 



[jira] [Created] (HIVE-19605) TAB_COL_STATS table has no index on db/table name

2018-05-18 Thread Todd Lipcon (JIRA)
Todd Lipcon created HIVE-19605:
--

 Summary: TAB_COL_STATS table has no index on db/table name
 Key: HIVE-19605
 URL: https://issues.apache.org/jira/browse/HIVE-19605
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Todd Lipcon


The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, TABLE_NAME). 
The getTableColumnStatistics call queries based on this tuple. This makes those 
queries take a significant amount of time in large metastores since they do a 
full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19604) Incorrect Handling of Boolean in DruidSerde

2018-05-18 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19604:
---

 Summary: Incorrect Handling of Boolean in DruidSerde
 Key: HIVE-19604
 URL: https://issues.apache.org/jira/browse/HIVE-19604
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Results of boolean expressions from Druid are expressed in the form of numeric 
1 or 0. 
When reading the results in DruidSerde both 1 and 0 are translated to String 
and then we call Boolean.valueOf(stringForm), this leads to the boolean being 
read always as false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19603) Decrease batch size of TestMinimrCliDriver

2018-05-18 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-19603:
---

 Summary: Decrease batch size of TestMinimrCliDriver
 Key: HIVE-19603
 URL: https://issues.apache.org/jira/browse/HIVE-19603
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Sahil Takiar
Assignee: Sahil Takiar


We have seen a lot of flakiness with the {{TestMinimrCliDriver}} - it keeps on 
timing out. I checked a recent Hive QA run and running the following tests 
locally takes my machine 1 hour:

{code}
mvn -B test -Dtest.groups= -Dtest=TestMinimrCliDriver 
-Dminimr.query.files=infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,scriptfile1.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q
{code}

On ptest, the timeout is 40 minutes. I suggest we decrease the batch size from 
10 to 5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #351: Branch 1.2

2018-05-18 Thread ey1984
GitHub user ey1984 opened a pull request:

https://github.com/apache/hive/pull/351

Branch 1.2

Hello,

I'm using your hive-jdbc (1.2.1) as dependency for 2 applications deployed 
into 2 docker containers (Java Code and Python Code).
For both containers, it runs behind a proxy system when I deploy on 
qualification and production environment.
For java, (-Dhttp.proxyHost=MyProxyHost -Dhttp.proxyPort=MyProxyPort)
For python, I set HTTP_PROXY=http://myproxyhost:myproxyport

And after deployed, and launched, a timeout occurs when I want to hit the 
hive server. I call hive by url jdbc://hive

So after debugging your source code, I added some code (PR as requested) in 
order to get the proxy system (env from os or jvm configuration) and It works 
fine for both containers.

Is this correction acceptable or is there any other solution to hit Hive 
Server by using proxy ?

Python : I use jaydepbeapi and Java : Only 
DriverManager.getConnection("jdbc://hive2)

Thanks a lot

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ey1984/hive branch-1.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/351.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #351


commit 2900b5c687f7a2e11bf1b56ddc17aa271e557ff2
Author: ey1984 
Date:   2018-04-30T21:31:39Z

Get proxy system

commit b07aa6a4d228a40c9b45ff5d99a7104a5dbb8443
Author: ey1984 
Date:   2018-04-30T21:54:28Z

merge conflict caused by PR Hive-Proxy-System

commit 3962b9b3b54f5e07854331d989911b645ef344bd
Author: Ekrem YILMAZ 
Date:   2018-05-14T12:06:21Z

Get Proxy Settings By UseSystemProperties

Get Proxy Settings by UseSystemProperties and keep getEnv

commit 2e98af05005161eb33c255022d0487ddd808dfb0
Author: Ekrem YILMAZ 
Date:   2018-05-14T12:09:36Z

Merge branch 'HIVE-Proxy-System' into branch-1.2

commit 59860ca463eec571421d875ef1ed32f545ecfef9
Author: Ekrem YILMAZ 
Date:   2018-05-14T12:10:08Z

Merge with Hive-Proxy-System

commit 913c8c8f620c62316d89d7b9187bfcad3b700f16
Author: Ekrem YILMAZ 
Date:   2018-05-14T12:11:47Z

Merge with Hive-Proxy-System




---


Re: Pull Request for Proxy Settings

2018-05-18 Thread Peter Vary
Hi Ekrem,

This is the documented way of contributing to Hive:
https://cwiki.apache.org/confluence/display/Hive/HowToContribute 


Could you please provide your contribution according to that?

Thanks,
Peter

> On May 14, 2018, at 2:25 PM, Ekrem YILMAZ  wrote:
> 
> Hi guys,
> 
> I created a pull request for getting proxy settings from env or system
> properties.
> All descriptions are available on the pull request.
> 
> https://github.com/apache/hive/pull/338
> 
> Do you know when this pull request will be handled or if I have forgotten
> some steps during pull request process ?
> 
> Thanks a lot.
> 
> Ekrem