[discuss] Fixing DecimalColumnVector cache misses

2016-11-03 Thread Gopal Vijayaraghavan
Hi,

(x-posted for discussion)

Hive's storage-api + ORC vector readers have a cache miss built-into it for the 
case of Decimal readers. 

With LLAP, two distinct cache misses are basically dragging Decimal performance 
down.

DecimalColumnVector -> HiveDecimalWritable -> HiveDecimal(BigInteger) -> new 
BigDecimal()

The writable is entirely overhead and so is the BigInteger -> BigDecimal 
conversions, particularly since the HiveDecimal type is not boxed unlike a 
"long".

Modifying the writable involves a fresh allocation of a HiveDecimal, which 
makes the object reference a rather unsightly cache miss (this is TPC-H Q1).



Changing this in hive/storage-api will produce a chicken-egg scenario between 
hive/storage-api -> orc -> hive/ql/exec/vectorization, across projects.

I'm conflicted on how to change DecimalColumnVector one-shot without breaking 
things (if possible, remove BigInteger allocations in the read-path as a 
possible optimization). 

Suggestions/discuss?

Cheers,
Gopal




[jira] [Created] (HIVE-15124) Fix OrcInputFormat to use reader's schema for include boolean array

2016-11-03 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-15124:


 Summary: Fix OrcInputFormat to use reader's schema for include 
boolean array
 Key: HIVE-15124
 URL: https://issues.apache.org/jira/browse/HIVE-15124
 Project: Hive
  Issue Type: Bug
  Components: ORC
Affects Versions: 2.1.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, the OrcInputFormat uses the file's schema rather than the reader's 
schema. This means that SchemaEvolution fails with an 
ArrayIndexOutOfBoundsException if a partition has a different schema than the 
table.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15123) LLAP UI: The UI should work even if the cache is disabled

2016-11-03 Thread Gopal V (JIRA)
Gopal V created HIVE-15123:
--

 Summary: LLAP UI: The UI should work even if the cache is disabled
 Key: HIVE-15123
 URL: https://issues.apache.org/jira/browse/HIVE-15123
 Project: Hive
  Issue Type: Bug
  Components: llap, Web UI
Affects Versions: 2.1.0, 2.2.0
Reporter: Gopal V
Assignee: Gopal V


{code}
metrics.js:82 Uncaught TypeError: Cannot read property 'CacheCapacityTotal' of 
undefined
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-11-03 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-15122:
-

 Summary: Hive: Upcasting types should not obscure stats 
(min/max/ndv)
 Key: HIVE-15122
 URL: https://issues.apache.org/jira/browse/HIVE-15122
 Project: Hive
  Issue Type: Bug
Reporter: Siddharth Seth


A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
LLAP.

Snippet from the bad plan.
{code}
| STAGE PLANS:  

   |
|   Stage: Stage-1  

   |
| Tez   

   |
|   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6   

   |
|   Edges:  

   |
| Map 2 <- Map 1 (BROADCAST_EDGE)   

   |
| Map 3 <- Map 2 (BROADCAST_EDGE)   

   |
| Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 (CUSTOM_SIMPLE_EDGE), 
Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
|
| Reducer 5 <- Reducer 4 (SIMPLE_EDGE)  

   |
| Reducer 6 <- Reducer 5 (SIMPLE_EDGE)  

   |
|   DagName:

   |
|   Vertices:   

   |
| Map 1 

   |
| Map Operator Tree:

   |
| TableScan 

   |
|   alias: supplier 

   |
|   filterExpr: (s_suppkey is not null and s_nationkey is not 
null) (type: boolean)   
 |
|   Statistics: Num rows: 1000 Data size: 16000 Basic 
stats: COMPLETE Column stats: COMPLETE  
 |
|   Filter Operator 

   |
| predicate: (s_suppkey is not null and s_nationkey is not 
null) (type: boolean)   
|
| Statistics: Num rows: 1000 Data size: 16000 Basic 
stats: COMPLETE Column stats: COMPLETE  
   |
| Select Operator   

   |
|   expressions: s_suppkey (type: bigint), s_nationkey 
(type: bigint)  
|
|   outputColumnNames: _col0, _col1 

   |
|   Statistics: Num rows: 1000 Data size: 16000 
Basic stats: COMPLETE Column stats: COMPLETE
   |
|   Reduce Output Operator  
   

Re: Snapshot builds are not deployed

2016-11-03 Thread Gopal Vijayaraghavan
Hi,

> the last update was on October 13.

You are right - the Jenkins job hasn't run in 3 weeks.

https://builds.apache.org/view/H-L/view/Hive/job/Hive-trunk/

Cc'd @spena, maybe he can help. 
 
Cheers,
Gopal




[jira] [Created] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-03 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-15121:
---

 Summary: Last MR job in Hive should be able to write to a 
different scratch directory
 Key: HIVE-15121
 URL: https://issues.apache.org/jira/browse/HIVE-15121
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Sahil Takiar


Hive should be able to configure all intermediate MR jobs to write to HDFS, but 
the final MR job to write to S3.

This will be useful for implementing parallel renames on S3. The idea is that 
for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
final job writes to S3. Writing to HDFS should be faster than writing to S3, so 
it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15120) Storage based auth: allow option to enforce write checks for external tables

2016-11-03 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-15120:


 Summary: Storage based auth: allow option to enforce write checks 
for external tables
 Key: HIVE-15120
 URL: https://issues.apache.org/jira/browse/HIVE-15120
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair


Under storage based authorization, we don't require read permissions on table 
directory for external table create/drop.
This is because external table contents are populated often from outside of 
hive and are not written into from hive. So write access is not needed. Also, 
we can't require write permissions to drop a table if we don't require them for 
creation (users who created them should be able to drop them).

However, this difference in behavior of external tables is not well documented. 
So users get surprised to learn that drop table can be done by just any user 
who has read access to the directory. At that point changing the large number 
of scripts that use external tables is hard. 
It would be good to have a user config option to have external tables to be 
treated same as managed tables.
The option should be off by default, so that the behavior is backward 
compatible by default.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 53328: Support for standard ROLLUP syntax

2016-11-03 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53328/#review154773
---


Ship it!




Ship It!

- Jesús Camacho Rodríguez


On Nov. 3, 2016, 5:18 p.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53328/
> ---
> 
> (Updated Nov. 3, 2016, 5:18 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-15119
> https://issues.apache.org/jira/browse/HIVE-15119
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Standard ROLLUP and CUBE syntax is GROUP BY ROLLUP/CUBE (expression list)... 
> but HIVE allows GROUP BY  WITH ROLLUP/CUBE syntax. We would 
> like HIVE to support standard ROLLUP/CUBE syntax to allow out of the box 
> support for TPCDS queries i.e. without rewritting them.
> 
> This patach includes update to grammar to allow ROLLUP and CUBE in following 
> syntax:
> 
> SELECT.GROUP BY ROLLUP ( expr1, expr2)
> SELECT.GROUP BY CUBE (expr1, expr2..)
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 13e2d17 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q 854e401 
>   ql/src/test/queries/clientpositive/cbo_rp_annotate_stats_groupby.q 3159fc7 
>   ql/src/test/queries/clientpositive/cte_1.q 2956339 
>   ql/src/test/queries/clientpositive/groupby_cube1.q bfa13ee 
>   ql/src/test/queries/clientpositive/groupby_cube_multi_gby.q 80022bb 
>   ql/src/test/queries/clientpositive/groupby_grouping_id1.q de4a7c3 
>   ql/src/test/queries/clientpositive/groupby_grouping_id2.q 5c05aad 
>   ql/src/test/queries/clientpositive/groupby_grouping_sets1.q 804dfb3 
>   ql/src/test/queries/clientpositive/groupby_grouping_sets2.q 824942c 
>   ql/src/test/queries/clientpositive/groupby_grouping_sets3.q 7077377 
>   ql/src/test/queries/clientpositive/groupby_grouping_sets4.q 06e5e1a 
>   ql/src/test/queries/clientpositive/groupby_grouping_sets5.q 6a09c88 
>   ql/src/test/queries/clientpositive/groupby_rollup1.q 23cac80 
>   ql/src/test/queries/clientpositive/infer_bucket_sort_grouping_operators.q 
> 928f6fb 
>   ql/src/test/queries/clientpositive/limit_pushdown2.q 637b5b0 
>   ql/src/test/queries/clientpositive/vector_grouping_sets.q 09ba6b6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out f6971a0 
>   ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 
> f5b4375 
>   ql/src/test/results/clientpositive/cte_1.q.out 61fd1af 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out b9cfeb2 
>   ql/src/test/results/clientpositive/groupby_cube_multi_gby.q.out 992fd2d 
>   ql/src/test/results/clientpositive/groupby_grouping_id1.q.out 136edeb 
>   ql/src/test/results/clientpositive/groupby_grouping_sets1.q.out 5b70906 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out f00bb5b 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out 5c69907 
>   ql/src/test/results/clientpositive/groupby_grouping_sets4.q.out b7e9329 
>   ql/src/test/results/clientpositive/groupby_grouping_sets5.q.out f175778 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 54e1a0d 
>   
> ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out 
> ebfce60 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out 2f68674 
>   ql/src/test/results/clientpositive/llap/cte_1.q.out e309ce8 
>   ql/src/test/results/clientpositive/llap/groupby_grouping_id2.q.out 544a7ae 
>   ql/src/test/results/clientpositive/llap/vector_grouping_sets.q.out 8e55ce3 
>   ql/src/test/results/clientpositive/vector_grouping_sets.q.out 4207c19 
> 
> Diff: https://reviews.apache.org/r/53328/diff/
> 
> 
> Testing
> ---
> 
> Updated exsting tests to use new ROLLUP and CUBE syntax in addition to 
> non-standard syntax.
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



Re: Review Request 53328: Support for standard ROLLUP syntax

2016-11-03 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53328/
---

(Updated Nov. 3, 2016, 5:18 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Changes
---

Accidentally uploaded incomplete patch before. Uploading complete patch


Bugs: HIVE-15119
https://issues.apache.org/jira/browse/HIVE-15119


Repository: hive-git


Description
---

Standard ROLLUP and CUBE syntax is GROUP BY ROLLUP/CUBE (expression list)... 
but HIVE allows GROUP BY  WITH ROLLUP/CUBE syntax. We would 
like HIVE to support standard ROLLUP/CUBE syntax to allow out of the box 
support for TPCDS queries i.e. without rewritting them.

This patach includes update to grammar to allow ROLLUP and CUBE in following 
syntax:

SELECT.GROUP BY ROLLUP ( expr1, expr2)
SELECT.GROUP BY CUBE (expr1, expr2..)


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 13e2d17 
  ql/src/test/queries/clientpositive/annotate_stats_groupby.q 854e401 
  ql/src/test/queries/clientpositive/cbo_rp_annotate_stats_groupby.q 3159fc7 
  ql/src/test/queries/clientpositive/cte_1.q 2956339 
  ql/src/test/queries/clientpositive/groupby_cube1.q bfa13ee 
  ql/src/test/queries/clientpositive/groupby_cube_multi_gby.q 80022bb 
  ql/src/test/queries/clientpositive/groupby_grouping_id1.q de4a7c3 
  ql/src/test/queries/clientpositive/groupby_grouping_id2.q 5c05aad 
  ql/src/test/queries/clientpositive/groupby_grouping_sets1.q 804dfb3 
  ql/src/test/queries/clientpositive/groupby_grouping_sets2.q 824942c 
  ql/src/test/queries/clientpositive/groupby_grouping_sets3.q 7077377 
  ql/src/test/queries/clientpositive/groupby_grouping_sets4.q 06e5e1a 
  ql/src/test/queries/clientpositive/groupby_grouping_sets5.q 6a09c88 
  ql/src/test/queries/clientpositive/groupby_rollup1.q 23cac80 
  ql/src/test/queries/clientpositive/infer_bucket_sort_grouping_operators.q 
928f6fb 
  ql/src/test/queries/clientpositive/limit_pushdown2.q 637b5b0 
  ql/src/test/queries/clientpositive/vector_grouping_sets.q 09ba6b6 
  ql/src/test/results/clientpositive/annotate_stats_groupby.q.out f6971a0 
  ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 
f5b4375 
  ql/src/test/results/clientpositive/cte_1.q.out 61fd1af 
  ql/src/test/results/clientpositive/groupby_cube1.q.out b9cfeb2 
  ql/src/test/results/clientpositive/groupby_cube_multi_gby.q.out 992fd2d 
  ql/src/test/results/clientpositive/groupby_grouping_id1.q.out 136edeb 
  ql/src/test/results/clientpositive/groupby_grouping_sets1.q.out 5b70906 
  ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out f00bb5b 
  ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out 5c69907 
  ql/src/test/results/clientpositive/groupby_grouping_sets4.q.out b7e9329 
  ql/src/test/results/clientpositive/groupby_grouping_sets5.q.out f175778 
  ql/src/test/results/clientpositive/groupby_rollup1.q.out 54e1a0d 
  ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out 
ebfce60 
  ql/src/test/results/clientpositive/limit_pushdown2.q.out 2f68674 
  ql/src/test/results/clientpositive/llap/cte_1.q.out e309ce8 
  ql/src/test/results/clientpositive/llap/groupby_grouping_id2.q.out 544a7ae 
  ql/src/test/results/clientpositive/llap/vector_grouping_sets.q.out 8e55ce3 
  ql/src/test/results/clientpositive/vector_grouping_sets.q.out 4207c19 

Diff: https://reviews.apache.org/r/53328/diff/


Testing
---

Updated exsting tests to use new ROLLUP and CUBE syntax in addition to 
non-standard syntax.


Thanks,

Vineet Garg



Re: Review Request 53328: Support for standard ROLLUP syntax

2016-11-03 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53328/
---

(Updated Nov. 3, 2016, 5:16 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-15119
https://issues.apache.org/jira/browse/HIVE-15119


Repository: hive-git


Description (updated)
---

Standard ROLLUP and CUBE syntax is GROUP BY ROLLUP/CUBE (expression list)... 
but HIVE allows GROUP BY  WITH ROLLUP/CUBE syntax. We would 
like HIVE to support standard ROLLUP/CUBE syntax to allow out of the box 
support for TPCDS queries i.e. without rewritting them.

This patach includes update to grammar to allow ROLLUP and CUBE in following 
syntax:

SELECT.GROUP BY ROLLUP ( expr1, expr2)
SELECT.GROUP BY CUBE (expr1, expr2..)


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 13e2d17 
  ql/src/test/queries/clientpositive/annotate_stats_groupby.q 854e401 
  ql/src/test/queries/clientpositive/cbo_rp_annotate_stats_groupby.q 3159fc7 
  ql/src/test/queries/clientpositive/cte_1.q 2956339 
  ql/src/test/queries/clientpositive/groupby_grouping_id1.q de4a7c3 
  ql/src/test/queries/clientpositive/groupby_grouping_id2.q 5c05aad 
  ql/src/test/queries/clientpositive/groupby_rollup1.q 23cac80 
  ql/src/test/queries/clientpositive/infer_bucket_sort_grouping_operators.q 
928f6fb 
  ql/src/test/queries/clientpositive/limit_pushdown2.q 637b5b0 
  ql/src/test/queries/clientpositive/vector_grouping_sets.q 09ba6b6 
  ql/src/test/results/clientpositive/annotate_stats_groupby.q.out f6971a0 
  ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 
f5b4375 
  ql/src/test/results/clientpositive/cte_1.q.out 61fd1af 
  ql/src/test/results/clientpositive/groupby_grouping_id1.q.out 136edeb 
  ql/src/test/results/clientpositive/groupby_rollup1.q.out 54e1a0d 
  ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out 
ebfce60 
  ql/src/test/results/clientpositive/limit_pushdown2.q.out 2f68674 
  ql/src/test/results/clientpositive/llap/cte_1.q.out e309ce8 
  ql/src/test/results/clientpositive/llap/groupby_grouping_id2.q.out 544a7ae 
  ql/src/test/results/clientpositive/llap/vector_grouping_sets.q.out 8e55ce3 
  ql/src/test/results/clientpositive/vector_grouping_sets.q.out 4207c19 

Diff: https://reviews.apache.org/r/53328/diff/


Testing (updated)
---

Updated exsting tests to use new ROLLUP and CUBE syntax in addition to 
non-standard syntax.


Thanks,

Vineet Garg



Re: Review Request 53328: Support for standard ROLLUP syntax

2016-11-03 Thread Vineet Garg


> On Nov. 1, 2016, 12:07 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g, line 60
> > 
> >
> > Since sets is not used in that syntax, maybe it is easier to create a 
> > parser rule that rewrites
> > GROUP BY (e1, e2, e3) WITH ROLLUP into ROLLUP(e1, e2, e3)
> > and 
> > GROUP BY (e1, e2, e3) WITH CUBE into CUBE(e1, e2, e3)
> > 
> > Then the rule with the old syntax will kick in.
> > 
> > The advantage with this approach is that we will keep a single rule 
> > that actually generates the syntax that SemanticAnalyzer receives.
> > 
> > What do you think?

I agree this would be a better approach but I am unable to figure out how to 
write new rule in such a way.


- Vineet


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53328/#review154349
---


On Oct. 31, 2016, 11:27 p.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53328/
> ---
> 
> (Updated Oct. 31, 2016, 11:27 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Standard ROLLUP syntax is GROUP BY ROLLUP (expression list)... but HIVE 
> allows GROUP BY  WITH ROLLUP syntax. We would like HIVE to 
> support standard ROLLUP syntax to allow out of the box support for TPCDS 
> queries i.e. without rewritting them.
> 
> This patach includes update to grammar to allow ROLLUP in following syntax:
> 
> SELECT.GROUP BY ROLLUP ( expr1, expr2)
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 13e2d17 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q 854e401 
>   ql/src/test/queries/clientpositive/cbo_rp_annotate_stats_groupby.q 3159fc7 
>   ql/src/test/queries/clientpositive/cte_1.q 2956339 
>   ql/src/test/queries/clientpositive/groupby_grouping_id1.q de4a7c3 
>   ql/src/test/queries/clientpositive/groupby_grouping_id2.q 5c05aad 
>   ql/src/test/queries/clientpositive/groupby_rollup1.q 23cac80 
>   ql/src/test/queries/clientpositive/infer_bucket_sort_grouping_operators.q 
> 928f6fb 
>   ql/src/test/queries/clientpositive/limit_pushdown2.q 637b5b0 
>   ql/src/test/queries/clientpositive/vector_grouping_sets.q 09ba6b6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out f6971a0 
>   ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 
> f5b4375 
>   ql/src/test/results/clientpositive/cte_1.q.out 61fd1af 
>   ql/src/test/results/clientpositive/groupby_grouping_id1.q.out 136edeb 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 54e1a0d 
>   
> ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out 
> ebfce60 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out 2f68674 
>   ql/src/test/results/clientpositive/llap/cte_1.q.out e309ce8 
>   ql/src/test/results/clientpositive/llap/groupby_grouping_id2.q.out 544a7ae 
>   ql/src/test/results/clientpositive/llap/vector_grouping_sets.q.out 8e55ce3 
>   ql/src/test/results/clientpositive/vector_grouping_sets.q.out 4207c19 
> 
> Diff: https://reviews.apache.org/r/53328/diff/
> 
> 
> Testing
> ---
> 
> Updated exsting tests to use new ROLLUP syntax in addition to non-standard 
> syntax.
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



[jira] [Created] (HIVE-15119) Support standard syntax for ROLLUP & CUBE

2016-11-03 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-15119:
--

 Summary: Support standard syntax for ROLLUP & CUBE
 Key: HIVE-15119
 URL: https://issues.apache.org/jira/browse/HIVE-15119
 Project: Hive
  Issue Type: Task
  Components: Parser, SQL
Reporter: Vineet Garg
Assignee: Vineet Garg


Standard ROLLUP and CUBE syntax is GROUP BY ROLLUP (expression list)... and 
GROUP BY CUBE (expression list) respectively. 
Currently HIVE only allows GROUP BY  WITH ROLLUP/CUBE syntax.
 
 We would like HIVE to support standard ROLLUP/CUBE syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 53257: HIVE-14960: Improve the stability of TestNotificationListener

2016-11-03 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53257/#review154743
---


Ship it!




Ship It!

- Aihua Xu


On Nov. 3, 2016, 3:30 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53257/
> ---
> 
> (Updated Nov. 3, 2016, 3:30 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, Peter Vary, and Barna Zsombor 
> Klara.
> 
> 
> Bugs: HIVE-14960
> https://issues.apache.org/jira/browse/HIVE-14960
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The TestNotificationListener test fails occasionally. It happens if the 
> testAMQListener method is completed and the recevied messages are checked 
> before the last "DROP_TABLE" message got processed and put to the 
> "actualMessages" list by the onMessage method. 
> As a solution I used a CountDownLatch which count is decreased by 1 when a 
> message is processed. And the "testAMQListener" method will wait for the 
> Latch to reach zero or a maximum time limit before complete.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/server-extensions/src/test/java/org/apache/hive/hcatalog/listener/TestNotificationListener.java
>  9e03da4 
> 
> Diff: https://reviews.apache.org/r/53257/diff/
> 
> 
> Testing
> ---
> 
> The change effects only a unit test.
> Ran the test many times locally.
> Added random sleeps to simulate the delay of the message processing.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 53257: HIVE-14960: Improve the stability of TestNotificationListener

2016-11-03 Thread Marta Kuczora

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53257/
---

(Updated Nov. 3, 2016, 3:30 p.m.)


Review request for hive, Aihua Xu, Chaoyu Tang, Peter Vary, and Barna Zsombor 
Klara.


Bugs: HIVE-14960
https://issues.apache.org/jira/browse/HIVE-14960


Repository: hive-git


Description
---

The TestNotificationListener test fails occasionally. It happens if the 
testAMQListener method is completed and the recevied messages are checked 
before the last "DROP_TABLE" message got processed and put to the 
"actualMessages" list by the onMessage method. 
As a solution I used a CountDownLatch which count is decreased by 1 when a 
message is processed. And the "testAMQListener" method will wait for the Latch 
to reach zero or a maximum time limit before complete.


Diffs (updated)
-

  
hcatalog/server-extensions/src/test/java/org/apache/hive/hcatalog/listener/TestNotificationListener.java
 9e03da4 

Diff: https://reviews.apache.org/r/53257/diff/


Testing
---

The change effects only a unit test.
Ran the test many times locally.
Added random sleeps to simulate the delay of the message processing.


Thanks,

Marta Kuczora



Re: Review Request 53257: HIVE-14960: Improve the stability of TestNotificationListener

2016-11-03 Thread Marta Kuczora


> On Oct. 28, 2016, 11:31 a.m., Barna Zsombor Klara wrote:
> > +1.
> > Thanks for the patch.

Thanks a lot for the review.


> On Oct. 28, 2016, 11:31 a.m., Barna Zsombor Klara wrote:
> > hcatalog/server-extensions/src/test/java/org/apache/hive/hcatalog/listener/TestNotificationListener.java,
> >  line 112
> > 
> >
> > nit: I would raise this up the intance level and change the 
> > countdownlatch constr argument to the size of the list.

It's a very good idea. I fixed it.


> On Oct. 28, 2016, 11:31 a.m., Barna Zsombor Klara wrote:
> > hcatalog/server-extensions/src/test/java/org/apache/hive/hcatalog/listener/TestNotificationListener.java,
> >  line 253
> > 
> >
> > I would put this into a finally block. I know that the timeout will 
> > prevent the test from hanging indefinitely, but let's not wait 30s unless 
> > necessary.

You're right about this. I fixed it.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53257/#review154114
---


On Oct. 28, 2016, 10:08 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53257/
> ---
> 
> (Updated Oct. 28, 2016, 10:08 a.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, Peter Vary, and Barna Zsombor 
> Klara.
> 
> 
> Bugs: HIVE-14960
> https://issues.apache.org/jira/browse/HIVE-14960
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The TestNotificationListener test fails occasionally. It happens if the 
> testAMQListener method is completed and the recevied messages are checked 
> before the last "DROP_TABLE" message got processed and put to the 
> "actualMessages" list by the onMessage method. 
> As a solution I used a CountDownLatch which count is decreased by 1 when a 
> message is processed. And the "testAMQListener" method will wait for the 
> Latch to reach zero or a maximum time limit before complete.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/server-extensions/src/test/java/org/apache/hive/hcatalog/listener/TestNotificationListener.java
>  9e03da4 
> 
> Diff: https://reviews.apache.org/r/53257/diff/
> 
> 
> Testing
> ---
> 
> The change effects only a unit test.
> Ran the test many times locally.
> Added random sleeps to simulate the delay of the message processing.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



[jira] [Created] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-11-03 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15118:
---

 Summary: Remove unused 'COLUMNS' table from derby schema
 Key: HIVE-15118
 URL: https://issues.apache.org/jira/browse/HIVE-15118
 Project: Hive
  Issue Type: Improvement
  Components: Database/Schema
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


COLUMNS table is unused any more. Other databases already removed it. Remove 
from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15117) Partition filters are not pushed down with lateral view and undeterministic UDF

2016-11-03 Thread LongShangRen (JIRA)
LongShangRen created HIVE-15117:
---

 Summary: Partition filters are not pushed down with lateral view 
and undeterministic UDF
 Key: HIVE-15117
 URL: https://issues.apache.org/jira/browse/HIVE-15117
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: LongShangRen
 Fix For: 1.2.1


sql with lateral view didn't push down partition column as expected!.

here is how it can be reproduced.

1. *create test table*

{quote}
   create table test_lateral_view (id bigint,json_cont string) partitioned by 
(vt string);
{quote}

2. *explain below sql*

{quote}

select   *
from test_lateral_view a
lateral view json_tuple(json_cont, 'iids', 'indexs') b as iids,indexs
where a.vt = '2016-10-27'
and rand()>0.5;
{quote}

here is my result:

{quote}

STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
TableScan
  alias: a
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE
  Lateral View Forward
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE
Select Operator
  expressions: id (type: bigint), json_cont (type: string), vt 
(type: string)
  outputColumnNames: id, json_cont, vt
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE
  Lateral View Join Operator
outputColumnNames: _col0, _col1, _col2, _col6, _col7
Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
Filter Operator
  {color:red}
  predicate: ((_col2 = '2016-10-27') and (rand() > 0.5)) (type: 
boolean)
  {color:red}
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
  Select Operator
expressions: _col0 (type: bigint), _col1 (type: string), 
'2016-10-27' (type: string), _col6 (type: string), _col7 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
ListSink
Select Operator
  expressions: json_cont (type: string), 'iids' (type: string), 
'indexs' (type: string)
  outputColumnNames: _col0, _col1, _col2
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE
  UDTF Operator
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
function name: json_tuple
Lateral View Join Operator
  outputColumnNames: _col0, _col1, _col2, _col6, _col7
  Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
  Filter Operator
predicate: ((_col2 = '2016-10-27') and (rand() > 0.5)) 
(type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
Select Operator
  expressions: _col0 (type: bigint), _col1 (type: string), 
'2016-10-27' (type: string), _col6 (type: string), _col7 (type: string)
  outputColumnNames: _col0, _col1, _col2, _col3, _col4
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
  ListSink
{quote}

 As you can see,the partition column is in filter operator,which means this sql 
will scan the whole table.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15116) Flaky test: TestMiniLlapLocalCliDriver.testCliDriver.join_acid_non_acid

2016-11-03 Thread Barna Zsombor Klara (JIRA)
Barna Zsombor Klara created HIVE-15116:
--

 Summary: Flaky test: 
TestMiniLlapLocalCliDriver.testCliDriver.join_acid_non_acid
 Key: HIVE-15116
 URL: https://issues.apache.org/jira/browse/HIVE-15116
 Project: Hive
  Issue Type: Sub-task
Reporter: Barna Zsombor Klara


{code}
Running: diff -a 
/home/hiveptest/54.193.134.5-hiveptest-0/apache-github-source-source/itests/qtest/target/qfile-results/clientpositive/join_acid_non_acid.q.out
 
/home/hiveptest/54.193.134.5-hiveptest-0/apache-github-source-source/ql/src/test/results/clientpositive/llap/join_acid_non_acid.q.out
73d72
< 1 a
74a74
> 1 a
{code}

Seems to be a white space difference.
The test failed in the following pre-commit runs:
https://builds.apache.org/job/PreCommit-HIVE-Build/1932/testReport/
https://builds.apache.org/job/PreCommit-HIVE-Build/1931/testReport/
https://builds.apache.org/job/PreCommit-HIVE-Build/1930/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15115) Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]

2016-11-03 Thread Barna Zsombor Klara (JIRA)
Barna Zsombor Klara created HIVE-15115:
--

 Summary: Flaky test: 
TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 Key: HIVE-15115
 URL: https://issues.apache.org/jira/browse/HIVE-15115
 Project: Hive
  Issue Type: Sub-task
Reporter: Barna Zsombor Klara


This test was identified as flaky before, it seems it turned flaky again.
Earlier Jira:
[HIVE-14976|https://issues.apache.org/jira/browse/HIVE-14976]
New flaky runs:
https://builds.apache.org/job/PreCommit-HIVE-Build/1931/testReport
https://builds.apache.org/job/PreCommit-HIVE-Build/1930/testReport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)