svn commit: r1034236 - in /websites/staging/hive/trunk/content: ./ people.html

2018-08-20 Thread buildbot
Author: buildbot
Date: Mon Aug 20 22:09:38 2018
New Revision: 1034236

Log:
Staging update by buildbot for hive

Modified:
websites/staging/hive/trunk/content/   (props changed)
websites/staging/hive/trunk/content/people.html

Propchange: websites/staging/hive/trunk/content/
--
--- cms:source-revision (original)
+++ cms:source-revision Mon Aug 20 22:09:38 2018
@@ -1 +1 @@
-1837105
+1838505

Modified: websites/staging/hive/trunk/content/people.html
==
--- websites/staging/hive/trunk/content/people.html (original)
+++ websites/staging/hive/trunk/content/people.html Mon Aug 20 22:09:38 2018
@@ -452,7 +452,7 @@ tr:nth-child(2n+1) {
 
 xuefu 
 Xuefu Zhang 
- 
+https://www.alibaba.com/";>Alibaba 
Inc 
  
 
 




svn commit: r1838505 - /hive/cms/trunk/content/people.mdtext

2018-08-20 Thread xuefu
Author: xuefu
Date: Mon Aug 20 22:09:32 2018
New Revision: 1838505

URL: http://svn.apache.org/viewvc?rev=1838505&view=rev
Log:
Update Xuefu's org in the committer list

Modified:
hive/cms/trunk/content/people.mdtext

Modified: hive/cms/trunk/content/people.mdtext
URL: 
http://svn.apache.org/viewvc/hive/cms/trunk/content/people.mdtext?rev=1838505&r1=1838504&r2=1838505&view=diff
==
--- hive/cms/trunk/content/people.mdtext (original)
+++ hive/cms/trunk/content/people.mdtext Mon Aug 20 22:09:32 2018
@@ -334,7 +334,7 @@ tr:nth-child(2n+1) {
 
 xuefu 
 Xuefu Zhang 
- 
+https://www.alibaba.com/";>Alibaba 
Inc 
 
 
 




[1/2] hive git commit: HIVE-20366: TPC-DS query78 stats estimates are off for is null filter(Vineet Garg, reviewed by Ashutosh Chauhan)

2018-08-20 Thread vgarg
Repository: hive
Updated Branches:
  refs/heads/master f28036137 -> 20baf490c


http://git-wip-us.apache.org/repos/asf/hive/blob/20baf490/ql/src/test/results/clientpositive/llap/subquery_notin.q.out
--
diff --git a/ql/src/test/results/clientpositive/llap/subquery_notin.q.out 
b/ql/src/test/results/clientpositive/llap/subquery_notin.q.out
index 70501f9..390cdf0 100644
--- a/ql/src/test/results/clientpositive/llap/subquery_notin.q.out
+++ b/ql/src/test/results/clientpositive/llap/subquery_notin.q.out
@@ -105,10 +105,10 @@ STAGE PLANS:
   0 _col0 (type: string)
   1 _col0 (type: string)
 outputColumnNames: _col0, _col1, _col2, _col3, _col5
-Statistics: Num rows: 500 Data size: 97528 Basic stats: 
COMPLETE Column stats: COMPLETE
+Statistics: Num rows: 500 Data size: 97716 Basic stats: 
COMPLETE Column stats: COMPLETE
 Filter Operator
   predicate: ((_col2 = 0L) or (_col5 is null and _col0 is not 
null and (_col3 >= _col2))) (type: boolean)
-  Statistics: Num rows: 500 Data size: 97528 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Statistics: Num rows: 500 Data size: 97716 Basic stats: 
COMPLETE Column stats: COMPLETE
   Select Operator
 expressions: _col0 (type: string), _col1 (type: string)
 outputColumnNames: _col0, _col1
@@ -373,12 +373,12 @@ STAGE PLANS:
   0 _col1 (type: string)
   1 _col0 (type: string)
 outputColumnNames: _col0, _col1, _col2, _col4, _col5
-Statistics: Num rows: 26 Data size: 6134 Basic stats: COMPLETE 
Column stats: COMPLETE
+Statistics: Num rows: 26 Data size: 5814 Basic stats: COMPLETE 
Column stats: COMPLETE
 Reduce Output Operator
   key expressions: _col0 (type: string), _col1 (type: string)
   sort order: ++
   Map-reduce partition columns: _col0 (type: string), _col1 
(type: string)
-  Statistics: Num rows: 26 Data size: 6134 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Statistics: Num rows: 26 Data size: 5814 Basic stats: 
COMPLETE Column stats: COMPLETE
   value expressions: _col2 (type: int), _col4 (type: bigint), 
_col5 (type: bigint)
 Reducer 3 
 Execution mode: llap
@@ -390,10 +390,10 @@ STAGE PLANS:
   0 _col0 (type: string), _col1 (type: string)
   1 _col0 (type: string), _col1 (type: string)
 outputColumnNames: _col0, _col1, _col2, _col4, _col5, _col8
-Statistics: Num rows: 26 Data size: 6154 Basic stats: COMPLETE 
Column stats: COMPLETE
+Statistics: Num rows: 26 Data size: 5834 Basic stats: COMPLETE 
Column stats: COMPLETE
 Filter Operator
   predicate: CASE WHEN ((_col4 = 0L)) THEN (true) WHEN (_col4 
is null) THEN (true) WHEN (_col8 is not null) THEN (false) WHEN (_col0 is null) 
THEN (null) WHEN ((_col5 < _col4)) THEN (false) ELSE (true) END (type: boolean)
-  Statistics: Num rows: 13 Data size: 3087 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Statistics: Num rows: 13 Data size: 2927 Basic stats: 
COMPLETE Column stats: COMPLETE
   Select Operator
 expressions: _col1 (type: string), _col0 (type: string), 
_col2 (type: int)
 outputColumnNames: _col0, _col1, _col2
@@ -932,10 +932,10 @@ STAGE PLANS:
   0 _col1 (type: string)
   1 _col0 (type: string)
 outputColumnNames: _col0, _col1, _col2, _col4
-Statistics: Num rows: 26 Data size: 5966 Basic stats: COMPLETE 
Column stats: COMPLETE
+Statistics: Num rows: 26 Data size: 5806 Basic stats: COMPLETE 
Column stats: COMPLETE
 Filter Operator
   predicate: (sq_count_check(CASE WHEN (_col4 is null) THEN 
(0) ELSE (_col4) END, true) > 0) (type: boolean)
-  Statistics: Num rows: 8 Data size: 1840 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Statistics: Num rows: 8 Data size: 1792 Basic stats: 
COMPLETE Column stats: COMPLETE
   Select Operator
 expressions: _col0 (type: string), _col1 (type: string), 
_col2 (type: int)
 outputColumnNames: _col0, _col1, _col2
@@ -973,10 +973,10 @@ STAGE PLANS:
   0 _col1 (type: string), _col2 (type: int)
   1 _col1 (type: string), _col0 (type: int)
 outputColumnNames: _col0, _col1, _col2, _col4, _col5, _col8
-Statistics: Num rows: 8 Data size: 1932 Basic stats: COMPLETE 
Column stats: COMPLETE
+Statistics: Num rows: 8 Data size: 1944 Basic 

[2/2] hive git commit: HIVE-20366: TPC-DS query78 stats estimates are off for is null filter(Vineet Garg, reviewed by Ashutosh Chauhan)

2018-08-20 Thread vgarg
HIVE-20366: TPC-DS query78 stats estimates are off for is null filter(Vineet 
Garg, reviewed by Ashutosh Chauhan)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/20baf490
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/20baf490
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/20baf490

Branch: refs/heads/master
Commit: 20baf490cbc109b17da6b6b1c0fd64ca32314e6f
Parents: f280361
Author: Vineet Garg 
Authored: Mon Aug 20 13:25:32 2018 -0700
Committer: Vineet Garg 
Committed: Mon Aug 20 13:25:32 2018 -0700

--
 .../stats/annotation/StatsRulesProcFactory.java |  87 ---
 .../clientpositive/annotate_stats_join.q.out|   4 +-
 .../llap/bucket_map_join_tez2.q.out |  16 +-
 .../clientpositive/llap/check_constraint.q.out  |   2 +-
 .../llap/correlationoptimizer1.q.out|  24 +--
 .../clientpositive/llap/explainuser_1.q.out |  24 +--
 .../llap/insert_into_default_keyword.q.out  |   2 +-
 .../results/clientpositive/llap/join46.q.out|  36 ++---
 .../llap/join_emit_interval.q.out   |   4 +-
 .../results/clientpositive/llap/mapjoin46.q.out |  36 ++---
 .../llap/mapjoin_emit_interval.q.out|   4 +-
 .../clientpositive/llap/subquery_in.q.out   |  28 ++--
 .../clientpositive/llap/subquery_multi.q.out|  10 +-
 .../clientpositive/llap/subquery_notin.q.out| 150 +--
 .../clientpositive/llap/subquery_scalar.q.out   |   8 +-
 .../clientpositive/llap/subquery_select.q.out   |  46 +++---
 .../clientpositive/llap/tez_join_tests.q.out|  12 +-
 .../clientpositive/llap/tez_joins_explain.q.out |  12 +-
 .../clientpositive/llap/unionDistinct_1.q.out   |  18 +--
 .../clientpositive/llap/vector_coalesce_3.q.out |   2 +-
 .../llap/vector_groupby_mapjoin.q.out   |   4 +-
 .../llap/vector_outer_join0.q.out   |   8 +-
 .../clientpositive/llap/vectorized_join46.q.out |  24 +--
 .../spark/annotate_stats_join.q.out |   4 +-
 .../spark/spark_explainuser_1.q.out |  24 +--
 25 files changed, 317 insertions(+), 272 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/20baf490/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
--
diff --git 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 
b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
index 7682791..9cd6812 100644
--- 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
+++ 
b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
@@ -1696,12 +1696,17 @@ public class StatsRulesProcFactory {
 }
 
 List distinctVals = Lists.newArrayList();
+
+// these ndvs are later used to compute unmatched rows and num of 
nulls for outer joins
+List ndvsUnmatched= Lists.newArrayList();
 long denom = 1;
+long denomUnmatched = 1;
 if (inferredRowCount == -1) {
   // failed to infer PK-FK relationship for row count estimation 
fall-back on default logic
   // compute denominator  max(V(R,y1), V(S,y1)) * max(V(R,y2), V(S,y2))
   // in case of multi-attribute join
   List perAttrDVs = Lists.newArrayList();
+  // go over each predicate
   for (int idx = 0; idx < numAttr; idx++) {
 for (Integer i : joinKeys.keySet()) {
   String col = joinKeys.get(i).get(idx);
@@ -1711,19 +1716,27 @@ public class StatsRulesProcFactory {
   }
 }
 distinctVals.add(getDenominator(perAttrDVs));
+ndvsUnmatched.add(getDenominatorForUnmatchedRows(perAttrDVs));
 perAttrDVs.clear();
   }
 
   if (numAttr > 1 && 
conf.getBoolVar(HiveConf.ConfVars.HIVE_STATS_CORRELATED_MULTI_KEY_JOINS)) {
 denom = Collections.max(distinctVals);
+denomUnmatched = denom - 
ndvsUnmatched.get(distinctVals.indexOf(denom));
   } else if (numAttr > numParent) {
 // To avoid denominator getting larger and aggressively reducing
 // number of rows, we will ease out denominator.
 denom = StatsUtils.addWithExpDecay(distinctVals);
+denomUnmatched = denom - StatsUtils.addWithExpDecay(ndvsUnmatched);
   } else {
 for (Long l : distinctVals) {
   denom = StatsUtils.safeMult(denom, l);
 }
+long tempDenom = 1;
+for (Long l : ndvsUnmatched) {
+  tempDenom = StatsUtils.safeMult(tempDenom, l);
+}
+denomUnmatched = denom - tempDenom;
   }
 }
 
@@ 

hive git commit: HIVE-20402: itest needs explicit dependency on hbase-common jar (Mike Drob, reviewed by Naveen Gangam)

2018-08-20 Thread ngangam
Repository: hive
Updated Branches:
  refs/heads/master 66f97da9d -> f28036137


HIVE-20402: itest needs explicit dependency on hbase-common jar (Mike Drob, 
reviewed by Naveen Gangam)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/f2803613
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/f2803613
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/f2803613

Branch: refs/heads/master
Commit: f280361374c6219d8734d5972c740d6d6c3fb7ef
Parents: 66f97da
Author: Naveen Gangam 
Authored: Mon Aug 20 11:14:09 2018 -0400
Committer: Naveen Gangam 
Committed: Mon Aug 20 11:14:09 2018 -0400

--
 itests/util/pom.xml | 6 ++
 1 file changed, 6 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/f2803613/itests/util/pom.xml
--
diff --git a/itests/util/pom.xml b/itests/util/pom.xml
index 9a36446..607fd47 100644
--- a/itests/util/pom.xml
+++ b/itests/util/pom.xml
@@ -217,6 +217,12 @@
 
 
   org.apache.hbase
+  hbase-common
+  ${hbase.version}
+  tests
+
+
+  org.apache.hbase
   hbase-server
   ${hbase.version}
   tests