[GitHub] [hive] maheshk114 opened a new pull request #569: HIVE-21446 : Hive Server going OOM during hive external table replications

2019-03-13 Thread GitBox
maheshk114 opened a new pull request #569: HIVE-21446 : Hive Server going OOM 
during hive external table replications
URL: https://github.com/apache/hive/pull/569
 
 
   …


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HIVE-21446) Hive Server going OOM during hive external table replications

2019-03-13 Thread mahesh kumar behera (JIRA)
mahesh kumar behera created HIVE-21446:
--

 Summary: Hive Server going OOM during hive external table 
replications
 Key: HIVE-21446
 URL: https://issues.apache.org/jira/browse/HIVE-21446
 Project: Hive
  Issue Type: Bug
  Components: repl
Affects Versions: 4.0.0
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera
 Fix For: 4.0.0


The file system objects opened using proxy users are not closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [hive] maheshk114 closed pull request #541: HIVE-21197 : Hive Replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled

2019-03-13 Thread GitBox
maheshk114 closed pull request #541: HIVE-21197 : Hive Replication can add 
duplicate data during migration to a target with hive.strict.managed.tables 
enabled
URL: https://github.com/apache/hive/pull/541
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] maheshk114 closed pull request #549: HIVE-21314 : Hive Replication not retaining the owner in the replicated table

2019-03-13 Thread GitBox
maheshk114 closed pull request #549: HIVE-21314 : Hive Replication not 
retaining the owner in the replicated table
URL: https://github.com/apache/hive/pull/549
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] maheshk114 closed pull request #558: HIVE-21325 : Hive external table replication failed with Permission denied issue.

2019-03-13 Thread GitBox
maheshk114 closed pull request #558: HIVE-21325 : Hive external table 
replication failed with Permission denied issue.
URL: https://github.com/apache/hive/pull/558
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265396797
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   It's just a lambda vs the hit of instantiating two lists per pair (plus imo 
clearer code)... In any case, if you think it is not worth changing, that's OK.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by 
keys reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265396039
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   @jcamachor 
   
   > And in L408 you could just use loop (or a lambda would be nicer) to add 
the left side from the pairs in the list
   
   But that would add an extra loop no? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by 
keys reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265396039
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   @jcamachor 
   
   > And in L408 you could just use loop (or a lambda would be nicer) to add 
the left side from the pairs in the list
   But that would add an extra loop no? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265392422
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   Where do you need additional lists? When you loop in L424, you are accessing 
both values. And in L408 you could just use loop (or a lambda would be nicer) 
to add the left side from the pairs in the list. Or am I missing anything?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265392422
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   Where do you need additional lists? When you loop in L424, you are accessing 
both values. And in L408 you could just use loop (or a lambda) to add the left 
side from the pairs in the list. Or am I missing anything?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HIVE-21445) Support range check for DECIMAL type in stats annotation

2019-03-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21445:
--

 Summary: Support range check for DECIMAL type in stats annotation
 Key: HIVE-21445
 URL: https://issues.apache.org/jira/browse/HIVE-21445
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer, Statistics
Affects Versions: 4.0.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [hive] vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by 
keys reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265391015
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+  if(mapGBKeysLineage.containsKey(baseTable)) {
+List baseCol = mapGBKeysLineage.get(baseTable).left;
+baseCol.add(tblRef.getIndex());
+List gbKey = mapGBKeysLineage.get(baseTable).right;
+gbKey.add(key);
+  } else {
+List baseCol = new ArrayList<>();
+baseCol.add(tblRef.getIndex());
+List gbKey = new ArrayList<>();
+gbKey.add(key);
+mapGBKeysLineage.put(baseTable, Pair.of(baseCol, gbKey));
+  }
+} else if(RexUtil.isDeterministic(expr)){
+  // even though we weren't able to backtrack this key it could still 
be candidate for removal
+  // if rest of the columns contain pk/unique
+  TableRefFinder finder = new TableRefFinder();
+  expr.accept(finder);
+  Set tableRefs = finder.getTableRefs();
+  if(tableRefs.size() == 1) {
+RexTableInputRef tblRef = tableRefs.iterator().next();
+Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+if(candidateKeys.containsKey(baseTable)) {
+  List candidateGBKeys = candidateKeys.get(baseTable);
+  candidateGBKeys.add(key);
+} else {
+  List candidateGBKeys =  new ArrayList<>();
+  candidateGBKeys.add(key);
+  candidateKeys.put(baseTable, candidateGBKeys);
+}
+  }
+}
   }
 }
-if(currentKey == null || currentKey.isEmpty()) {
-  return originalGroupSet;
-}
 
 // we want to delete all columns in original GB set except the key
 ImmutableBitSet.Builder builder = ImmutableBitSet.builder();
 
-// we have established that this gb set contains keys and it is safe to 
remove rest of the columns
-for(int i=0; i, Pair, 
List>> entry:mapGBKeysLineage.entrySet()) {
+  RelOptHiveTable tbl = (RelOptHiveTable)entry.getKey().left;
+  List backtrackedGBList = entry.getValue().left;
+  List gbKeys = 

[GitHub] [hive] vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
vineetgarg02 commented on a change in pull request #567: HIVE-21382: Group by 
keys reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265390982
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   The reason this has to be kept this way is because later for a given table 
we need a list of all the back tracked keys and list of all the corresponding 
gb keys. If we keep List of pairs we will have to later create lists out of 
that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HIVE-21444) Additional tests for materialized view rewriting

2019-03-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21444:
--

 Summary: Additional tests for materialized view rewriting
 Key: HIVE-21444
 URL: https://issues.apache.org/jira/browse/HIVE-21444
 Project: Hive
  Issue Type: Test
  Components: CBO, Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265281278
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+  if(mapGBKeysLineage.containsKey(baseTable)) {
+List baseCol = mapGBKeysLineage.get(baseTable).left;
+baseCol.add(tblRef.getIndex());
+List gbKey = mapGBKeysLineage.get(baseTable).right;
+gbKey.add(key);
+  } else {
+List baseCol = new ArrayList<>();
+baseCol.add(tblRef.getIndex());
+List gbKey = new ArrayList<>();
+gbKey.add(key);
+mapGBKeysLineage.put(baseTable, Pair.of(baseCol, gbKey));
+  }
+} else if(RexUtil.isDeterministic(expr)){
+  // even though we weren't able to backtrack this key it could still 
be candidate for removal
+  // if rest of the columns contain pk/unique
+  TableRefFinder finder = new TableRefFinder();
+  expr.accept(finder);
+  Set tableRefs = finder.getTableRefs();
+  if(tableRefs.size() == 1) {
+RexTableInputRef tblRef = tableRefs.iterator().next();
+Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+if(candidateKeys.containsKey(baseTable)) {
+  List candidateGBKeys = candidateKeys.get(baseTable);
+  candidateGBKeys.add(key);
+} else {
+  List candidateGBKeys =  new ArrayList<>();
+  candidateGBKeys.add(key);
+  candidateKeys.put(baseTable, candidateGBKeys);
+}
+  }
+}
   }
 }
-if(currentKey == null || currentKey.isEmpty()) {
-  return originalGroupSet;
-}
 
 // we want to delete all columns in original GB set except the key
 ImmutableBitSet.Builder builder = ImmutableBitSet.builder();
 
-// we have established that this gb set contains keys and it is safe to 
remove rest of the columns
-for(int i=0; i, Pair, 
List>> entry:mapGBKeysLineage.entrySet()) {
+  RelOptHiveTable tbl = (RelOptHiveTable)entry.getKey().left;
+  List backtrackedGBList = entry.getValue().left;
+  List gbKeys = 

[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265254408
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
 
 Review comment:
   ```ImmutableBitSet``` implements ```implements Iterable```, hence 
we can just write ```for (int key : originalGroupSet)``` .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265274235
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+  if(mapGBKeysLineage.containsKey(baseTable)) {
+List baseCol = mapGBKeysLineage.get(baseTable).left;
+baseCol.add(tblRef.getIndex());
+List gbKey = mapGBKeysLineage.get(baseTable).right;
+gbKey.add(key);
+  } else {
+List baseCol = new ArrayList<>();
+baseCol.add(tblRef.getIndex());
+List gbKey = new ArrayList<>();
+gbKey.add(key);
+mapGBKeysLineage.put(baseTable, Pair.of(baseCol, gbKey));
+  }
+} else if(RexUtil.isDeterministic(expr)){
+  // even though we weren't able to backtrack this key it could still 
be candidate for removal
+  // if rest of the columns contain pk/unique
+  TableRefFinder finder = new TableRefFinder();
+  expr.accept(finder);
+  Set tableRefs = finder.getTableRefs();
+  if(tableRefs.size() == 1) {
+RexTableInputRef tblRef = tableRefs.iterator().next();
+Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+if(candidateKeys.containsKey(baseTable)) {
+  List candidateGBKeys = candidateKeys.get(baseTable);
+  candidateGBKeys.add(key);
+} else {
+  List candidateGBKeys =  new ArrayList<>();
+  candidateGBKeys.add(key);
+  candidateKeys.put(baseTable, candidateGBKeys);
+}
+  }
+}
   }
 }
-if(currentKey == null || currentKey.isEmpty()) {
-  return originalGroupSet;
-}
 
 // we want to delete all columns in original GB set except the key
 ImmutableBitSet.Builder builder = ImmutableBitSet.builder();
 
-// we have established that this gb set contains keys and it is safe to 
remove rest of the columns
-for(int i=0; i, Pair, 
List>> entry:mapGBKeysLineage.entrySet()) {
+  RelOptHiveTable tbl = (RelOptHiveTable)entry.getKey().left;
+  List backtrackedGBList = entry.getValue().left;
+  List gbKeys = 

[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265277319
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+  if(mapGBKeysLineage.containsKey(baseTable)) {
+List baseCol = mapGBKeysLineage.get(baseTable).left;
+baseCol.add(tblRef.getIndex());
+List gbKey = mapGBKeysLineage.get(baseTable).right;
+gbKey.add(key);
+  } else {
+List baseCol = new ArrayList<>();
+baseCol.add(tblRef.getIndex());
+List gbKey = new ArrayList<>();
+gbKey.add(key);
+mapGBKeysLineage.put(baseTable, Pair.of(baseCol, gbKey));
+  }
+} else if(RexUtil.isDeterministic(expr)){
+  // even though we weren't able to backtrack this key it could still 
be candidate for removal
+  // if rest of the columns contain pk/unique
+  TableRefFinder finder = new TableRefFinder();
+  expr.accept(finder);
+  Set tableRefs = finder.getTableRefs();
+  if(tableRefs.size() == 1) {
+RexTableInputRef tblRef = tableRefs.iterator().next();
+Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+if(candidateKeys.containsKey(baseTable)) {
+  List candidateGBKeys = candidateKeys.get(baseTable);
+  candidateGBKeys.add(key);
+} else {
+  List candidateGBKeys =  new ArrayList<>();
+  candidateGBKeys.add(key);
+  candidateKeys.put(baseTable, candidateGBKeys);
+}
+  }
+}
   }
 }
-if(currentKey == null || currentKey.isEmpty()) {
-  return originalGroupSet;
-}
 
 // we want to delete all columns in original GB set except the key
 ImmutableBitSet.Builder builder = ImmutableBitSet.builder();
 
-// we have established that this gb set contains keys and it is safe to 
remove rest of the columns
-for(int i=0; i, Pair, 
List>> entry:mapGBKeysLineage.entrySet()) {
+  RelOptHiveTable tbl = (RelOptHiveTable)entry.getKey().left;
+  List backtrackedGBList = entry.getValue().left;
+  List gbKeys = 

[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265249031
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+  if(mapGBKeysLineage.containsKey(baseTable)) {
+List baseCol = mapGBKeysLineage.get(baseTable).left;
+baseCol.add(tblRef.getIndex());
+List gbKey = mapGBKeysLineage.get(baseTable).right;
+gbKey.add(key);
+  } else {
+List baseCol = new ArrayList<>();
+baseCol.add(tblRef.getIndex());
+List gbKey = new ArrayList<>();
+gbKey.add(key);
+mapGBKeysLineage.put(baseTable, Pair.of(baseCol, gbKey));
+  }
+} else if(RexUtil.isDeterministic(expr)){
+  // even though we weren't able to backtrack this key it could still 
be candidate for removal
+  // if rest of the columns contain pk/unique
+  TableRefFinder finder = new TableRefFinder();
 
 Review comment:
   We should use ```RexUtil.gatherTableReferences``` for same purpose so we do 
not need to introduce ```TableRefFinder```.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265261615
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   Please, add a comment explaining what this map contains (specially the 
value).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265277655
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
 
 Review comment:
   Same as above.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265274642
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+  if(mapGBKeysLineage.containsKey(baseTable)) {
+List baseCol = mapGBKeysLineage.get(baseTable).left;
+baseCol.add(tblRef.getIndex());
+List gbKey = mapGBKeysLineage.get(baseTable).right;
+gbKey.add(key);
+  } else {
+List baseCol = new ArrayList<>();
+baseCol.add(tblRef.getIndex());
+List gbKey = new ArrayList<>();
+gbKey.add(key);
+mapGBKeysLineage.put(baseTable, Pair.of(baseCol, gbKey));
+  }
+} else if(RexUtil.isDeterministic(expr)){
+  // even though we weren't able to backtrack this key it could still 
be candidate for removal
+  // if rest of the columns contain pk/unique
+  TableRefFinder finder = new TableRefFinder();
+  expr.accept(finder);
+  Set tableRefs = finder.getTableRefs();
+  if(tableRefs.size() == 1) {
+RexTableInputRef tblRef = tableRefs.iterator().next();
+Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
+if(candidateKeys.containsKey(baseTable)) {
+  List candidateGBKeys = candidateKeys.get(baseTable);
+  candidateGBKeys.add(key);
+} else {
+  List candidateGBKeys =  new ArrayList<>();
+  candidateGBKeys.add(key);
+  candidateKeys.put(baseTable, candidateGBKeys);
+}
+  }
+}
   }
 }
-if(currentKey == null || currentKey.isEmpty()) {
-  return originalGroupSet;
-}
 
 // we want to delete all columns in original GB set except the key
 ImmutableBitSet.Builder builder = ImmutableBitSet.builder();
 
-// we have established that this gb set contains keys and it is safe to 
remove rest of the columns
-for(int i=0; i, Pair, 
List>> entry:mapGBKeysLineage.entrySet()) {
+  RelOptHiveTable tbl = (RelOptHiveTable)entry.getKey().left;
+  List backtrackedGBList = entry.getValue().left;
+  List gbKeys = 

[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265279555
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
 
 Review comment:
   ```List>``` seems more intuitive/readable than holding 
two lists in the value?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265260514
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
 
 Review comment:
   Unless I am mistaken, we should pass ```aggregate.getInput()``` to method 
since ```inputRef``` references the input.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys reduction optimization - keys are not reduced in query23

2019-03-13 Thread GitBox
jcamachor commented on a change in pull request #567: HIVE-21382: Group by keys 
reduction optimization - keys are not reduced in query23
URL: https://github.com/apache/hive/pull/567#discussion_r265252070
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 ##
 @@ -315,48 +322,128 @@ private boolean isRexLiteral(final RexNode rexNode) {
   }
 
 
+  private static class TableRefFinder extends RexVisitorImpl {
+private Set tableRefs = null;
+TableRefFinder() {
+  super(true);
+  this.tableRefs = new HashSet<>();
+}
+
+public Set getTableRefs() {
+  return this.tableRefs;
+}
+
+@Override
+public Void visitTableInputRef(RexTableInputRef ref) {
+  this.tableRefs.add(ref);
+  return null;
+}
+  }
+
   // Given a groupset this tries to find out if the cardinality of the 
grouping columns could have changed
   // because if not and it consist of keys (unique + not null OR pk), we can 
safely remove rest of the columns
   // if those are columns are not being used further up
   private ImmutableBitSet generateGroupSetIfCardinalitySame(final Aggregate 
aggregate,
 final ImmutableBitSet 
originalGroupSet, final ImmutableBitSet fieldsUsed) {
-Pair> tabToOrgCol = 
HiveRelOptUtil.getColumnOriginSet(aggregate.getInput(),
-   
  originalGroupSet);
-if(tabToOrgCol == null) {
-  return originalGroupSet;
-}
-RelOptHiveTable tbl = (RelOptHiveTable)tabToOrgCol.left;
-List backtrackedGBList = tabToOrgCol.right;
-ImmutableBitSet backtrackedGBSet = 
ImmutableBitSet.builder().addAll(backtrackedGBList).build();
 
-List allKeys = tbl.getNonNullableKeys();
-ImmutableBitSet currentKey = null;
-for(ImmutableBitSet key:allKeys) {
-  if(backtrackedGBSet.contains(key)) {
-// only if grouping sets consist of keys
-currentKey = key;
-break;
+RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
+RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery();
+
+Iterator iterator = originalGroupSet.iterator();
+Map, Pair, List>> 
mapGBKeysLineage= new HashMap<>();
+
+Map, List> candidateKeys = new 
HashMap<>();
+
+while(iterator.hasNext()) {
+  Integer key = iterator.next();
+  RexNode inputRef = rexBuilder.makeInputRef(aggregate.getInput(), 
key.intValue());
+  Set exprLineage = mq.getExpressionLineage(aggregate, inputRef);
+  if(exprLineage != null && exprLineage.size() == 1){
+RexNode expr = exprLineage.iterator().next();
+if(expr instanceof RexTableInputRef) {
+  RexTableInputRef tblRef = (RexTableInputRef)expr;
+  Pair baseTable = 
Pair.of(tblRef.getTableRef().getTable(), 
tblRef.getTableRef().getEntityNumber());
 
 Review comment:
   We do not need to use ```Pair```, we can call ```getTableRef()``` and use 
```RelTableRef``` directly as a key.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HIVE-21443) Better usability for SHOW COMPACTIONS

2019-03-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HIVE-21443:
--

 Summary: Better usability for SHOW COMPACTIONS
 Key: HIVE-21443
 URL: https://issues.apache.org/jira/browse/HIVE-21443
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Reporter: Todd Lipcon


Currently on a test cluster the output of 'SHOW COMPACTIONS' has 117k rows. 
This makes it basically useless to work with.

For better usability, we should support syntax like 'SHOW COMPACTIONS IN 
' or maybe 'SHOW COMPACTIONS ON ' (particular syntax to be 
chosen for consistency with other operations I suppose).

Alternatively (or maybe in addition) it seems like it would be nice to expose 
the same data in a queryable table (eg in information_schema or a system 
namespace) so that I could do things like: SELECT dbname, state, count(*) from 
compactions group by 1,2;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21442) Malformed ORC created by Hive Streaming

2019-03-13 Thread Gideon Korir (JIRA)
Gideon Korir created HIVE-21442:
---

 Summary: Malformed ORC created by Hive Streaming
 Key: HIVE-21442
 URL: https://issues.apache.org/jira/browse/HIVE-21442
 Project: Hive
  Issue Type: Bug
Reporter: Gideon Korir






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70193: HIVE-15406 Consider vectorizing the new 'trunc' function

2019-03-13 Thread Laszlo Bodor

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70193/
---

(Updated March 13, 2019, 2:39 p.m.)


Review request for hive and Teddy Choi.


Repository: hive-git


Description
---

HIVE-15406 Consider vectorizing the new 'trunc' function


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromDate.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromString.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromTimestamp.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDecimal.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDecimalNoScale.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncFloat.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncFloatNoScale.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java 
7a7d13ef41 
  ql/src/test/queries/clientpositive/vector_udf_trunc.q PRE-CREATION 
  ql/src/test/results/clientpositive/vector_udf_trunc.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/70193/diff/2/

Changes: https://reviews.apache.org/r/70193/diff/1-2/


Testing
---

added vector_udf_trunc.q for testing


Thanks,

Laszlo Bodor



[jira] [Created] (HIVE-21441) TestCliDriver#groupby_ppr fails

2019-03-13 Thread Laszlo Bodor (JIRA)
Laszlo Bodor created HIVE-21441:
---

 Summary: TestCliDriver#groupby_ppr fails
 Key: HIVE-21441
 URL: https://issues.apache.org/jira/browse/HIVE-21441
 Project: Hive
  Issue Type: Bug
Reporter: Laszlo Bodor






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21440) Fix test_teradatabinaryfile to not run into stackoverflows

2019-03-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-21440:
---

 Summary: Fix test_teradatabinaryfile to not run into stackoverflows
 Key: HIVE-21440
 URL: https://issues.apache.org/jira/browse/HIVE-21440
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
 Attachments: teradata.samle.junit.xml

this test seems to be failing in recent runs; taking a closer look shows that 
it might be some kryo related stackoverflow

{code}
Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
"org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
class: org.apache.hadoop.hive.ql.io.TeradataBinaryFileOutput
Format
at 
org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:67)
at 
org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:380)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:364)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:74)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:490)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:97)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:517)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.write(DefaultSerializers.java:321)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.write(DefaultSerializers.java:314)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:606)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:87)
... 104 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor101.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:54)
... 115 more
Caused by: java.lang.StackOverflowError
at java.util.HashMap.hash(HashMap.java:338)
at java.util.HashMap.get(HashMap.java:556)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
at 
org.apache.hive.com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21439) Provide an option to reduce lookup overhead for bucketed tables

2019-03-13 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-21439:
---

 Summary: Provide an option to reduce lookup overhead for bucketed 
tables
 Key: HIVE-21439
 URL: https://issues.apache.org/jira/browse/HIVE-21439
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan


If a table is bucketed, `OpTraitsRulesProcFactory::TableScanRule` ends up 
verifying if the partitions have got the same number of files as the number of 
buckets in table. 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java#L185

In large tables, this turns out to be very time consuming operation. It would 
be good to have an option to by pass this depending on need basis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21438) HiveStreamingConnection.toString doesn't print transaction batch

2019-03-13 Thread JIRA
Stig Rohde Døssing created HIVE-21438:
-

 Summary: HiveStreamingConnection.toString doesn't print 
transaction batch
 Key: HIVE-21438
 URL: https://issues.apache.org/jira/browse/HIVE-21438
 Project: Hive
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 3.1.1
Reporter: Stig Rohde Døssing


HiveStreamingConnection.toString doesn't contain the current transaction state. 
In hive-hcatalog-streaming, the transaction batch was exposed to the user, 
which allowed the application to log e.g. transaction id and state when errors 
occur. Some exceptions from TransactionBatch contain the current transaction 
id, but many don't.

It would be nice if HiveStreamingConnection.toString also included the 
currentTransactionBatch.toString.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21437) Vectorization: Decimal division with integer columns

2019-03-13 Thread Gopal V (JIRA)
Gopal V created HIVE-21437:
--

 Summary: Vectorization: Decimal division with integer columns
 Key: HIVE-21437
 URL: https://issues.apache.org/jira/browse/HIVE-21437
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Gopal V


Vectorizer fails for

{code}
CREATE temporary TABLE `catalog_Sales`(
  `cs_quantity` int, 
  `cs_wholesale_cost` decimal(7,2), 
  `cs_list_price` decimal(7,2), 
  `cs_sales_price` decimal(7,2), 
  `cs_ext_discount_amt` decimal(7,2), 
  `cs_ext_sales_price` decimal(7,2), 
  `cs_ext_wholesale_cost` decimal(7,2), 
  `cs_ext_list_price` decimal(7,2), 
  `cs_ext_tax` decimal(7,2), 
  `cs_coupon_amt` decimal(7,2), 
  `cs_ext_ship_cost` decimal(7,2), 
  `cs_net_paid` decimal(7,2), 
  `cs_net_paid_inc_tax` decimal(7,2), 
  `cs_net_paid_inc_ship` decimal(7,2), 
  `cs_net_paid_inc_ship_tax` decimal(7,2), 
  `cs_net_profit` decimal(7,2))
 ;

explain vectorization detail select maxcs_ext_list_price - 
cs_ext_wholesale_cost) - cs_ext_discount_amt) + cs_ext_sales_price) / 2) from 
catalog_sales;
{code}

{code}
'Map Vectorization:'
'enabled: true'
'enabledConditionsMet: 
hive.vectorized.use.vectorized.input.format IS true'
'inputFileFormats: 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
'notVectorizedReason: SELECT operator: Could not instantiate 
DecimalColDivideDecimalScalar with arguments arguments: [21, 20, 22], argument 
classes: [Integer, Integer, Integer], exception: 
java.lang.IllegalArgumentException: java.lang.ClassCastException@63b56be0 stack 
trace: sun.reflect.GeneratedConstructorAccessor.newInstance(Unknown 
Source), 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45),
 java.lang.reflect.Constructor.newInstance(Constructor.java:423), 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.instantiateExpression(VectorizationContext.java:2088),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4662),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4602),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.vectorizeSelectOperator(Vectorizer.java:4584),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperator(Vectorizer.java:5171),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChild(Vectorizer.java:923),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChildren(Vectorizer.java:809),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperatorTree(Vectorizer.java:776),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.access$2400(Vectorizer.java:240),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2038),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:1990),
 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(Vectorizer.java:1963),
 ...'
'vectorized: false'
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [hive] ShubhamChaurasia opened a new pull request #568: HIVE-21435: LlapBaseInputFormat should get task number from conf

2019-03-13 Thread GitBox
ShubhamChaurasia opened a new pull request #568: HIVE-21435: 
LlapBaseInputFormat should get task number from conf
URL: https://github.com/apache/hive/pull/568
 
 
   LlapBaseInputFormat should get task number from TASK_ATTEMPT_ID conf if 
present, while building SubmitWorkRequestProto


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services