[GitHub] [carbondata] QiangCai commented on a change in pull request #3771: [CARBONDATA-3849] pushdown array_contains filter to carbon for array of primitive types

GitBox Tue, 09 Jun 2020 19:02:04 -0700


QiangCai commented on a change in pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#discussion_r437812719




##########
File path: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
##########
@@ -222,49 +228,103 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
       }
     }
     BitSetGroup bitSetGroup = new BitSetGroup(pageNumbers);
-    for (int i = 0; i < pageNumbers; i++) {
-      BitSet set = new BitSet(numberOfRows[i]);
-      RowIntf row = new RowImpl();
-      BitSet prvBitset = null;
-      // if bitset pipe line is enabled then use rowid from previous bitset
-      // otherwise use older flow
-      if (!useBitsetPipeLine ||
-          null == rawBlockletColumnChunks.getBitSetGroup() ||
-          null == bitSetGroup.getBitSet(i) ||
-          rawBlockletColumnChunks.getBitSetGroup().getBitSet(i).isEmpty()) {
+    if (isDimensionPresentInCurrentBlock.length == 1 && 
isDimensionPresentInCurrentBlock[0]

Review comment:
       1.  better to add new Expression like ArrayContainsExpression
   2.  how about to consider filter BitSetPipeLine ?

##########
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala
##########
@@ -679,18 +681,20 @@ private[sql] class CarbonLateDecodeStrategy extends 
SparkStrategy {
     // In case of ComplexType dataTypes no filters should be pushed down. 
IsNotNull is being
     // explicitly added by spark and pushed. That also has to be handled and 
pushed back to
     // Spark for handling.
-    val predicatesWithoutComplex = predicates.filter(predicate =>
+    // allow array_contains() push down
+    val filteredPredicates = predicates.filter(predicate =>

Review comment:
       use '{' instead of '('

##########
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala
##########
@@ -517,7 +518,8 @@ private[sql] class CarbonLateDecodeStrategy extends 
SparkStrategy {
       val supportBatch =
         supportBatchedDataSource(relation.relation.sqlContext,
           updateRequestedColumns) && extraRdd.getOrElse((null, true))._2
-      if (!vectorPushRowFilters && !supportBatch && !implicitExisted) {
+      if (!vectorPushRowFilters && !supportBatch && !implicitExisted && 
filterSet.nonEmpty &&

Review comment:
       why need to change it?

##########
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/CarbonFilters.scala
##########
@@ -152,13 +152,25 @@ object CarbonFilters {
     }
 
     def getCarbonExpression(name: String) = {

Review comment:
       in 'createFilter' method,  convert CarbonArrayContains filter to 
ArrayContainsExpression

##########
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala
##########
@@ -865,6 +869,27 @@ private[sql] class CarbonLateDecodeStrategy extends 
SparkStrategy {
         Some(CarbonContainsWith(c))
       case c@Literal(v, t) if (v == null) =>
         Some(FalseExpr())
+      case c@ArrayContains(a: Attribute, Literal(v, t)) =>
+        a.dataType match {
+          case arrayType: ArrayType =>
+            arrayType.elementType match {
+              case StringType => Some(sources.EqualTo(a.name, v))

Review comment:
       how about to use a new filter: CarbonArrayContains




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on a change in pull request #3771: [CARBONDATA-3849] pushdown array_contains filter to carbon for array of primitive types

Reply via email to