[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


Indhumathi27 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440609387



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java
##
@@ -188,6 +189,14 @@ private static FilterExecuter createFilterExecuterTree(
   return new FalseFilterExecutor();
 case ROWLEVEL:
 default:
+  if (filterExpressionResolverTree.getFilterExpression() instanceof 
UnknownExpression) {
+FilterExecuter filterExecuter =
+((UnknownExpression) 
filterExpressionResolverTree.getFilterExpression())
+.getFilterExecuter(filterExpressionResolverTree, 
segmentProperties);
+if (filterExecuter != null) {

Review comment:
   If in case, filterExecuter is null, we need to return 
RowLevelFilterExecuterImpl right?. If filterExecuter never comes null, can 
check and remove this check

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##
@@ -134,8 +133,31 @@ class GeoTest extends QueryTest with BeforeAndAfterAll 
with BeforeAndAfterEach {
 Row(116285807, 40084087)))
   }
 
-  test("test polygon query on table partitioned by timevalue column")
-  {
+  test("test block pruning for polygon query") {
+createTable()
+sql(s"insert into $table1 select 157542840,116285807,40084087")
+sql(s"insert into $table1 select 157542840,116372142,40129503")
+sql(s"insert into $table1 select 157542840,116187332,39979316")
+sql(s"insert into $table1 select 157542840,116337069,39951887")
+sql(s"insert into $table1 select 157542840,116359102,40154684")
+sql(s"insert into $table1 select 157542840,116736367,39970323")
+sql(s"insert into $table1 select 157542840,116362699,39942444")
+sql(s"insert into $table1 select 157542840,116325378,39963129")
+sql(s"insert into $table1 select 157542840,116302895,39930753")
+sql(s"insert into $table1 select 157542840,116288955,3101")
+val df = sql(s"select longitude, latitude from $table1 where 
IN_POLYGON('116.321011 " +
+ s"40.123503, 116.137676 39.947911, 116.560993 39.935276, 
116.321011 40.123503')")
+assert(df.rdd.getNumPartitions == 6)
+checkAnswer(df,

Review comment:
   Can you move this checkAnswer to a new method with param as dataframe. 
Looks like this check is same in all test cases





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-644536457


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3155/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-644536233


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1431/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#issuecomment-644391643


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3154/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#issuecomment-644391323


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1430/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


VenuReddy2103 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440394383



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/expression/PolygonExpression.java
##
@@ -46,15 +51,16 @@
   private CustomIndex> instance;
   private List ranges = new ArrayList();
   private ColumnExpression column;
-  private ExpressionResult trueExpRes;
-  private ExpressionResult falseExpRes;
+  private static final ExpressionResult trueExpRes =

Review comment:
   Have replied in the immediate below comment. Please let me know if you 
have any other opinion.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


VenuReddy2103 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440386464



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java
##
@@ -188,6 +189,14 @@ private static FilterExecuter createFilterExecuterTree(
   return new FalseFilterExecutor();
 case ROWLEVEL:
 default:
+  if (filterExpressionResolverTree.getFilterExpression() instanceof 
UnknownExpression) {

Review comment:
   In the initial version of the feature PR, polygon expression and 
executor type were defined explitily for polygon filter in carbondata-core 
module itself. But Ravi's review comment suggested the use of 
`UnknownExpression`(similar to `SparkUnknownExpression`) and have 
carbondata-geo depends on carbondata-core. But not the other way round.  
   Currently `UnknownExpression` always uses `RowLevelFilterExecuterImpl` and 
applies filter for each row without pruning(i.e., no block, blocklet and page 
pruning). So have enhanced the `UnknownExpression` with pruning ability.

##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/expression/PolygonExpression.java
##
@@ -46,15 +51,16 @@
   private CustomIndex> instance;
   private List ranges = new ArrayList();
   private ColumnExpression column;
-  private ExpressionResult trueExpRes;
-  private ExpressionResult falseExpRes;
+  private static final ExpressionResult trueExpRes =

Review comment:
   Have replied for it the immediate below comment. Please let me know if 
you have any other opinion.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


VenuReddy2103 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440386464



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java
##
@@ -188,6 +189,14 @@ private static FilterExecuter createFilterExecuterTree(
   return new FalseFilterExecutor();
 case ROWLEVEL:
 default:
+  if (filterExpressionResolverTree.getFilterExpression() instanceof 
UnknownExpression) {

Review comment:
   In the initial version of the feature PR, polygon expression and 
executor type were defined explitily for polygon filter in carbondata-core 
module itself. But Ravi's review comment suggested the use of 
`UnknownExpression`(similar to `SparkUnknownExpression`) and have 
carbondata-geo depends on carbondata-core. But not the other way round. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


VenuReddy2103 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440378220



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/filter/executor/PolygonFilterExecutorImpl.java
##
@@ -0,0 +1,84 @@
+package org.apache.carbondata.geo.scan.filter.executor;
+
+import java.util.BitSet;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.GenericQueryType;
+import 
org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.MeasureColumnResolvedFilterInfo;
+import org.apache.carbondata.core.scan.processor.RawBlockletColumnChunks;
+import org.apache.carbondata.core.util.DataTypeUtil;
+import org.apache.carbondata.geo.scan.expression.PolygonExpression;
+
+import org.apache.log4j.Logger;
+
+public class PolygonFilterExecutorImpl extends RowLevelFilterExecuterImpl {
+  public PolygonFilterExecutorImpl(List 
dimColEvaluatorInfoList,
+  List msrColEvalutorInfoList, Expression 
exp,
+  AbsoluteTableIdentifier tableIdentifier, SegmentProperties 
segmentProperties,
+  Map complexDimensionInfoMap) {
+super(dimColEvaluatorInfoList, msrColEvalutorInfoList, exp, 
tableIdentifier, segmentProperties,
+complexDimensionInfoMap);
+  }
+
+  private int getNearestRangeIndex(List ranges, long searchForNumber) {
+Long[] range;
+int low = 0, mid = 0, high = ranges.size() - 1;
+while (low <= high) {
+  mid = low + ((high - low) / 2);
+  range = ranges.get(mid);
+  if (searchForNumber >= range[0]) {
+if (searchForNumber <= range[1]) {
+  // Return the range index if the number is between min and max 
values of the range
+  return mid;
+} else {
+  // Number is bigger than this range's min and max. Search on the 
right side of the range
+  low = mid + 1;
+}
+  } else {
+// Number is smaller than this range's min and max. Search on the left 
side of the range
+high = mid - 1;
+  }
+}
+return mid;
+  }
+
+  private boolean isScanRequired(byte[] maxValue, byte[] minValue) {

Review comment:
   Added method header and comments to code





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


VenuReddy2103 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440377390



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/filter/executor/PolygonFilterExecutorImpl.java
##
@@ -0,0 +1,84 @@
+package org.apache.carbondata.geo.scan.filter.executor;

Review comment:
   Added





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-15 Thread GitBox


VenuReddy2103 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r440377512



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/filter/executor/PolygonFilterExecutorImpl.java
##
@@ -0,0 +1,84 @@
+package org.apache.carbondata.geo.scan.filter.executor;
+
+import java.util.BitSet;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.GenericQueryType;
+import 
org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.MeasureColumnResolvedFilterInfo;
+import org.apache.carbondata.core.scan.processor.RawBlockletColumnChunks;
+import org.apache.carbondata.core.util.DataTypeUtil;
+import org.apache.carbondata.geo.scan.expression.PolygonExpression;
+
+import org.apache.log4j.Logger;
+
+public class PolygonFilterExecutorImpl extends RowLevelFilterExecuterImpl {
+  public PolygonFilterExecutorImpl(List 
dimColEvaluatorInfoList,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-644291267


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1429/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-644286332


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3153/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-644098574


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1427/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-644097156


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3151/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3786: [CARBONDATA-3842] Fix incorrect results on mv with limit (Missed code during mv refcatory)

2020-06-15 Thread GitBox


Indhumathi27 commented on a change in pull request #3786:
URL: https://github.com/apache/carbondata/pull/3786#discussion_r440035757



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestAllOperationsOnMV.scala
##
@@ -610,10 +610,10 @@ class TestAllOperationsOnMV extends QueryTest with 
BeforeAndAfterEach {
 sql("drop materialized view if exists mv1")
 sql("create materialized view mv1  as select a.name,a.price from maintable 
a")
 var dataFrame = sql("select a.name,a.price from maintable a limit 1")
-assert(dataFrame.count() == 1)

Review comment:
   Here, count API is not applying collect rows on SPARK PHYSICAL PLAN. 
When i checked it, plan was on main table





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3786: [CARBONDATA-3842] Fix incorrect results on mv with limit (Missed code during mv refcatory)

2020-06-15 Thread GitBox


Indhumathi27 commented on a change in pull request #3786:
URL: https://github.com/apache/carbondata/pull/3786#discussion_r440035757



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestAllOperationsOnMV.scala
##
@@ -610,10 +610,10 @@ class TestAllOperationsOnMV extends QueryTest with 
BeforeAndAfterEach {
 sql("drop materialized view if exists mv1")
 sql("create materialized view mv1  as select a.name,a.price from maintable 
a")
 var dataFrame = sql("select a.name,a.price from maintable a limit 1")
-assert(dataFrame.count() == 1)

Review comment:
   Here, count API is not applying collect rows on SPARK PLAN. When i 
checked it, plan was on main table





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3786: [CARBONDATA-3842] Fix incorrect results on mv with limit (Missed code during mv refcatory)

2020-06-15 Thread GitBox


Indhumathi27 commented on a change in pull request #3786:
URL: https://github.com/apache/carbondata/pull/3786#discussion_r440032092



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestAllOperationsOnMV.scala
##
@@ -610,10 +610,10 @@ class TestAllOperationsOnMV extends QueryTest with 
BeforeAndAfterEach {
 sql("drop materialized view if exists mv1")
 sql("create materialized view mv1  as select a.name,a.price from maintable 
a")
 var dataFrame = sql("select a.name,a.price from maintable a limit 1")
-assert(dataFrame.count() == 1)

Review comment:
   In this Case, count was returning 1 only, but df.show() is giving all 
results. In cluster also, limit query gives all rows. Because of this, testcase 
didnt catch this issue. That's why changed to df.collect.length





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3786: [CARBONDATA-3842] Fix incorrect results on mv with limit (Missed code during mv refcatory)

2020-06-15 Thread GitBox


akashrn5 commented on a change in pull request #3786:
URL: https://github.com/apache/carbondata/pull/3786#discussion_r440014735



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestAllOperationsOnMV.scala
##
@@ -610,10 +610,10 @@ class TestAllOperationsOnMV extends QueryTest with 
BeforeAndAfterEach {
 sql("drop materialized view if exists mv1")
 sql("create materialized view mv1  as select a.name,a.price from maintable 
a")
 var dataFrame = sql("select a.name,a.price from maintable a limit 1")
-assert(dataFrame.count() == 1)

Review comment:
   why this change? because count is also an action, it should also give 1





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #3786: [CARBONDATA-3842] Fix incorrect results on mv with limit (Missed code during mv refcatory)

2020-06-15 Thread GitBox


akashrn5 commented on pull request #3786:
URL: https://github.com/apache/carbondata/pull/3786#issuecomment-643985798


   @Indhumathi27 why the test cases didnt catch this. If we do not have test 
cases, please add a test case for the limit case.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.merge.index.

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3776:
URL: https://github.com/apache/carbondata/pull/3776#issuecomment-643972857


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3150/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.merge.index.

2020-06-15 Thread GitBox


CarbonDataQA1 commented on pull request #3776:
URL: https://github.com/apache/carbondata/pull/3776#issuecomment-643972110


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1426/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org