[GitHub] [carbondata] kunal642 commented on a change in pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#discussion_r485382685



##
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##
@@ -0,0 +1,376 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.io.FilenameFilter;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.Field;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.scan.expression.logical.AndExpression;
+import org.apache.carbondata.core.scan.expression.logical.OrExpression;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hadoop.internal.ObjectArrayWritable;
+
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapreduce.RecordWriter;
+
+public class CarbonIUD {
+
+  private final Map>> 
filterColumnToValueMappingForDelete;
+  private final Map>> 
filterColumnToValueMappingForUpdate;
+  private final Map> updateColumnToValueMapping;
+
+  private CarbonIUD() {
+filterColumnToValueMappingForDelete = new HashMap<>();
+filterColumnToValueMappingForUpdate = new HashMap<>();
+updateColumnToValueMapping = new HashMap<>();
+  }
+
+  /**
+   * @return CarbonIUD object
+   */
+  public static CarbonIUD getInstance() {
+return new CarbonIUD();
+  }
+
+  /**
+   * @param path   is the table path on which delete is performed
+   * @param column is the columnName on which records have to be deleted
+   * @param value  of column on which the records have to be deleted
+   * @return CarbonIUD object
+   */
+  public CarbonIUD delete(String path, String column, String value) {
+prepareDelete(path, column, value, filterColumnToValueMappingForDelete);
+return this;
+  }
+
+  /**
+   * This method deletes the rows at given path by applying the 
filterExpression
+   *
+   * @param path is the table path on which delete is performed
+   * @param filterExpression is the expression to delete the records
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public void delete(String path, Expression filterExpression)
+  throws IOException, InterruptedException {
+CarbonReader reader = CarbonReader.builder(path)
+.projection(new String[] { 
CarbonCommonConstants.CARBON_IMPLICIT_COLUMN_TUPLEID })
+.filter(filterExpression).build();
+
+RecordWriter deleteDeltaWriter =
+CarbonTableOutputFormat.getDeleteDeltaRecordWriter(path);
+ObjectArrayWritable writable = new ObjectArrayWritable();
+while (reader.hasNext()) {
+  Object[] row = (Object[]) reader.readNextRow();
+  writable.set(row);
+  deleteDeltaWriter.write(NullWritable.get(), writable);
+}
+deleteDeltaWriter.close(null);
+reader.close();
+  }
+
+  /**
+   * Calling this method will start the execution of delete process
+   *
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public void closeDelete() throws IOException, InterruptedException {
+for (Map.Entry>> path : 
this.filterColumnToValueMappingForDelete
+.entrySet()) {
+  deleteExecution(path.getKey());
+}
+  }
+
+  /**
+   * @param path  is the table path on which update is performed
+   * @param columnis the columnName on which records have to be updated
+   * @param value of column on which the records have to be updated
+   * @param updColumn is the name of updatedColumn
+   * @param updValue  is the value of updatedCo

[GitHub] [carbondata] ShreelekhyaG closed pull request #3909: [CARBONDATA-3972] Date/timestamp compatability between hive and carbon

2020-09-09 Thread GitBox


ShreelekhyaG closed pull request #3909:
URL: https://github.com/apache/carbondata/pull/3909


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689403708


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2275/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689408106


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4014/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 opened a new pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


vikramahuja1001 opened a new pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689476467


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2276/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689476907


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4015/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689507261


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2278/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689507898


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4017/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan-c980 commented on a change in pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


Karan-c980 commented on a change in pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#discussion_r485548849



##
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##
@@ -0,0 +1,376 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.io.FilenameFilter;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.Field;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.scan.expression.logical.AndExpression;
+import org.apache.carbondata.core.scan.expression.logical.OrExpression;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hadoop.internal.ObjectArrayWritable;
+
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapreduce.RecordWriter;
+
+public class CarbonIUD {
+
+  private final Map>> 
filterColumnToValueMappingForDelete;
+  private final Map>> 
filterColumnToValueMappingForUpdate;
+  private final Map> updateColumnToValueMapping;
+
+  private CarbonIUD() {
+filterColumnToValueMappingForDelete = new HashMap<>();
+filterColumnToValueMappingForUpdate = new HashMap<>();
+updateColumnToValueMapping = new HashMap<>();
+  }
+
+  /**
+   * @return CarbonIUD object
+   */
+  public static CarbonIUD getInstance() {
+return new CarbonIUD();
+  }
+
+  /**
+   * @param path   is the table path on which delete is performed
+   * @param column is the columnName on which records have to be deleted
+   * @param value  of column on which the records have to be deleted
+   * @return CarbonIUD object
+   */
+  public CarbonIUD delete(String path, String column, String value) {
+prepareDelete(path, column, value, filterColumnToValueMappingForDelete);
+return this;
+  }
+
+  /**
+   * This method deletes the rows at given path by applying the 
filterExpression
+   *
+   * @param path is the table path on which delete is performed
+   * @param filterExpression is the expression to delete the records
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public void delete(String path, Expression filterExpression)
+  throws IOException, InterruptedException {
+CarbonReader reader = CarbonReader.builder(path)
+.projection(new String[] { 
CarbonCommonConstants.CARBON_IMPLICIT_COLUMN_TUPLEID })
+.filter(filterExpression).build();
+
+RecordWriter deleteDeltaWriter =
+CarbonTableOutputFormat.getDeleteDeltaRecordWriter(path);
+ObjectArrayWritable writable = new ObjectArrayWritable();
+while (reader.hasNext()) {
+  Object[] row = (Object[]) reader.readNextRow();
+  writable.set(row);
+  deleteDeltaWriter.write(NullWritable.get(), writable);
+}
+deleteDeltaWriter.close(null);
+reader.close();
+  }
+
+  /**
+   * Calling this method will start the execution of delete process
+   *
+   * @throws IOException
+   * @throws InterruptedException
+   */
+  public void closeDelete() throws IOException, InterruptedException {
+for (Map.Entry>> path : 
this.filterColumnToValueMappingForDelete
+.entrySet()) {
+  deleteExecution(path.getKey());
+}
+  }
+
+  /**
+   * @param path  is the table path on which update is performed
+   * @param columnis the columnName on which records have to be updated
+   * @param value of column on which the records have to be updated
+   * @param updColumn is the name of updatedColumn
+   * @param updValue  is the value of updated

[GitHub] [carbondata] Karan-c980 commented on a change in pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


Karan-c980 commented on a change in pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#discussion_r485548941



##
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##
@@ -0,0 +1,376 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.io.FilenameFilter;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.Field;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.scan.expression.logical.AndExpression;
+import org.apache.carbondata.core.scan.expression.logical.OrExpression;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hadoop.internal.ObjectArrayWritable;
+
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapreduce.RecordWriter;
+
+public class CarbonIUD {
+
+  private final Map>> 
filterColumnToValueMappingForDelete;
+  private final Map>> 
filterColumnToValueMappingForUpdate;
+  private final Map> updateColumnToValueMapping;
+
+  private CarbonIUD() {
+filterColumnToValueMappingForDelete = new HashMap<>();
+filterColumnToValueMappingForUpdate = new HashMap<>();
+updateColumnToValueMapping = new HashMap<>();
+  }
+
+  /**
+   * @return CarbonIUD object
+   */
+  public static CarbonIUD getInstance() {

Review comment:
   Done

##
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##
@@ -0,0 +1,376 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.File;
+import java.io.FilenameFilter;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.Field;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.scan.expression.logical.AndExpression;
+import org.apache.carbondata.core.scan.expression.logical.OrExpression;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hadoop.internal.ObjectArrayWritable;
+
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapreduce.R

[GitHub] [carbondata] Karan-c980 commented on pull request #3876: TestingCI

2020-09-09 Thread GitBox


Karan-c980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-689550512


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689551682


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4016/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689552007


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2277/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485626615



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataModule.java
##
@@ -21,6 +21,8 @@
 
 import static java.util.Objects.requireNonNull;
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.util.CarbonProperties;

Review comment:
   Why this change is required?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485628731



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataModule.java
##
@@ -21,6 +21,8 @@
 
 import static java.util.Objects.requireNonNull;
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.util.CarbonProperties;

Review comment:
   prestodb compile was broken by #3885 , They have copied code but not 
import statement. so, fixed it. Mentioned in PR description also





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485628731



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataModule.java
##
@@ -21,6 +21,8 @@
 
 import static java.util.Objects.requireNonNull;
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.util.CarbonProperties;

Review comment:
   prestodb compile was broken by #3885 , They have copied code from 
prestosql but not import statement. so, fixed it. Mentioned in PR description 
also





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #3911: [CARBONDATA-3793]Fix update and delete issue when multiple partition columns are present and clean files issue

2020-09-09 Thread GitBox


akashrn5 commented on pull request #3911:
URL: https://github.com/apache/carbondata/pull/3911#issuecomment-689577551


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485630147



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   according to this code all the PartitionSpec will have all the column 
names!!! is this correct?
   The column names in partitionSpec should be per partition right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485634681



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/readers/ComplexTypeStreamReader.java
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+
+import com.facebook.presto.spi.block.ArrayBlock;
+import com.facebook.presto.spi.block.RowBlock;
+import com.facebook.presto.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import com.facebook.presto.spi.block.Block;
+import com.facebook.presto.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+import org.apache.carbondata.presto.ColumnarVectorWrapperDirect;
+
+/**
+ * Class to read the complex type Stream [array/struct/map]
+ */
+
+public class ComplexTypeStreamReader extends CarbonColumnVectorImpl

Review comment:
   This change is too big and not related to partition pruning, please 
raise a seperate PR for this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485635431



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   you mean all the partition column right ? not the all table columns ?
   
   Ideally they use partitionPath. I have tested it. It didn't impacted. Let me 
check again. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485636199



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   yeah all partition column, 1 partition spec should have only the columns 
it is created on, not all columns.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485636502



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/readers/ComplexTypeStreamReader.java
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+
+import com.facebook.presto.spi.block.ArrayBlock;
+import com.facebook.presto.spi.block.RowBlock;
+import com.facebook.presto.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import com.facebook.presto.spi.block.Block;
+import com.facebook.presto.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+import org.apache.carbondata.presto.ColumnarVectorWrapperDirect;
+
+/**
+ * Class to read the complex type Stream [array/struct/map]
+ */
+
+public class ComplexTypeStreamReader extends CarbonColumnVectorImpl

Review comment:
   Then I have to raise partition changes only for prestosql, because 
prestodb profile is not compiling.  or I fix the prestodb compile first. you 
can merge it. I will raise separate PR for partition. what you suggest ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485638458



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/readers/ComplexTypeStreamReader.java
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+
+import com.facebook.presto.spi.block.ArrayBlock;
+import com.facebook.presto.spi.block.RowBlock;
+import com.facebook.presto.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import com.facebook.presto.spi.block.Block;
+import com.facebook.presto.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+import org.apache.carbondata.presto.ColumnarVectorWrapperDirect;
+
+/**
+ * Class to read the complex type Stream [array/struct/map]
+ */
+
+public class ComplexTypeStreamReader extends CarbonColumnVectorImpl

Review comment:
   is this compilation fix? if not then keep the compilation changes in 
this PR and ComplexTypeStreamReader looks to be a bug fix!!!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485640004



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   I think HiveTableHandle will give matching column for the partitions 
only , because this is per query derived. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485640004



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   I think HiveTableHandle will give matching column for the partitions 
only , because this is per query derived.  Let me retest and get back to you





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485642687



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/readers/ComplexTypeStreamReader.java
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+
+import com.facebook.presto.spi.block.ArrayBlock;
+import com.facebook.presto.spi.block.RowBlock;
+import com.facebook.presto.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import com.facebook.presto.spi.block.Block;
+import com.facebook.presto.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+import org.apache.carbondata.presto.ColumnarVectorWrapperDirect;
+
+/**
+ * Class to read the complex type Stream [array/struct/map]
+ */
+
+public class ComplexTypeStreamReader extends CarbonColumnVectorImpl

Review comment:
   Not a bug fix, you need to know how prestodb and prestosql is divided.
   there are some common files and individual files.  This class is individual 
class
   
   There is a complexStreamReader in prestoSql also. same code but import 
packages are from prestosql. import packages are from prestodb





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485642687



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/readers/ComplexTypeStreamReader.java
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+
+import com.facebook.presto.spi.block.ArrayBlock;
+import com.facebook.presto.spi.block.RowBlock;
+import com.facebook.presto.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import com.facebook.presto.spi.block.Block;
+import com.facebook.presto.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+import org.apache.carbondata.presto.ColumnarVectorWrapperDirect;
+
+/**
+ * Class to read the complex type Stream [array/struct/map]
+ */
+
+public class ComplexTypeStreamReader extends CarbonColumnVectorImpl

Review comment:
   Not a bug fix, you need to know how prestodb and prestosql is divided.
   there are some common files and individual files.  This class is individual 
class
   
   There is a complexStreamReader in prestoSql also. same code but import 
packages are from prestosql. here, import packages are from prestodb





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485651041



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/readers/ComplexTypeStreamReader.java
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+
+import com.facebook.presto.spi.block.ArrayBlock;
+import com.facebook.presto.spi.block.RowBlock;
+import com.facebook.presto.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import com.facebook.presto.spi.block.Block;
+import com.facebook.presto.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+import org.apache.carbondata.presto.ColumnarVectorWrapperDirect;
+
+/**
+ * Class to read the complex type Stream [array/struct/map]
+ */
+
+public class ComplexTypeStreamReader extends CarbonColumnVectorImpl

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r485651703



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataModule.java
##
@@ -21,6 +21,8 @@
 
 import static java.util.Objects.requireNonNull;
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.util.CarbonProperties;

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3787: [WIP][CARBONDATA-3923] support global sort for SI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3787:
URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689598077


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4018/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3787: [WIP][CARBONDATA-3923] support global sort for SI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3787:
URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689598895


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2279/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689609867


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4019/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


Indhumathi27 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r485655081



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -1106,19 +1107,39 @@ public static void cleanSegments(CarbonTable table, 
List partitio
*/
   public static void deleteSegment(String tablePath, Segment segment,
   List partitionSpecs,
-  SegmentUpdateStatusManager updateStatusManager) throws Exception {
+  SegmentUpdateStatusManager updateStatusManager, String tableName, String 
DatabaseName)
+  throws Exception {
 SegmentFileStore fileStore = new SegmentFileStore(tablePath, 
segment.getSegmentFileName());
 List indexOrMergeFiles = 
fileStore.readIndexFiles(SegmentStatus.SUCCESS, true,
 FileFactory.getConfiguration());
 Map> indexFilesMap = fileStore.getIndexFilesMap();
 for (Map.Entry> entry : indexFilesMap.entrySet()) {
+  // If the file to be deleted is a carbondata file, copy that file to the 
trash folder.
+  if (entry.getKey().endsWith(".carbondata")) {

Review comment:
   Can Use CarbonCommonConstants FACT_FILE_EXT

##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -1106,19 +1107,39 @@ public static void cleanSegments(CarbonTable table, 
List partitio
*/
   public static void deleteSegment(String tablePath, Segment segment,
   List partitionSpecs,
-  SegmentUpdateStatusManager updateStatusManager) throws Exception {
+  SegmentUpdateStatusManager updateStatusManager, String tableName, String 
DatabaseName)

Review comment:
   ```suggestion
 SegmentUpdateStatusManager updateStatusManager, String tableName, 
String databaseName)
   ```

##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util.concurrent.{Executors, ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.permission.{FsAction, FsPermission}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, 
LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{SegmentStatus, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableCLean  and lean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableCLean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+  dbName: String,

Review comment:
   please check and format the code

##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -1106,19 +1107,39 @@ public static void cleanSegments(CarbonTable table, 
List partitio
*/
   public static void deleteSegment(String tablePath, Segment segment,
  

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689616290


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2281/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-689623460


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2282/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-689623335


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4020/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-09-09 Thread GitBox


ajantha-bhat commented on pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#issuecomment-689648572


   Merging on behalf of @xubo245 
   
   I didn't get time to fully review. If any refactor required can handle in 
new PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-09-09 Thread GitBox


asfgit closed pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3855) Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.

2020-09-09 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-3855:
-
Fix Version/s: 2.1.0

> Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.
> --
>
> Key: CARBONDATA-3855
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3855
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Nihal kumar ojha
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: CarbonData SDK support load from file.pdf
>
>  Time Spent: 34h 20m
>  Remaining Estimate: 0h
>
> Please find the solution document attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] kunal642 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


kunal642 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689663325


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-09-09 Thread GitBox


asfgit closed pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3865) Implement delete and update feature in carbondata SDK.

2020-09-09 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3865.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Implement delete and update feature in carbondata SDK.
> --
>
> Key: CARBONDATA-3865
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3865
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Karanpreet Singh
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: Implement delete and update feature in carbondata 
> SDK.pdf, Implement delete and update feature in carbondata SDK_V2.pdf
>
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Please find the design document attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #1351: Add maven-antrun-plugin into build profiles so that class files of different profiles do not clash when genrating coverage report

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #1351:
URL: https://github.com/apache/carbondata/pull/1351#issuecomment-689669748


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4024/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689682451


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2284/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689683149


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4022/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3787: [WIP][CARBONDATA-3923] support global sort for SI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3787:
URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689693694


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4023/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3787: [WIP][CARBONDATA-3923] support global sort for SI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3787:
URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689696689


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2285/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan-c980 commented on pull request #3876: TestingCI

2020-09-09 Thread GitBox


Karan-c980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-689721230


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [WIP] Added Hive test for read and local dictionary support.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-689764177


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2286/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [WIP] Added Hive test for read and local dictionary support.

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-689767522


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4025/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-689784286


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2287/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-689784831


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4026/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689856094


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4028/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689858430


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2289/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


QiangCai commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689946875


   please describe the detail of the strategy.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3902: [CARBONDATA-3961] reorder filter expression based on storage ordinal

2020-09-09 Thread GitBox


QiangCai commented on pull request #3902:
URL: https://github.com/apache/carbondata/pull/3902#issuecomment-689947452


   please rebase code



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning

2020-09-09 Thread GitBox


QiangCai commented on a change in pull request #3908:
URL: https://github.com/apache/carbondata/pull/3908#discussion_r486038703



##
File path: 
integration/spark/src/main/scala/org/apache/spark/util/PartitionCacheManger.scala
##
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util
+
+import java.net.URI
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.log4j.Logger
+import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, 
CatalogTablePartition}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.cache.{Cache, Cacheable, CarbonLRUCache}
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.index.Segment
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
SegmentFileStore}
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object PartitionCacheManager extends Cache[PartitionCacheKey,
+  java.util.List[CatalogTablePartition]] {
+
+  private val CACHE = new CarbonLRUCache(
+CarbonCommonConstants.CARBON_PARTITION_MAX_DRIVER_LRU_CACHE_SIZE,
+CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT)
+
+  val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getName)
+
+  def get(identifier: PartitionCacheKey): 
java.util.List[CatalogTablePartition] = {
+LOGGER.info("Reading partition values from store")
+// read the tableStatus file to get valid and invalid segments
+val validInvalidSegments = new 
SegmentStatusManager(AbsoluteTableIdentifier.from(
+  identifier.tablePath, null, null, identifier.tableId))
+  .getValidAndInvalidSegments
+val cacheablePartitionSpecs = 
validInvalidSegments.getValidSegments.asScala.map { segment =>
+  val segmentFileName = segment.getSegmentFileName
+  val segmentFilePath = FileFactory.getCarbonFile(
+CarbonTablePath.getSegmentFilePath(identifier.tablePath, 
segmentFileName))
+  // read the last modified time
+  val segmentFileModifiedTime = segmentFilePath.getLastModifiedTime
+  val existingCache = CACHE.get(identifier.tableId)
+  if (existingCache != null) {
+val segmentCache = 
CACHE.get(identifier.tableId).asInstanceOf[CacheablePartitionSpec]
+  .partitionSpecs.get(segment.getSegmentNo)
+segmentCache match {
+  case Some(c) =>
+// check if cache needs to be updated
+if (segmentFileModifiedTime > c._2) {
+  (segment.getSegmentNo, (readPartition(identifier,
+segmentFilePath.getAbsolutePath), segmentFileModifiedTime))
+} else {
+  (segment.getSegmentNo, c)
+}
+  case None =>
+(segment.getSegmentNo, (readPartition(identifier,
+  segmentFilePath.getAbsolutePath), segmentFileModifiedTime))
+}
+  } else {
+// read the partitions if not available in cache.
+(segment.getSegmentNo, (readPartition(identifier,
+  segmentFilePath.getAbsolutePath), segmentFileModifiedTime))
+  }
+}.toMap
+// remove all invalid segment entries from cache
+val finalCache = cacheablePartitionSpecs --
+ 
validInvalidSegments.getInvalidSegments.asScala.map(_.getSegmentNo)
+val cacheObject = CacheablePartitionSpec(finalCache)
+if (finalCache.nonEmpty) {
+  // remove the existing cache as new cache values may be added.
+  // CarbonLRUCache does not allow cache updation until time is expired.
+  // TODO: Need to fix!!
+  CACHE.remove(identifier.tableId)
+  CACHE.put(identifier.tableId,
+cacheObject,
+cacheObject.getMemorySize,
+identifier.expirationTime)
+}
+finalCache.values.flatMap(_._1).toList.asJava
+  }
+
+  override def getAll(keys: util.List[PartitionCacheKey]):
+  util.List[util.List[CatalogTablePartition]] = {
+keys.asScala.toList.map(get).asJava
+  }
+
+  override def getIfPre

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r486048290



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   I have verified. This scenario can work. I guess. No need to modify any 
code.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration

2020-09-09 Thread GitBox


ajantha-bhat commented on a change in pull request #3913:
URL: https://github.com/apache/carbondata/pull/3913#discussion_r486048413



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java
##
@@ -117,6 +122,16 @@ public ConnectorSplitSource 
getSplits(ConnectorTransactionHandle transactionHand
   // file metastore case tablePath can be null, so get from location
   location = table.getStorage().getLocation();
 }
+List filteredPartitions = new ArrayList<>();
+if (layout.getPartitionColumns().size() > 0 && 
layout.getPartitions().isPresent()) {
+  List colNames =
+  layout.getPartitionColumns().stream().map(x -> ((HiveColumnHandle) 
x).getName())
+  .collect(Collectors.toList());
+  for (HivePartition partition : layout.getPartitions().get()) {
+filteredPartitions.add(new PartitionSpec(colNames,

Review comment:
   ```
   presto:redods> select dtm,hh from dw_log_ubt_partition_carbon_neww;
dtm | hh
   -+
part_dtm_01 | part_hh_01
part_dtm_01 | part_hh_01
part_dtm_01 | part_hh_02
part_dtm_20 | part_hh_21
part_dtm_01 | part_hh_03
part_dtm_21 | NULL
   (6 rows)
   
   Query 20200910_035416_00017_wv9qh, FINISHED, 3 nodes
   Splits: 22 total, 22 done (100.00%)
   0:01 [6 rows, 176B] [9 rows/s, 282B/s]
   
   presto:redods> select dtm,hh from dw_log_ubt_partition_carbon_neww where 
(dtm = 'part_dtm_01' and hh = 'part_hh_03') or dtm='part_dtm_21';
dtm | hh
   -+
part_dtm_01 | part_hh_03
part_dtm_21 | NULL
   (2 rows)
   
   Query 20200910_035548_00018_wv9qh, FINISHED, 3 nodes
   Splits: 18 total, 18 done (100.00%)
   0:01 [2 rows, 0B] [1 rows/s, 0B/s]
   
   
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning

2020-09-09 Thread GitBox


kunal642 commented on a change in pull request #3908:
URL: https://github.com/apache/carbondata/pull/3908#discussion_r486057999



##
File path: 
integration/spark/src/main/scala/org/apache/spark/util/PartitionCacheManger.scala
##
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util
+
+import java.net.URI
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.log4j.Logger
+import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, 
CatalogTablePartition}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.cache.{Cache, Cacheable, CarbonLRUCache}
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.index.Segment
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
SegmentFileStore}
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object PartitionCacheManager extends Cache[PartitionCacheKey,
+  java.util.List[CatalogTablePartition]] {
+
+  private val CACHE = new CarbonLRUCache(
+CarbonCommonConstants.CARBON_PARTITION_MAX_DRIVER_LRU_CACHE_SIZE,
+CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT)
+
+  val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getName)
+
+  def get(identifier: PartitionCacheKey): 
java.util.List[CatalogTablePartition] = {
+LOGGER.info("Reading partition values from store")
+// read the tableStatus file to get valid and invalid segments
+val validInvalidSegments = new 
SegmentStatusManager(AbsoluteTableIdentifier.from(
+  identifier.tablePath, null, null, identifier.tableId))
+  .getValidAndInvalidSegments
+val cacheablePartitionSpecs = 
validInvalidSegments.getValidSegments.asScala.map { segment =>
+  val segmentFileName = segment.getSegmentFileName
+  val segmentFilePath = FileFactory.getCarbonFile(
+CarbonTablePath.getSegmentFilePath(identifier.tablePath, 
segmentFileName))
+  // read the last modified time
+  val segmentFileModifiedTime = segmentFilePath.getLastModifiedTime
+  val existingCache = CACHE.get(identifier.tableId)
+  if (existingCache != null) {
+val segmentCache = 
CACHE.get(identifier.tableId).asInstanceOf[CacheablePartitionSpec]
+  .partitionSpecs.get(segment.getSegmentNo)
+segmentCache match {
+  case Some(c) =>
+// check if cache needs to be updated
+if (segmentFileModifiedTime > c._2) {
+  (segment.getSegmentNo, (readPartition(identifier,
+segmentFilePath.getAbsolutePath), segmentFileModifiedTime))
+} else {
+  (segment.getSegmentNo, c)
+}
+  case None =>
+(segment.getSegmentNo, (readPartition(identifier,
+  segmentFilePath.getAbsolutePath), segmentFileModifiedTime))
+}
+  } else {
+// read the partitions if not available in cache.
+(segment.getSegmentNo, (readPartition(identifier,
+  segmentFilePath.getAbsolutePath), segmentFileModifiedTime))
+  }
+}.toMap
+// remove all invalid segment entries from cache
+val finalCache = cacheablePartitionSpecs --
+ 
validInvalidSegments.getInvalidSegments.asScala.map(_.getSegmentNo)
+val cacheObject = CacheablePartitionSpec(finalCache)
+if (finalCache.nonEmpty) {
+  // remove the existing cache as new cache values may be added.
+  // CarbonLRUCache does not allow cache updation until time is expired.
+  // TODO: Need to fix!!
+  CACHE.remove(identifier.tableId)
+  CACHE.put(identifier.tableId,
+cacheObject,
+cacheObject.getMemorySize,
+identifier.expirationTime)
+}
+finalCache.values.flatMap(_._1).toList.asJava
+  }
+
+  override def getAll(keys: util.List[PartitionCacheKey]):
+  util.List[util.List[CatalogTablePartition]] = {
+keys.asScala.toList.map(get).asJava
+  }
+
+  override def getIfPre

[GitHub] [carbondata] vikramahuja1001 commented on pull request #3917: [WIP] clean files refactor

2020-09-09 Thread GitBox


vikramahuja1001 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-689983615


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3978) Adding support for trash folder, where all the carbondata files will go after deletion of a segment. and clean files refactoring

2020-09-09 Thread Vikram Ahuja (Jira)
Vikram Ahuja created CARBONDATA-3978:


 Summary: Adding support for trash folder, where all the carbondata 
files will go after deletion of a segment. and clean files refactoring
 Key: CARBONDATA-3978
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3978
 Project: CarbonData
  Issue Type: Improvement
Reporter: Vikram Ahuja






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] ajantha-bhat commented on pull request #3787: [CARBONDATA-3923] support global sort for SI

2020-09-09 Thread GitBox


ajantha-bhat commented on pull request #3787:
URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689992523


   @QiangCai , @akashrn5 @kunal642 : please start review. Complex type testcase 
and some more testcases can be added. I will add in 2 hours



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3908:
URL: https://github.com/apache/carbondata/pull/3908#issuecomment-690020844


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2290/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3902: [CARBONDATA-3961] reorder filter expression based on storage ordinal

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3902:
URL: https://github.com/apache/carbondata/pull/3902#issuecomment-690025455


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4030/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3908:
URL: https://github.com/apache/carbondata/pull/3908#issuecomment-690026266


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4029/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3902: [CARBONDATA-3961] reorder filter expression based on storage ordinal

2020-09-09 Thread GitBox


CarbonDataQA1 commented on pull request #3902:
URL: https://github.com/apache/carbondata/pull/3902#issuecomment-690030669


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2291/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org