[GitHub] [carbondata] ajantha-bhat opened a new pull request #3924: [WIP] Allow SI creation on first dimension column

2020-09-13 Thread GitBox


ajantha-bhat opened a new pull request #3924:
URL: https://github.com/apache/carbondata/pull/3924


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3911: [CARBONDATA-3793]Fix update and delete issue when multiple partition columns are present and clean files issue

2020-09-13 Thread GitBox


kunal642 commented on pull request #3911:
URL: https://github.com/apache/carbondata/pull/3911#issuecomment-691806899


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3985) Optimize the segment-timestamp file clean up

2020-09-13 Thread suwen (Jira)
suwen created CARBONDATA-3985:
-

 Summary: Optimize the segment-timestamp file clean up
 Key: CARBONDATA-3985
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3985
 Project: CarbonData
  Issue Type: Improvement
  Components: core, spark-integration
Reporter: suwen


For data update, in the CarbonProjectForUpdateCommand process, after the delete 
delta file is generated, the status of each segment is checked. If the status 
is not successful, all the segment directories are traversed to clean up the 
timestamp corresponding .carbondata, .carbonindex and .deletedelta files.

If a great many segments have been generated in the Partion directory, it will 
be very time-consuming.

In fact, in the process of cleaning up timestamp files, we only need to clean 
up the files in the Segment directory involved in this update.

In the process of generating delete delta, record the segment path involved in 
this update; after entering the checkAndUpdateStatusFiles() function, if a 
segment status is found to be not successful, it will be cleaned directly 
according to the segment path list that has been recorded during generating 
delete delta, without searching all the segment directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] BrooksLI commented on a change in pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be copied to

2020-09-13 Thread GitBox


BrooksLI commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r486292381



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -1106,19 +1107,39 @@ public static void cleanSegments(CarbonTable table, 
List partitio
*/
   public static void deleteSegment(String tablePath, Segment segment,
   List partitionSpecs,
-  SegmentUpdateStatusManager updateStatusManager) throws Exception {
+  SegmentUpdateStatusManager updateStatusManager, String tableName, String 
DatabaseName)
+  throws Exception {
 SegmentFileStore fileStore = new SegmentFileStore(tablePath, 
segment.getSegmentFileName());
 List indexOrMergeFiles = 
fileStore.readIndexFiles(SegmentStatus.SUCCESS, true,
 FileFactory.getConfiguration());
 Map> indexFilesMap = fileStore.getIndexFilesMap();
 for (Map.Entry> entry : indexFilesMap.entrySet()) {
+  // If the file to be deleted is a carbondata file, copy that file to the 
trash folder.
+  if (entry.getKey().endsWith(".carbondata")) {

Review comment:
   Can we extract the method rather than duplicate it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] Compaction on table having range column after altering datatype from string to long string fails.

2020-09-13 Thread GitBox


CarbonDataQA1 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-691721507


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2317/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] Compaction on table having range column after altering datatype from string to long string fails.

2020-09-13 Thread GitBox


CarbonDataQA1 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-691721064


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4055/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-13 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-691717134


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2316/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-13 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-691716955


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4054/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 opened a new pull request #3923: [CARBONDATA-3984] Compaction on table having range column after altering datatype from string to long string fails.

2020-09-13 Thread GitBox


Karan980 opened a new pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923


### Why is this PR needed?
   When dataType of a String column which is also provided as range column in 
table properties is altered to longStringColumn. It throws following error 
while performing compaction on the table.
   **VARCHAR not supported for the filter expression;** 
   

### What changes were proposed in this PR?
   Added condition for VARCHAR datatype in all conditional expressions.
   
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- No
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3984) compaction on table having range column after altering data type from string to long string fails.

2020-09-13 Thread Karan (Jira)
Karan created CARBONDATA-3984:
-

 Summary: compaction on table having range column after altering 
data type from string to long string fails.
 Key: CARBONDATA-3984
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3984
 Project: CarbonData
  Issue Type: Bug
  Components: core, spark-integration
Affects Versions: 2.0.0
Reporter: Karan


When dataType of a String column which is also provided as range column in 
table properties is altered to longStringColumn. It shows following error while 
performing compaction on the table.

 

VARCHAR not supported for the filter expression; at 
org.apache.spark.sql.util.CarbonException$.analysisException 
(CarbonException.scala: 23) at org.apache.carbondata.spark.rdd.CarbonMergerRDD 
$$ anon $ 1.  (CarbonMergerRDD.scala: 227) at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute ( 
CarbonMergerRDD.scala: 104) at org.apache.carbondata.spark.rdd.CarbonRDD.compute



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] Karan-c980 commented on pull request #3876: TestingCI

2020-09-13 Thread GitBox


Karan-c980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-691702741


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3982) Use Partition instead of Span to split legacy and non-legacy segments for executor distribution in indexserver

2020-09-13 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3982.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Use Partition instead of Span to split legacy and non-legacy segments for 
> executor distribution in indexserver 
> ---
>
> Key: CARBONDATA-3982
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3982
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3918: [CARBONDATA-3982] Use Partition instead of Span to split legacy and non-legacy segments for executor distribution in indexserver

2020-09-13 Thread GitBox


asfgit closed pull request #3918:
URL: https://github.com/apache/carbondata/pull/3918


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3918: [CARBONDATA-3982] Use Partition instead of Span to split legacy and non-legacy segments for executor distribution in indexserver

2020-09-13 Thread GitBox


kunal642 commented on pull request #3918:
URL: https://github.com/apache/carbondata/pull/3918#issuecomment-691624940


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org