[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569160839



##
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with 
BeforeAndAfterAll {
 sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath 
}', " +
 s"'format'='carbon')")
 sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-assert(sql("show segments for table maintable1_si").collect().length ==
-   sql("show segments for table maintable1").collect().length)
+assert(sql("show segments for table maintable1_si").collect().length == 2)
+assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
   Disabled SI table after alter add load and added a check to verify in 
test cases.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569160528



##
File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##
@@ -221,7 +223,13 @@ public void deserializeFields(DataInput in, String[] 
locations, String tablePath
   indexUniqueId = in.readUTF();
 }
 String filePath = getPath();
-if (filePath.startsWith(File.separator)) {
+boolean isLocalFile = FileFactory.getCarbonFile(filePath) instanceof 
LocalCarbonFile;
+// If it is external segment path, table path need not be appended to 
filePath
+// Example filepath: hdfs://hacluster/opt/newsegmentpath/
+// filePath value would start with hdfs:// or s3:// . If it is local
+// ubuntu storage, it starts with File separator, so check if given path 
exists or not.
+if ((!isLocalFile && filePath.startsWith(File.separator)) || (isLocalFile 
&& !FileFactory

Review comment:
   ok done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on pull request #4072: [CARBONDATA-4110] Support clean files dry run operation and show statistics after clean files operation

2021-02-02 Thread GitBox


vikramahuja1001 commented on pull request #4072:
URL: https://github.com/apache/carbondata/pull/4072#issuecomment-772241927


   @QiangCai @ajantha-bhat @akashrn5 please review



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI.

2021-02-02 Thread GitBox


ajantha-bhat edited a comment on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771551095


   The title of the PR can be more specific like "**Data mismatch issue in SI 
global sort merge scenario**"
   @Karan980 : From next time consider this point.  I have changed it while 
merging now. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI.

2021-02-02 Thread GitBox


ajantha-bhat commented on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771548605







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4076: [CARBONDATA-4107] Added related MV tables Map to fact table and added lock while touchMDTFile

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4076:
URL: https://github.com/apache/carbondata/pull/4076#issuecomment-771646152


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5412/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI global sort merge scenario.

2021-02-02 Thread GitBox


asfgit closed pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #4070: [CARBONDATA-4082] Fix alter table add segment query on adding a segment having delete delta files.

2021-02-02 Thread GitBox


kunal642 commented on a change in pull request #4070:
URL: https://github.com/apache/carbondata/pull/4070#discussion_r568530696



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
##
@@ -294,6 +297,49 @@ case class CarbonAddLoadCommand(
   
OperationListenerBus.getInstance().fireEvent(loadTablePreStatusUpdateEvent, 
operationContext)
 }
 
+val deltaFiles = FileFactory.getCarbonFile(segmentPath).listFiles()
+  .filter(_.getName.endsWith(CarbonCommonConstants.DELETE_DELTA_FILE_EXT))
+if (deltaFiles.length > 0) {
+  val blockNameToDeltaFilesMap =
+collection.mutable.Map[String, 
collection.mutable.ListBuffer[(CarbonFile, String)]]()
+  deltaFiles.foreach { deltaFile =>
+val tmpDeltaFilePath = deltaFile.getAbsolutePath
+  .replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+CarbonCommonConstants.FILE_SEPARATOR)
+val deltaFilePathElements = 
tmpDeltaFilePath.split(CarbonCommonConstants.FILE_SEPARATOR)
+if (deltaFilePathElements != null && deltaFilePathElements.nonEmpty) {
+  val deltaFileName = 
deltaFilePathElements(deltaFilePathElements.length - 1)
+  val blockName = CarbonTablePath.DataFileUtil
+.getBlockNameFromDeleteDeltaFile(deltaFileName)
+  if (blockNameToDeltaFilesMap.contains(blockName)) {
+blockNameToDeltaFilesMap(blockName) += ((deltaFile, deltaFileName))
+  } else {
+val deltaFileList = new ListBuffer[(CarbonFile, String)]()
+deltaFileList += ((deltaFile, deltaFileName))
+blockNameToDeltaFilesMap.put(blockName, deltaFileList)
+  }
+}
+  }
+  val segmentUpdateDetails = new util.ArrayList[SegmentUpdateDetails]()
+  val columnCompressor = 
CompressorFactory.getInstance.getCompressor.getName
+  blockNameToDeltaFilesMap.foreach { entry =>
+val segmentUpdateDetail = new SegmentUpdateDetails()
+segmentUpdateDetail.setBlockName(entry._1)
+segmentUpdateDetail.setActualBlockName(
+  entry._1 + CarbonCommonConstants.POINT + columnCompressor +
+CarbonCommonConstants.FACT_FILE_EXT)
+segmentUpdateDetail.setSegmentName(model.getSegmentId)
+setMinMaxDeltaStampAndDeletedRowCount(entry._2, segmentUpdateDetail)
+segmentUpdateDetails.add(segmentUpdateDetail)
+  }
+  val timestamp = System.currentTimeMillis().toString
+  val segmentDetails = new util.HashSet[Segment]()
+  segmentDetails.add(model.getSegment)
+  CarbonUpdateUtil.updateSegmentStatus(segmentUpdateDetails, carbonTable, 
timestamp, false)

Review comment:
   can we pass a check like forcewrite in the updateSegmentStatus to avoid 
the validation of the segment from tablestaus file.. this flag would be true in 
addload command when delete delta is present. This way you can avoid writing 
twice.

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
##
@@ -369,5 +426,64 @@ case class CarbonAddLoadCommand(
 }
   }
 
+  /**
+   * If there are more than one deleteDelta File present  for a block. Then 
This method
+   * will pick the deltaFile with highest timestamp, because the default 
threshold for horizontal
+   * compaction is 1. It is assumed that threshold for horizontal compaction 
is not changed from
+   * default value. So there will always be only one valid delete delta file 
present for a block.
+   * It also sets the number of deleted rows for a segment.
+   */
+  def setValidDeltaFileAndDeletedRowCount(
+  deleteDeltaFiles : ListBuffer[(CarbonFile, String)],
+  segmentUpdateDetails : SegmentUpdateDetails
+  ) : Unit = {
+var maxDeltaStamp : Long = -1
+var deletedRowsCount : Long = 0
+var validDeltaFile : CarbonFile = null
+deleteDeltaFiles.foreach { deltaFile =>
+  val currentFileTimestamp = CarbonTablePath.DataFileUtil
+.getTimeStampFromDeleteDeltaFile(deltaFile._2)
+  if (currentFileTimestamp.toLong > maxDeltaStamp) {
+maxDeltaStamp = currentFileTimestamp.toLong
+validDeltaFile = deltaFile._1
+  }
+}
+val blockDetails =
+  new 
CarbonDeleteDeltaFileReaderImpl(validDeltaFile.getAbsolutePath).readJson()
+blockDetails.getBlockletDetails.asScala.foreach { blocklet =>
+  deletedRowsCount = deletedRowsCount + blocklet.getDeletedRows.size()
+}
+segmentUpdateDetails.setDeleteDeltaStartTimestamp(maxDeltaStamp.toString)
+segmentUpdateDetails.setDeleteDeltaEndTimestamp(maxDeltaStamp.toString)
+segmentUpdateDetails.setDeletedRowsInBlock(deletedRowsCount.toString)
+  }
+
+  /**
+   * As horizontal compaction not supported for SDK segments. So all delta 
files are valid
+   */
+  def readAllDeltaFiles(
+  deleteDeltaFiles : ListBuffer[(CarbonFile, String)],
+  segmentUpdateDetails 

[GitHub] [carbondata] nihal0107 edited a comment on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


nihal0107 edited a comment on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771464559


   > > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   > 
   > > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   > 
   > If we enable 'AUTO_LOAD_MERGE',then we return and show the segment id 
before compaction since the user would focus on his load operation. Test case 
has been added. Please review.
   
   We are showing the segment id because if we need to query on specific 
segment then this feature will be helpful. But if we will show the segment id 
before compaction and will query on specific segment then the operation will 
fail. Better to take the opinion in community.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


ajantha-bhat commented on a change in pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#discussion_r568528970



##
File path: 
integration/spark/src/test/scala/org/apache/spark/util/CarbonCommandSuite.scala
##
@@ -82,6 +83,43 @@ class CarbonCommandSuite extends QueryTest with 
BeforeAndAfterAll {
""".stripMargin)
   }
 
+  protected def createTestTable(tableName: String): Unit = {
+sql(
+  s"""

Review comment:
   Instead of adding new testcase ,
   In one of the existing testcases of **loading, insert into, partition table 
loading, partition table insert into**, just add a validation for segment id
   

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -276,7 +280,15 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
 }
 throw ex
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {
+  Seq(Row(loadResultForReturn.getLoadName))
+} else {
+  rowsForReturn

Review comment:
   why are you returning number of rows instead of segment id here ? when 
will the code enter here  when load is success ?
   can you add some comment ?

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -276,7 +280,15 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
 }
 throw ex
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   I think code is not formatted, 
   we follow space after if and != 

##
File path: 
integration/spark/src/test/scala/org/apache/spark/util/CarbonCommandSuite.scala
##
@@ -100,6 +138,56 @@ class CarbonCommandSuite extends QueryTest with 
BeforeAndAfterAll {
   private lazy val location = CarbonProperties.getStorePath()
 
 
+  test("Return segment ID after load and insert") {
+val tableName = "test_table"
+val inputTableName = "csv_table"
+val inputPath = s"$resourcesPath/data_alltypes.csv"
+dropTable(tableName)
+dropTable(inputTableName)
+createAndLoadInputTable(inputTableName, inputPath)
+createTestTable(tableName)
+checkAnswer(sql(
+  s"""
+ | INSERT INTO TABLE $tableName
+ | SELECT shortField, intField, bigintField, doubleField, stringField,
+ | from_unixtime(unix_timestamp(timestampField,'/M/dd')) 
timestampField, decimalField,
+ | cast(to_date(from_unixtime(unix_timestamp(dateField,'/M/dd'))) 
as date), charField
+ | FROM $inputTableName
+   """.stripMargin), Seq(Row("0")))
+checkAnswer(sql(
+  s"LOAD DATA LOCAL INPATH '$inputPath'" +
+  s" INTO TABLE $tableName" +
+  " OPTIONS('FILEHEADER'=" +
+  "'shortField,intField,bigintField,doubleField,stringField," +
+  "timestampField,decimalField,dateField,charField')"), Seq(Row("1")))

Review comment:
   possible to return text like "Successfully loaded to segment id : 1" 
instead of returning just "1"

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##
@@ -191,7 +196,15 @@ case class CarbonLoadDataCommand(databaseNameOp: 
Option[String],
   throw ex
 }
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   @QiangCai, @ydvpankaj99  : why our checkstyle is not catching this 
format issues ?

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##
@@ -191,7 +196,15 @@ case class CarbonLoadDataCommand(databaseNameOp: 
Option[String],
   throw ex
 }
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   @QiangCai, @ydvpankaj99  : why our checkstyle is not catching this 
format issues ?

##
File path: 
integration/spark/src/test/scala/org/apache/spark/util/CarbonCommandSuite.scala
##
@@ -100,6 +138,56 @@ class CarbonCommandSuite extends QueryTest with 
BeforeAndAfterAll {
   private lazy val location = CarbonProperties.getStorePath()
 
 
+  test("Return segment ID after load and insert") {
+val tableName = "test_table"
+val inputTableName = "csv_table"
+val inputPath = s"$resourcesPath/data_alltypes.csv"
+dropTable(tableName)
+dropTable(inputTableName)
+createAndLoadInputTable(inputTableName, inputPath)
+createTestTable(tableName)
+checkAnswer(sql(
+  s"""
+ | INSERT INTO TABLE $tableName
+ | SELECT shortField, intField, bigintField, doubleField, stringField,
+ | from_unixtime(unix_timestamp(timestampField,'/M/dd')) 
timestampField, decimalField,
+  

[GitHub] [carbondata] ajantha-bhat commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


ajantha-bhat commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771560715







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


asfgit closed pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI global sort merge scenario.

2021-02-02 Thread GitBox


Karan980 commented on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771551639


   > The title of the PR can be more specific like "**Data mismatch issue in SI 
global sort merge scenario**"
   
   Done



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #4076: [CARBONDATA-4107] Added related MV tables Map to fact table and added lock while touchMDTFile

2021-02-02 Thread GitBox


akashrn5 commented on a change in pull request #4076:
URL: https://github.com/apache/carbondata/pull/4076#discussion_r568501151



##
File path: docs/mv-guide.md
##
@@ -241,6 +242,10 @@ The current information includes:
  | Refresh Mode  | FULL / INCREMENTAL refresh to MV
  |
  | Refresh Trigger Mode  | ON_COMMIT / ON_MANUAL refresh to MV provided by 
user |
  | Properties  | Table properties of the materialized view 
  |
+
+**NOTE**: For materialized views created
+before 
[CARBONDATA-4107](https://issues.apache.org/jira/browse/CARBONDATA-4107) issue 
fix, run
+refresh mv command to add mv name to fact table property and to enable it.

Review comment:
   also please add what happens if user doesn't use refresh 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


ShreelekhyaG commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771481950


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


akashrn5 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r568509426



##
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with 
BeforeAndAfterAll {
 sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath 
}', " +
 s"'format'='carbon')")
 sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-assert(sql("show segments for table maintable1_si").collect().length ==
-   sql("show segments for table maintable1").collect().length)
+assert(sql("show segments for table maintable1_si").collect().length == 2)
+assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
   also have an assert of checking SI table is disabled and query doesn't 
hit SI

##
File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##
@@ -221,7 +223,13 @@ public void deserializeFields(DataInput in, String[] 
locations, String tablePath
   indexUniqueId = in.readUTF();
 }
 String filePath = getPath();
-if (filePath.startsWith(File.separator)) {
+boolean isLocalFile = FileFactory.getCarbonFile(filePath) instanceof 
LocalCarbonFile;
+// If it is external segment path, table path need not be appended to 
filePath
+// Example filepath: hdfs://hacluster/opt/newsegmentpath/
+// filePath value would start with hdfs:// or s3:// . If it is local
+// ubuntu storage, it starts with File separator, so check if given path 
exists or not.
+if ((!isLocalFile && filePath.startsWith(File.separator)) || (isLocalFile 
&& !FileFactory

Review comment:
   the comment is not clear, please rewrite it with better example and 
scenarios





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771878716







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] areyouokfreejoe closed pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


areyouokfreejoe closed pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771459205







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


kunal642 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771412881







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] areyouokfreejoe commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


areyouokfreejoe commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771456273







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771318129







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ydvpankaj99 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


ydvpankaj99 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771448246


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771387700







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


nihal0107 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771385539







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


nihal0107 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771354445







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771881699


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3652/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771878716


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5413/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4076: [CARBONDATA-4107] Added related MV tables Map to fact table and added lock while touchMDTFile

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4076:
URL: https://github.com/apache/carbondata/pull/4076#issuecomment-771647871


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3651/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4076: [CARBONDATA-4107] Added related MV tables Map to fact table and added lock while touchMDTFile

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4076:
URL: https://github.com/apache/carbondata/pull/4076#issuecomment-771646152


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5412/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771619559


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3650/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771616575


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5409/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771610225


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5410/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771609869


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3649/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


ajantha-bhat commented on a change in pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#discussion_r568524277



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##
@@ -191,7 +196,15 @@ case class CarbonLoadDataCommand(databaseNameOp: 
Option[String],
   throw ex
 }
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   @QiangCai, @ydvpankaj99  : why our checkstyle is not catching this 
format issues ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #4070: [CARBONDATA-4082] Fix alter table add segment query on adding a segment having delete delta files.

2021-02-02 Thread GitBox


kunal642 commented on a change in pull request #4070:
URL: https://github.com/apache/carbondata/pull/4070#discussion_r568534198



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
##
@@ -294,6 +297,49 @@ case class CarbonAddLoadCommand(
   
OperationListenerBus.getInstance().fireEvent(loadTablePreStatusUpdateEvent, 
operationContext)
 }
 
+val deltaFiles = FileFactory.getCarbonFile(segmentPath).listFiles()

Review comment:
   Better to use CarbonFileFilter to list only the delete delta files





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #4070: [CARBONDATA-4082] Fix alter table add segment query on adding a segment having delete delta files.

2021-02-02 Thread GitBox


kunal642 commented on a change in pull request #4070:
URL: https://github.com/apache/carbondata/pull/4070#discussion_r568531096



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
##
@@ -369,5 +426,64 @@ case class CarbonAddLoadCommand(
 }
   }
 
+  /**
+   * If there are more than one deleteDelta File present  for a block. Then 
This method
+   * will pick the deltaFile with highest timestamp, because the default 
threshold for horizontal
+   * compaction is 1. It is assumed that threshold for horizontal compaction 
is not changed from
+   * default value. So there will always be only one valid delete delta file 
present for a block.
+   * It also sets the number of deleted rows for a segment.
+   */
+  def setValidDeltaFileAndDeletedRowCount(
+  deleteDeltaFiles : ListBuffer[(CarbonFile, String)],
+  segmentUpdateDetails : SegmentUpdateDetails
+  ) : Unit = {
+var maxDeltaStamp : Long = -1
+var deletedRowsCount : Long = 0
+var validDeltaFile : CarbonFile = null
+deleteDeltaFiles.foreach { deltaFile =>
+  val currentFileTimestamp = CarbonTablePath.DataFileUtil
+.getTimeStampFromDeleteDeltaFile(deltaFile._2)
+  if (currentFileTimestamp.toLong > maxDeltaStamp) {
+maxDeltaStamp = currentFileTimestamp.toLong
+validDeltaFile = deltaFile._1
+  }
+}
+val blockDetails =
+  new 
CarbonDeleteDeltaFileReaderImpl(validDeltaFile.getAbsolutePath).readJson()
+blockDetails.getBlockletDetails.asScala.foreach { blocklet =>
+  deletedRowsCount = deletedRowsCount + blocklet.getDeletedRows.size()
+}
+segmentUpdateDetails.setDeleteDeltaStartTimestamp(maxDeltaStamp.toString)
+segmentUpdateDetails.setDeleteDeltaEndTimestamp(maxDeltaStamp.toString)
+segmentUpdateDetails.setDeletedRowsInBlock(deletedRowsCount.toString)
+  }
+
+  /**
+   * As horizontal compaction not supported for SDK segments. So all delta 
files are valid
+   */
+  def readAllDeltaFiles(
+  deleteDeltaFiles : ListBuffer[(CarbonFile, String)],
+  segmentUpdateDetails : SegmentUpdateDetails
+  ) : Unit = {

Review comment:
   please fix this formatting.. move to above line. Check other code for 
the same as well





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #4070: [CARBONDATA-4082] Fix alter table add segment query on adding a segment having delete delta files.

2021-02-02 Thread GitBox


kunal642 commented on a change in pull request #4070:
URL: https://github.com/apache/carbondata/pull/4070#discussion_r568530696



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
##
@@ -294,6 +297,49 @@ case class CarbonAddLoadCommand(
   
OperationListenerBus.getInstance().fireEvent(loadTablePreStatusUpdateEvent, 
operationContext)
 }
 
+val deltaFiles = FileFactory.getCarbonFile(segmentPath).listFiles()
+  .filter(_.getName.endsWith(CarbonCommonConstants.DELETE_DELTA_FILE_EXT))
+if (deltaFiles.length > 0) {
+  val blockNameToDeltaFilesMap =
+collection.mutable.Map[String, 
collection.mutable.ListBuffer[(CarbonFile, String)]]()
+  deltaFiles.foreach { deltaFile =>
+val tmpDeltaFilePath = deltaFile.getAbsolutePath
+  .replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+CarbonCommonConstants.FILE_SEPARATOR)
+val deltaFilePathElements = 
tmpDeltaFilePath.split(CarbonCommonConstants.FILE_SEPARATOR)
+if (deltaFilePathElements != null && deltaFilePathElements.nonEmpty) {
+  val deltaFileName = 
deltaFilePathElements(deltaFilePathElements.length - 1)
+  val blockName = CarbonTablePath.DataFileUtil
+.getBlockNameFromDeleteDeltaFile(deltaFileName)
+  if (blockNameToDeltaFilesMap.contains(blockName)) {
+blockNameToDeltaFilesMap(blockName) += ((deltaFile, deltaFileName))
+  } else {
+val deltaFileList = new ListBuffer[(CarbonFile, String)]()
+deltaFileList += ((deltaFile, deltaFileName))
+blockNameToDeltaFilesMap.put(blockName, deltaFileList)
+  }
+}
+  }
+  val segmentUpdateDetails = new util.ArrayList[SegmentUpdateDetails]()
+  val columnCompressor = 
CompressorFactory.getInstance.getCompressor.getName
+  blockNameToDeltaFilesMap.foreach { entry =>
+val segmentUpdateDetail = new SegmentUpdateDetails()
+segmentUpdateDetail.setBlockName(entry._1)
+segmentUpdateDetail.setActualBlockName(
+  entry._1 + CarbonCommonConstants.POINT + columnCompressor +
+CarbonCommonConstants.FACT_FILE_EXT)
+segmentUpdateDetail.setSegmentName(model.getSegmentId)
+setMinMaxDeltaStampAndDeletedRowCount(entry._2, segmentUpdateDetail)
+segmentUpdateDetails.add(segmentUpdateDetail)
+  }
+  val timestamp = System.currentTimeMillis().toString
+  val segmentDetails = new util.HashSet[Segment]()
+  segmentDetails.add(model.getSegment)
+  CarbonUpdateUtil.updateSegmentStatus(segmentUpdateDetails, carbonTable, 
timestamp, false)

Review comment:
   can we pass a check like forcewrite in the updateSegmentStatus to avoid 
the validation of the segment from tablestaus file.. this flag would be true in 
addload command when delete delta is present. This way you can avoid writing 
twice.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


ajantha-bhat commented on a change in pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#discussion_r568528970



##
File path: 
integration/spark/src/test/scala/org/apache/spark/util/CarbonCommandSuite.scala
##
@@ -82,6 +83,43 @@ class CarbonCommandSuite extends QueryTest with 
BeforeAndAfterAll {
""".stripMargin)
   }
 
+  protected def createTestTable(tableName: String): Unit = {
+sql(
+  s"""

Review comment:
   Instead of adding new testcase ,
   In one of the existing testcases of **loading, insert into, partition table 
loading, partition table insert into**, just add a validation for segment id
   

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -276,7 +280,15 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
 }
 throw ex
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {
+  Seq(Row(loadResultForReturn.getLoadName))
+} else {
+  rowsForReturn

Review comment:
   why are you returning number of rows instead of segment id here ? when 
will the code enter here  when load is success ?
   can you add some comment ?

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -276,7 +280,15 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
 }
 throw ex
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   I think code is not formatted, 
   we follow space after if and != 

##
File path: 
integration/spark/src/test/scala/org/apache/spark/util/CarbonCommandSuite.scala
##
@@ -100,6 +138,56 @@ class CarbonCommandSuite extends QueryTest with 
BeforeAndAfterAll {
   private lazy val location = CarbonProperties.getStorePath()
 
 
+  test("Return segment ID after load and insert") {
+val tableName = "test_table"
+val inputTableName = "csv_table"
+val inputPath = s"$resourcesPath/data_alltypes.csv"
+dropTable(tableName)
+dropTable(inputTableName)
+createAndLoadInputTable(inputTableName, inputPath)
+createTestTable(tableName)
+checkAnswer(sql(
+  s"""
+ | INSERT INTO TABLE $tableName
+ | SELECT shortField, intField, bigintField, doubleField, stringField,
+ | from_unixtime(unix_timestamp(timestampField,'/M/dd')) 
timestampField, decimalField,
+ | cast(to_date(from_unixtime(unix_timestamp(dateField,'/M/dd'))) 
as date), charField
+ | FROM $inputTableName
+   """.stripMargin), Seq(Row("0")))
+checkAnswer(sql(
+  s"LOAD DATA LOCAL INPATH '$inputPath'" +
+  s" INTO TABLE $tableName" +
+  " OPTIONS('FILEHEADER'=" +
+  "'shortField,intField,bigintField,doubleField,stringField," +
+  "timestampField,decimalField,dateField,charField')"), Seq(Row("1")))

Review comment:
   possible to return text like "Successfully loaded to segment id : 1" 
instead of returning just "1"

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##
@@ -191,7 +196,15 @@ case class CarbonLoadDataCommand(databaseNameOp: 
Option[String],
   throw ex
 }
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   @QiangCai, @ydvpankaj99  : why our checkstyle is not catching this 
format issues ?

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##
@@ -191,7 +196,15 @@ case class CarbonLoadDataCommand(databaseNameOp: 
Option[String],
   throw ex
 }
 }
-Seq.empty
+if(loadResultForReturn!=null && loadResultForReturn.getLoadName!=null) {

Review comment:
   @QiangCai, @ydvpankaj99  : why our checkstyle is not catching this 
format issues ?

##
File path: 
integration/spark/src/test/scala/org/apache/spark/util/CarbonCommandSuite.scala
##
@@ -100,6 +138,56 @@ class CarbonCommandSuite extends QueryTest with 
BeforeAndAfterAll {
   private lazy val location = CarbonProperties.getStorePath()
 
 
+  test("Return segment ID after load and insert") {
+val tableName = "test_table"
+val inputTableName = "csv_table"
+val inputPath = s"$resourcesPath/data_alltypes.csv"
+dropTable(tableName)
+dropTable(inputTableName)
+createAndLoadInputTable(inputTableName, inputPath)
+createTestTable(tableName)
+checkAnswer(sql(
+  s"""
+ | INSERT INTO TABLE $tableName
+ | SELECT shortField, intField, bigintField, doubleField, stringField,
+ | from_unixtime(unix_timestamp(timestampField,'/M/dd')) 
timestampField, decimalField,
+  

[GitHub] [carbondata] akashrn5 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

2021-02-02 Thread GitBox


akashrn5 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r568509426



##
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with 
BeforeAndAfterAll {
 sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath 
}', " +
 s"'format'='carbon')")
 sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-assert(sql("show segments for table maintable1_si").collect().length ==
-   sql("show segments for table maintable1").collect().length)
+assert(sql("show segments for table maintable1_si").collect().length == 2)
+assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
   also have an assert of checking SI table is disabled and query doesn't 
hit SI

##
File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##
@@ -221,7 +223,13 @@ public void deserializeFields(DataInput in, String[] 
locations, String tablePath
   indexUniqueId = in.readUTF();
 }
 String filePath = getPath();
-if (filePath.startsWith(File.separator)) {
+boolean isLocalFile = FileFactory.getCarbonFile(filePath) instanceof 
LocalCarbonFile;
+// If it is external segment path, table path need not be appended to 
filePath
+// Example filepath: hdfs://hacluster/opt/newsegmentpath/
+// filePath value would start with hdfs:// or s3:// . If it is local
+// ubuntu storage, it starts with File separator, so check if given path 
exists or not.
+if ((!isLocalFile && filePath.startsWith(File.separator)) || (isLocalFile 
&& !FileFactory

Review comment:
   the comment is not clear, please rewrite it with better example and 
scenarios





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


ajantha-bhat commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771560807


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


ajantha-bhat commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771560715


   @nihal0107 : As per me, showing the original single segment before 
compaction is ok, because it is load command and not the compaction command.
   so, when load command finishes we can give segment id that loaded and user 
can do show segments before querying that segment to see if it has undergone 
compaction or not.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-4113) Partition query results invalid when carbon.read.partition.hive.direct is disabled

2021-02-02 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-4113.
--
Fix Version/s: 2.1.1
   Resolution: Fixed

> Partition query results invalid when carbon.read.partition.hive.direct is 
> disabled
> --
>
> Key: CARBONDATA-4113
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4113
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.1
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> set 'carbon.read.partition.hive.direct' to false.
> queries to execute:
> create table partition_cache(a string) partitioned by(b int) stored as 
> carbondata
> insert into partition_cache select 'k',1;
> insert into partition_cache select 'k',1;
> insert into partition_cache select 'k',2;
> insert into partition_cache select 'k',2;
> alter table partition_cache compact 'minor';
> select *from partition_cache; => no results



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


asfgit closed pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


kunal642 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771559408


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-4112) Data mismatch issue in SI

2021-02-02 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-4112.
--
Fix Version/s: (was: 2.1.0)
   2.2.0
   Resolution: Fixed

> Data mismatch issue in SI
> -
>
> Key: CARBONDATA-4112
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4112
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0
>Reporter: Karan
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> When data files of a SI segment are merged. It gives more number of rows in 
> SI table than main table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI global sort merge scenario.

2021-02-02 Thread GitBox


asfgit closed pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI global sort merge scenario.

2021-02-02 Thread GitBox


Karan980 commented on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771551639


   > The title of the PR can be more specific like "**Data mismatch issue in SI 
global sort merge scenario**"
   
   Done



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI.

2021-02-02 Thread GitBox


ajantha-bhat edited a comment on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771551095


   The title of the PR can be more specific like "**Data mismatch issue in SI 
global sort merge scenario**"
   @Karan980 : From next time consider this point.  I have changed it while 
merging now. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI.

2021-02-02 Thread GitBox


ajantha-bhat commented on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771551095


   The title of the PR can be more specific like "**Data mismatch issue in SI 
global sort merge scenario**"



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4083: [CARBONDATA-4112] Data mismatch issue in SI.

2021-02-02 Thread GitBox


ajantha-bhat commented on pull request #4083:
URL: https://github.com/apache/carbondata/pull/4083#issuecomment-771548605


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771547238


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3647/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771547087


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5407/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #4076: [CARBONDATA-4107] Added related MV tables Map to fact table and added lock while touchMDTFile

2021-02-02 Thread GitBox


akashrn5 commented on a change in pull request #4076:
URL: https://github.com/apache/carbondata/pull/4076#discussion_r568501151



##
File path: docs/mv-guide.md
##
@@ -241,6 +242,10 @@ The current information includes:
  | Refresh Mode  | FULL / INCREMENTAL refresh to MV
  |
  | Refresh Trigger Mode  | ON_COMMIT / ON_MANUAL refresh to MV provided by 
user |
  | Properties  | Table properties of the materialized view 
  |
+
+**NOTE**: For materialized views created
+before 
[CARBONDATA-4107](https://issues.apache.org/jira/browse/CARBONDATA-4107) issue 
fix, run
+refresh mv command to add mv name to fact table property and to enable it.

Review comment:
   also please add what happens if user doesn't use refresh 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4117) Test cg index query with Index server fails with NPE

2021-02-02 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4117:
-

 Summary: Test cg index query with Index server fails with NPE
 Key: CARBONDATA-4117
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4117
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Test queries to execute:


spark-sql> CREATE TABLE index_test_cg(id INT, name STRING, city STRING, age 
INT) STORED AS carbondata TBLPROPERTIES('SORT_COLUMNS'='city,name', 
'SORT_SCOPE'='LOCAL_SORT');

spark-sql> create index cgindex on table index_test_cg (name) as 
'org.apache.carbondata.spark.testsuite.index.CGIndexFactory';

LOAD DATA LOCAL INPATH '$file2' INTO TABLE index_test_cg 
OPTIONS('header'='false')

spark-sql> select * from index_test_cg where name='n502670';
2021-01-29 15:09:25,881 | ERROR | main | Exception occurred while getting 
splits using index server. Initiating Fallback to embedded mode | 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:454)
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy69.getSplits(Unknown Source)
at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:85)
at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59)
at 
org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769)
at 
org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58)
at 
org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:307)
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:443)
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:555)
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:500)
at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:357)
at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:205)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:159)
at org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:68)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2299)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:989)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:384)
at org.apache.spark.rdd.RDD.collect(RDD.scala:988)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:345)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:372)
at 
org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:127)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver$$anonfun$run$1.apply(SparkSQLDriver.scala:66)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver$$anonfun$run$1.apply(SparkSQLDriver.scala:66)
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:95)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:86)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:789)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:63)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:383)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:277)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDrive

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771518487


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5405/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771518388


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3646/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771515497


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3645/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771513934


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5406/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


ShreelekhyaG commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771481950


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 edited a comment on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


nihal0107 edited a comment on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771464559


   > > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   > 
   > > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   > 
   > If we enable 'AUTO_LOAD_MERGE',then we return and show the segment id 
before compaction since the user would focus on his load operation. Test case 
has been added. Please review.
   
   We are showing the segment id because if we need to query on specific 
segment then this feature will be helpful. But if we will show the segment id 
before compaction and will query on specific segment then the operation will 
fail. Better to take the opinion in community.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771465576


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5402/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


nihal0107 commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771464559


   > > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   > 
   > > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   > 
   > If we enable 'AUTO_LOAD_MERGE',then we return and show the segment id 
before compaction since the user would focus on his load operation. Test case 
has been added. Please review.
   
   We are showing the segment id because if we need to query on specific 
segment then this feature will be helpful. But if we will show the segment id 
before compaction and will query on specific segment then the operation will 
fail.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4084: [CARBONDATA-4113] Partition prune and cache fix when carbon.read.partition.hive.direct is disabled.

2021-02-02 Thread GitBox


CarbonDataQA2 commented on pull request #4084:
URL: https://github.com/apache/carbondata/pull/4084#issuecomment-771459205


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3642/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4116) Concurrent Data Loading Issue

2021-02-02 Thread suyash yadav (Jira)
suyash yadav created CARBONDATA-4116:


 Summary: Concurrent Data Loading Issue
 Key: CARBONDATA-4116
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4116
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Affects Versions: 1.6.1, 2.0.1
 Environment: Apache carbondata 2.0.1
Reporter: suyash yadav


Even carbon claim that it can support the concurrent data loading together with 
table compaction, in fact it cannot. We have faced data inconsistent issue in 
carbon 1.6.1 when loading data concurrently into the table together with 
compaction. That is why we implement table locking for load data, compact and 
clean files command. All this is due to the manipulation of the table’s 
metadata file, i.e .tablestatus.

 

We are facing issue in concurrent data loading together with compaction - gets 
us to data inconsistency - what is the way out to fix this as we want 
concurrent loading with compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] areyouokfreejoe commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


areyouokfreejoe commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771458076


   > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] areyouokfreejoe closed pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


areyouokfreejoe closed pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] areyouokfreejoe commented on pull request #4086: [CARBONDATA-4115] Successful load and insert will return segment ID

2021-02-02 Thread GitBox


areyouokfreejoe commented on pull request #4086:
URL: https://github.com/apache/carbondata/pull/4086#issuecomment-771456273


   > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   
   
   
   > If we enable the property `ENABLE_AUTO_LOAD_MERGE` then which segment id 
are we planning to show, the segment generated after compaction or before 
compaction? Better to add a test case for that scenario also.
   
   If we enable 'AUTO_LOAD_MERGE',then we return and show the segment id before 
compaction since the user would focus on his load operation. Test case has been 
added. Please review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ydvpankaj99 commented on pull request #4071: [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

2021-02-02 Thread GitBox


ydvpankaj99 commented on pull request #4071:
URL: https://github.com/apache/carbondata/pull/4071#issuecomment-771448246


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org