[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583995968
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1908/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583981872
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/206/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583971855
 
 
   Build Failed  with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/205/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583967776
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1906/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376874268
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/DirectDictDimensionIndexCodec.java
 ##
 @@ -46,18 +47,33 @@ public String getName() {
   public ColumnPageEncoder createEncoder(Map parameter) {
 return new IndexStorageEncoder() {
   @Override
-  void encodeIndexStorage(ColumnPage inputPage) {
-BlockIndexerStorage indexStorage;
-byte[][] data = inputPage.getByteArrayPage();
+  void encodeIndexStorage(ColumnPage input) {
+BlockIndexerStorage indexStorage;
+boolean isDictionary = input.isLocalDictGeneratedPage();
+
+// if need to build invertIndex or RLE, the columnpage should to be 
organized in Row,
+// in the other words, we get the data of columnpage as an array, in 
which each element
+// presenets a row. But if no need to build both invertIndex and RLE, 
it will increase
+// extra overhead, considering data in columnpage was already stored 
as flattened data,
+// and the compression is also on flattened  data, to organized data 
in ROW is actually
+// increase the overheadof "Expand" and "Flatten" with on invertIndex 
and RLE.
+// Overall, isFlatted presents do we flatten the data? if need to 
build invertIndex or RLE,
+// isFlattened is set to ture, otherwise, isFlattened is set to false.
+boolean isFlattened = !isInvertedIndex && !isDictionary;
+
+// when isFlattened is true, data[0] is the flattened data of the 
columnpage.
+// when isFlattened is false, data[i] is the ith row of the columnpage.
+ByteBuffer[] data = input.getByteBufferArrayPage(isFlattened);
 if (isInvertedIndex) {
-  indexStorage = new BlockIndexerStorageForShort(data, false, false, 
isSort);
+  indexStorage = new BlockIndexerStorageForShort(data, isDictionary, 
!isDictionary, isSort);
 
 Review comment:
   same as above


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376873977
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/DictDimensionIndexCodec.java
 ##
 @@ -46,18 +47,33 @@ public String getName() {
   public ColumnPageEncoder createEncoder(Map parameter) {
 return new IndexStorageEncoder() {
   @Override
-  void encodeIndexStorage(ColumnPage inputPage) {
-BlockIndexerStorage indexStorage;
-byte[][] data = inputPage.getByteArrayPage();
+  void encodeIndexStorage(ColumnPage input) {
+BlockIndexerStorage indexStorage;
+boolean isDictionary = input.isLocalDictGeneratedPage();
+
+// if need to build invertIndex or RLE, the columnpage should to be 
organized in Row,
+// in the other words, we get the data of columnpage as an array, in 
which each element
+// presenets a row. But if no need to build both invertIndex and RLE, 
it will increase
+// extra overhead, considering data in columnpage was already stored 
as flattened data,
+// and the compression is also on flattened  data, to organized data 
in ROW is actually
+// increase the overheadof "Expand" and "Flatten" with on invertIndex 
and RLE.
+// Overall, isFlatted presents do we flatten the data? if need to 
build invertIndex or RLE,
+// isFlattened is set to ture, otherwise, isFlattened is set to false.
+boolean isFlattened = !isInvertedIndex && !isDictionary;
+
+// when isFlattened is true, data[0] is the flattened data of the 
columnpage.
+// when isFlattened is false, data[i] is the ith row of the columnpage.
+ByteBuffer[] data = input.getByteBufferArrayPage(isFlattened);
 if (isInvertedIndex) {
-  indexStorage = new BlockIndexerStorageForShort(data, true, false, 
isSort);
+  indexStorage = new BlockIndexerStorageForShort(data, isDictionary, 
!isDictionary, isSort);
 
 Review comment:
   Not a good practice to pass same flag with complementary values. Hardcode is 
ok or introduce new variable


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376873364
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/ComplexDimensionIndexCodec.java
 ##
 @@ -46,12 +47,13 @@ public ColumnPageEncoder createEncoder(Map 
parameter) {
 return new IndexStorageEncoder() {
   @Override
   void encodeIndexStorage(ColumnPage inputPage) {
-BlockIndexerStorage indexStorage =
-new BlockIndexerStorageForShort(inputPage.getByteArrayPage(), 
false, false, false);
-byte[] flattened = ByteUtil.flatten(indexStorage.getDataPage());
+BlockIndexerStorage indexStorage =
+new 
BlockIndexerStorageForShort(inputPage.getByteBufferArrayPage(false),
+false, false, false);
+ByteBuffer flattened = ByteUtil.flatten(indexStorage.getDataPage());
 Compressor compressor = CompressorFactory.getInstance().getCompressor(
 inputPage.getColumnCompressorName());
-byte[] compressed = compressor.compressByte(flattened);
+byte[] compressed = 
ByteUtil.byteBufferToBytes(compressor.compressByte(flattened));
 
 Review comment:
   I think we have converted 2D byte array to bytebuffer, to support 
compression directly on the byte buffer.
   But now we have to convert to byte[] again !, possible to keep bytebuffer 
itself ?
   
   so compression gc might have reduced, but decompression gc might have 
increased now. Please compare the before and after performance and memory usage.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table

2020-02-09 Thread GitBox
kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed 
performance issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#discussion_r376872898
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 ##
 @@ -35,6 +35,7 @@
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.constants.CarbonLoadOptionConstants;
 import org.apache.carbondata.core.constants.SortScopeOptions;
+import org.apache.carbondata.core.datamap.DataMapLevel;
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] kunal642 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table

2020-02-09 Thread GitBox
kunal642 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue 
for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583960836
 
 
   @jackylk Please review..Fixed the comments


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376872535
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java
 ##
 @@ -289,6 +298,15 @@ public BigDecimal getDecimal(int rowId) {
 return data;
   }
 
+  @Override
+  public ByteBuffer[] getByteBufferArrayPage(boolean isFlattened) {
 
 Review comment:
   The changes was only for offheap data right ? so I expect only unsafe pages 
should have changes. Why changed for safe column pages also ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583959274
 
 
   @QiangCai ok, I will verify on cluster, and with old table


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376871815
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java
 ##
 @@ -747,6 +759,16 @@ public long getPageLengthInBytes() throws IOException {
*/
   public byte[] compress(Compressor compressor) throws IOException {
 DataType dataType = columnPageEncoderMeta.getStoreDataType();
+
+// if the columnpage is isUnsafeEnabled and the Datatype is primitive.
+// we try to compress the data in offheap directly, avoiding a copy from 
offheap to heap
+if (isUnsafeEnabled() && (dataType == DataTypes.BOOLEAN || dataType == BYTE
+|| dataType == SHORT || dataType == DataTypes.SHORT_INT || dataType == 
INT
+|| dataType == LONG || dataType == FLOAT || dataType == DOUBLE
+|| DataTypes.isDecimal(dataType))) {
 
 Review comment:
   is Decimal supported ? 
   
   below I see getByteBufferArrayPage is unsupported in DecimalColumnPage


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583959274
 
 
   @QiangCai ok, I will verify on cluster, and with old table created before 
this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] jackylk commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
jackylk commented on issue #3606: [CARBONDATA-3681] Change default compressor 
to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583959274
 
 
   @QiangCai ok, I will verify on cluster


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376869338
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/columnar/BlockIndexerStorageForNoInvertedIndexForShort.java
 ##
 @@ -79,12 +82,8 @@ private void rleEncodeOnData(List actualDataList) {
 }
   }
 
-  private byte[][] convertToDataPage(List list) {
-byte[][] shortArray = new byte[list.size()][];
-for (int i = 0; i < shortArray.length; i++) {
-  shortArray[i] = list.get(i);
-}
-return shortArray;
+  private ByteBuffer[] convertToDataPage(List list) {
 
 Review comment:
   should avoid redundant conversion,  should directly use ByteBuffer[] 
everywhere, don't convert to list


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo

2020-02-09 Thread GitBox
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] 
Support compress offheap data in columnpage directly, avoding a copy of data 
from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376868515
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/columnar/BlockIndexerStorageForNoInvertedIndexForShort.java
 ##
 @@ -17,52 +17,55 @@
 
 package org.apache.carbondata.core.datastore.columnar;
 
+import java.nio.ByteBuffer;
 import java.util.ArrayList;
+import java.util.Arrays;
 import java.util.List;
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import org.apache.carbondata.core.util.ByteUtil;
 
 /**
  * Below class will be used to for no inverted index
  */
-public class BlockIndexerStorageForNoInvertedIndexForShort extends 
BlockIndexerStorage {
+public class BlockIndexerStorageForNoInvertedIndexForShort
+extends BlockIndexerStorage {
 
   /**
* column data
*/
-  private byte[][] dataPage;
+  private ByteBuffer[] dataPage;
 
   private short[] dataRlePage;
 
-  public BlockIndexerStorageForNoInvertedIndexForShort(byte[][] dataPage, 
boolean applyRLE) {
+  public BlockIndexerStorageForNoInvertedIndexForShort(ByteBuffer[] dataPage, 
boolean applyRLE) {
 this.dataPage = dataPage;
 if (applyRLE) {
-  List actualDataList = new ArrayList<>();
-  for (int i = 0; i < dataPage.length; i++) {
-actualDataList.add(dataPage[i]);
-  }
+  List actualDataList = Arrays.asList(dataPage);
 
 Review comment:
   **Can we skip converting arrays to list ?** 
   Can we change it to use the array directly ? because as it is one 
dimensional array now, we can remove list. can use array directly in below 
methods.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583955161
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/204/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai closed pull request #3379: [CARBONDATA-3546] Delete duplicate data between segments

2020-02-09 Thread GitBox
QiangCai closed pull request #3379: [CARBONDATA-3546] Delete duplicate data 
between segments
URL: https://github.com/apache/carbondata/pull/3379
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai closed pull request #3403: [CARBONDATA-3547] Delete duplicate data during GLOBAL_SORT compaction

2020-02-09 Thread GitBox
QiangCai closed pull request #3403: [CARBONDATA-3547] Delete duplicate data 
during GLOBAL_SORT compaction
URL: https://github.com/apache/carbondata/pull/3403
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai closed pull request #3211: [WIP] Support configuring Java version

2020-02-09 Thread GitBox
QiangCai closed pull request #3211: [WIP] Support configuring Java version
URL: https://github.com/apache/carbondata/pull/3211
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583935130
 
 
   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1905/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
QiangCai commented on issue #3606: [CARBONDATA-3681] Change default compressor 
to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583934608
 
 
   better to do further testing on a cluster, not only a local machine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376834155
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java
 ##
 @@ -38,11 +37,6 @@
*/
   private int blockletSize;
 
-  /**
-   * list of page metadata
-   */
-  private List pageMetadataList;
 
 Review comment:
   it means it will remove start/end key from loading flow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376831843
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
 ##
 @@ -135,25 +135,6 @@ DataMapExprWrapper chooseDataMap(DataMapLevel level, 
FilterResolverIntf resolver
 return null;
   }
 
-  /**
-   * Get all datamaps of the table for clearing purpose
-   */
-  public DataMapExprWrapper getAllDataMapsForClear(CarbonTable carbonTable)
 
 Review comment:
   why remove this method?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376835460
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/table/column/CarbonDimension.java
 ##
 @@ -93,15 +86,7 @@ public int getKeyOrdinal() {
 return keyOrdinal;
   }
 
-  /**
-   * @return the complexTypeOrdinal
-   */
-  public int getComplexTypeOrdinal() {
-return complexTypeOrdinal;
-  }
-
   public void setComplexTypeOridnal(int complexTypeOrdinal) {
 
 Review comment:
   why not remove it ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376832770
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java
 ##
 @@ -640,15 +377,91 @@ public int getNumberOfSortColumns() {
 return numberOfSortColumns;
   }
 
-  public int getNumberOfNoDictSortColumns() {
-return numberOfNoDictSortColumns;
+  public int getLastDimensionColOrdinal() {
+return lastDimensionColOrdinal;
+  }
+
+  public int getNumberOfColumns() {
+return numberOfColumnsAfterFlatten;
   }
 
-  public int getNumberOfDictSortColumns() {
-return this.numberOfSortColumns - this.numberOfNoDictSortColumns;
+  public int getNumberOfDictDimensions() {
+return numberOfDictDimensions;
   }
 
-  public int getLastDimensionColOrdinal() {
-return lastDimensionColOrdinal;
+  public int getNumberOfSimpleDimensions() {
 
 Review comment:
   primitiveDimension


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376848072
 
 

 ##
 File path: 
integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala
 ##
 @@ -111,8 +111,6 @@ case class CarbonMergerMapping(
 validSegments: Array[Segment],
 tableId: String,
 campactionType: CompactionType,
-// maxSegmentColCardinality is Cardinality of last segment of compaction
-var maxSegmentColCardinality: Array[Int],
 // maxSegmentColumnSchemaList is list of column schema of last segment of 
compaction
 var maxSegmentColumnSchemaList: List[ColumnSchema],
 
 Review comment:
   use CarbonTable directly


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376833731
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
 ##
 @@ -74,31 +72,10 @@
*/
   private Segment segment;
 
-  /**
-   * id of the Blocklet.
-   */
-  private String blockletId;
 
 Review comment:
   why remove blocklet info 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839981
 
 

 ##
 File path: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java
 ##
 @@ -756,4 +756,45 @@ public static long toLongLittleEndian(byte[] bytes, int 
offset) {
 ((long) bytes[offset + 3] & 0xff) << 24) | (((long) bytes[offset + 2] 
& 0xff) << 16) | (
 ((long) bytes[offset + 1] & 0xff) << 8) | (((long) bytes[offset] & 
0xff)));
   }
+
+  public static byte[] convertDateToBytes(int date) {
+return ByteUtil.toBytes(date);
+  }
+
+  public static byte[] convertDateToBytes(long[] date) {
+byte[] output = new byte[date.length * 4];
+for (int i = 0; i < date.length; i++) {
+  System.arraycopy(ByteUtil.toBytes(date[i]), 0, output, i * 4, 4);
+}
+return output;
+  }
+
+  public static int convertBytesToDate(byte[] date) {
+return ByteUtil.toInt(date, 0);
+  }
+
+  public static int convertBytesToDate(byte[] date, int offset) {
+return ByteUtil.toInt(date, offset);
+  }
+
+  public static int dateBytesSize() {
+return 4;
+  }
+
+  public static int[] convertBytesToDateIntArray(byte[] input) {
 
 Review comment:
   convertDateBytesToInts


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376847677
 
 

 ##
 File path: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
 ##
 @@ -578,15 +562,11 @@ class CarbonMergerRDD[K, V](
   }
 }
 val updatedMaxSegmentColumnList = new util.ArrayList[ColumnSchema]()
-// update cardinality and column schema list according to master schema
-val cardinality = CarbonCompactionUtil
 
 Review comment:
   if no need to update cardinality,  also no need to update column schema


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839201
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java
 ##
 @@ -217,7 +217,8 @@ public BlockletScannedResult call() throws Exception {
 nextRead.set(true);
 futureIo = readNextBlockletAsync();
   }
-  return blockletScanner.scanBlocklet(rawBlockletColumnChunks);
+  BlockletScannedResult result = 
blockletScanner.scanBlocklet(rawBlockletColumnChunks);
 
 Review comment:
   not require the change


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376833358
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java
 ##
 @@ -640,15 +377,91 @@ public int getNumberOfSortColumns() {
 return numberOfSortColumns;
   }
 
-  public int getNumberOfNoDictSortColumns() {
-return numberOfNoDictSortColumns;
+  public int getLastDimensionColOrdinal() {
+return lastDimensionColOrdinal;
+  }
+
+  public int getNumberOfColumns() {
+return numberOfColumnsAfterFlatten;
   }
 
-  public int getNumberOfDictSortColumns() {
-return this.numberOfSortColumns - this.numberOfNoDictSortColumns;
+  public int getNumberOfDictDimensions() {
+return numberOfDictDimensions;
   }
 
-  public int getLastDimensionColOrdinal() {
-return lastDimensionColOrdinal;
+  public int getNumberOfSimpleDimensions() {
+return numberOfDictDimensions + numberOfNoDictionaryDimension;
+  }
+
+  public int getNumberOfComplexDimensions() {
+return complexDimensions.size();
+  }
+
+  public int getNumberOfMeasures() {
+return measures.size();
+  }
+
+  /**
+   * Return column value length in byte for all dimension columns in the table
+   * for dimension it is -1 (for DATE it is 4),
+   */
+  public int[] createDimColumnValueLength() {
+int[] length = new int[dimensions.size()];
+int index = 0;
+for (CarbonDimension dimension : dimensions) {
+  DataType dataType = dimension.getDataType();
+  if (dataType == DataTypes.DATE) {
+length[index] = 4;
+  } else {
+length[index] = -1;
+  }
+  index++;
+}
+return length;
+  }
+
+  /**
+   * Return column value length in byte for all columns in the table
+   * for dimension and complex column it is -1 (for DATE it is 4),
+   * for measure is 8 (for decimal is -1)
+   */
+  public int[] createColumnValueLength() {
+int[] length = new int[numberOfColumnsAfterFlatten];
+int index = 0;
+for (CarbonDimension dimension : dimensions) {
+  DataType dataType = dimension.getDataType();
+  if (dataType == DataTypes.DATE) {
+length[index] = 4;
+  } else {
+length[index] = -1;
+  }
+  index++;
+}
+for (CarbonDimension complexDimension : complexDimensions) {
+  int depth = getNumColumnsAfterFlatten(complexDimension);
+  for (int i = 0; i < depth; i++) {
+length[index++] = -1;
+  }
+}
+for (CarbonMeasure measure : measures) {
+  DataType dataType = measure.getDataType();
+  if (DataTypes.isDecimal(dataType)) {
+length[index++] = -1;
+  } else {
+length[index++] = 8;
 
 Review comment:
   why the length of other measures are 8?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path

2020-02-09 Thread GitBox
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove 
MDK and cardinality in write path
URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839786
 
 

 ##
 File path: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java
 ##
 @@ -756,4 +756,45 @@ public static long toLongLittleEndian(byte[] bytes, int 
offset) {
 ((long) bytes[offset + 3] & 0xff) << 24) | (((long) bytes[offset + 2] 
& 0xff) << 16) | (
 ((long) bytes[offset + 1] & 0xff) << 8) | (((long) bytes[offset] & 
0xff)));
   }
+
+  public static byte[] convertDateToBytes(int date) {
+return ByteUtil.toBytes(date);
+  }
+
+  public static byte[] convertDateToBytes(long[] date) {
+byte[] output = new byte[date.length * 4];
+for (int i = 0; i < date.length; i++) {
+  System.arraycopy(ByteUtil.toBytes(date[i]), 0, output, i * 4, 4);
+}
+return output;
+  }
+
+  public static int convertBytesToDate(byte[] date) {
 
 Review comment:
   How about convertDateBytesToInt


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583925032
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/203/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583883191
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1904/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583883007
 
 
   Build Failed  with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/202/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583877338
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1903/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583871265
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/201/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583867782
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1902/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583860628
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/200/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583851132
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1901/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance 
issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583850872
 
 
   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1900/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583848464
 
 
   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1899/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583845230
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/199/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance 
issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583845113
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/198/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583844201
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1895/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583842184
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/197/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap 
data in columnpage directly, avoding a copy of data from offhead to heap before 
compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840206
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap 
data in columnpage directly, avoding a copy of data from offhead to heap before 
compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840176
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840188
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1898/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840028
 
 
   Build Failed  with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/196/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839686
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1897/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839556
 
 
   Build Failed  with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/195/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839247
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1896/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap 
data in columnpage directly, avoding a copy of data from offhead to heap before 
compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839095
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default 
compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583838300
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/193/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change 
default compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#discussion_r376777589
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
 ##
 @@ -285,17 +286,39 @@ public static String getSegmentPath(String tablePath, 
String segmentId) {
   }
 
   /**
-   * Gets data file name only with out path
-   *
-   * @param filePartNo  data file part number
-   * @param taskNo  task identifier
-   * @param factUpdateTimeStamp unique identifier to identify an update
-   * @return gets data file name only with out path
+   * Gets data file name only, without parent path
*/
   public static String getCarbonDataFileName(Integer filePartNo, String 
taskNo, int bucketNumber,
-  int batchNo, String factUpdateTimeStamp, String segmentNo) {
-return DATA_PART_PREFIX + filePartNo + "-" + taskNo + BATCH_PREFIX + 
batchNo + "-"
-+ bucketNumber + "-" + segmentNo + "-" + factUpdateTimeStamp + 
CARBON_DATA_EXT;
+  int batchNo, String factUpdateTimeStamp, String segmentNo, String 
compressor) {
+Objects.requireNonNull(filePartNo);
+Objects.requireNonNull(taskNo);
+Objects.requireNonNull(factUpdateTimeStamp);
+Objects.requireNonNull(compressor);
+
+// Start from CarbonData 2.0, the data file name patten is:
+// partNo-taskNo-batchNo-bucketNo-segmentNo-timestamp.compressor.carbondata
+// For example:
+// part-0-0_batchno0-0-0-1580982686749.zstd.carbondata
+//
+// If the compressor name is missing, the file is compressed by snappy, 
which is
+// the default compressor in CarbonData 1.x
+
+return new StringBuffer().append(DATA_PART_PREFIX)
 
 Review comment:
   I changed to StringBuilder, and this link 
(https://stackoverflow.com/questions/47605/string-concatenation-concat-vs-operator)
 suggest StringBuilder is more efficient 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change 
default compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#discussion_r376776975
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/readcommitter/LatestFilesReadCommittedScope.java
 ##
 @@ -163,7 +163,7 @@ public SegmentRefreshInfo 
getCommittedSegmentRefreshInfo(Segment segment, Update
 return segmentRefreshInfo;
   }
 
-  private String getSegmentID(String carbonIndexFileName, String 
indexFilePath) {
+  private String getTimestamp(String carbonIndexFileName, String 
indexFilePath) {
 
 Review comment:
   I changed back


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change 
default compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#discussion_r376776809
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ##
 @@ -1083,7 +1083,7 @@ private CarbonCommonConstants() {
* The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty.
* Specially, empty means that Carbondata will not compress the sort temp 
files.
*/
-  public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "SNAPPY";
+  public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "zstd";
 
 Review comment:
   fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583835072
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1893/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance 
issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583834562
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1894/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance 
issue for drop table
URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583829617
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/192/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress 
offheap data in columnpage directly, avoding a copy of data from offhead to 
heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583829549
 
 
   Build Success with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/191/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583827775
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1892/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

2020-02-09 Thread GitBox
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap 
data in columnpage directly, avoding a copy of data from offhead to heap before 
compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583827606
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-02-09 Thread GitBox
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583827649
 
 
   Build Failed  with Spark 2.4.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/190/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change 
default compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#discussion_r376770522
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
 ##
 @@ -285,17 +286,39 @@ public static String getSegmentPath(String tablePath, 
String segmentId) {
   }
 
   /**
-   * Gets data file name only with out path
-   *
-   * @param filePartNo  data file part number
-   * @param taskNo  task identifier
-   * @param factUpdateTimeStamp unique identifier to identify an update
-   * @return gets data file name only with out path
+   * Gets data file name only, without parent path
*/
   public static String getCarbonDataFileName(Integer filePartNo, String 
taskNo, int bucketNumber,
-  int batchNo, String factUpdateTimeStamp, String segmentNo) {
-return DATA_PART_PREFIX + filePartNo + "-" + taskNo + BATCH_PREFIX + 
batchNo + "-"
-+ bucketNumber + "-" + segmentNo + "-" + factUpdateTimeStamp + 
CARBON_DATA_EXT;
+  int batchNo, String factUpdateTimeStamp, String segmentNo, String 
compressor) {
+Objects.requireNonNull(filePartNo);
+Objects.requireNonNull(taskNo);
+Objects.requireNonNull(factUpdateTimeStamp);
+Objects.requireNonNull(compressor);
+
+// Start from CarbonData 2.0, the data file name patten is:
+// partNo-taskNo-batchNo-bucketNo-segmentNo-timestamp.compressor.carbondata
+// For example:
+// part-0-0_batchno0-0-0-1580982686749.zstd.carbondata
+//
+// If the compressor name is missing, the file is compressed by snappy, 
which is
+// the default compressor in CarbonData 1.x
+
+return new StringBuffer().append(DATA_PART_PREFIX)
 
 Review comment:
   There is need not use StringBuffer to build string, just use string concat 
will be ok.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change 
default compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#discussion_r376770113
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/readcommitter/LatestFilesReadCommittedScope.java
 ##
 @@ -163,7 +163,7 @@ public SegmentRefreshInfo 
getCommittedSegmentRefreshInfo(Segment segment, Update
 return segmentRefreshInfo;
   }
 
-  private String getSegmentID(String carbonIndexFileName, String 
indexFilePath) {
+  private String getTimestamp(String carbonIndexFileName, String 
indexFilePath) {
 
 Review comment:
   Why change method name to getTimestamp?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd

2020-02-09 Thread GitBox
niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change 
default compressor to zstd
URL: https://github.com/apache/carbondata/pull/3606#discussion_r376769368
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ##
 @@ -1083,7 +1083,7 @@ private CarbonCommonConstants() {
* The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty.
* Specially, empty means that Carbondata will not compress the sort temp 
files.
*/
-  public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "SNAPPY";
+  public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "zstd";
 
 Review comment:
   ```suggestion
 public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "ZSTD";
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services