[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583995968 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1908/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583981872 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/206/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583971855 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/205/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583967776 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1906/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376874268 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/DirectDictDimensionIndexCodec.java ## @@ -46,18 +47,33 @@ public String getName() { public ColumnPageEncoder createEncoder(Map parameter) { return new IndexStorageEncoder() { @Override - void encodeIndexStorage(ColumnPage inputPage) { -BlockIndexerStorage indexStorage; -byte[][] data = inputPage.getByteArrayPage(); + void encodeIndexStorage(ColumnPage input) { +BlockIndexerStorage indexStorage; +boolean isDictionary = input.isLocalDictGeneratedPage(); + +// if need to build invertIndex or RLE, the columnpage should to be organized in Row, +// in the other words, we get the data of columnpage as an array, in which each element +// presenets a row. But if no need to build both invertIndex and RLE, it will increase +// extra overhead, considering data in columnpage was already stored as flattened data, +// and the compression is also on flattened data, to organized data in ROW is actually +// increase the overheadof "Expand" and "Flatten" with on invertIndex and RLE. +// Overall, isFlatted presents do we flatten the data? if need to build invertIndex or RLE, +// isFlattened is set to ture, otherwise, isFlattened is set to false. +boolean isFlattened = !isInvertedIndex && !isDictionary; + +// when isFlattened is true, data[0] is the flattened data of the columnpage. +// when isFlattened is false, data[i] is the ith row of the columnpage. +ByteBuffer[] data = input.getByteBufferArrayPage(isFlattened); if (isInvertedIndex) { - indexStorage = new BlockIndexerStorageForShort(data, false, false, isSort); + indexStorage = new BlockIndexerStorageForShort(data, isDictionary, !isDictionary, isSort); Review comment: same as above This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376873977 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/DictDimensionIndexCodec.java ## @@ -46,18 +47,33 @@ public String getName() { public ColumnPageEncoder createEncoder(Map parameter) { return new IndexStorageEncoder() { @Override - void encodeIndexStorage(ColumnPage inputPage) { -BlockIndexerStorage indexStorage; -byte[][] data = inputPage.getByteArrayPage(); + void encodeIndexStorage(ColumnPage input) { +BlockIndexerStorage indexStorage; +boolean isDictionary = input.isLocalDictGeneratedPage(); + +// if need to build invertIndex or RLE, the columnpage should to be organized in Row, +// in the other words, we get the data of columnpage as an array, in which each element +// presenets a row. But if no need to build both invertIndex and RLE, it will increase +// extra overhead, considering data in columnpage was already stored as flattened data, +// and the compression is also on flattened data, to organized data in ROW is actually +// increase the overheadof "Expand" and "Flatten" with on invertIndex and RLE. +// Overall, isFlatted presents do we flatten the data? if need to build invertIndex or RLE, +// isFlattened is set to ture, otherwise, isFlattened is set to false. +boolean isFlattened = !isInvertedIndex && !isDictionary; + +// when isFlattened is true, data[0] is the flattened data of the columnpage. +// when isFlattened is false, data[i] is the ith row of the columnpage. +ByteBuffer[] data = input.getByteBufferArrayPage(isFlattened); if (isInvertedIndex) { - indexStorage = new BlockIndexerStorageForShort(data, true, false, isSort); + indexStorage = new BlockIndexerStorageForShort(data, isDictionary, !isDictionary, isSort); Review comment: Not a good practice to pass same flag with complementary values. Hardcode is ok or introduce new variable This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376873364 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/ComplexDimensionIndexCodec.java ## @@ -46,12 +47,13 @@ public ColumnPageEncoder createEncoder(Map parameter) { return new IndexStorageEncoder() { @Override void encodeIndexStorage(ColumnPage inputPage) { -BlockIndexerStorage indexStorage = -new BlockIndexerStorageForShort(inputPage.getByteArrayPage(), false, false, false); -byte[] flattened = ByteUtil.flatten(indexStorage.getDataPage()); +BlockIndexerStorage indexStorage = +new BlockIndexerStorageForShort(inputPage.getByteBufferArrayPage(false), +false, false, false); +ByteBuffer flattened = ByteUtil.flatten(indexStorage.getDataPage()); Compressor compressor = CompressorFactory.getInstance().getCompressor( inputPage.getColumnCompressorName()); -byte[] compressed = compressor.compressByte(flattened); +byte[] compressed = ByteUtil.byteBufferToBytes(compressor.compressByte(flattened)); Review comment: I think we have converted 2D byte array to bytebuffer, to support compression directly on the byte buffer. But now we have to convert to byte[] again !, possible to keep bytebuffer itself ? so compression gc might have reduced, but decompression gc might have increased now. Please compare the before and after performance and memory usage. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table
kunal642 commented on a change in pull request #3601: [CARBONDATA-3677] Fixed performance issue for drop table URL: https://github.com/apache/carbondata/pull/3601#discussion_r376872898 ## File path: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ## @@ -35,6 +35,7 @@ import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.constants.CarbonLoadOptionConstants; import org.apache.carbondata.core.constants.SortScopeOptions; +import org.apache.carbondata.core.datamap.DataMapLevel; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] kunal642 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
kunal642 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583960836 @jackylk Please review..Fixed the comments This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376872535 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java ## @@ -289,6 +298,15 @@ public BigDecimal getDecimal(int rowId) { return data; } + @Override + public ByteBuffer[] getByteBufferArrayPage(boolean isFlattened) { Review comment: The changes was only for offheap data right ? so I expect only unsafe pages should have changes. Why changed for safe column pages also ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583959274 @QiangCai ok, I will verify on cluster, and with old table This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376871815 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java ## @@ -747,6 +759,16 @@ public long getPageLengthInBytes() throws IOException { */ public byte[] compress(Compressor compressor) throws IOException { DataType dataType = columnPageEncoderMeta.getStoreDataType(); + +// if the columnpage is isUnsafeEnabled and the Datatype is primitive. +// we try to compress the data in offheap directly, avoiding a copy from offheap to heap +if (isUnsafeEnabled() && (dataType == DataTypes.BOOLEAN || dataType == BYTE +|| dataType == SHORT || dataType == DataTypes.SHORT_INT || dataType == INT +|| dataType == LONG || dataType == FLOAT || dataType == DOUBLE +|| DataTypes.isDecimal(dataType))) { Review comment: is Decimal supported ? below I see getByteBufferArrayPage is unsupported in DecimalColumnPage This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
jackylk edited a comment on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583959274 @QiangCai ok, I will verify on cluster, and with old table created before this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] jackylk commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
jackylk commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583959274 @QiangCai ok, I will verify on cluster This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376869338 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/BlockIndexerStorageForNoInvertedIndexForShort.java ## @@ -79,12 +82,8 @@ private void rleEncodeOnData(List actualDataList) { } } - private byte[][] convertToDataPage(List list) { -byte[][] shortArray = new byte[list.size()][]; -for (int i = 0; i < shortArray.length; i++) { - shortArray[i] = list.get(i); -} -return shortArray; + private ByteBuffer[] convertToDataPage(List list) { Review comment: should avoid redundant conversion, should directly use ByteBuffer[] everywhere, don't convert to list This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap befo
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#discussion_r376868515 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/BlockIndexerStorageForNoInvertedIndexForShort.java ## @@ -17,52 +17,55 @@ package org.apache.carbondata.core.datastore.columnar; +import java.nio.ByteBuffer; import java.util.ArrayList; +import java.util.Arrays; import java.util.List; import org.apache.carbondata.core.constants.CarbonCommonConstants; -import org.apache.carbondata.core.util.ByteUtil; /** * Below class will be used to for no inverted index */ -public class BlockIndexerStorageForNoInvertedIndexForShort extends BlockIndexerStorage { +public class BlockIndexerStorageForNoInvertedIndexForShort +extends BlockIndexerStorage { /** * column data */ - private byte[][] dataPage; + private ByteBuffer[] dataPage; private short[] dataRlePage; - public BlockIndexerStorageForNoInvertedIndexForShort(byte[][] dataPage, boolean applyRLE) { + public BlockIndexerStorageForNoInvertedIndexForShort(ByteBuffer[] dataPage, boolean applyRLE) { this.dataPage = dataPage; if (applyRLE) { - List actualDataList = new ArrayList<>(); - for (int i = 0; i < dataPage.length; i++) { -actualDataList.add(dataPage[i]); - } + List actualDataList = Arrays.asList(dataPage); Review comment: **Can we skip converting arrays to list ?** Can we change it to use the array directly ? because as it is one dimensional array now, we can remove list. can use array directly in below methods. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583955161 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/204/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai closed pull request #3379: [CARBONDATA-3546] Delete duplicate data between segments
QiangCai closed pull request #3379: [CARBONDATA-3546] Delete duplicate data between segments URL: https://github.com/apache/carbondata/pull/3379 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai closed pull request #3403: [CARBONDATA-3547] Delete duplicate data during GLOBAL_SORT compaction
QiangCai closed pull request #3403: [CARBONDATA-3547] Delete duplicate data during GLOBAL_SORT compaction URL: https://github.com/apache/carbondata/pull/3403 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai closed pull request #3211: [WIP] Support configuring Java version
QiangCai closed pull request #3211: [WIP] Support configuring Java version URL: https://github.com/apache/carbondata/pull/3211 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583935130 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1905/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
QiangCai commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583934608 better to do further testing on a cluster, not only a local machine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376834155 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java ## @@ -38,11 +37,6 @@ */ private int blockletSize; - /** - * list of page metadata - */ - private List pageMetadataList; Review comment: it means it will remove start/end key from loading flow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376831843 ## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java ## @@ -135,25 +135,6 @@ DataMapExprWrapper chooseDataMap(DataMapLevel level, FilterResolverIntf resolver return null; } - /** - * Get all datamaps of the table for clearing purpose - */ - public DataMapExprWrapper getAllDataMapsForClear(CarbonTable carbonTable) Review comment: why remove this method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376835460 ## File path: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/column/CarbonDimension.java ## @@ -93,15 +86,7 @@ public int getKeyOrdinal() { return keyOrdinal; } - /** - * @return the complexTypeOrdinal - */ - public int getComplexTypeOrdinal() { -return complexTypeOrdinal; - } - public void setComplexTypeOridnal(int complexTypeOrdinal) { Review comment: why not remove it ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376832770 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java ## @@ -640,15 +377,91 @@ public int getNumberOfSortColumns() { return numberOfSortColumns; } - public int getNumberOfNoDictSortColumns() { -return numberOfNoDictSortColumns; + public int getLastDimensionColOrdinal() { +return lastDimensionColOrdinal; + } + + public int getNumberOfColumns() { +return numberOfColumnsAfterFlatten; } - public int getNumberOfDictSortColumns() { -return this.numberOfSortColumns - this.numberOfNoDictSortColumns; + public int getNumberOfDictDimensions() { +return numberOfDictDimensions; } - public int getLastDimensionColOrdinal() { -return lastDimensionColOrdinal; + public int getNumberOfSimpleDimensions() { Review comment: primitiveDimension This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376848072 ## File path: integration/spark-common/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchemaCommon.scala ## @@ -111,8 +111,6 @@ case class CarbonMergerMapping( validSegments: Array[Segment], tableId: String, campactionType: CompactionType, -// maxSegmentColCardinality is Cardinality of last segment of compaction -var maxSegmentColCardinality: Array[Int], // maxSegmentColumnSchemaList is list of column schema of last segment of compaction var maxSegmentColumnSchemaList: List[ColumnSchema], Review comment: use CarbonTable directly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376833731 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java ## @@ -74,31 +72,10 @@ */ private Segment segment; - /** - * id of the Blocklet. - */ - private String blockletId; Review comment: why remove blocklet info This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839981 ## File path: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java ## @@ -756,4 +756,45 @@ public static long toLongLittleEndian(byte[] bytes, int offset) { ((long) bytes[offset + 3] & 0xff) << 24) | (((long) bytes[offset + 2] & 0xff) << 16) | ( ((long) bytes[offset + 1] & 0xff) << 8) | (((long) bytes[offset] & 0xff))); } + + public static byte[] convertDateToBytes(int date) { +return ByteUtil.toBytes(date); + } + + public static byte[] convertDateToBytes(long[] date) { +byte[] output = new byte[date.length * 4]; +for (int i = 0; i < date.length; i++) { + System.arraycopy(ByteUtil.toBytes(date[i]), 0, output, i * 4, 4); +} +return output; + } + + public static int convertBytesToDate(byte[] date) { +return ByteUtil.toInt(date, 0); + } + + public static int convertBytesToDate(byte[] date, int offset) { +return ByteUtil.toInt(date, offset); + } + + public static int dateBytesSize() { +return 4; + } + + public static int[] convertBytesToDateIntArray(byte[] input) { Review comment: convertDateBytesToInts This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376847677 ## File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala ## @@ -578,15 +562,11 @@ class CarbonMergerRDD[K, V]( } } val updatedMaxSegmentColumnList = new util.ArrayList[ColumnSchema]() -// update cardinality and column schema list according to master schema -val cardinality = CarbonCompactionUtil Review comment: if no need to update cardinality, also no need to update column schema This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839201 ## File path: core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java ## @@ -217,7 +217,8 @@ public BlockletScannedResult call() throws Exception { nextRead.set(true); futureIo = readNextBlockletAsync(); } - return blockletScanner.scanBlocklet(rawBlockletColumnChunks); + BlockletScannedResult result = blockletScanner.scanBlocklet(rawBlockletColumnChunks); Review comment: not require the change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376833358 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java ## @@ -640,15 +377,91 @@ public int getNumberOfSortColumns() { return numberOfSortColumns; } - public int getNumberOfNoDictSortColumns() { -return numberOfNoDictSortColumns; + public int getLastDimensionColOrdinal() { +return lastDimensionColOrdinal; + } + + public int getNumberOfColumns() { +return numberOfColumnsAfterFlatten; } - public int getNumberOfDictSortColumns() { -return this.numberOfSortColumns - this.numberOfNoDictSortColumns; + public int getNumberOfDictDimensions() { +return numberOfDictDimensions; } - public int getLastDimensionColOrdinal() { -return lastDimensionColOrdinal; + public int getNumberOfSimpleDimensions() { +return numberOfDictDimensions + numberOfNoDictionaryDimension; + } + + public int getNumberOfComplexDimensions() { +return complexDimensions.size(); + } + + public int getNumberOfMeasures() { +return measures.size(); + } + + /** + * Return column value length in byte for all dimension columns in the table + * for dimension it is -1 (for DATE it is 4), + */ + public int[] createDimColumnValueLength() { +int[] length = new int[dimensions.size()]; +int index = 0; +for (CarbonDimension dimension : dimensions) { + DataType dataType = dimension.getDataType(); + if (dataType == DataTypes.DATE) { +length[index] = 4; + } else { +length[index] = -1; + } + index++; +} +return length; + } + + /** + * Return column value length in byte for all columns in the table + * for dimension and complex column it is -1 (for DATE it is 4), + * for measure is 8 (for decimal is -1) + */ + public int[] createColumnValueLength() { +int[] length = new int[numberOfColumnsAfterFlatten]; +int index = 0; +for (CarbonDimension dimension : dimensions) { + DataType dataType = dimension.getDataType(); + if (dataType == DataTypes.DATE) { +length[index] = 4; + } else { +length[index] = -1; + } + index++; +} +for (CarbonDimension complexDimension : complexDimensions) { + int depth = getNumColumnsAfterFlatten(complexDimension); + for (int i = 0; i < depth; i++) { +length[index++] = -1; + } +} +for (CarbonMeasure measure : measures) { + DataType dataType = measure.getDataType(); + if (DataTypes.isDecimal(dataType)) { +length[index++] = -1; + } else { +length[index++] = 8; Review comment: why the length of other measures are 8? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path
QiangCai commented on a change in pull request #3598: [CARBONDATA-3684] Remove MDK and cardinality in write path URL: https://github.com/apache/carbondata/pull/3598#discussion_r376839786 ## File path: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java ## @@ -756,4 +756,45 @@ public static long toLongLittleEndian(byte[] bytes, int offset) { ((long) bytes[offset + 3] & 0xff) << 24) | (((long) bytes[offset + 2] & 0xff) << 16) | ( ((long) bytes[offset + 1] & 0xff) << 8) | (((long) bytes[offset] & 0xff))); } + + public static byte[] convertDateToBytes(int date) { +return ByteUtil.toBytes(date); + } + + public static byte[] convertDateToBytes(long[] date) { +byte[] output = new byte[date.length * 4]; +for (int i = 0; i < date.length; i++) { + System.arraycopy(ByteUtil.toBytes(date[i]), 0, output, i * 4, 4); +} +return output; + } + + public static int convertBytesToDate(byte[] date) { Review comment: How about convertDateBytesToInt This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583925032 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/203/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583883191 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1904/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583883007 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/202/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583877338 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1903/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583871265 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/201/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583867782 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1902/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583860628 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/200/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583851132 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1901/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583850872 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1900/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583848464 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1899/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583845230 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/199/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583845113 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/198/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583844201 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1895/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583842184 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/197/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840206 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840176 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840188 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1898/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840028 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/196/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839686 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1897/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839556 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/195/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839247 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1896/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839095 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#issuecomment-583838300 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/193/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd
jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#discussion_r376777589 ## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ## @@ -285,17 +286,39 @@ public static String getSegmentPath(String tablePath, String segmentId) { } /** - * Gets data file name only with out path - * - * @param filePartNo data file part number - * @param taskNo task identifier - * @param factUpdateTimeStamp unique identifier to identify an update - * @return gets data file name only with out path + * Gets data file name only, without parent path */ public static String getCarbonDataFileName(Integer filePartNo, String taskNo, int bucketNumber, - int batchNo, String factUpdateTimeStamp, String segmentNo) { -return DATA_PART_PREFIX + filePartNo + "-" + taskNo + BATCH_PREFIX + batchNo + "-" -+ bucketNumber + "-" + segmentNo + "-" + factUpdateTimeStamp + CARBON_DATA_EXT; + int batchNo, String factUpdateTimeStamp, String segmentNo, String compressor) { +Objects.requireNonNull(filePartNo); +Objects.requireNonNull(taskNo); +Objects.requireNonNull(factUpdateTimeStamp); +Objects.requireNonNull(compressor); + +// Start from CarbonData 2.0, the data file name patten is: +// partNo-taskNo-batchNo-bucketNo-segmentNo-timestamp.compressor.carbondata +// For example: +// part-0-0_batchno0-0-0-1580982686749.zstd.carbondata +// +// If the compressor name is missing, the file is compressed by snappy, which is +// the default compressor in CarbonData 1.x + +return new StringBuffer().append(DATA_PART_PREFIX) Review comment: I changed to StringBuilder, and this link (https://stackoverflow.com/questions/47605/string-concatenation-concat-vs-operator) suggest StringBuilder is more efficient This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd
jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#discussion_r376776975 ## File path: core/src/main/java/org/apache/carbondata/core/readcommitter/LatestFilesReadCommittedScope.java ## @@ -163,7 +163,7 @@ public SegmentRefreshInfo getCommittedSegmentRefreshInfo(Segment segment, Update return segmentRefreshInfo; } - private String getSegmentID(String carbonIndexFileName, String indexFilePath) { + private String getTimestamp(String carbonIndexFileName, String indexFilePath) { Review comment: I changed back This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd
jackylk commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#discussion_r376776809 ## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ## @@ -1083,7 +1083,7 @@ private CarbonCommonConstants() { * The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty. * Specially, empty means that Carbondata will not compress the sort temp files. */ - public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "SNAPPY"; + public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "zstd"; Review comment: fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583835072 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1893/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583834562 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1894/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table
CarbonDataQA1 commented on issue #3601: [CARBONDATA-3677] Fixed performance issue for drop table URL: https://github.com/apache/carbondata/pull/3601#issuecomment-583829617 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/192/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583829549 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/191/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583827775 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1892/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed. URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583827606 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive
CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive URL: https://github.com/apache/carbondata/pull/3583#issuecomment-583827649 Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/190/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd
niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#discussion_r376770522 ## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ## @@ -285,17 +286,39 @@ public static String getSegmentPath(String tablePath, String segmentId) { } /** - * Gets data file name only with out path - * - * @param filePartNo data file part number - * @param taskNo task identifier - * @param factUpdateTimeStamp unique identifier to identify an update - * @return gets data file name only with out path + * Gets data file name only, without parent path */ public static String getCarbonDataFileName(Integer filePartNo, String taskNo, int bucketNumber, - int batchNo, String factUpdateTimeStamp, String segmentNo) { -return DATA_PART_PREFIX + filePartNo + "-" + taskNo + BATCH_PREFIX + batchNo + "-" -+ bucketNumber + "-" + segmentNo + "-" + factUpdateTimeStamp + CARBON_DATA_EXT; + int batchNo, String factUpdateTimeStamp, String segmentNo, String compressor) { +Objects.requireNonNull(filePartNo); +Objects.requireNonNull(taskNo); +Objects.requireNonNull(factUpdateTimeStamp); +Objects.requireNonNull(compressor); + +// Start from CarbonData 2.0, the data file name patten is: +// partNo-taskNo-batchNo-bucketNo-segmentNo-timestamp.compressor.carbondata +// For example: +// part-0-0_batchno0-0-0-1580982686749.zstd.carbondata +// +// If the compressor name is missing, the file is compressed by snappy, which is +// the default compressor in CarbonData 1.x + +return new StringBuffer().append(DATA_PART_PREFIX) Review comment: There is need not use StringBuffer to build string, just use string concat will be ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd
niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#discussion_r376770113 ## File path: core/src/main/java/org/apache/carbondata/core/readcommitter/LatestFilesReadCommittedScope.java ## @@ -163,7 +163,7 @@ public SegmentRefreshInfo getCommittedSegmentRefreshInfo(Segment segment, Update return segmentRefreshInfo; } - private String getSegmentID(String carbonIndexFileName, String indexFilePath) { + private String getTimestamp(String carbonIndexFileName, String indexFilePath) { Review comment: Why change method name to getTimestamp? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd
niuge01 commented on a change in pull request #3606: [CARBONDATA-3681] Change default compressor to zstd URL: https://github.com/apache/carbondata/pull/3606#discussion_r376769368 ## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ## @@ -1083,7 +1083,7 @@ private CarbonCommonConstants() { * The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty. * Specially, empty means that Carbondata will not compress the sort temp files. */ - public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "SNAPPY"; + public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "zstd"; Review comment: ```suggestion public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "ZSTD"; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services