[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711693802 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2751/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711690989 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4505/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-71106 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4504/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.
ajantha-bhat commented on a change in pull request #3875: URL: https://github.com/apache/carbondata/pull/3875#discussion_r507464156 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java ## @@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf jobConf) throws IOEx } String tablePath = FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath(); TaskAttemptID taskAttemptID = TaskAttemptID.forName(jc.get("mapred.task.id")); +// taskAttemptID will be null when the insert job is fired from presto. Presto send the JobConf +// and since presto does not use the MR framework for execution, the mapred.task.id will be +// null, so prepare a new ID. +if (taskAttemptID == null) { + SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm"); + String jobTrackerId = formatter.format(new Date()); + taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0); Review comment: Concurrent insert may use same taskAttemptID. Can you use a UUID as taskAttemptID or check how ORC writer is doing? ## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java ## @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.presto; + +import java.io.IOException; +import java.io.UncheckedIOException; +import java.util.Arrays; +import java.util.List; +import java.util.Properties; + +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat; +import org.apache.carbondata.hive.CarbonHiveSerDe; +import org.apache.carbondata.hive.MapredCarbonOutputFormat; +import org.apache.carbondata.presto.impl.CarbonTableConfig; + +import com.google.common.collect.ImmutableList; +import io.prestosql.plugin.hive.HiveFileWriter; +import io.prestosql.plugin.hive.HiveType; +import io.prestosql.plugin.hive.HiveWriteUtils; +import io.prestosql.spi.Page; +import io.prestosql.spi.PrestoException; +import io.prestosql.spi.block.Block; +import io.prestosql.spi.type.Type; +import io.prestosql.spi.type.TypeManager; +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.exec.FileSinkOperator; +import org.apache.hadoop.hive.ql.io.HiveOutputFormat; +import org.apache.hadoop.hive.ql.io.IOConstants; +import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector; +import org.apache.hadoop.hive.serde2.SerDeException; +import org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector; +import org.apache.hadoop.hive.serde2.objectinspector.StructField; +import org.apache.hadoop.io.Text; +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapred.Reporter; +import org.apache.log4j.Logger; + +import static com.google.common.collect.ImmutableList.toImmutableList; +import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR; +import static java.util.Objects.requireNonNull; +import static java.util.stream.Collectors.toList; +import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT; + +/** + * This class implements HiveFileWriter and it creates the carbonFileWriter to write the page data + * sent from presto. + */ +public class CarbonDataFileWriter implements HiveFileWriter { + + private static final Logger LOG = + LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName()); + + private final JobConf configuration; + private final Path outPutPath; + private final FileSinkOperator.RecordWriter recordWriter; + private final CarbonHiveSerDe serDe; + private final int fieldCount; + private final Object row; + private final SettableStructObjectInspector tableInspector; + private final List structFields; + private final HiveWriteUtils.FieldSetter[] setters; + + private boolean isCommitDone; + + public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, Properties properties, + JobCon
[GitHub] [carbondata] nihal0107 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
nihal0107 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711567187 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
ajantha-bhat commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-711561462 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711515303 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2746/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711500236 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4500/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…
CarbonDataQA1 commented on pull request #3981: URL: https://github.com/apache/carbondata/pull/3981#issuecomment-711489234 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2745/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…
CarbonDataQA1 commented on pull request #3981: URL: https://github.com/apache/carbondata/pull/3981#issuecomment-711487305 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4499/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711484706 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4502/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711484392 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2748/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711479603 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4501/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711479262 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2747/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r507380824 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1246,8 +1209,22 @@ public static boolean isHorizontalCompactionEnabled() { // set the update status. segmentUpdateStatusManager.setUpdateStatusDetails(segmentUpdateDetails); -CarbonFile[] deleteDeltaFiles = -segmentUpdateStatusManager.getDeleteDeltaFilesList(new Segment(seg), blockName); +// only when SegmentUpdateDetails contain the specified block +// will the method getDeleteDeltaFilesList be executed +List blockNameList = segmentUpdateStatusManager.getBlockNameFromSegment(seg); +Map> blockAndDeleteDeltaFilesMap = new HashMap<>(); +CarbonFile[] deleteDeltaFiles = null; +if (blockNameList.contains(blockName)) { + blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesList(new Segment(seg)); +} +if (blockAndDeleteDeltaFilesMap.containsKey(blockName)) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r507380663 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1246,8 +1209,22 @@ public static boolean isHorizontalCompactionEnabled() { // set the update status. segmentUpdateStatusManager.setUpdateStatusDetails(segmentUpdateDetails); -CarbonFile[] deleteDeltaFiles = -segmentUpdateStatusManager.getDeleteDeltaFilesList(new Segment(seg), blockName); +// only when SegmentUpdateDetails contain the specified block +// will the method getDeleteDeltaFilesList be executed +List blockNameList = segmentUpdateStatusManager.getBlockNameFromSegment(seg); +Map> blockAndDeleteDeltaFilesMap = new HashMap<>(); +CarbonFile[] deleteDeltaFiles = null; +if (blockNameList.contains(blockName)) { Review comment: Done. Combined the two judgement This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r507380251 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1246,8 +1209,22 @@ public static boolean isHorizontalCompactionEnabled() { // set the update status. segmentUpdateStatusManager.setUpdateStatusDetails(segmentUpdateDetails); -CarbonFile[] deleteDeltaFiles = -segmentUpdateStatusManager.getDeleteDeltaFilesList(new Segment(seg), blockName); +// only when SegmentUpdateDetails contain the specified block +// will the method getDeleteDeltaFilesList be executed +List blockNameList = segmentUpdateStatusManager.getBlockNameFromSegment(seg); +Map> blockAndDeleteDeltaFilesMap = new HashMap<>(); +CarbonFile[] deleteDeltaFiles = null; Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r507380031 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1246,8 +1209,22 @@ public static boolean isHorizontalCompactionEnabled() { // set the update status. segmentUpdateStatusManager.setUpdateStatusDetails(segmentUpdateDetails); -CarbonFile[] deleteDeltaFiles = -segmentUpdateStatusManager.getDeleteDeltaFilesList(new Segment(seg), blockName); +// only when SegmentUpdateDetails contain the specified block +// will the method getDeleteDeltaFilesList be executed +List blockNameList = segmentUpdateStatusManager.getBlockNameFromSegment(seg); +Map> blockAndDeleteDeltaFilesMap = new HashMap<>(); +CarbonFile[] deleteDeltaFiles = null; +if (blockNameList.contains(blockName)) { + blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesList(new Segment(seg)); +} +if (blockAndDeleteDeltaFilesMap.containsKey(blockName)) { + List deleteDeltaFileList = blockAndDeleteDeltaFilesMap.get(blockName); + deleteDeltaFiles = deleteDeltaFileList.toArray(new CarbonFile[deleteDeltaFileList.size()]); +} + +// CarbonFile[] deleteDeltaFiles = Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
nihal0107 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711465736 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai commented on pull request #3934: [WIP] Support Global Unique Id for SegmentNo
QiangCai commented on pull request #3934: URL: https://github.com/apache/carbondata/pull/3934#issuecomment-711460106 please close this PR and raise another PR to fix the listFiles issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…
marchpure commented on pull request #3981: URL: https://github.com/apache/carbondata/pull/3981#issuecomment-711459583 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
marchpure commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711459338 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org