[GitHub] [carbondata] Karan980 commented on a change in pull request #3974: [Carbondata-3999] Fix permission issue of indexServerTmp directory
Karan980 commented on a change in pull request #3974: URL: https://github.com/apache/carbondata/pull/3974#discussion_r505237702 ## File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java ## @@ -3250,14 +3250,14 @@ public static String getIndexServerTempPath() { public static CarbonFile createTempFolderForIndexServer(String queryId) throws IOException { final String path = getIndexServerTempPath(); +if (!FileFactory.isFileExist(path)) { + // Create the new index server temp directory if it does not exist + LOGGER.info("Creating Index Server temp folder:" + path); + FileFactory + .createDirectoryAndSetPermission(path, + new FsPermission(FsAction.ALL, FsAction.ALL, FsAction.ALL)); Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708958736 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4454/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-708960718 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2704/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI
CarbonDataQA1 commented on pull request #3948: URL: https://github.com/apache/carbondata/pull/3948#issuecomment-708963061 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2701/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI
CarbonDataQA1 commented on pull request #3948: URL: https://github.com/apache/carbondata/pull/3948#issuecomment-708965531 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4455/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata
CarbonDataQA1 commented on pull request #3917: URL: https://github.com/apache/carbondata/pull/3917#issuecomment-708966898 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4456/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708968372 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2700/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akkio-97 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command
akkio-97 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-708988105 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901]corrected the documentation
CarbonDataQA1 commented on pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#issuecomment-708992189 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4457/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
Karan980 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708993120 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata
CarbonDataQA1 commented on pull request #3917: URL: https://github.com/apache/carbondata/pull/3917#issuecomment-708995578 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2702/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901]corrected the documentation
CarbonDataQA1 commented on pull request #3980: URL: https://github.com/apache/carbondata/pull/3980#issuecomment-708997435 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2703/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-708998997 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4458/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-709021329 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2705/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-709030440 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4459/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command
CarbonDataQA1 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-709087907 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4460/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-709105593 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4461/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK
CarbonDataQA1 commented on pull request #3970: URL: https://github.com/apache/carbondata/pull/3970#issuecomment-709116721 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2707/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command
CarbonDataQA1 commented on pull request #3967: URL: https://github.com/apache/carbondata/pull/3967#issuecomment-709180579 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2706/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 opened a new pull request #3985: [WIP]Fixed float variable to take 4 bytes in case of adaptive encoding
nihal0107 opened a new pull request #3985: URL: https://github.com/apache/carbondata/pull/3985 ### Why is this PR needed? Currently, float variables are using long value 8 bytes to store float data. ### What changes were proposed in this PR? Handled the float variables to take 4 bytes to store float data. ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [WIP]Fixed float variable to take 4 bytes in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709200276 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4463/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [WIP]Fixed float variable to take 4 bytes in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709204135 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2709/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
CarbonDataQA1 commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-709263365 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4462/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
CarbonDataQA1 commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-709265320 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2708/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-4034) Improve the time-comsuming of Horizontal Compaction for update
Jiayu Shen created CARBONDATA-4034: -- Summary: Improve the time-comsuming of Horizontal Compaction for update Key: CARBONDATA-4034 URL: https://issues.apache.org/jira/browse/CARBONDATA-4034 Project: CarbonData Issue Type: Bug Reporter: Jiayu Shen In the update flow, horizontal compaction will be significantly slower when updating with a lot of segments(or a lot of blocks). There is a case whose costing is as shown in the log. 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation completed for [ods_oms.oms_wh_outbound_order]. 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation completed for [ods_oms.oms_wh_outbound_order]. In this PR, we optimize the process between second and third row of the log, by optimizing the method _performDeleteDeltaCompaction_ in horizontal compaction flow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4034) Improve the time-comsuming of Horizontal Compaction for update
[ https://issues.apache.org/jira/browse/CARBONDATA-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayu Shen updated CARBONDATA-4034: --- Description: In the update flow, horizontal compaction will be significantly slower when updating with a lot of segments(or a lot of blocks). There is a case whose costing is as shown in the log. 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation completed for [ods_oms.oms_wh_outbound_order]. 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation completed for [ods_oms.oms_wh_outbound_order]. In this PR, we optimize the process between second and third row of the log, by optimizing the method _performDeleteDeltaCompaction_ in horizontal compaction flow. was: In the update flow, horizontal compaction will be significantly slower when updating with a lot of segments(or a lot of blocks). There is a case whose costing is as shown in the log. 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation completed for [ods_oms.oms_wh_outbound_order]. 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation completed for [ods_oms.oms_wh_outbound_order]. In this PR, we optimize the process between second and third row of the log, by optimizing the method _performDeleteDeltaCompaction_ in horizontal compaction flow. > Improve the time-comsuming of Horizontal Compaction for update > -- > > Key: CARBONDATA-4034 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4034 > Project: CarbonData > Issue Type: Bug >Reporter: Jiayu Shen >Priority: Minor > > In the update flow, horizontal compaction will be significantly slower when > updating with a lot of segments(or a lot of blocks). > There is a case whose costing is as shown in the log. > 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Update Compaction operation started for > [ods_oms.oms_wh_outbound_order] > 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Update Compaction operation completed for > [ods_oms.oms_wh_outbound_order]. > 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Delete Compaction operation started for > [ods_oms.oms_wh_outbound_order] > 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Delete Compaction operation completed for > [ods_oms.oms_wh_outbound_order]. > In this PR, we optimize the process between second and third row of the log, > by optimizing the method _performDeleteDeltaCompaction_ in horizontal > compaction flow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4034) Improve the time-comsuming of Horizontal Compaction for update
[ https://issues.apache.org/jira/browse/CARBONDATA-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayu Shen updated CARBONDATA-4034: --- Description: In the update flow, horizontal compaction will be significantly slower when updating with a lot of segments(or a lot of blocks). There is a case whose costing is as shown in the log. 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation completed for [ods_oms.oms_wh_outbound_order]. 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation completed for [ods_oms.oms_wh_outbound_order]. In this PR, we optimize the process between second and third row of the log, by optimizing the method _performDeleteDeltaCompaction_ in horizontal compaction flow. was: In the update flow, horizontal compaction will be significantly slower when updating with a lot of segments(or a lot of blocks). There is a case whose costing is as shown in the log. 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | Horizontal Update Compaction operation completed for [ods_oms.oms_wh_outbound_order]. 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation started for [ods_oms.oms_wh_outbound_order] 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | Horizontal Delete Compaction operation completed for [ods_oms.oms_wh_outbound_order]. In this PR, we optimize the process between second and third row of the log, by optimizing the method _performDeleteDeltaCompaction_ in horizontal compaction flow. > Improve the time-comsuming of Horizontal Compaction for update > -- > > Key: CARBONDATA-4034 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4034 > Project: CarbonData > Issue Type: Bug >Reporter: Jiayu Shen >Priority: Minor > > In the update flow, horizontal compaction will be significantly slower when > updating with a lot of segments(or a lot of blocks). There is a case whose > costing is as shown in the log. > 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Update Compaction operation started for > [ods_oms.oms_wh_outbound_order] > 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Update Compaction operation completed for > [ods_oms.oms_wh_outbound_order]. > 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Delete Compaction operation started for > [ods_oms.oms_wh_outbound_order] > 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Delete Compaction operation completed for > [ods_oms.oms_wh_outbound_order]. > In this PR, we optimize the process between second and third row of the log, > by optimizing the method _performDeleteDeltaCompaction_ in horizontal > compaction flow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4034) Improve the time-consuming of Horizontal Compaction for update
[ https://issues.apache.org/jira/browse/CARBONDATA-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayu Shen updated CARBONDATA-4034: --- Summary: Improve the time-consuming of Horizontal Compaction for update (was: Improve the time-comsuming of Horizontal Compaction for update) > Improve the time-consuming of Horizontal Compaction for update > -- > > Key: CARBONDATA-4034 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4034 > Project: CarbonData > Issue Type: Bug >Reporter: Jiayu Shen >Priority: Minor > > In the update flow, horizontal compaction will be significantly slower when > updating with a lot of segments(or a lot of blocks). There is a case whose > costing is as shown in the log. > 2020-10-10 09:38:10,466 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Update Compaction operation started for > [ods_oms.oms_wh_outbound_order] > 2020-10-10 09:50:25,718 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Update Compaction operation completed for > [ods_oms.oms_wh_outbound_order]. > 2020-10-10 10:15:44,302 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Delete Compaction operation started for > [ods_oms.oms_wh_outbound_order] > 2020-10-10 10:15:54,874 | INFO | [OperationManager-Background-Pool-28] | > Horizontal Delete Compaction operation completed for > [ods_oms.oms_wh_outbound_order]. > In this PR, we optimize the process between second and third row of the log, > by optimizing the method _performDeleteDeltaCompaction_ in horizontal > compaction flow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [WIP]Fixed float variable to take 4 bytes in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709317970 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4464/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 opened a new pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 opened a new pull request #3986: URL: https://github.com/apache/carbondata/pull/3986 ### Why is this PR needed? The horizontal compaction flow will be too slow when updating with lots of segments(or lots of blocks), so we try to analyze and optimize it for time-consuming problem. ### What changes were proposed in this PR? In performDeleteDeltaCompaction, optimize the method getSegListIUDCompactionQualified. Combine two traversals of segments which have same process, and move listFiles process outside the traversal of blocks. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [WIP]Fixed float variable to take 4 bytes in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709323198 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2710/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709323928 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-709379641 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2711/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-709387841 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4465/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
ajantha-bhat commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709454528 Add to whitelist This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709457343 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2712/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
marchpure commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r505670979 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1210,6 +1198,39 @@ private static boolean checkDeleteDeltaFilesInSeg(Segment seg, return blockLists; } + private static List checkAndGetDeleteDeltaFilesInSeg(Segment seg, + SegmentUpdateStatusManager segmentUpdateStatusManager, int numberDeltaFilesThreshold) { + +List blockLists = new ArrayList<>(); + +Map> blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesForSegment(seg); + +List blockNameList = + segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo()); + +Set uniqueBlocks = new HashSet(); +for (final String blockName : blockNameList) { + + List deleteDeltaFiles = blockAndDeleteDeltaFilesMap.get(blockName); + + if (null != deleteDeltaFiles) { +for (CarbonFile blocks : deleteDeltaFiles) { Review comment: if (delteDeltaFiles.size < threshold) continue ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1039,22 +1039,10 @@ private static boolean isSegmentValid(LoadMetadataDetails seg) { if (CompactionType.IUD_DELETE_DELTA == compactionTypeIUD) { int numberDeleteDeltaFilesThreshold = CarbonProperties.getInstance().getNoDeleteDeltaFilesThresholdForIUDCompaction(); - List deleteSegments = new ArrayList<>(); for (Segment seg : segments) { -if (checkDeleteDeltaFilesInSeg(seg, segmentUpdateStatusManager, Review comment: remove checkDeleteDeltaFilesInSeg function ## File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java ## @@ -455,6 +455,51 @@ public boolean accept(CarbonFile pathName) { return null; } + public Map> getDeleteDeltaFilesForSegment(final Segment seg) { Review comment: remove getDeleteDeltaFilesList function ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1210,6 +1198,39 @@ private static boolean checkDeleteDeltaFilesInSeg(Segment seg, return blockLists; } + private static List checkAndGetDeleteDeltaFilesInSeg(Segment seg, + SegmentUpdateStatusManager segmentUpdateStatusManager, int numberDeltaFilesThreshold) { + +List blockLists = new ArrayList<>(); + +Map> blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesForSegment(seg); + +List blockNameList = + segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo()); + +Set uniqueBlocks = new HashSet(); +for (final String blockName : blockNameList) { + + List deleteDeltaFiles = blockAndDeleteDeltaFilesMap.get(blockName); + + if (null != deleteDeltaFiles) { +for (CarbonFile blocks : deleteDeltaFiles) { Review comment: blocks -> block ## File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java ## @@ -455,6 +455,51 @@ public boolean accept(CarbonFile pathName) { return null; } + public Map> getDeleteDeltaFilesForSegment(final Segment seg) { +String segmentPath = CarbonTablePath.getSegmentPath( + identifier.getTablePath(), seg.getSegmentNo()); +CarbonFile segDir = FileFactory.getCarbonFile(segmentPath); +CarbonFile[] allDeleteDeltaFilesOfSegment = segDir.listFiles(new CarbonFileFilter() { + @Override + public boolean accept(CarbonFile pathName) { +String fileName = pathName.getName(); +return (pathName.getSize() > 0) && Review comment: getSize() will trigger one S3 IO. remove getsSize() ## File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java ## @@ -455,6 +455,51 @@ public boolean accept(CarbonFile pathName) { return null; } + public Map> getDeleteDeltaFilesForSegment(final Segment seg) { +String segmentPath = CarbonTablePath.getSegmentPath( + identifier.getTablePath(), seg.getSegmentNo()); Review comment: if SegmentUpdateDetails donot contains seg, we shall return empty result directly. which can save a lot of IO overhead ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1210,6 +1198,39 @@ private static boolean checkDeleteDeltaFilesInSeg(Segment seg, return blockLists; } + private static List checkAndGetDeleteDeltaFilesInSeg(Segment seg, + SegmentUpdateStatusManager segmentUpdateStatusManager, int numberDeltaFilesThreshold) { + +List blockLists = new ArrayList<>(); + +Map> blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFi
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709459321 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4466/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
marchpure commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709459684 checkstyle failes. you can have a checkstyple test in your local env by run: mvn clean install -DskipTests metions: don't use mvn clean package -DskipTests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
marchpure commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-709674010 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
marchpure commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709674242 add some logs and comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709688032 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4468/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
CarbonDataQA1 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709694022 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2714/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…
marchpure commented on pull request #3981: URL: https://github.com/apache/carbondata/pull/3981#issuecomment-709698383 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…
marchpure commented on pull request #3977: URL: https://github.com/apache/carbondata/pull/3977#issuecomment-709698300 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Zhangshunyu commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
Zhangshunyu commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709698583 Have you ever tested this optimization? Could you pls give a comparison result for this change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
CarbonDataQA1 commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-709705817 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4469/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r506020436 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1039,22 +1039,10 @@ private static boolean isSegmentValid(LoadMetadataDetails seg) { if (CompactionType.IUD_DELETE_DELTA == compactionTypeIUD) { int numberDeleteDeltaFilesThreshold = CarbonProperties.getInstance().getNoDeleteDeltaFilesThresholdForIUDCompaction(); - List deleteSegments = new ArrayList<>(); for (Segment seg : segments) { -if (checkDeleteDeltaFilesInSeg(seg, segmentUpdateStatusManager, Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r506020565 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1210,6 +1198,39 @@ private static boolean checkDeleteDeltaFilesInSeg(Segment seg, return blockLists; } + private static List checkAndGetDeleteDeltaFilesInSeg(Segment seg, + SegmentUpdateStatusManager segmentUpdateStatusManager, int numberDeltaFilesThreshold) { + +List blockLists = new ArrayList<>(); + +Map> blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesForSegment(seg); + +List blockNameList = + segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo()); + +Set uniqueBlocks = new HashSet(); +for (final String blockName : blockNameList) { + + List deleteDeltaFiles = blockAndDeleteDeltaFilesMap.get(blockName); + + if (null != deleteDeltaFiles) { +for (CarbonFile blocks : deleteDeltaFiles) { + String task = CarbonTablePath.DataFileUtil.getTaskNo(blocks.getName()); + String timestamp = + CarbonTablePath.DataFileUtil.getTimeStampFromDeleteDeltaFile(blocks.getName()); + String taskAndTimeStamp = task + "-" + timestamp; + uniqueBlocks.add(taskAndTimeStamp); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue
CarbonDataQA1 commented on pull request #3982: URL: https://github.com/apache/carbondata/pull/3982#issuecomment-709706349 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2715/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r506020950 ## File path: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java ## @@ -455,6 +455,51 @@ public boolean accept(CarbonFile pathName) { return null; } + public Map> getDeleteDeltaFilesForSegment(final Segment seg) { +String segmentPath = CarbonTablePath.getSegmentPath( + identifier.getTablePath(), seg.getSegmentNo()); +CarbonFile segDir = FileFactory.getCarbonFile(segmentPath); +CarbonFile[] allDeleteDeltaFilesOfSegment = segDir.listFiles(new CarbonFileFilter() { + @Override + public boolean accept(CarbonFile pathName) { +String fileName = pathName.getName(); +return (pathName.getSize() > 0) && Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r506020781 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1210,6 +1198,39 @@ private static boolean checkDeleteDeltaFilesInSeg(Segment seg, return blockLists; } + private static List checkAndGetDeleteDeltaFilesInSeg(Segment seg, + SegmentUpdateStatusManager segmentUpdateStatusManager, int numberDeltaFilesThreshold) { + +List blockLists = new ArrayList<>(); + +Map> blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesForSegment(seg); + +List blockNameList = + segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo()); + +Set uniqueBlocks = new HashSet(); +for (final String blockName : blockNameList) { + + List deleteDeltaFiles = blockAndDeleteDeltaFilesMap.get(blockName); + + if (null != deleteDeltaFiles) { +for (CarbonFile blocks : deleteDeltaFiles) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
shenjiayu17 commented on a change in pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#discussion_r506020858 ## File path: processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java ## @@ -1210,6 +1198,39 @@ private static boolean checkDeleteDeltaFilesInSeg(Segment seg, return blockLists; } + private static List checkAndGetDeleteDeltaFilesInSeg(Segment seg, + SegmentUpdateStatusManager segmentUpdateStatusManager, int numberDeltaFilesThreshold) { + +List blockLists = new ArrayList<>(); + +Map> blockAndDeleteDeltaFilesMap = + segmentUpdateStatusManager.getDeleteDeltaFilesForSegment(seg); + +List blockNameList = + segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo()); + +Set uniqueBlocks = new HashSet(); +for (final String blockName : blockNameList) { + + List deleteDeltaFiles = blockAndDeleteDeltaFilesMap.get(blockName); + + if (null != deleteDeltaFiles) { +for (CarbonFile blocks : deleteDeltaFiles) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709707004 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4474/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709707683 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2720/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-709708185 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4470/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709711202 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4475/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709712170 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2721/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code
CarbonDataQA1 commented on pull request #3950: URL: https://github.com/apache/carbondata/pull/3950#issuecomment-709735909 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2716/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
nihal0107 commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709745588 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-709762201 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2717/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…
CarbonDataQA1 commented on pull request #3981: URL: https://github.com/apache/carbondata/pull/3981#issuecomment-709765901 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4472/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3981: [CARBONDATA-4031] Incorrect query result after Update/Delete and Inse…
CarbonDataQA1 commented on pull request #3981: URL: https://github.com/apache/carbondata/pull/3981#issuecomment-709770210 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2718/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…
CarbonDataQA1 commented on pull request #3977: URL: https://github.com/apache/carbondata/pull/3977#issuecomment-709772789 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2719/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3977: [CARBONDATA-4027] Fix the wrong modifiedtime of loading files in inse…
CarbonDataQA1 commented on pull request #3977: URL: https://github.com/apache/carbondata/pull/3977#issuecomment-709780358 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4473/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709785844 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4476/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI
ajantha-bhat commented on pull request #3948: URL: https://github.com/apache/carbondata/pull/3948#issuecomment-709786898 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
CarbonDataQA1 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-709794950 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4471/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update
CarbonDataQA1 commented on pull request #3986: URL: https://github.com/apache/carbondata/pull/3986#issuecomment-709805701 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2722/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
ajantha-bhat commented on a change in pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#discussion_r506065671 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/statistics/PrimitivePageStatsCollector.java ## @@ -256,7 +256,7 @@ private int getDecimalCount(double value) { } private int getDecimalCount(float value) { -return getDecimalCount((double) value); +return getDecimalCount(Double.parseDouble(Float.toString(value))); Review comment: In `AdaptiveDeltaFloatingCodec.java` line please remove line 323 to 328. As this code is added as a cover up to this issue. ``` } else if (pageDataType == DataTypes.LONG) { int size = pageSize * longSizeInBytes; for (int i = 0; i < size; i += longSizeInBytes) { vector.putDouble(rowId++, (max - ByteUtil.toLongLittleEndian(pageData, i)) / factor); } } ``` Similarly in `AdaptiveFloatingCodec` 309 to 313 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
nihal0107 commented on a change in pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#discussion_r506066673 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/statistics/PrimitivePageStatsCollector.java ## @@ -256,7 +256,7 @@ private int getDecimalCount(double value) { } private int getDecimalCount(float value) { -return getDecimalCount((double) value); +return getDecimalCount(Double.parseDouble(Float.toString(value))); Review comment: removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding
ajantha-bhat commented on pull request #3985: URL: https://github.com/apache/carbondata/pull/3985#issuecomment-709810787 LGTM. Can merge once the build is passed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example
kunal642 commented on pull request #3914: URL: https://github.com/apache/carbondata/pull/3914#issuecomment-709835508 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter
kunal642 commented on a change in pull request #3932: URL: https://github.com/apache/carbondata/pull/3932#discussion_r506085998 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/CarbonTakeOrderedAndProjectExec.scala ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution + +import org.apache.spark.rdd.RDD +import org.apache.spark.serializer.Serializer +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions.{Attribute, NamedExpression, SortOrder, UnsafeProjection} +import org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering +import org.apache.spark.sql.catalyst.plans.physical.{Partitioning, SinglePartition} +import org.apache.spark.sql.execution.exchange.ShuffleExchangeExec +import org.apache.spark.util.Utils + +// To skip the order at map task +case class CarbonTakeOrderedAndProjectExec( +limit: Int, +sortOrder: Seq[SortOrder], +projectList: Seq[NamedExpression], +child: SparkPlan, +skipMapOrder: Boolean = false, +readFromHead: Boolean = true) extends UnaryExecNode { Review comment: CarbonTakeOrderedAndProjectExec should extend TakeOrderedAndProjectExec, and unmodified methods should not be overridden This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3932: [CARBONDATA-3994] Skip Order by for map task if it is a first sort column and use limit pushdown for array_contains filter
kunal642 commented on a change in pull request #3932: URL: https://github.com/apache/carbondata/pull/3932#discussion_r506086546 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala ## @@ -984,6 +988,80 @@ private[sql] class CarbonLateDecodeStrategy extends SparkStrategy { null)(sparkSession) } } + + object ExtractTakeOrderedAndProjectExec { + +def unapply(plan: LogicalPlan): Option[CarbonTakeOrderedAndProjectExec] = { + val allRelations = plan.collect { case logicalRelation: LogicalRelation => logicalRelation } + // push down order by limit to carbon map task, + // only when there are only one CarbonDatasourceHadoopRelation + if (allRelations.size != 1 || + allRelations.exists(x => !x.relation.isInstanceOf[CarbonDatasourceHadoopRelation])) { +return None + } + // check and Replace TakeOrderedAndProject with CarbonTakeOrderedAndProjectExec. + val relation = allRelations.head.relation.asInstanceOf[CarbonDatasourceHadoopRelation] + val sparkPlan = plan match { +case ReturnAnswer(rootPlan) => rootPlan match { + case Limit(IntegerLiteral(limit), Sort(order, true, child)) => +TakeOrderedAndProjectExec(limit, + order, + child.output, + planLater(pushLimit(limit, child))) + case Limit(IntegerLiteral(limit), Project(projectList, Sort(order, true, child))) => +TakeOrderedAndProjectExec(limit, order, projectList, planLater(pushLimit(limit, child))) Review comment: instead of TakeOrderedAndProjectExec, we should directly prepare CarbonTakeOrderedAndProjectExec instance based on the checks below. I feel this TakeOrderedAndProjectExec creation is unnecessary This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org