[GitHub] [hudi] codecov-commenter commented on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException
codecov-commenter commented on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878802290 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3261](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (afe140f) into [master](https://codecov.io/gh/apache/hudi/commit/c8a2033c275e21a752893fc89311e1f6846f5a78?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (c8a2033) will **increase** coverage by `3.42%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3261/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3261 +/- ## + Coverage 47.71% 51.13% +3.42% + Complexity 5526 417-5109 Files 934 67 -867 Lines 41456 3049 -38407 Branches 4167 330-3837 - Hits 19779 1559 -18220 + Misses19917 1350 -18567 + Partials 1760 140-1620 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `?` | | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `?` | | | huditimelineservice | `?` | | | hudiutilities | `51.13% <ø> (-8.11%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `5.17% <0.00%> (-83.63%)` | :arrow_down: | | [...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | `0.00% <0.00%> (-72.23%)` | :arrow_down: | | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-71.82%)` | :arrow_down: | | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | `0.00% <0.00%> (-66.67%)` | :arrow_down: | | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) |
[jira] [Commented] (HUDI-2150) Rename/Restructure configs for better modularity
[ https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379612#comment-17379612 ] Vinoth Chandar commented on HUDI-2150: -- This should be renamed consistent with base file terminlogy {code:java} public static final ConfigProperty PARQUET_SMALL_FILE_LIMIT_BYTES = ConfigProperty .key("hoodie.parquet.small.file.limit") .defaultValue(String.valueOf(104857600)) .withDocumentation("Upsert uses this file size to compact new data onto existing files. " + "By default, treat any file <= 100MB as a small file.");{code} > Rename/Restructure configs for better modularity > > > Key: HUDI-2150 > URL: https://issues.apache.org/jira/browse/HUDI-2150 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > Given we have a framework now, that can capture configs and even their > alternatives well, time to clean things up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HUDI-2150) Rename/Restructure configs for better modularity
[ https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379561#comment-17379561 ] Vinoth Chandar edited comment on HUDI-2150 at 7/13/21, 5:57 AM: Cleaner related configs to be moved out of HoodieCompactionConfig into its own HoodieCleanConfig. Archival related configs to be moved out of HoodieCompactionConfig into its own HoodieArchivalConfig. was (Author: vc): Cleaner related configs to be moved out of HoodieCompactionConfig into its own HoodieCleanConfig > Rename/Restructure configs for better modularity > > > Key: HUDI-2150 > URL: https://issues.apache.org/jira/browse/HUDI-2150 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > Given we have a framework now, that can capture configs and even their > alternatives well, time to clean things up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2168) AccessControlException for anonymous user
[ https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379611#comment-17379611 ] ASF GitHub Bot commented on HUDI-2168: -- hudi-bot edited a comment on pull request #3264: URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938 ## CI report: * e8e5e310224eee469a19bcfe7af537154843c318 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=877) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > AccessControlException for anonymous user > - > > Key: HUDI-2168 > URL: https://issues.apache.org/jira/browse/HUDI-2168 > Project: Apache Hudi > Issue Type: Task > Components: Testing >Reporter: Vinay >Assignee: Vinay >Priority: Trivial > Labels: pull-request-available > > Users are facing the following exception while executing test case dependent > on starting Hive service > > {code:java} > Got exception: org.apache.hadoop.security.AccessControlException Permission > denied: user=anonymous, access=WRITE > {code} > This is specifically happening at the time of clearing Hive DB > {code:java} > client.updateHiveSQL("drop database if exists " + > hiveSyncConfig.databaseName); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3264: [HUDI-2168] Fix for AccessControlException for anonymous user
hudi-bot edited a comment on pull request #3264: URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938 ## CI report: * e8e5e310224eee469a19bcfe7af537154843c318 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=877) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1548) Fix documentation around schema evolution
[ https://issues.apache.org/jira/browse/HUDI-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379610#comment-17379610 ] ASF GitHub Bot commented on HUDI-1548: -- codope commented on pull request #3257: URL: https://github.com/apache/hudi/pull/3257#issuecomment-878800281 @vinothchandar @n3nash @nsivabalan Can you please review the doc? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix documentation around schema evolution > -- > > Key: HUDI-1548 > URL: https://issues.apache.org/jira/browse/HUDI-1548 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Reporter: sivabalan narayanan >Assignee: Nishith Agarwal >Priority: Blocker > Labels: ', pull-request-available, sev:high, user-support-issues > Fix For: 0.9.0 > > > Clearly call out what kind of schema evolution is supported by hudi in > documentation . > Context: https://github.com/apache/hudi/issues/2331 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codope commented on pull request #3257: [HUDI-1548] Add documentation for schema evolution
codope commented on pull request #3257: URL: https://github.com/apache/hudi/pull/3257#issuecomment-878800281 @vinothchandar @n3nash @nsivabalan Can you please review the doc? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2168) AccessControlException for anonymous user
[ https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379609#comment-17379609 ] ASF GitHub Bot commented on HUDI-2168: -- hudi-bot commented on pull request #3264: URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938 ## CI report: * e8e5e310224eee469a19bcfe7af537154843c318 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > AccessControlException for anonymous user > - > > Key: HUDI-2168 > URL: https://issues.apache.org/jira/browse/HUDI-2168 > Project: Apache Hudi > Issue Type: Task > Components: Testing >Reporter: Vinay >Assignee: Vinay >Priority: Trivial > Labels: pull-request-available > > Users are facing the following exception while executing test case dependent > on starting Hive service > > {code:java} > Got exception: org.apache.hadoop.security.AccessControlException Permission > denied: user=anonymous, access=WRITE > {code} > This is specifically happening at the time of clearing Hive DB > {code:java} > client.updateHiveSQL("drop database if exists " + > hiveSyncConfig.databaseName); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379608#comment-17379608 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot commented on pull request #3264: [HUDI-2168] Fix for AccessControlException for anonymous user
hudi-bot commented on pull request #3264: URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938 ## CI report: * e8e5e310224eee469a19bcfe7af537154843c318 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2168) AccessControlException for anonymous user
[ https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379607#comment-17379607 ] ASF GitHub Bot commented on HUDI-2168: -- veenaypatil opened a new pull request #3264: URL: https://github.com/apache/hudi/pull/3264 ## What is the purpose of the pull request To fix access control exception while running the test cases which involves starting the Hive service ## Brief change log Set config ``` config.setBoolean("dfs.permissions",false); ``` ## Verify this pull request This pull request is a trivial rework / code cleanup without any test coverage. - Verified the tests are running locally after this change ## Committer checklist - [X] Has a corresponding JIRA in PR title & commit - [X] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > AccessControlException for anonymous user > - > > Key: HUDI-2168 > URL: https://issues.apache.org/jira/browse/HUDI-2168 > Project: Apache Hudi > Issue Type: Task > Components: Testing >Reporter: Vinay >Assignee: Vinay >Priority: Trivial > > Users are facing the following exception while executing test case dependent > on starting Hive service > > {code:java} > Got exception: org.apache.hadoop.security.AccessControlException Permission > denied: user=anonymous, access=WRITE > {code} > This is specifically happening at the time of clearing Hive DB > {code:java} > client.updateHiveSQL("drop database if exists " + > hiveSyncConfig.databaseName); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2168) AccessControlException for anonymous user
[ https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2168: - Labels: pull-request-available (was: ) > AccessControlException for anonymous user > - > > Key: HUDI-2168 > URL: https://issues.apache.org/jira/browse/HUDI-2168 > Project: Apache Hudi > Issue Type: Task > Components: Testing >Reporter: Vinay >Assignee: Vinay >Priority: Trivial > Labels: pull-request-available > > Users are facing the following exception while executing test case dependent > on starting Hive service > > {code:java} > Got exception: org.apache.hadoop.security.AccessControlException Permission > denied: user=anonymous, access=WRITE > {code} > This is specifically happening at the time of clearing Hive DB > {code:java} > client.updateHiveSQL("drop database if exists " + > hiveSyncConfig.databaseName); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] veenaypatil opened a new pull request #3264: [HUDI-2168] Fix for AccessControlException for anonymous user
veenaypatil opened a new pull request #3264: URL: https://github.com/apache/hudi/pull/3264 ## What is the purpose of the pull request To fix access control exception while running the test cases which involves starting the Hive service ## Brief change log Set config ``` config.setBoolean("dfs.permissions",false); ``` ## Verify this pull request This pull request is a trivial rework / code cleanup without any test coverage. - Verified the tests are running locally after this change ## Committer checklist - [X] Has a corresponding JIRA in PR title & commit - [X] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379600#comment-17379600 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876) * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876) * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379599#comment-17379599 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876) * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876) * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379596#comment-17379596 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379595#comment-17379595 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) * e56bac615f087cec7817b846809c9f8fd0cc20a5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) * e56bac615f087cec7817b846809c9f8fd0cc20a5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379593#comment-17379593 ] ASF GitHub Bot commented on HUDI-2153: -- hudi-bot edited a comment on pull request #3263: URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248 ## CI report: * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException
hudi-bot edited a comment on pull request #3263: URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248 ## CI report: * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379586#comment-17379586 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2168) AccessControlException for anonymous user
[ https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HUDI-2168: Status: In Progress (was: Open) > AccessControlException for anonymous user > - > > Key: HUDI-2168 > URL: https://issues.apache.org/jira/browse/HUDI-2168 > Project: Apache Hudi > Issue Type: Task > Components: Testing >Reporter: Vinay >Assignee: Vinay >Priority: Trivial > > Users are facing the following exception while executing test case dependent > on starting Hive service > > {code:java} > Got exception: org.apache.hadoop.security.AccessControlException Permission > denied: user=anonymous, access=WRITE > {code} > This is specifically happening at the time of clearing Hive DB > {code:java} > client.updateHiveSQL("drop database if exists " + > hiveSyncConfig.databaseName); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-2168) AccessControlException for anonymous user
Vinay created HUDI-2168: --- Summary: AccessControlException for anonymous user Key: HUDI-2168 URL: https://issues.apache.org/jira/browse/HUDI-2168 Project: Apache Hudi Issue Type: Task Components: Testing Reporter: Vinay Assignee: Vinay Users are facing the following exception while executing test case dependent on starting Hive service {code:java} Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=anonymous, access=WRITE {code} This is specifically happening at the time of clearing Hive DB {code:java} client.updateHiveSQL("drop database if exists " + hiveSyncConfig.databaseName); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379584#comment-17379584 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1985) Website re-design implementation
[ https://issues.apache.org/jira/browse/HUDI-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379583#comment-17379583 ] Vinoth Govindarajan commented on HUDI-1985: --- Hi [~xushiyan], I have experience in the past building websites, I can volunteer to work on this re-design. > Website re-design implementation > > > Key: HUDI-1985 > URL: https://issues.apache.org/jira/browse/HUDI-1985 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Reporter: Raymond Xu >Priority: Blocker > Labels: documentation > Fix For: 0.9.0 > > > To provide better navigation and organization of Hudi website's info, we have > done a re-design of the web pages. > Previous discussion > [https://github.com/apache/hudi/issues/2905] > > See the wireframe and final design in > [https://www.figma.com/file/tipod1JZRw7anZRWBI6sZh/Hudi.Apache?node-id=32%3A6] > (login Figma to comment) > The design is ready for implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379580#comment-17379580 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379577#comment-17379577 ] ASF GitHub Bot commented on HUDI-2153: -- hudi-bot edited a comment on pull request #3263: URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248 ## CI report: * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException
hudi-bot edited a comment on pull request #3263: URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248 ## CI report: * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379576#comment-17379576 ] ASF GitHub Bot commented on HUDI-2153: -- danny0405 commented on a change in pull request #3263: URL: https://github.com/apache/hudi/pull/3263#discussion_r668418303 ## File path: hudi-flink/src/main/java/org/apache/hudi/streamer/HoodieFlinkStreamer.java ## @@ -109,7 +109,7 @@ public static void main(String[] args) throws Exception { .transform( "bucket_assigner", TypeInformation.of(HoodieRecord.class), -new KeyedProcessOperator<>(new BucketAssignFunction<>(conf))) +new BucketAssignOperator<>(new BucketAssignFunction<>(conf))) .setParallelism(conf.getInteger(FlinkOptions.BUCKET_ASSIGN_TASKS)) Review comment: Nice catch, can we fix the indentation ? And there is another PR same with this, can we close that ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] danny0405 commented on a change in pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException
danny0405 commented on a change in pull request #3263: URL: https://github.com/apache/hudi/pull/3263#discussion_r668418303 ## File path: hudi-flink/src/main/java/org/apache/hudi/streamer/HoodieFlinkStreamer.java ## @@ -109,7 +109,7 @@ public static void main(String[] args) throws Exception { .transform( "bucket_assigner", TypeInformation.of(HoodieRecord.class), -new KeyedProcessOperator<>(new BucketAssignFunction<>(conf))) +new BucketAssignOperator<>(new BucketAssignFunction<>(conf))) .setParallelism(conf.getInteger(FlinkOptions.BUCKET_ASSIGN_TASKS)) Review comment: Nice catch, can we fix the indentation ? And there is another PR same with this, can we close that ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379575#comment-17379575 ] ASF GitHub Bot commented on HUDI-2153: -- hudi-bot commented on pull request #3263: URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248 ## CI report: * f1299ed52dcf90635d4f11fef040255cfda9f35b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot commented on pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException
hudi-bot commented on pull request #3263: URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248 ## CI report: * f1299ed52dcf90635d4f11fef040255cfda9f35b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379572#comment-17379572 ] ASF GitHub Bot commented on HUDI-2153: -- moranyuwen opened a new pull request #3263: URL: https://github.com/apache/hudi/pull/3263 JIRA Issue: https://issues.apache.org/jira/browse/HUDI-2153 When you run HoodieFlinkStreamer to write data, the context in the bucketAssignment function load is bull, and the update resolvesthat the context is null -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] moranyuwen opened a new pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException
moranyuwen opened a new pull request #3263: URL: https://github.com/apache/hudi/pull/3263 JIRA Issue: https://issues.apache.org/jira/browse/HUDI-2153 When you run HoodieFlinkStreamer to write data, the context in the bucketAssignment function load is bull, and the update resolvesthat the context is null -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] branch master updated: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config (#3250)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new b0089b8 [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config (#3250) b0089b8 is described below commit b0089b894ad12da11fbd6a0fb08508c7adee68e6 Author: Sagar Sumit AuthorDate: Tue Jul 13 09:54:40 2021 +0530 [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config (#3250) --- .../java/org/apache/hudi/config/HoodieWriteConfig.java | 3 ++- .../java/org/apache/hudi/config/TestHoodieWriteConfig.java | 14 -- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java index 20d2846..e2e295d 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java @@ -339,8 +339,9 @@ public class HoodieWriteConfig extends HoodieConfig { .withDocumentation(""); public static final ConfigProperty EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION = ConfigProperty - .key(AVRO_SCHEMA + ".externalTransformation") + .key(AVRO_SCHEMA.key() + ".external.transformation") .defaultValue("false") + .withAlternatives(AVRO_SCHEMA.key() + ".externalTransformation") .withDocumentation(""); private ConsistencyGuardConfig consistencyGuardConfig; diff --git a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java index 7661e1d..89f7a97 100644 --- a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java +++ b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java @@ -23,6 +23,8 @@ import org.apache.hudi.config.HoodieWriteConfig.Builder; import org.apache.hudi.index.HoodieIndex; import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; @@ -33,16 +35,23 @@ import java.util.Map; import java.util.Properties; import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertTrue; public class TestHoodieWriteConfig { - @Test - public void testPropertyLoading() throws IOException { + @ParameterizedTest + @ValueSource(booleans = {true, false}) + public void testPropertyLoading(boolean withAlternative) throws IOException { Builder builder = HoodieWriteConfig.newBuilder().withPath("/tmp"); Map params = new HashMap<>(3); params.put(HoodieCompactionConfig.CLEANER_COMMITS_RETAINED_PROP.key(), "1"); params.put(HoodieCompactionConfig.MAX_COMMITS_TO_KEEP_PROP.key(), "5"); params.put(HoodieCompactionConfig.MIN_COMMITS_TO_KEEP_PROP.key(), "2"); +if (withAlternative) { + params.put("hoodie.avro.schema.externalTransformation", "true"); +} else { + params.put("hoodie.avro.schema.external.transformation", "true"); +} ByteArrayOutputStream outStream = saveParamsIntoOutputStream(params); ByteArrayInputStream inputStream = new ByteArrayInputStream(outStream.toByteArray()); try { @@ -54,6 +63,7 @@ public class TestHoodieWriteConfig { HoodieWriteConfig config = builder.build(); assertEquals(5, config.getMaxCommitsToKeep()); assertEquals(2, config.getMinCommitsToKeep()); +assertTrue(config.shouldUseExternalSchemaTransformation()); } @Test
[GitHub] [hudi] nsivabalan merged pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
nsivabalan merged pull request #3250: URL: https://github.com/apache/hudi/pull/3250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379567#comment-17379567 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379566#comment-17379566 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) * 8a212fd77769cbf7e248e971f66109381ba80f71 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) * 8a212fd77769cbf7e248e971f66109381ba80f71 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2150) Rename/Restructure configs for better modularity
[ https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2150: - Description: Given we have a framework now, that can capture configs and even their alternatives well, time to clean things up. (was: * Rename HoodieWriteConfig to HoodieClientConfig * Move bunch of configs from CompactionConfig to StorageConfig * Introduce new HoodieCleanConfig * Should we consider lombok or something to automate the defaults/getters/setters * Consistent name of properties/defaults * Enforce bounds more strictly) > Rename/Restructure configs for better modularity > > > Key: HUDI-2150 > URL: https://issues.apache.org/jira/browse/HUDI-2150 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > Given we have a framework now, that can capture configs and even their > alternatives well, time to clean things up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2150) Rename/Restructure configs for better modularity
[ https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379561#comment-17379561 ] Vinoth Chandar commented on HUDI-2150: -- Cleaner related configs to be moved out of HoodieCompactionConfig into its own HoodieCleanConfig > Rename/Restructure configs for better modularity > > > Key: HUDI-2150 > URL: https://issues.apache.org/jira/browse/HUDI-2150 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > Given we have a framework now, that can capture configs and even their > alternatives well, time to clean things up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379557#comment-17379557 ] ASF GitHub Bot commented on HUDI-2153: -- hudi-bot edited a comment on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128 ## CI report: * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException
hudi-bot edited a comment on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128 ## CI report: * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-2167) HoodieCompactionConfig get HoodieCleaningPolicy NullPointerException
tsianglei created HUDI-2167: --- Summary: HoodieCompactionConfig get HoodieCleaningPolicy NullPointerException Key: HUDI-2167 URL: https://issues.apache.org/jira/browse/HUDI-2167 Project: Apache Hudi Issue Type: Bug Components: CLI, Flink Integration Reporter: tsianglei Caused by: java.lang.NullPointerException: Name is null at java.lang.Enum.valueOf(Enum.java:236) ~[?:1.8.0_221] at org.apache.hudi.common.model.HoodieCleaningPolicy.valueOf(HoodieCleaningPolicy.java:24) ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT] at org.apache.hudi.config.HoodieCompactionConfig$Builder.build(HoodieCompactionConfig.java:368) ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT] at org.apache.hudi.util.StreamerUtil.getHoodieClientConfig(StreamerUtil.java:155) ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT] at org.apache.hudi.util.StreamerUtil.createWriteClient(StreamerUtil.java:277) ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT] at org.apache.hudi.sink.StreamWriteOperatorCoordinator.start(StreamWriteOperatorCoordinator.java:154) ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT] at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder.start(OperatorCoordinatorHolder.java:189) ~[flink-dist_2.11-1.12.2.jar:1.12.2] at org.apache.flink.runtime.scheduler.SchedulerBase.startAllOperatorCoordinators(SchedulerBase.java:1253) ~[flink-dist_2.11-1.12.2.jar:1.12.2] at org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:624) ~[flink-dist_2.11-1.12.2.jar:1.12.2] at org.apache.flink.runtime.jobmaster.JobMaster.startScheduling(JobMaster.java:1032) ~[flink-dist_2.11-1.12.2.jar:1.12.2] at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:705) ~[?:1.8.0_221] ... 27 more -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379546#comment-17379546 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379543#comment-17379543 ] ASF GitHub Bot commented on HUDI-2153: -- hudi-bot edited a comment on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128 ## CI report: * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException
hudi-bot edited a comment on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128 ## CI report: * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379542#comment-17379542 ] ASF GitHub Bot commented on HUDI-2153: -- hudi-bot commented on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128 ## CI report: * afe140f7b9169e5a6129a10a6a12f839658c7b08 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot commented on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException
hudi-bot commented on pull request #3261: URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128 ## CI report: * afe140f7b9169e5a6129a10a6a12f839658c7b08 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] izhangzhihao opened a new issue #3262: [SUPPORT] No successful commits under path
izhangzhihao opened a new issue #3262: URL: https://github.com/apache/hudi/issues/3262 **To Reproduce** Steps to reproduce the behavior: code https://github.com/izhangzhihao/Real-time-Data-Warehouse/tree/hudi ### create table ```sql CREATE TABLE accident_claims ( claim_idBIGINT, claim_total DOUBLE, claim_total_receipt VARCHAR(50), claim_currency VARCHAR(3), member_id INT, accident_date DATE, accident_type VARCHAR(20), accident_detail VARCHAR(20), claim_date DATE, claim_statusVARCHAR(10), ts_created TIMESTAMP(3), ts_updated TIMESTAMP(3), ds DATE, PRIMARY KEY (claim_id) NOT ENFORCED ) PARTITIONED BY (ds) WITH ( 'connector'='hudi', 'path' = '/data/dwd/accident_claims', 'table.type' = 'MERGE_ON_READ', 'read.streaming.enabled' = 'true', 'write.batch.size' = '1', 'write.task.max.size' = '1', 'write.tasks' = '1', 'compaction.tasks' = '1', 'compaction.delta_seconds' = '60', 'write.precombine.field' = 'ts_updated', 'read.tasks' = '1', 'read.streaming.check-interval' = '5', 'read.streaming.start-commit' = '20210712134429', ); ``` ### insert from CDC change stream ```sql INSERT INTO dwd.accident_claims SELECT claim_id, claim_total, claim_total_receipt, claim_currency, member_id, CAST (accident_date as DATE), accident_type, accident_detail, CAST (claim_date as DATE), claim_status, CAST (ts_created as TIMESTAMP), CAST (ts_updated as TIMESTAMP), CAST (SUBSTRING(claim_date, 0, 9) as DATE) FROM datasource.accident_claims; ``` **Expected behavior** ``` SELECT * FROM accident_claims; ``` should return results But got: ``` Flink SQL> SELECT * FROM accident_claims; [ERROR] Could not execute SQL statement. Reason: org.apache.hudi.exception.HoodieException: No successful commits under path /data/dwd/accident_claims ``` But the sample code works: ``` CREATE TABLE t1( uuid VARCHAR(20), -- you can use 'PRIMARY KEY NOT ENFORCED' syntax to mark the field as record key name VARCHAR(10), age INT, ts TIMESTAMP(3), `partition` VARCHAR(20) ) PARTITIONED BY (`partition`) WITH ( 'connector' = 'hudi', 'path' = '/data/t1', 'write.tasks' = '1', -- default is 4 ,required more resource 'compaction.tasks' = '1', -- default is 10 ,required more resource 'table.type' = 'COPY_ON_WRITE', -- this creates a MERGE_ON_READ table, by default is COPY_ON_WRITE 'read.tasks' = '1', -- default is 4 ,required more resource 'read.streaming.enabled' = 'true', -- this option enable the streaming read 'read.streaming.start-commit' = '20210712134429', -- specifies the start commit instant time 'read.streaming.check-interval' = '4' -- specifies the check interval for finding new source commits, default 60s. ); -- insert data using values INSERT INTO t1 VALUES ('id1','Danny',23,TIMESTAMP '1970-01-01 00:00:01','par1'), ('id2','Stephen',33,TIMESTAMP '1970-01-01 00:00:02','par1'), ('id3','Julian',53,TIMESTAMP '1970-01-01 00:00:03','par2'), ('id4','Fabian',31,TIMESTAMP '1970-01-01 00:00:04','par2'), ('id5','Sophia',18,TIMESTAMP '1970-01-01 00:00:05','par3'), ('id6','Emma',20,TIMESTAMP '1970-01-01 00:00:06','par3'), ('id7','Bob',44,TIMESTAMP '1970-01-01 00:00:07','par4'), ('id8','Han',56,TIMESTAMP '1970-01-01 00:00:08','par4'); SELECT * FROM t1; ``` So I didn't get what's wrong here... **Environment Description** * Hudi version : 0.9.0 SNAPSHOT * Flink version : 1.12.2 * Hive version : none * Hadoop version : 2.8.3 * Storage (HDFS/S3/GCS..) : local file system * Running on Docker? (yes/no) : yes **Additional context** Add any other context about the problem here. ![image](https://user-images.githubusercontent.com/12044174/125382900-20040c80-e3c9-11eb-8ab6-be9a7c3072f5.png) Taskmanager log: [taskmanager.log.zip](https://github.com/apache/hudi/files/6805564/taskmanager.log.zip) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2153: - Labels: pull-request-available (was: ) > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException
[ https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379541#comment-17379541 ] ASF GitHub Bot commented on HUDI-2153: -- moranyuwen opened a new pull request #3261: URL: https://github.com/apache/hudi/pull/3261 Running HoodieFlinkStreamer will encounter an exception in the bucketAssignFunction class where the context is null ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BucketAssignFunction NullPointerException > - > > Key: HUDI-2153 > URL: https://issues.apache.org/jira/browse/HUDI-2153 > Project: Apache Hudi > Issue Type: Bug >Reporter: moran >Priority: Major > Fix For: 0.9.0 > > > java.lang.NullPointerException > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198) > at > org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > at java.lang.Thread.run(Thread.java:748) > ERROR at > Line 197 of the BucketAssignFunction class > (this.context.setCurrentKey(recordKey)) > Why is this context null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] moranyuwen opened a new pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException
moranyuwen opened a new pull request #3261: URL: https://github.com/apache/hudi/pull/3261 Running HoodieFlinkStreamer will encounter an exception in the bucketAssignFunction class where the context is null ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379522#comment-17379522 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379520#comment-17379520 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878723946 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 commented on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob
zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878723946 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
codope commented on a change in pull request #3250: URL: https://github.com/apache/hudi/pull/3250#discussion_r668367569 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -339,7 +339,7 @@ .withDocumentation(""); public static final ConfigProperty EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION = ConfigProperty - .key(AVRO_SCHEMA + ".externalTransformation") + .key(AVRO_SCHEMA.key() + ".externalTransformation") Review comment: Changed the config key to `hoodie.avro.schema.external.transformation` and also have `hoodie.avro.schema.externalTransformation` as alternative for backwards compatibility. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2151) Make performant out-of-box configs
[ https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379496#comment-17379496 ] Vinoth Chandar commented on HUDI-2151: -- These need a default value. {code:java} public static final ConfigProperty ZK_PORT_PROP = ConfigProperty .key(ZK_PORT_PROP_KEY) .noDefaultValue() .sinceVersion("0.8.0") .withDocumentation("Zookeeper port to connect to."); public static final ConfigProperty ZK_LOCK_KEY_PROP = ConfigProperty .key(ZK_LOCK_KEY_PROP_KEY) .noDefaultValue() .sinceVersion("0.8.0") .withDocumentation("Key name under base_path at which to create a ZNode and acquire lock. " + "Final path on zk will look like base_path/lock_key. We recommend setting this to the table name");{code} > Make performant out-of-box configs > -- > > Key: HUDI-2151 > URL: https://issues.apache.org/jira/browse/HUDI-2151 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup, Docs >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > We have quite a few configs which deliver better performance or usability, > but guarded by flags. > This is to identify them, change them, test (functionally, perf) and make > them default > > Need to ensure we also capture all the backwards compatibility issues that > can arise -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HUDI-2151) Make performant out-of-box configs
[ https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2151: - Comment: was deleted (was: Is this correct? 5s? {code:java} public static final ConfigProperty LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP = ConfigProperty .key(LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY) .defaultValue(String.valueOf(5000L)) .sinceVersion("0.8.0") .withDocumentation("Maximum amount of time to wait between retries by lock provider client. This bounds" + " the maximum delay from the exponential backoff.");{code}) > Make performant out-of-box configs > -- > > Key: HUDI-2151 > URL: https://issues.apache.org/jira/browse/HUDI-2151 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup, Docs >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > We have quite a few configs which deliver better performance or usability, > but guarded by flags. > This is to identify them, change them, test (functionally, perf) and make > them default > > Need to ensure we also capture all the backwards compatibility issues that > can arise -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2144) Offline clustering(independent sparkJob) will cause insert action losing data
[ https://issues.apache.org/jira/browse/HUDI-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379491#comment-17379491 ] ASF GitHub Bot commented on HUDI-2144: -- satishkotha merged pull request #3240: URL: https://github.com/apache/hudi/pull/3240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Offline clustering(independent sparkJob) will cause insert action losing data > - > > Key: HUDI-2144 > URL: https://issues.apache.org/jira/browse/HUDI-2144 > Project: Apache Hudi > Issue Type: Bug >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > Attachments: image-2021-07-08-13-52-00-089.png > > > For now we have two kinds of pipeline for Hudi using spark: > # Streaming insert data to specific partition > # Offline clustering spark > job(`org.apache.hudi.utilities.HoodieClusteringJob`) to optimize file size > pipeline 1 created > But here is a bug we met that will lose data > These steps can make the problem reproduce stably : > # Submit a spark job to Ingest data1 using insert mode. > # Schedule a clustering plan using > `org.apache.hudi.utilities.HoodieClusteringJob` > # Submit a spark job again to Ingest data2 using insert mode(Ensure that > there is new file slice created in the same file group which means small file > tuning for insert is working). Suppose this file group is called file group 1 > and new file slice is called file slice 2. > # Execute that clustering job step2 planed. > # Query data1+data2 you will find new data for a is lost compared with > common ingestion without clustering > > !image-2021-07-08-13-52-00-089.png|width=922,height=728! > Here is the root cause: > When ingest data using insert mode, Hudi will find small files and try to > append new data to them ,aiming to tuning data file size. > [https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149] > is try to filter Small Files In Clustering but only works when user set > `hoodie.clustering.inline` true which is not good enough when users using > offline clustering. > I just raise a PR try to fix it and tested. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (ca440cc -> c8a2033)
This is an automated email from the ASF dual-hosted git repository. satish pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from ca440cc [HUDI-2107] Support Read Log Only MOR Table For Spark (#3193) add c8a2033 [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data (#3240) No new revisions were added by this update. Summary of changes: .../table/action/commit/UpsertPartitioner.java | 2 +- .../table/action/commit/TestUpsertPartitioner.java | 45 +- .../hudi/common/testutils/ClusteringTestUtils.java | 54 ++ 3 files changed, 99 insertions(+), 2 deletions(-) create mode 100644 hudi-common/src/test/java/org/apache/hudi/common/testutils/ClusteringTestUtils.java
[GitHub] [hudi] satishkotha merged pull request #3240: [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data
satishkotha merged pull request #3240: URL: https://github.com/apache/hudi/pull/3240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2144) Offline clustering(independent sparkJob) will cause insert action losing data
[ https://issues.apache.org/jira/browse/HUDI-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379490#comment-17379490 ] ASF GitHub Bot commented on HUDI-2144: -- lw309637554 commented on a change in pull request #3240: URL: https://github.com/apache/hudi/pull/3240#discussion_r668356708 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String fileIdHint) { * @return smallFiles not in clustering */ private List filterSmallFilesInClustering(final Set pendingClusteringFileGroupsId, final List smallFiles) { -if (this.config.isClusteringEnabled()) { Review comment: @satishkotha @zhangyue19921010 Use "if (!pendingClusteringFileGroupsId.isEmpty())" will improve ease of use. Another need to modify. But if this will bring performance loss? @satishkotha " private JavaRDD> clusteringHandleUpdate(JavaRDD> inputRecordsRDD) { if (config.isClusteringEnabled()) {" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Offline clustering(independent sparkJob) will cause insert action losing data > - > > Key: HUDI-2144 > URL: https://issues.apache.org/jira/browse/HUDI-2144 > Project: Apache Hudi > Issue Type: Bug >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > Attachments: image-2021-07-08-13-52-00-089.png > > > For now we have two kinds of pipeline for Hudi using spark: > # Streaming insert data to specific partition > # Offline clustering spark > job(`org.apache.hudi.utilities.HoodieClusteringJob`) to optimize file size > pipeline 1 created > But here is a bug we met that will lose data > These steps can make the problem reproduce stably : > # Submit a spark job to Ingest data1 using insert mode. > # Schedule a clustering plan using > `org.apache.hudi.utilities.HoodieClusteringJob` > # Submit a spark job again to Ingest data2 using insert mode(Ensure that > there is new file slice created in the same file group which means small file > tuning for insert is working). Suppose this file group is called file group 1 > and new file slice is called file slice 2. > # Execute that clustering job step2 planed. > # Query data1+data2 you will find new data for a is lost compared with > common ingestion without clustering > > !image-2021-07-08-13-52-00-089.png|width=922,height=728! > Here is the root cause: > When ingest data using insert mode, Hudi will find small files and try to > append new data to them ,aiming to tuning data file size. > [https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149] > is try to filter Small Files In Clustering but only works when user set > `hoodie.clustering.inline` true which is not good enough when users using > offline clustering. > I just raise a PR try to fix it and tested. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] lw309637554 commented on a change in pull request #3240: [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data
lw309637554 commented on a change in pull request #3240: URL: https://github.com/apache/hudi/pull/3240#discussion_r668356708 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String fileIdHint) { * @return smallFiles not in clustering */ private List filterSmallFilesInClustering(final Set pendingClusteringFileGroupsId, final List smallFiles) { -if (this.config.isClusteringEnabled()) { Review comment: @satishkotha @zhangyue19921010 Use "if (!pendingClusteringFileGroupsId.isEmpty())" will improve ease of use. Another need to modify. But if this will bring performance loss? @satishkotha " private JavaRDD> clusteringHandleUpdate(JavaRDD> inputRecordsRDD) { if (config.isClusteringEnabled()) {" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2151) Make performant out-of-box configs
[ https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379488#comment-17379488 ] Vinoth Chandar commented on HUDI-2151: -- Is this correct? 5s? {code:java} public static final ConfigProperty LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP = ConfigProperty .key(LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY) .defaultValue(String.valueOf(5000L)) .sinceVersion("0.8.0") .withDocumentation("Maximum amount of time to wait between retries by lock provider client. This bounds" + " the maximum delay from the exponential backoff.");{code} > Make performant out-of-box configs > -- > > Key: HUDI-2151 > URL: https://issues.apache.org/jira/browse/HUDI-2151 > Project: Apache Hudi > Issue Type: Sub-task > Components: Code Cleanup, Docs >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > We have quite a few configs which deliver better performance or usability, > but guarded by flags. > This is to identify them, change them, test (functionally, perf) and make > them default > > Need to ensure we also capture all the backwards compatibility issues that > can arise -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2144) Offline clustering(independent sparkJob) will cause insert action losing data
[ https://issues.apache.org/jira/browse/HUDI-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379487#comment-17379487 ] ASF GitHub Bot commented on HUDI-2144: -- lw309637554 commented on a change in pull request #3240: URL: https://github.com/apache/hudi/pull/3240#discussion_r668354266 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String fileIdHint) { * @return smallFiles not in clustering */ private List filterSmallFilesInClustering(final Set pendingClusteringFileGroupsId, final List smallFiles) { -if (this.config.isClusteringEnabled()) { Review comment: @satishkotha @zhangyue19921010 At first we have two config for clustering. If set ASYNC_CLUSTERING_ENABLE_OPT_KEY will be ok. public boolean isAsyncClusteringEnabled() { return Boolean.parseBoolean(props.getProperty(HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY)); } public boolean isClusteringEnabled() { // TODO: future support async clustering return inlineClusteringEnabled() || isAsyncClusteringEnabled(); } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Offline clustering(independent sparkJob) will cause insert action losing data > - > > Key: HUDI-2144 > URL: https://issues.apache.org/jira/browse/HUDI-2144 > Project: Apache Hudi > Issue Type: Bug >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > Attachments: image-2021-07-08-13-52-00-089.png > > > For now we have two kinds of pipeline for Hudi using spark: > # Streaming insert data to specific partition > # Offline clustering spark > job(`org.apache.hudi.utilities.HoodieClusteringJob`) to optimize file size > pipeline 1 created > But here is a bug we met that will lose data > These steps can make the problem reproduce stably : > # Submit a spark job to Ingest data1 using insert mode. > # Schedule a clustering plan using > `org.apache.hudi.utilities.HoodieClusteringJob` > # Submit a spark job again to Ingest data2 using insert mode(Ensure that > there is new file slice created in the same file group which means small file > tuning for insert is working). Suppose this file group is called file group 1 > and new file slice is called file slice 2. > # Execute that clustering job step2 planed. > # Query data1+data2 you will find new data for a is lost compared with > common ingestion without clustering > > !image-2021-07-08-13-52-00-089.png|width=922,height=728! > Here is the root cause: > When ingest data using insert mode, Hudi will find small files and try to > append new data to them ,aiming to tuning data file size. > [https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149] > is try to filter Small Files In Clustering but only works when user set > `hoodie.clustering.inline` true which is not good enough when users using > offline clustering. > I just raise a PR try to fix it and tested. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] lw309637554 commented on a change in pull request #3240: [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data
lw309637554 commented on a change in pull request #3240: URL: https://github.com/apache/hudi/pull/3240#discussion_r668354266 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String fileIdHint) { * @return smallFiles not in clustering */ private List filterSmallFilesInClustering(final Set pendingClusteringFileGroupsId, final List smallFiles) { -if (this.config.isClusteringEnabled()) { Review comment: @satishkotha @zhangyue19921010 At first we have two config for clustering. If set ASYNC_CLUSTERING_ENABLE_OPT_KEY will be ok. public boolean isAsyncClusteringEnabled() { return Boolean.parseBoolean(props.getProperty(HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY)); } public boolean isClusteringEnabled() { // TODO: future support async clustering return inlineClusteringEnabled() || isAsyncClusteringEnabled(); } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379436#comment-17379436 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format
[ https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379432#comment-17379432 ] ASF GitHub Bot commented on HUDI-1828: -- codecov-commenter edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ensure All Tests Pass with ORC format > - > > Key: HUDI-1828 > URL: https://issues.apache.org/jira/browse/HUDI-1828 > Project: Apache Hudi > Issue Type: Sub-task > Components: Storage Management >Reporter: Teresa Kang >Priority: Major > Labels: pull-request-available > > Run all tests with HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT=ORC, ensure all > tests pass. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format
codecov-commenter edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format
[ https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379429#comment-17379429 ] ASF GitHub Bot commented on HUDI-1828: -- codecov-commenter edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2619292) into [master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2b21ae1) will **decrease** coverage by `20.04%`. > The diff coverage is `40.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3237 +/- ## = - Coverage 47.57% 27.53% -20.05% + Complexity 5481 1292 -4189 = Files 924 385 -539 Lines 4119415218-25976 Branches 4133 1318 -2815 = - Hits 19599 4190-15409 + Misses1985310724 -9129 + Partials 1742 304 -1438 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.93% <40.00%> (-13.65%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=) | `0.00% <ø> (-71.88%)` | :arrow_down: | | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `94.64% <40.00%> (-5.36%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format
codecov-commenter edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2619292) into [master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2b21ae1) will **decrease** coverage by `20.04%`. > The diff coverage is `40.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3237 +/- ## = - Coverage 47.57% 27.53% -20.05% + Complexity 5481 1292 -4189 = Files 924 385 -539 Lines 4119415218-25976 Branches 4133 1318 -2815 = - Hits 19599 4190-15409 + Misses1985310724 -9129 + Partials 1742 304 -1438 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.93% <40.00%> (-13.65%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=) | `0.00% <ø> (-71.88%)` | :arrow_down: | | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `94.64% <40.00%> (-5.36%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379423#comment-17379423 ] Nishith Agarwal commented on HUDI-2159: --- Thanks for the detailed analysis [~pwason]. I think it is definitely worth solving (1) from the 0.9.0 release. This is a legitimate situation that can surface up especially as users schedule ingestion at a lower frequency there is more chances of such collisions. For (2), since it is more of a perf degradation in cases of failures, we can address this right after 0.9 by landing the tailing timeline based on completion time. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format
[ https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379422#comment-17379422 ] ASF GitHub Bot commented on HUDI-1828: -- codecov-commenter edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2619292) into [master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2b21ae1) will **decrease** coverage by `31.65%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3237 +/- ## = - Coverage 47.57% 15.91% -31.66% + Complexity 5481 493 -4988 = Files 924 283 -641 Lines 4119411710-29484 Branches 4133 961 -3172 = - Hits 19599 1864-17735 + Misses19853 9683-10170 + Partials 1742 163 -1579 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.59%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=) | `0.00% <ø> (-71.88%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format
codecov-commenter edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2619292) into [master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (2b21ae1) will **decrease** coverage by `31.65%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3237 +/- ## = - Coverage 47.57% 15.91% -31.66% + Complexity 5481 493 -4988 = Files 924 283 -641 Lines 4119411710-29484 Branches 4133 961 -3172 = - Hits 19599 1864-17735 + Misses19853 9683-10170 + Partials 1742 163 -1579 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.59%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=) | `0.00% <ø> (-71.88%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379421#comment-17379421 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851) * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851) * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path
[ https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379419#comment-17379419 ] ASF GitHub Bot commented on HUDI-2161: -- hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851) * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support to disable meta column to BulkInsert Row Writer path > > > Key: HUDI-2161 > URL: https://issues.apache.org/jira/browse/HUDI-2161 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: pull-request-available > > Objective here is to disable all meta columns so as to avoid storage cost. > Also, some benefits could be seen in write latency with row writer path as no > special handling is required at RowCreateHandle layer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation
hudi-bot edited a comment on pull request #3247: URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931 ## CI report: * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851) * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format
[ https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379410#comment-17379410 ] ASF GitHub Bot commented on HUDI-1828: -- hudi-bot edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876059246 ## CI report: * 2619292015a49cb34ce484f5ecd2843e97000e52 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=867) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ensure All Tests Pass with ORC format > - > > Key: HUDI-1828 > URL: https://issues.apache.org/jira/browse/HUDI-1828 > Project: Apache Hudi > Issue Type: Sub-task > Components: Storage Management >Reporter: Teresa Kang >Priority: Major > Labels: pull-request-available > > Run all tests with HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT=ORC, ensure all > tests pass. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format
hudi-bot edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876059246 ## CI report: * 2619292015a49cb34ce484f5ecd2843e97000e52 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=867) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
codecov-commenter edited a comment on pull request #3250: URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (01525c9) into [master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (ca440cc) will **increase** coverage by `0.00%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@Coverage Diff@@ ## master#3250 +/- ## = Coverage 47.71% 47.72% - Complexity 5527 5528+1 = Files 934 934 Lines 4145641457+1 Branches 4167 4167 = + Hits 1978219785+3 + Misses1991619914-2 Partials 1758 1758 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.46% <100.00%> (+0.01%)` | :arrow_up: | | hudicommon | `48.57% <ø> (-0.01%)` | :arrow_down: | | hudiflink | `60.03% <ø> (ø)` | | | hudihadoopmr | `51.55% <ø> (ø)` | | | hudisparkdatasource | `67.37% <ø> (+0.05%)` | :arrow_up: | | hudisync | `54.51% <ø> (ø)` | | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `59.26% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `43.07% <100.00%> (+0.26%)` | :arrow_up: | | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.90% <0.00%> (-0.79%)` | :arrow_down: | | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `29.60% <0.00%> (+1.60%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=continue_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=footer_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). Last update [ca440cc...01525c9](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=lastupdated_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). Read the [comment
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
codecov-commenter edited a comment on pull request #3250: URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (01525c9) into [master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (ca440cc) will **decrease** coverage by `3.38%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3250 +/- ## - Coverage 47.71% 44.33% -3.39% + Complexity 5527 4930 -597 Files 934 860 -74 Lines 4145637415-4041 Branches 4167 3496 -671 - Hits 1978216589-3193 + Misses1991619555 -361 + Partials 1758 1271 -487 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.46% <100.00%> (+0.01%)` | :arrow_up: | | hudicommon | `48.57% <ø> (-0.01%)` | :arrow_down: | | hudiflink | `60.03% <ø> (ø)` | | | hudihadoopmr | `51.55% <ø> (ø)` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `59.26% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `43.07% <100.00%> (+0.26%)` | :arrow_up: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...main/java/org/apache/hudi/hive/HiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN5bmNDb25maWcuamF2YQ==) | `0.00% <0.00%> (-98.08%)` | :arrow_down: | | [...he/hudi/hive/replication/GlobalHiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvcmVwbGljYXRpb24vR2xvYmFsSGl2ZVN5bmNDb25maWcuamF2YQ==) | `0.00% <0.00%> (-95.00%)` | :arrow_down: | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
codecov-commenter edited a comment on pull request #3250: URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (01525c9) into [master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (ca440cc) will **decrease** coverage by `3.62%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3250 +/- ## - Coverage 47.71% 44.09% -3.63% + Complexity 5527 4868 -659 Files 934 854 -80 Lines 4145636964-4492 Branches 4167 3472 -695 - Hits 1978216300-3482 + Misses1991619412 -504 + Partials 1758 1252 -506 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.46% <100.00%> (+0.01%)` | :arrow_up: | | hudicommon | `48.57% <ø> (-0.01%)` | :arrow_down: | | hudiflink | `60.03% <ø> (ø)` | | | hudihadoopmr | `51.55% <ø> (ø)` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `43.07% <100.00%> (+0.26%)` | :arrow_up: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...main/java/org/apache/hudi/hive/HiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN5bmNDb25maWcuamF2YQ==) | `0.00% <0.00%> (-98.08%)` | :arrow_down: | | [...he/hudi/hive/replication/GlobalHiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvcmVwbGljYXRpb24vR2xvYmFsSGl2ZVN5bmNDb25maWcuamF2YQ==) | `0.00% <0.00%> (-95.00%)` | :arrow_down: | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
codecov-commenter edited a comment on pull request #3250: URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (01525c9) into [master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (ca440cc) will **decrease** coverage by `17.68%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3250 +/- ## = - Coverage 47.71% 30.03% -17.69% + Complexity 5527 1562 -3965 = Files 934 421 -513 Lines 4145616986-24470 Branches 4167 1561 -2606 = - Hits 19782 5102-14680 + Misses1991611481 -8435 + Partials 1758 403 -1355 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.93% <0.00%> (-13.52%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `51.55% <ø> (ø)` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.81%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config
codecov-commenter edited a comment on pull request #3250: URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (01525c9) into [master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (ca440cc) will **decrease** coverage by `20.18%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3250 +/- ## = - Coverage 47.71% 27.53% -20.19% + Complexity 5527 1291 -4236 = Files 934 385 -549 Lines 4145615215-26241 Branches 4167 1316 -2851 = - Hits 19782 4189-15593 + Misses1991610723 -9193 + Partials 1758 303 -1455 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.93% <0.00%> (-13.52%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.81%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-1241) Generate config docs automatically
[ https://issues.apache.org/jira/browse/HUDI-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379394#comment-17379394 ] ASF GitHub Bot commented on HUDI-1241: -- hudi-bot edited a comment on pull request #3260: URL: https://github.com/apache/hudi/pull/3260#issuecomment-878357134 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Generate config docs automatically > -- > > Key: HUDI-1241 > URL: https://issues.apache.org/jira/browse/HUDI-1241 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: sivabalan narayanan >Assignee: Sagar Sumit >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > > Now that we have `HoodieConfig` and `ConfigProperty`, can we write a small > script that can build a certain branch or git-sha, use reflection to load up > all the HoodieConfig classes and generate a .md file automatically for each > ConfigProperty defined. > > We will then render the .md file thru the site, as always -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format
[ https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379395#comment-17379395 ] ASF GitHub Bot commented on HUDI-1828: -- hudi-bot edited a comment on pull request #3237: URL: https://github.com/apache/hudi/pull/3237#issuecomment-876059246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ensure All Tests Pass with ORC format > - > > Key: HUDI-1828 > URL: https://issues.apache.org/jira/browse/HUDI-1828 > Project: Apache Hudi > Issue Type: Sub-task > Components: Storage Management >Reporter: Teresa Kang >Priority: Major > Labels: pull-request-available > > Run all tests with HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT=ORC, ensure all > tests pass. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2063) Add Doc For Spark Sql Integrates With Hudi
[ https://issues.apache.org/jira/browse/HUDI-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379393#comment-17379393 ] ASF GitHub Bot commented on HUDI-2063: -- leesf commented on pull request #3140: URL: https://github.com/apache/hudi/pull/3140#issuecomment-878335579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Doc For Spark Sql Integrates With Hudi > -- > > Key: HUDI-2063 > URL: https://issues.apache.org/jira/browse/HUDI-2063 > Project: Apache Hudi > Issue Type: Sub-task > Components: Docs >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)