[GitHub] [hudi] codecov-commenter commented on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException

2021-07-12 Thread GitBox


codecov-commenter commented on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878802290


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3261](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (afe140f) into 
[master](https://codecov.io/gh/apache/hudi/commit/c8a2033c275e21a752893fc89311e1f6846f5a78?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c8a2033) will **increase** coverage by `3.42%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3261/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3261  +/-   ##
   
   + Coverage 47.71%   51.13%   +3.42% 
   + Complexity 5526  417-5109 
   
 Files   934   67 -867 
 Lines 41456 3049   -38407 
 Branches   4167  330-3837 
   
   - Hits  19779 1559   -18220 
   + Misses19917 1350   -18567 
   + Partials   1760  140-1620 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `51.13% <ø> (-8.11%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3261?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==)
 | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | 
[...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh)
 | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | 
[...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-71.82%)` | :arrow_down: |
   | 
[...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh)
 | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3261/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)
 | 

[jira] [Commented] (HUDI-2150) Rename/Restructure configs for better modularity

2021-07-12 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379612#comment-17379612
 ] 

Vinoth Chandar commented on HUDI-2150:
--

This should be renamed consistent with base file terminlogy

 
{code:java}
public static final ConfigProperty PARQUET_SMALL_FILE_LIMIT_BYTES = 
ConfigProperty
 .key("hoodie.parquet.small.file.limit")
 .defaultValue(String.valueOf(104857600))
 .withDocumentation("Upsert uses this file size to compact new data onto 
existing files. "
 + "By default, treat any file <= 100MB as a small file.");{code}

> Rename/Restructure configs for better modularity
> 
>
> Key: HUDI-2150
> URL: https://issues.apache.org/jira/browse/HUDI-2150
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Given we have a framework now, that can capture configs and even their 
> alternatives well, time to clean things up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HUDI-2150) Rename/Restructure configs for better modularity

2021-07-12 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379561#comment-17379561
 ] 

Vinoth Chandar edited comment on HUDI-2150 at 7/13/21, 5:57 AM:


Cleaner related configs to be moved out of HoodieCompactionConfig into its own 
HoodieCleanConfig. 

 

Archival related configs to be moved out of HoodieCompactionConfig into its own 
HoodieArchivalConfig.


was (Author: vc):
Cleaner related configs to be moved out of HoodieCompactionConfig into its own 
HoodieCleanConfig

> Rename/Restructure configs for better modularity
> 
>
> Key: HUDI-2150
> URL: https://issues.apache.org/jira/browse/HUDI-2150
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Given we have a framework now, that can capture configs and even their 
> alternatives well, time to clean things up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2168) AccessControlException for anonymous user

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379611#comment-17379611
 ] 

ASF GitHub Bot commented on HUDI-2168:
--

hudi-bot edited a comment on pull request #3264:
URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938


   
   ## CI report:
   
   * e8e5e310224eee469a19bcfe7af537154843c318 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=877)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AccessControlException for anonymous user
> -
>
> Key: HUDI-2168
> URL: https://issues.apache.org/jira/browse/HUDI-2168
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Testing
>Reporter: Vinay
>Assignee: Vinay
>Priority: Trivial
>  Labels: pull-request-available
>
> Users are facing the following exception while executing test case dependent 
> on starting Hive service
>  
> {code:java}
> Got exception: org.apache.hadoop.security.AccessControlException Permission 
> denied: user=anonymous, access=WRITE
> {code}
> This is specifically happening at the time of clearing Hive DB
> {code:java}
> client.updateHiveSQL("drop database if exists " + 
> hiveSyncConfig.databaseName);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3264: [HUDI-2168] Fix for AccessControlException for anonymous user

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3264:
URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938


   
   ## CI report:
   
   * e8e5e310224eee469a19bcfe7af537154843c318 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=877)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-1548) Fix documentation around schema evolution

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379610#comment-17379610
 ] 

ASF GitHub Bot commented on HUDI-1548:
--

codope commented on pull request #3257:
URL: https://github.com/apache/hudi/pull/3257#issuecomment-878800281


   @vinothchandar @n3nash @nsivabalan Can you please review the doc? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix documentation around schema evolution 
> --
>
> Key: HUDI-1548
> URL: https://issues.apache.org/jira/browse/HUDI-1548
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Docs
>Reporter: sivabalan narayanan
>Assignee: Nishith Agarwal
>Priority: Blocker
>  Labels: ', pull-request-available, sev:high, user-support-issues
> Fix For: 0.9.0
>
>
> Clearly call out what kind of schema evolution is supported by hudi in 
> documentation .
> Context: https://github.com/apache/hudi/issues/2331



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codope commented on pull request #3257: [HUDI-1548] Add documentation for schema evolution

2021-07-12 Thread GitBox


codope commented on pull request #3257:
URL: https://github.com/apache/hudi/pull/3257#issuecomment-878800281


   @vinothchandar @n3nash @nsivabalan Can you please review the doc? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2168) AccessControlException for anonymous user

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379609#comment-17379609
 ] 

ASF GitHub Bot commented on HUDI-2168:
--

hudi-bot commented on pull request #3264:
URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938


   
   ## CI report:
   
   * e8e5e310224eee469a19bcfe7af537154843c318 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AccessControlException for anonymous user
> -
>
> Key: HUDI-2168
> URL: https://issues.apache.org/jira/browse/HUDI-2168
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Testing
>Reporter: Vinay
>Assignee: Vinay
>Priority: Trivial
>  Labels: pull-request-available
>
> Users are facing the following exception while executing test case dependent 
> on starting Hive service
>  
> {code:java}
> Got exception: org.apache.hadoop.security.AccessControlException Permission 
> denied: user=anonymous, access=WRITE
> {code}
> This is specifically happening at the time of clearing Hive DB
> {code:java}
> client.updateHiveSQL("drop database if exists " + 
> hiveSyncConfig.databaseName);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379608#comment-17379608
 ] 

ASF GitHub Bot commented on HUDI-2164:
--

hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Build cluster plan and execute this plan at once for HoodieClusteringJob
> 
>
> Key: HUDI-2164
> URL: https://issues.apache.org/jira/browse/HUDI-2164
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
>
> For now, Hudi can let users submit a HoodieClusteringJob to build a 
> clustering plan or execute a clustering plan through --schedule or 
> --instant-time config.
> If users want to trigger a clustering job, he has to 
>  # Submit a HoodieClusteringJob to build a clustering job through --schedule 
> config
>  # Copy the created clustering Instant time form Log info.
>  # Submit the HoodieClusteringJob again to execute this created clustering 
> plan through --instant-time config.
> The pain point is that there are too many steps when trigger a clustering and 
> need to copy and paste the instant time from log file manually so that we 
> can't make it automatically.
>  
> I just raise a PR to offer a new config named --mode or -m in short 
> ||--mode||remarks||
> |execute|Execute a cluster plan at given instant which means --instant-time 
> is needed here. default value. |
> |schedule|Make a clustering plan.|
> |*scheduleAndExecute*|Make a cluster plan first and execute that plan 
> immediately|
> Now users can use --mode scheduleAndExecute to Build cluster plan and execute 
> this plan at once using HoodieClusteringJob.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot commented on pull request #3264: [HUDI-2168] Fix for AccessControlException for anonymous user

2021-07-12 Thread GitBox


hudi-bot commented on pull request #3264:
URL: https://github.com/apache/hudi/pull/3264#issuecomment-878799938


   
   ## CI report:
   
   * e8e5e310224eee469a19bcfe7af537154843c318 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2168) AccessControlException for anonymous user

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379607#comment-17379607
 ] 

ASF GitHub Bot commented on HUDI-2168:
--

veenaypatil opened a new pull request #3264:
URL: https://github.com/apache/hudi/pull/3264


   ## What is the purpose of the pull request
   
   To fix access control exception while running the test cases which involves 
starting the Hive service
   
   ## Brief change log
   
   Set config 
   ```
   config.setBoolean("dfs.permissions",false);
   ```
   
   ## Verify this pull request
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   - Verified the tests are running locally after this change
   
   
   ## Committer checklist
   
- [X] Has a corresponding JIRA in PR title & commit

- [X] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AccessControlException for anonymous user
> -
>
> Key: HUDI-2168
> URL: https://issues.apache.org/jira/browse/HUDI-2168
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Testing
>Reporter: Vinay
>Assignee: Vinay
>Priority: Trivial
>
> Users are facing the following exception while executing test case dependent 
> on starting Hive service
>  
> {code:java}
> Got exception: org.apache.hadoop.security.AccessControlException Permission 
> denied: user=anonymous, access=WRITE
> {code}
> This is specifically happening at the time of clearing Hive DB
> {code:java}
> client.updateHiveSQL("drop database if exists " + 
> hiveSyncConfig.databaseName);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2168) AccessControlException for anonymous user

2021-07-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-2168:
-
Labels: pull-request-available  (was: )

> AccessControlException for anonymous user
> -
>
> Key: HUDI-2168
> URL: https://issues.apache.org/jira/browse/HUDI-2168
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Testing
>Reporter: Vinay
>Assignee: Vinay
>Priority: Trivial
>  Labels: pull-request-available
>
> Users are facing the following exception while executing test case dependent 
> on starting Hive service
>  
> {code:java}
> Got exception: org.apache.hadoop.security.AccessControlException Permission 
> denied: user=anonymous, access=WRITE
> {code}
> This is specifically happening at the time of clearing Hive DB
> {code:java}
> client.updateHiveSQL("drop database if exists " + 
> hiveSyncConfig.databaseName);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] veenaypatil opened a new pull request #3264: [HUDI-2168] Fix for AccessControlException for anonymous user

2021-07-12 Thread GitBox


veenaypatil opened a new pull request #3264:
URL: https://github.com/apache/hudi/pull/3264


   ## What is the purpose of the pull request
   
   To fix access control exception while running the test cases which involves 
starting the Hive service
   
   ## Brief change log
   
   Set config 
   ```
   config.setBoolean("dfs.permissions",false);
   ```
   
   ## Verify this pull request
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   - Verified the tests are running locally after this change
   
   
   ## Committer checklist
   
- [X] Has a corresponding JIRA in PR title & commit

- [X] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379600#comment-17379600
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876)
 
   * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876)
 
   * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379599#comment-17379599
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876)
 
   * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876)
 
   * caffa0a76af64dddc658d15a1dd3a371f3a8bcda UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379596#comment-17379596
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=876)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379595#comment-17379595
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   * e56bac615f087cec7817b846809c9f8fd0cc20a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379593#comment-17379593
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

hudi-bot edited a comment on pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248


   
   ## CI report:
   
   * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248


   
   ## CI report:
   
   * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379586#comment-17379586
 ] 

ASF GitHub Bot commented on HUDI-2164:
--

hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Build cluster plan and execute this plan at once for HoodieClusteringJob
> 
>
> Key: HUDI-2164
> URL: https://issues.apache.org/jira/browse/HUDI-2164
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
>
> For now, Hudi can let users submit a HoodieClusteringJob to build a 
> clustering plan or execute a clustering plan through --schedule or 
> --instant-time config.
> If users want to trigger a clustering job, he has to 
>  # Submit a HoodieClusteringJob to build a clustering job through --schedule 
> config
>  # Copy the created clustering Instant time form Log info.
>  # Submit the HoodieClusteringJob again to execute this created clustering 
> plan through --instant-time config.
> The pain point is that there are too many steps when trigger a clustering and 
> need to copy and paste the instant time from log file manually so that we 
> can't make it automatically.
>  
> I just raise a PR to offer a new config named --mode or -m in short 
> ||--mode||remarks||
> |execute|Execute a cluster plan at given instant which means --instant-time 
> is needed here. default value. |
> |schedule|Make a clustering plan.|
> |*scheduleAndExecute*|Make a cluster plan first and execute that plan 
> immediately|
> Now users can use --mode scheduleAndExecute to Build cluster plan and execute 
> this plan at once using HoodieClusteringJob.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2168) AccessControlException for anonymous user

2021-07-12 Thread Vinay (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HUDI-2168:

Status: In Progress  (was: Open)

> AccessControlException for anonymous user
> -
>
> Key: HUDI-2168
> URL: https://issues.apache.org/jira/browse/HUDI-2168
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Testing
>Reporter: Vinay
>Assignee: Vinay
>Priority: Trivial
>
> Users are facing the following exception while executing test case dependent 
> on starting Hive service
>  
> {code:java}
> Got exception: org.apache.hadoop.security.AccessControlException Permission 
> denied: user=anonymous, access=WRITE
> {code}
> This is specifically happening at the time of clearing Hive DB
> {code:java}
> client.updateHiveSQL("drop database if exists " + 
> hiveSyncConfig.databaseName);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2168) AccessControlException for anonymous user

2021-07-12 Thread Vinay (Jira)
Vinay created HUDI-2168:
---

 Summary: AccessControlException for anonymous user
 Key: HUDI-2168
 URL: https://issues.apache.org/jira/browse/HUDI-2168
 Project: Apache Hudi
  Issue Type: Task
  Components: Testing
Reporter: Vinay
Assignee: Vinay


Users are facing the following exception while executing test case dependent on 
starting Hive service

 
{code:java}
Got exception: org.apache.hadoop.security.AccessControlException Permission 
denied: user=anonymous, access=WRITE
{code}
This is specifically happening at the time of clearing Hive DB
{code:java}
client.updateHiveSQL("drop database if exists " + hiveSyncConfig.databaseName);
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379584#comment-17379584
 ] 

ASF GitHub Bot commented on HUDI-2164:
--

hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Build cluster plan and execute this plan at once for HoodieClusteringJob
> 
>
> Key: HUDI-2164
> URL: https://issues.apache.org/jira/browse/HUDI-2164
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
>
> For now, Hudi can let users submit a HoodieClusteringJob to build a 
> clustering plan or execute a clustering plan through --schedule or 
> --instant-time config.
> If users want to trigger a clustering job, he has to 
>  # Submit a HoodieClusteringJob to build a clustering job through --schedule 
> config
>  # Copy the created clustering Instant time form Log info.
>  # Submit the HoodieClusteringJob again to execute this created clustering 
> plan through --instant-time config.
> The pain point is that there are too many steps when trigger a clustering and 
> need to copy and paste the instant time from log file manually so that we 
> can't make it automatically.
>  
> I just raise a PR to offer a new config named --mode or -m in short 
> ||--mode||remarks||
> |execute|Execute a cluster plan at given instant which means --instant-time 
> is needed here. default value. |
> |schedule|Make a clustering plan.|
> |*scheduleAndExecute*|Make a cluster plan first and execute that plan 
> immediately|
> Now users can use --mode scheduleAndExecute to Build cluster plan and execute 
> this plan at once using HoodieClusteringJob.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1985) Website re-design implementation

2021-07-12 Thread Vinoth Govindarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379583#comment-17379583
 ] 

Vinoth Govindarajan commented on HUDI-1985:
---

Hi [~xushiyan],
I have experience in the past building websites, I can volunteer to work on 
this re-design.

 

> Website re-design implementation
> 
>
> Key: HUDI-1985
> URL: https://issues.apache.org/jira/browse/HUDI-1985
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Docs
>Reporter: Raymond Xu
>Priority: Blocker
>  Labels: documentation
> Fix For: 0.9.0
>
>
> To provide better navigation and organization of Hudi website's info, we have 
> done a re-design of the web pages.
> Previous discussion
> [https://github.com/apache/hudi/issues/2905]
>  
> See the wireframe and final design in 
> [https://www.figma.com/file/tipod1JZRw7anZRWBI6sZh/Hudi.Apache?node-id=32%3A6]
> (login Figma to comment)
> The design is ready for implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379580#comment-17379580
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379577#comment-17379577
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

hudi-bot edited a comment on pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248


   
   ## CI report:
   
   * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248


   
   ## CI report:
   
   * f1299ed52dcf90635d4f11fef040255cfda9f35b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=873)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379576#comment-17379576
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

danny0405 commented on a change in pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#discussion_r668418303



##
File path: 
hudi-flink/src/main/java/org/apache/hudi/streamer/HoodieFlinkStreamer.java
##
@@ -109,7 +109,7 @@ public static void main(String[] args) throws Exception {
 .transform(
 "bucket_assigner",
 TypeInformation.of(HoodieRecord.class),
-new KeyedProcessOperator<>(new BucketAssignFunction<>(conf)))
+new BucketAssignOperator<>(new BucketAssignFunction<>(conf)))
 .setParallelism(conf.getInteger(FlinkOptions.BUCKET_ASSIGN_TASKS))

Review comment:
   Nice catch, can we fix the indentation ? And there is another PR same 
with this, can we close that ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] danny0405 commented on a change in pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException

2021-07-12 Thread GitBox


danny0405 commented on a change in pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#discussion_r668418303



##
File path: 
hudi-flink/src/main/java/org/apache/hudi/streamer/HoodieFlinkStreamer.java
##
@@ -109,7 +109,7 @@ public static void main(String[] args) throws Exception {
 .transform(
 "bucket_assigner",
 TypeInformation.of(HoodieRecord.class),
-new KeyedProcessOperator<>(new BucketAssignFunction<>(conf)))
+new BucketAssignOperator<>(new BucketAssignFunction<>(conf)))
 .setParallelism(conf.getInteger(FlinkOptions.BUCKET_ASSIGN_TASKS))

Review comment:
   Nice catch, can we fix the indentation ? And there is another PR same 
with this, can we close that ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379575#comment-17379575
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

hudi-bot commented on pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248


   
   ## CI report:
   
   * f1299ed52dcf90635d4f11fef040255cfda9f35b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot commented on pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException

2021-07-12 Thread GitBox


hudi-bot commented on pull request #3263:
URL: https://github.com/apache/hudi/pull/3263#issuecomment-878768248


   
   ## CI report:
   
   * f1299ed52dcf90635d4f11fef040255cfda9f35b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379572#comment-17379572
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

moranyuwen opened a new pull request #3263:
URL: https://github.com/apache/hudi/pull/3263


   JIRA Issue: https://issues.apache.org/jira/browse/HUDI-2153
   
   When you run HoodieFlinkStreamer to write data, the context in the 
bucketAssignment function load is bull, and the update resolvesthat the context 
is null


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] moranyuwen opened a new pull request #3263: [HUDI-2153] Fix BucketAssignFunction Context NullPointerException

2021-07-12 Thread GitBox


moranyuwen opened a new pull request #3263:
URL: https://github.com/apache/hudi/pull/3263


   JIRA Issue: https://issues.apache.org/jira/browse/HUDI-2153
   
   When you run HoodieFlinkStreamer to write data, the context in the 
bucketAssignment function load is bull, and the update resolvesthat the context 
is null


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config (#3250)

2021-07-12 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new b0089b8  [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config 
(#3250)
b0089b8 is described below

commit b0089b894ad12da11fbd6a0fb08508c7adee68e6
Author: Sagar Sumit 
AuthorDate: Tue Jul 13 09:54:40 2021 +0530

[MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config (#3250)
---
 .../java/org/apache/hudi/config/HoodieWriteConfig.java |  3 ++-
 .../java/org/apache/hudi/config/TestHoodieWriteConfig.java | 14 --
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
index 20d2846..e2e295d 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
@@ -339,8 +339,9 @@ public class HoodieWriteConfig extends HoodieConfig {
   .withDocumentation("");
 
   public static final ConfigProperty 
EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION = ConfigProperty
-  .key(AVRO_SCHEMA + ".externalTransformation")
+  .key(AVRO_SCHEMA.key() + ".external.transformation")
   .defaultValue("false")
+  .withAlternatives(AVRO_SCHEMA.key() + ".externalTransformation")
   .withDocumentation("");
 
   private ConsistencyGuardConfig consistencyGuardConfig;
diff --git 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
index 7661e1d..89f7a97 100644
--- 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
+++ 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
@@ -23,6 +23,8 @@ import org.apache.hudi.config.HoodieWriteConfig.Builder;
 
 import org.apache.hudi.index.HoodieIndex;
 import org.junit.jupiter.api.Test;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.ValueSource;
 
 import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
@@ -33,16 +35,23 @@ import java.util.Map;
 import java.util.Properties;
 
 import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertTrue;
 
 public class TestHoodieWriteConfig {
 
-  @Test
-  public void testPropertyLoading() throws IOException {
+  @ParameterizedTest
+  @ValueSource(booleans = {true, false})
+  public void testPropertyLoading(boolean withAlternative) throws IOException {
 Builder builder = HoodieWriteConfig.newBuilder().withPath("/tmp");
 Map params = new HashMap<>(3);
 params.put(HoodieCompactionConfig.CLEANER_COMMITS_RETAINED_PROP.key(), 
"1");
 params.put(HoodieCompactionConfig.MAX_COMMITS_TO_KEEP_PROP.key(), "5");
 params.put(HoodieCompactionConfig.MIN_COMMITS_TO_KEEP_PROP.key(), "2");
+if (withAlternative) {
+  params.put("hoodie.avro.schema.externalTransformation", "true");
+} else {
+  params.put("hoodie.avro.schema.external.transformation", "true");
+}
 ByteArrayOutputStream outStream = saveParamsIntoOutputStream(params);
 ByteArrayInputStream inputStream = new 
ByteArrayInputStream(outStream.toByteArray());
 try {
@@ -54,6 +63,7 @@ public class TestHoodieWriteConfig {
 HoodieWriteConfig config = builder.build();
 assertEquals(5, config.getMaxCommitsToKeep());
 assertEquals(2, config.getMinCommitsToKeep());
+assertTrue(config.shouldUseExternalSchemaTransformation());
   }
 
   @Test


[GitHub] [hudi] nsivabalan merged pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


nsivabalan merged pull request #3250:
URL: https://github.com/apache/hudi/pull/3250


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379567#comment-17379567
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   * 8a212fd77769cbf7e248e971f66109381ba80f71 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=872)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379566#comment-17379566
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   * 8a212fd77769cbf7e248e971f66109381ba80f71 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   * 8a212fd77769cbf7e248e971f66109381ba80f71 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2150) Rename/Restructure configs for better modularity

2021-07-12 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-2150:
-
Description: Given we have a framework now, that can capture configs and 
even their alternatives well, time to clean things up.  (was: * Rename 
HoodieWriteConfig to HoodieClientConfig 
 * Move bunch of configs from  CompactionConfig to StorageConfig 
 * Introduce new HoodieCleanConfig
 * Should we consider lombok or something to automate the 
defaults/getters/setters
 * Consistent name of properties/defaults 
 * Enforce bounds more strictly)

> Rename/Restructure configs for better modularity
> 
>
> Key: HUDI-2150
> URL: https://issues.apache.org/jira/browse/HUDI-2150
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Given we have a framework now, that can capture configs and even their 
> alternatives well, time to clean things up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2150) Rename/Restructure configs for better modularity

2021-07-12 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379561#comment-17379561
 ] 

Vinoth Chandar commented on HUDI-2150:
--

Cleaner related configs to be moved out of HoodieCompactionConfig into its own 
HoodieCleanConfig

> Rename/Restructure configs for better modularity
> 
>
> Key: HUDI-2150
> URL: https://issues.apache.org/jira/browse/HUDI-2150
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Given we have a framework now, that can capture configs and even their 
> alternatives well, time to clean things up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379557#comment-17379557
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

hudi-bot edited a comment on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128


   
   ## CI report:
   
   * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128


   
   ## CI report:
   
   * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-2167) HoodieCompactionConfig get HoodieCleaningPolicy NullPointerException

2021-07-12 Thread tsianglei (Jira)
tsianglei created HUDI-2167:
---

 Summary: HoodieCompactionConfig get HoodieCleaningPolicy 
NullPointerException
 Key: HUDI-2167
 URL: https://issues.apache.org/jira/browse/HUDI-2167
 Project: Apache Hudi
  Issue Type: Bug
  Components: CLI, Flink Integration
Reporter: tsianglei


Caused by: java.lang.NullPointerException: Name is null
 at java.lang.Enum.valueOf(Enum.java:236) ~[?:1.8.0_221]
 at 
org.apache.hudi.common.model.HoodieCleaningPolicy.valueOf(HoodieCleaningPolicy.java:24)
 ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
 at 
org.apache.hudi.config.HoodieCompactionConfig$Builder.build(HoodieCompactionConfig.java:368)
 ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
 at 
org.apache.hudi.util.StreamerUtil.getHoodieClientConfig(StreamerUtil.java:155) 
~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
 at org.apache.hudi.util.StreamerUtil.createWriteClient(StreamerUtil.java:277) 
~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
 at 
org.apache.hudi.sink.StreamWriteOperatorCoordinator.start(StreamWriteOperatorCoordinator.java:154)
 ~[hudi-flink-bundle_2.11-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
 at 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder.start(OperatorCoordinatorHolder.java:189)
 ~[flink-dist_2.11-1.12.2.jar:1.12.2]
 at 
org.apache.flink.runtime.scheduler.SchedulerBase.startAllOperatorCoordinators(SchedulerBase.java:1253)
 ~[flink-dist_2.11-1.12.2.jar:1.12.2]
 at 
org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:624)
 ~[flink-dist_2.11-1.12.2.jar:1.12.2]
 at 
org.apache.flink.runtime.jobmaster.JobMaster.startScheduling(JobMaster.java:1032)
 ~[flink-dist_2.11-1.12.2.jar:1.12.2]
 at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:705) 
~[?:1.8.0_221]
 ... 27 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379546#comment-17379546
 ] 

ASF GitHub Bot commented on HUDI-2164:
--

hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Build cluster plan and execute this plan at once for HoodieClusteringJob
> 
>
> Key: HUDI-2164
> URL: https://issues.apache.org/jira/browse/HUDI-2164
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
>
> For now, Hudi can let users submit a HoodieClusteringJob to build a 
> clustering plan or execute a clustering plan through --schedule or 
> --instant-time config.
> If users want to trigger a clustering job, he has to 
>  # Submit a HoodieClusteringJob to build a clustering job through --schedule 
> config
>  # Copy the created clustering Instant time form Log info.
>  # Submit the HoodieClusteringJob again to execute this created clustering 
> plan through --instant-time config.
> The pain point is that there are too many steps when trigger a clustering and 
> need to copy and paste the instant time from log file manually so that we 
> can't make it automatically.
>  
> I just raise a PR to offer a new config named --mode or -m in short 
> ||--mode||remarks||
> |execute|Execute a cluster plan at given instant which means --instant-time 
> is needed here. default value. |
> |schedule|Make a clustering plan.|
> |*scheduleAndExecute*|Make a cluster plan first and execute that plan 
> immediately|
> Now users can use --mode scheduleAndExecute to Build cluster plan and execute 
> this plan at once using HoodieClusteringJob.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379543#comment-17379543
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

hudi-bot edited a comment on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128


   
   ## CI report:
   
   * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128


   
   ## CI report:
   
   * afe140f7b9169e5a6129a10a6a12f839658c7b08 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=871)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379542#comment-17379542
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

hudi-bot commented on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128


   
   ## CI report:
   
   * afe140f7b9169e5a6129a10a6a12f839658c7b08 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot commented on pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException

2021-07-12 Thread GitBox


hudi-bot commented on pull request #3261:
URL: https://github.com/apache/hudi/pull/3261#issuecomment-878740128


   
   ## CI report:
   
   * afe140f7b9169e5a6129a10a6a12f839658c7b08 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] izhangzhihao opened a new issue #3262: [SUPPORT] No successful commits under path

2021-07-12 Thread GitBox


izhangzhihao opened a new issue #3262:
URL: https://github.com/apache/hudi/issues/3262


   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   code https://github.com/izhangzhihao/Real-time-Data-Warehouse/tree/hudi
   
   ###  create table
   
   ```sql
   CREATE TABLE accident_claims
   (
   claim_idBIGINT,
   claim_total DOUBLE,
   claim_total_receipt VARCHAR(50),
   claim_currency  VARCHAR(3),
   member_id   INT,
   accident_date   DATE,
   accident_type   VARCHAR(20),
   accident_detail VARCHAR(20),
   claim_date  DATE,
   claim_statusVARCHAR(10),
   ts_created  TIMESTAMP(3),
   ts_updated  TIMESTAMP(3),
   ds  DATE,
   PRIMARY KEY (claim_id) NOT ENFORCED
   ) PARTITIONED BY (ds) WITH (
 'connector'='hudi',
 'path' = '/data/dwd/accident_claims',
 'table.type' = 'MERGE_ON_READ',
 'read.streaming.enabled' = 'true',
 'write.batch.size' = '1',
 'write.task.max.size' = '1',
 'write.tasks' = '1',
 'compaction.tasks' = '1',
 'compaction.delta_seconds' = '60',
 'write.precombine.field' = 'ts_updated',
 'read.tasks' = '1',
 'read.streaming.check-interval' = '5',
 'read.streaming.start-commit' = '20210712134429',
   );
   ```
   
   ### insert from CDC change stream
   
   ```sql
   INSERT INTO dwd.accident_claims
   SELECT claim_id,
  claim_total,
  claim_total_receipt,
  claim_currency,
  member_id,
  CAST (accident_date as DATE),
  accident_type,
  accident_detail,
  CAST (claim_date as DATE),
  claim_status,
  CAST (ts_created as TIMESTAMP),
  CAST (ts_updated as TIMESTAMP),
  CAST (SUBSTRING(claim_date, 0, 9) as DATE)
   FROM datasource.accident_claims;
   ```
   
   **Expected behavior**
   
   ```
   SELECT * FROM accident_claims;
   ```
   
   should return results
   
   But got:
   
   ```
   Flink SQL> SELECT * FROM accident_claims;
   [ERROR] Could not execute SQL statement. Reason:
   org.apache.hudi.exception.HoodieException: No successful commits under path 
/data/dwd/accident_claims
   ```
   
   But the sample code works:
   
   ```
   CREATE TABLE t1(
 uuid VARCHAR(20), -- you can use 'PRIMARY KEY NOT ENFORCED' syntax to mark 
the field as record key
 name VARCHAR(10),
 age INT,
 ts TIMESTAMP(3),
 `partition` VARCHAR(20)
   )
   PARTITIONED BY (`partition`)
   WITH (
 'connector' = 'hudi',
 'path' = '/data/t1',
 'write.tasks' = '1', -- default is 4 ,required more resource
 'compaction.tasks' = '1', -- default is 10 ,required more resource
 'table.type' = 'COPY_ON_WRITE', -- this creates a MERGE_ON_READ table, by 
default is COPY_ON_WRITE
 'read.tasks' = '1', -- default is 4 ,required more resource
 'read.streaming.enabled' = 'true',  -- this option enable the streaming 
read
 'read.streaming.start-commit' = '20210712134429', -- specifies the start 
commit instant time
 'read.streaming.check-interval' = '4' -- specifies the check interval for 
finding new source commits, default 60s.
   );
   
   -- insert data using values
   INSERT INTO t1 VALUES
 ('id1','Danny',23,TIMESTAMP '1970-01-01 00:00:01','par1'),
 ('id2','Stephen',33,TIMESTAMP '1970-01-01 00:00:02','par1'),
 ('id3','Julian',53,TIMESTAMP '1970-01-01 00:00:03','par2'),
 ('id4','Fabian',31,TIMESTAMP '1970-01-01 00:00:04','par2'),
 ('id5','Sophia',18,TIMESTAMP '1970-01-01 00:00:05','par3'),
 ('id6','Emma',20,TIMESTAMP '1970-01-01 00:00:06','par3'),
 ('id7','Bob',44,TIMESTAMP '1970-01-01 00:00:07','par4'),
 ('id8','Han',56,TIMESTAMP '1970-01-01 00:00:08','par4');
   
   SELECT * FROM t1;
   ```
   
   So I didn't get what's wrong here...
   
   **Environment Description**
   
   * Hudi version : 0.9.0 SNAPSHOT
   
   * Flink version :  1.12.2
   
   * Hive version : none
   
   * Hadoop version : 2.8.3
   
   * Storage (HDFS/S3/GCS..) : local file system
   
   * Running on Docker? (yes/no) : yes
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   
![image](https://user-images.githubusercontent.com/12044174/125382900-20040c80-e3c9-11eb-8ab6-be9a7c3072f5.png)
   
   Taskmanager log: 
[taskmanager.log.zip](https://github.com/apache/hudi/files/6805564/taskmanager.log.zip)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-2153:
-
Labels: pull-request-available  (was: )

> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2153) BucketAssignFunction NullPointerException

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379541#comment-17379541
 ] 

ASF GitHub Bot commented on HUDI-2153:
--

moranyuwen opened a new pull request #3261:
URL: https://github.com/apache/hudi/pull/3261


   Running HoodieFlinkStreamer will encounter an exception in the 
bucketAssignFunction class where the context is null
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BucketAssignFunction NullPointerException
> -
>
> Key: HUDI-2153
> URL: https://issues.apache.org/jira/browse/HUDI-2153
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: moran
>Priority: Major
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:198)
>   at 
> org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:159)
>   at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:204)
>   at 
> org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:174)
>   at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:396)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:191)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
>   at java.lang.Thread.run(Thread.java:748)
> ERROR at 
> Line 197 of the BucketAssignFunction class  
> (this.context.setCurrentKey(recordKey))
> Why is this context null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] moranyuwen opened a new pull request #3261: [HUDI-2153] Fix BucketAssignFunction NullPointerException

2021-07-12 Thread GitBox


moranyuwen opened a new pull request #3261:
URL: https://github.com/apache/hudi/pull/3261


   Running HoodieFlinkStreamer will encounter an exception in the 
bucketAssignFunction class where the context is null
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379522#comment-17379522
 ] 

ASF GitHub Bot commented on HUDI-2164:
--

hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Build cluster plan and execute this plan at once for HoodieClusteringJob
> 
>
> Key: HUDI-2164
> URL: https://issues.apache.org/jira/browse/HUDI-2164
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
>
> For now, Hudi can let users submit a HoodieClusteringJob to build a 
> clustering plan or execute a clustering plan through --schedule or 
> --instant-time config.
> If users want to trigger a clustering job, he has to 
>  # Submit a HoodieClusteringJob to build a clustering job through --schedule 
> config
>  # Copy the created clustering Instant time form Log info.
>  # Submit the HoodieClusteringJob again to execute this created clustering 
> plan through --instant-time config.
> The pain point is that there are too many steps when trigger a clustering and 
> need to copy and paste the instant time from log file manually so that we 
> can't make it automatically.
>  
> I just raise a PR to offer a new config named --mode or -m in short 
> ||--mode||remarks||
> |execute|Execute a cluster plan at given instant which means --instant-time 
> is needed here. default value. |
> |schedule|Make a clustering plan.|
> |*scheduleAndExecute*|Make a cluster plan first and execute that plan 
> immediately|
> Now users can use --mode scheduleAndExecute to Build cluster plan and execute 
> this plan at once using HoodieClusteringJob.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249


   
   ## CI report:
   
   * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379520#comment-17379520
 ] 

ASF GitHub Bot commented on HUDI-2164:
--

zhangyue19921010 commented on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878723946


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Build cluster plan and execute this plan at once for HoodieClusteringJob
> 
>
> Key: HUDI-2164
> URL: https://issues.apache.org/jira/browse/HUDI-2164
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
>
> For now, Hudi can let users submit a HoodieClusteringJob to build a 
> clustering plan or execute a clustering plan through --schedule or 
> --instant-time config.
> If users want to trigger a clustering job, he has to 
>  # Submit a HoodieClusteringJob to build a clustering job through --schedule 
> config
>  # Copy the created clustering Instant time form Log info.
>  # Submit the HoodieClusteringJob again to execute this created clustering 
> plan through --instant-time config.
> The pain point is that there are too many steps when trigger a clustering and 
> need to copy and paste the instant time from log file manually so that we 
> can't make it automatically.
>  
> I just raise a PR to offer a new config named --mode or -m in short 
> ||--mode||remarks||
> |execute|Execute a cluster plan at given instant which means --instant-time 
> is needed here. default value. |
> |schedule|Make a clustering plan.|
> |*scheduleAndExecute*|Make a cluster plan first and execute that plan 
> immediately|
> Now users can use --mode scheduleAndExecute to Build cluster plan and execute 
> this plan at once using HoodieClusteringJob.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] zhangyue19921010 commented on pull request #3259: [HUDI-2164] Build cluster plan and execute this plan at once for HoodieClusteringJob

2021-07-12 Thread GitBox


zhangyue19921010 commented on pull request #3259:
URL: https://github.com/apache/hudi/pull/3259#issuecomment-878723946


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope commented on a change in pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


codope commented on a change in pull request #3250:
URL: https://github.com/apache/hudi/pull/3250#discussion_r668367569



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##
@@ -339,7 +339,7 @@
   .withDocumentation("");
 
   public static final ConfigProperty 
EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION = ConfigProperty
-  .key(AVRO_SCHEMA + ".externalTransformation")
+  .key(AVRO_SCHEMA.key() + ".externalTransformation")

Review comment:
   Changed the config key to `hoodie.avro.schema.external.transformation` 
and also have `hoodie.avro.schema.externalTransformation` as alternative for 
backwards compatibility.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2151) Make performant out-of-box configs

2021-07-12 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379496#comment-17379496
 ] 

Vinoth Chandar commented on HUDI-2151:
--

These need a default value. 

 
{code:java}
public static final ConfigProperty ZK_PORT_PROP = ConfigProperty
 .key(ZK_PORT_PROP_KEY)
 .noDefaultValue()
 .sinceVersion("0.8.0")
 .withDocumentation("Zookeeper port to connect to.");

public static final ConfigProperty ZK_LOCK_KEY_PROP = ConfigProperty
 .key(ZK_LOCK_KEY_PROP_KEY)
 .noDefaultValue()
 .sinceVersion("0.8.0")
 .withDocumentation("Key name under base_path at which to create a ZNode and 
acquire lock. "
 + "Final path on zk will look like base_path/lock_key. We recommend setting 
this to the table name");{code}

> Make performant out-of-box configs
> --
>
> Key: HUDI-2151
> URL: https://issues.apache.org/jira/browse/HUDI-2151
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup, Docs
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> We have quite a few configs which deliver better performance or usability, 
> but guarded by flags. 
>  This is to identify them, change them, test (functionally, perf) and make 
> them default
>  
> Need to ensure we also capture all the backwards compatibility issues that 
> can arise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (HUDI-2151) Make performant out-of-box configs

2021-07-12 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-2151:
-
Comment: was deleted

(was: Is this correct? 5s?

 
{code:java}
public static final ConfigProperty 
LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP = ConfigProperty
 .key(LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY)
 .defaultValue(String.valueOf(5000L))
 .sinceVersion("0.8.0")
 .withDocumentation("Maximum amount of time to wait between retries by lock 
provider client. This bounds" +
 " the maximum delay from the exponential backoff.");{code})

> Make performant out-of-box configs
> --
>
> Key: HUDI-2151
> URL: https://issues.apache.org/jira/browse/HUDI-2151
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup, Docs
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> We have quite a few configs which deliver better performance or usability, 
> but guarded by flags. 
>  This is to identify them, change them, test (functionally, perf) and make 
> them default
>  
> Need to ensure we also capture all the backwards compatibility issues that 
> can arise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2144) Offline clustering(independent sparkJob) will cause insert action losing data

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379491#comment-17379491
 ] 

ASF GitHub Bot commented on HUDI-2144:
--

satishkotha merged pull request #3240:
URL: https://github.com/apache/hudi/pull/3240


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Offline clustering(independent sparkJob) will cause insert action losing data
> -
>
> Key: HUDI-2144
> URL: https://issues.apache.org/jira/browse/HUDI-2144
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-07-08-13-52-00-089.png
>
>
> For now we have two kinds of pipeline for Hudi using spark:
>  # Streaming insert data to specific partition
>  # Offline clustering spark 
> job(`org.apache.hudi.utilities.HoodieClusteringJob`) to optimize file size 
> pipeline 1 created
> But here is a bug we met that will lose data
> These steps can make the problem reproduce stably :
>  # Submit a spark job to Ingest data1 using insert mode.
>  # Schedule a clustering plan using 
> `org.apache.hudi.utilities.HoodieClusteringJob`
>  # Submit a spark job again to Ingest data2 using insert mode(Ensure that 
> there is new file slice created in the same file group which means small file 
> tuning for insert is working). Suppose this file group is called file group 1 
> and new file slice is called file slice 2.
>  # Execute that clustering job step2 planed.
>  # Query data1+data2 you will find new data for a  is lost compared with 
> common ingestion without clustering
>  
>   !image-2021-07-08-13-52-00-089.png|width=922,height=728!
> Here is the root cause:
> When ingest data using insert mode, Hudi will find small files and try to 
> append new data to them ,aiming to tuning data file size.
> [https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149]
> is try to filter Small Files In Clustering but only works when user set 
> `hoodie.clustering.inline` true which is not good enough when users using 
> offline clustering.
> I just raise a PR try to fix it and tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[hudi] branch master updated (ca440cc -> c8a2033)

2021-07-12 Thread satish
This is an automated email from the ASF dual-hosted git repository.

satish pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from ca440cc  [HUDI-2107] Support Read Log Only MOR Table For Spark (#3193)
 add c8a2033  [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) 
will cause insert action losing data (#3240)

No new revisions were added by this update.

Summary of changes:
 .../table/action/commit/UpsertPartitioner.java |  2 +-
 .../table/action/commit/TestUpsertPartitioner.java | 45 +-
 .../hudi/common/testutils/ClusteringTestUtils.java | 54 ++
 3 files changed, 99 insertions(+), 2 deletions(-)
 create mode 100644 
hudi-common/src/test/java/org/apache/hudi/common/testutils/ClusteringTestUtils.java


[GitHub] [hudi] satishkotha merged pull request #3240: [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data

2021-07-12 Thread GitBox


satishkotha merged pull request #3240:
URL: https://github.com/apache/hudi/pull/3240


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2144) Offline clustering(independent sparkJob) will cause insert action losing data

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379490#comment-17379490
 ] 

ASF GitHub Bot commented on HUDI-2144:
--

lw309637554 commented on a change in pull request #3240:
URL: https://github.com/apache/hudi/pull/3240#discussion_r668356708



##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##
@@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String 
fileIdHint) {
* @return smallFiles not in clustering
*/
   private List filterSmallFilesInClustering(final Set 
pendingClusteringFileGroupsId, final List smallFiles) {
-if (this.config.isClusteringEnabled()) {

Review comment:
   @satishkotha @zhangyue19921010 
   Use "if (!pendingClusteringFileGroupsId.isEmpty())" will improve ease of 
use. 
   Another need to  modify. But if this will bring performance loss? 
@satishkotha 
   
   "  private JavaRDD> 
clusteringHandleUpdate(JavaRDD> inputRecordsRDD) {
   if (config.isClusteringEnabled()) {"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Offline clustering(independent sparkJob) will cause insert action losing data
> -
>
> Key: HUDI-2144
> URL: https://issues.apache.org/jira/browse/HUDI-2144
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-07-08-13-52-00-089.png
>
>
> For now we have two kinds of pipeline for Hudi using spark:
>  # Streaming insert data to specific partition
>  # Offline clustering spark 
> job(`org.apache.hudi.utilities.HoodieClusteringJob`) to optimize file size 
> pipeline 1 created
> But here is a bug we met that will lose data
> These steps can make the problem reproduce stably :
>  # Submit a spark job to Ingest data1 using insert mode.
>  # Schedule a clustering plan using 
> `org.apache.hudi.utilities.HoodieClusteringJob`
>  # Submit a spark job again to Ingest data2 using insert mode(Ensure that 
> there is new file slice created in the same file group which means small file 
> tuning for insert is working). Suppose this file group is called file group 1 
> and new file slice is called file slice 2.
>  # Execute that clustering job step2 planed.
>  # Query data1+data2 you will find new data for a  is lost compared with 
> common ingestion without clustering
>  
>   !image-2021-07-08-13-52-00-089.png|width=922,height=728!
> Here is the root cause:
> When ingest data using insert mode, Hudi will find small files and try to 
> append new data to them ,aiming to tuning data file size.
> [https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149]
> is try to filter Small Files In Clustering but only works when user set 
> `hoodie.clustering.inline` true which is not good enough when users using 
> offline clustering.
> I just raise a PR try to fix it and tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] lw309637554 commented on a change in pull request #3240: [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data

2021-07-12 Thread GitBox


lw309637554 commented on a change in pull request #3240:
URL: https://github.com/apache/hudi/pull/3240#discussion_r668356708



##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##
@@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String 
fileIdHint) {
* @return smallFiles not in clustering
*/
   private List filterSmallFilesInClustering(final Set 
pendingClusteringFileGroupsId, final List smallFiles) {
-if (this.config.isClusteringEnabled()) {

Review comment:
   @satishkotha @zhangyue19921010 
   Use "if (!pendingClusteringFileGroupsId.isEmpty())" will improve ease of 
use. 
   Another need to  modify. But if this will bring performance loss? 
@satishkotha 
   
   "  private JavaRDD> 
clusteringHandleUpdate(JavaRDD> inputRecordsRDD) {
   if (config.isClusteringEnabled()) {"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2151) Make performant out-of-box configs

2021-07-12 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379488#comment-17379488
 ] 

Vinoth Chandar commented on HUDI-2151:
--

Is this correct? 5s?

 
{code:java}
public static final ConfigProperty 
LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP = ConfigProperty
 .key(LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY)
 .defaultValue(String.valueOf(5000L))
 .sinceVersion("0.8.0")
 .withDocumentation("Maximum amount of time to wait between retries by lock 
provider client. This bounds" +
 " the maximum delay from the exponential backoff.");{code}

> Make performant out-of-box configs
> --
>
> Key: HUDI-2151
> URL: https://issues.apache.org/jira/browse/HUDI-2151
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Code Cleanup, Docs
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> We have quite a few configs which deliver better performance or usability, 
> but guarded by flags. 
>  This is to identify them, change them, test (functionally, perf) and make 
> them default
>  
> Need to ensure we also capture all the backwards compatibility issues that 
> can arise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2144) Offline clustering(independent sparkJob) will cause insert action losing data

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379487#comment-17379487
 ] 

ASF GitHub Bot commented on HUDI-2144:
--

lw309637554 commented on a change in pull request #3240:
URL: https://github.com/apache/hudi/pull/3240#discussion_r668354266



##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##
@@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String 
fileIdHint) {
* @return smallFiles not in clustering
*/
   private List filterSmallFilesInClustering(final Set 
pendingClusteringFileGroupsId, final List smallFiles) {
-if (this.config.isClusteringEnabled()) {

Review comment:
   @satishkotha @zhangyue19921010
   At first we have two config for clustering. If set 
ASYNC_CLUSTERING_ENABLE_OPT_KEY will be ok.
 public boolean isAsyncClusteringEnabled() {
   return 
Boolean.parseBoolean(props.getProperty(HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY));
 }
   
 public boolean isClusteringEnabled() {
   // TODO: future support async clustering
   return inlineClusteringEnabled() || isAsyncClusteringEnabled();
 }





-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Offline clustering(independent sparkJob) will cause insert action losing data
> -
>
> Key: HUDI-2144
> URL: https://issues.apache.org/jira/browse/HUDI-2144
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Yue Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-07-08-13-52-00-089.png
>
>
> For now we have two kinds of pipeline for Hudi using spark:
>  # Streaming insert data to specific partition
>  # Offline clustering spark 
> job(`org.apache.hudi.utilities.HoodieClusteringJob`) to optimize file size 
> pipeline 1 created
> But here is a bug we met that will lose data
> These steps can make the problem reproduce stably :
>  # Submit a spark job to Ingest data1 using insert mode.
>  # Schedule a clustering plan using 
> `org.apache.hudi.utilities.HoodieClusteringJob`
>  # Submit a spark job again to Ingest data2 using insert mode(Ensure that 
> there is new file slice created in the same file group which means small file 
> tuning for insert is working). Suppose this file group is called file group 1 
> and new file slice is called file slice 2.
>  # Execute that clustering job step2 planed.
>  # Query data1+data2 you will find new data for a  is lost compared with 
> common ingestion without clustering
>  
>   !image-2021-07-08-13-52-00-089.png|width=922,height=728!
> Here is the root cause:
> When ingest data using insert mode, Hudi will find small files and try to 
> append new data to them ,aiming to tuning data file size.
> [https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149]
> is try to filter Small Files In Clustering but only works when user set 
> `hoodie.clustering.inline` true which is not good enough when users using 
> offline clustering.
> I just raise a PR try to fix it and tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] lw309637554 commented on a change in pull request #3240: [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data

2021-07-12 Thread GitBox


lw309637554 commented on a change in pull request #3240:
URL: https://github.com/apache/hudi/pull/3240#discussion_r668354266



##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##
@@ -146,7 +146,7 @@ private int addUpdateBucket(String partitionPath, String 
fileIdHint) {
* @return smallFiles not in clustering
*/
   private List filterSmallFilesInClustering(final Set 
pendingClusteringFileGroupsId, final List smallFiles) {
-if (this.config.isClusteringEnabled()) {

Review comment:
   @satishkotha @zhangyue19921010
   At first we have two config for clustering. If set 
ASYNC_CLUSTERING_ENABLE_OPT_KEY will be ok.
 public boolean isAsyncClusteringEnabled() {
   return 
Boolean.parseBoolean(props.getProperty(HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY));
 }
   
 public boolean isClusteringEnabled() {
   // TODO: future support async clustering
   return inlineClusteringEnabled() || isAsyncClusteringEnabled();
 }





-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379436#comment-17379436
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379432#comment-17379432
 ] 

ASF GitHub Bot commented on HUDI-1828:
--

codecov-commenter edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure All Tests Pass with ORC format
> -
>
> Key: HUDI-1828
> URL: https://issues.apache.org/jira/browse/HUDI-1828
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Storage Management
>Reporter: Teresa Kang
>Priority: Major
>  Labels: pull-request-available
>
> Run all tests with HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT=ORC, ensure all 
> tests pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-commenter edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379429#comment-17379429
 ] 

ASF GitHub Bot commented on HUDI-1828:
--

codecov-commenter edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2619292) into 
[master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2b21ae1) will **decrease** coverage by `20.04%`.
   > The diff coverage is `40.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3237   +/-   ##
   =
   - Coverage 47.57%   27.53%   -20.05% 
   + Complexity 5481 1292 -4189 
   =
 Files   924  385  -539 
 Lines 4119415218-25976 
 Branches   4133 1318 -2815 
   =
   - Hits  19599 4190-15409 
   + Misses1985310724 -9129 
   + Partials   1742  304 -1438 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.93% <40.00%> (-13.65%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=)
 | `0.00% <ø> (-71.88%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==)
 | `94.64% <40.00%> (-5.36%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2619292) into 
[master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2b21ae1) will **decrease** coverage by `20.04%`.
   > The diff coverage is `40.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3237   +/-   ##
   =
   - Coverage 47.57%   27.53%   -20.05% 
   + Complexity 5481 1292 -4189 
   =
 Files   924  385  -539 
 Lines 4119415218-25976 
 Branches   4133 1318 -2815 
   =
   - Hits  19599 4190-15409 
   + Misses1985310724 -9129 
   + Partials   1742  304 -1438 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.93% <40.00%> (-13.65%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=)
 | `0.00% <ø> (-71.88%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==)
 | `94.64% <40.00%> (-5.36%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together

2021-07-12 Thread Nishith Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379423#comment-17379423
 ] 

Nishith Agarwal commented on HUDI-2159:
---

Thanks for the detailed analysis [~pwason]. I think it is definitely worth 
solving (1) from the 0.9.0 release. This is a legitimate situation that can 
surface up especially as users schedule ingestion at a lower frequency there is 
more chances of such collisions.

For (2), since it is more of a perf degradation in cases of failures, we can 
address this right after 0.9 by landing the tailing timeline based on 
completion time. 

> Supporting Clustering and Metadata Table together
> -
>
> Key: HUDI-2159
> URL: https://issues.apache.org/jira/browse/HUDI-2159
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Prashant Wason
>Assignee: Prashant Wason
>Priority: Blocker
> Fix For: 0.9.0
>
>
> I am testing clustering support for metadata enabled table and found a few 
> issues.
> *Setup*
> Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 
> mins. 
> Pipeline 2: Clustering pipeline with long running jobs (3-4 hours)
> Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours)
>  
> *Issue #1: Parallel commits on Metadata Table*
> Assume the Clustering pipeline is completing T5.replacecommit and ingestion 
> pipeline is completing T10.commit. Metadata Table will synced at an instant 
>  Now both the pipelines will call syncMetadataTable() which will do the 
> following:
>  # Find all un-synced instants from dataset (T5, T6 ... T10)
>  # Read each instant and perform a deltacommit on the Metadata Table with the 
> same timestamp as instant.
> There is a chance that two processed perform deltacommit at T5 on the 
> metadata table and one will fail (instant file already exists). This will be 
> an exception raised and will be detected as failure of pipeline leading to 
> false-positive alerts.
>  
> *Issue #2: No archiving/rollback support for failed clustering operations*
> If a clustering operation fails, it leaves a left-over 
> T5.replacecommit.inflight. There is no automated way to rollback or archive 
> these. Since clustering is a long running operation in general and may be run 
> through multiple pipelines at the same time, automated rollback of left-over 
> inflights doesnt work as we cannot be sure that the process is dead.
> Metadata Table sync only works in completion order. So if 
> T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond 
> T5 causing a large number of LogBLocks to pile up which will have to be 
> merged in memory leading to deteriorating performance.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379422#comment-17379422
 ] 

ASF GitHub Bot commented on HUDI-1828:
--

codecov-commenter edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2619292) into 
[master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2b21ae1) will **decrease** coverage by `31.65%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3237   +/-   ##
   =
   - Coverage 47.57%   15.91%   -31.66% 
   + Complexity 5481  493 -4988 
   =
 Files   924  283  -641 
 Lines 4119411710-29484 
 Branches   4133  961 -3172 
   =
   - Hits  19599 1864-17735 
   + Misses19853 9683-10170 
   + Partials   1742  163 -1579 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <ø> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=)
 | `0.00% <ø> (-71.88%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876129015


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3237](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2619292) into 
[master](https://codecov.io/gh/apache/hudi/commit/2b21ae1775aeb108a4b0e3f89889651a19f93b2f?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (2b21ae1) will **decrease** coverage by `31.65%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3237/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3237   +/-   ##
   =
   - Coverage 47.57%   15.91%   -31.66% 
   + Complexity 5481  493 -4988 
   =
 Files   924  283  -641 
 Lines 4119411710-29484 
 Branches   4133  961 -3172 
   =
   - Hits  19599 1864-17735 
   + Misses19853 9683-10170 
   + Partials   1742  163 -1579 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <ø> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (+1.25%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3237?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/io/storage/HoodieOrcWriter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL3N0b3JhZ2UvSG9vZGllT3JjV3JpdGVyLmphdmE=)
 | `0.00% <ø> (-71.88%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3237/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379421#comment-17379421
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851)
 
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851)
 
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=868)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2161) Add support to disable meta column to BulkInsert Row Writer path

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379419#comment-17379419
 ] 

ASF GitHub Bot commented on HUDI-2161:
--

hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851)
 
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support to disable meta column to BulkInsert Row Writer path
> 
>
> Key: HUDI-2161
> URL: https://issues.apache.org/jira/browse/HUDI-2161
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Objective here is to disable all meta columns so as to avoid storage cost. 
> Also, some benefits could be seen in write latency with row writer path as no 
> special handling is required at RowCreateHandle layer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3247: [HUDI-2161] Adding support to disable meta columns with bulk insert operation

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3247:
URL: https://github.com/apache/hudi/pull/3247#issuecomment-876918931


   
   ## CI report:
   
   * f0dd67bb360fe3fd275264127d50a9feb881479a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=851)
 
   * 860eabd8a3d02e8709874cb67788e61d0d43d9c5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379410#comment-17379410
 ] 

ASF GitHub Bot commented on HUDI-1828:
--

hudi-bot edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876059246


   
   ## CI report:
   
   * 2619292015a49cb34ce484f5ecd2843e97000e52 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=867)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure All Tests Pass with ORC format
> -
>
> Key: HUDI-1828
> URL: https://issues.apache.org/jira/browse/HUDI-1828
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Storage Management
>Reporter: Teresa Kang
>Priority: Major
>  Labels: pull-request-available
>
> Run all tests with HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT=ORC, ensure all 
> tests pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] hudi-bot edited a comment on pull request #3237: [HUDI-1828] Update unit tests to support ORC as the base file format

2021-07-12 Thread GitBox


hudi-bot edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876059246


   
   ## CI report:
   
   * 2619292015a49cb34ce484f5ecd2843e97000e52 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=867)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3250:
URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (01525c9) into 
[master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (ca440cc) will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@Coverage Diff@@
   ## master#3250   +/-   ##
   =
 Coverage 47.71%   47.72%   
   - Complexity 5527 5528+1 
   =
 Files   934  934   
 Lines 4145641457+1 
 Branches   4167 4167   
   =
   + Hits  1978219785+3 
   + Misses1991619914-2 
 Partials   1758 1758   
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.46% <100.00%> (+0.01%)` | :arrow_up: |
   | hudicommon | `48.57% <ø> (-0.01%)` | :arrow_down: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.55% <ø> (ø)` | |
   | hudisparkdatasource | `67.37% <ø> (+0.05%)` | :arrow_up: |
   | hudisync | `54.51% <ø> (ø)` | |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.26% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `43.07% <100.00%> (+0.26%)` | :arrow_up: |
   | 
[...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==)
 | `78.90% <0.00%> (-0.79%)` | :arrow_down: |
   | 
[...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=)
 | `29.60% <0.00%> (+1.60%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=continue_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=footer_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation).
 Last update 
[ca440cc...01525c9](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=lastupdated_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation).
 Read the [comment 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3250:
URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (01525c9) into 
[master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (ca440cc) will **decrease** coverage by `3.38%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3250  +/-   ##
   
   - Coverage 47.71%   44.33%   -3.39% 
   + Complexity 5527 4930 -597 
   
 Files   934  860  -74 
 Lines 4145637415-4041 
 Branches   4167 3496 -671 
   
   - Hits  1978216589-3193 
   + Misses1991619555 -361 
   + Partials   1758 1271 -487 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.46% <100.00%> (+0.01%)` | :arrow_up: |
   | hudicommon | `48.57% <ø> (-0.01%)` | :arrow_down: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.55% <ø> (ø)` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.26% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `43.07% <100.00%> (+0.26%)` | :arrow_up: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/hive/HiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN5bmNDb25maWcuamF2YQ==)
 | `0.00% <0.00%> (-98.08%)` | :arrow_down: |
   | 
[...he/hudi/hive/replication/GlobalHiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvcmVwbGljYXRpb24vR2xvYmFsSGl2ZVN5bmNDb25maWcuamF2YQ==)
 | `0.00% <0.00%> (-95.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3250:
URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (01525c9) into 
[master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (ca440cc) will **decrease** coverage by `3.62%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3250  +/-   ##
   
   - Coverage 47.71%   44.09%   -3.63% 
   + Complexity 5527 4868 -659 
   
 Files   934  854  -80 
 Lines 4145636964-4492 
 Branches   4167 3472 -695 
   
   - Hits  1978216300-3482 
   + Misses1991619412 -504 
   + Partials   1758 1252 -506 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.46% <100.00%> (+0.01%)` | :arrow_up: |
   | hudicommon | `48.57% <ø> (-0.01%)` | :arrow_down: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.55% <ø> (ø)` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `43.07% <100.00%> (+0.26%)` | :arrow_up: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/hive/HiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN5bmNDb25maWcuamF2YQ==)
 | `0.00% <0.00%> (-98.08%)` | :arrow_down: |
   | 
[...he/hudi/hive/replication/GlobalHiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvcmVwbGljYXRpb24vR2xvYmFsSGl2ZVN5bmNDb25maWcuamF2YQ==)
 | `0.00% <0.00%> (-95.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3250:
URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (01525c9) into 
[master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (ca440cc) will **decrease** coverage by `17.68%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3250   +/-   ##
   =
   - Coverage 47.71%   30.03%   -17.69% 
   + Complexity 5527 1562 -3965 
   =
 Files   934  421  -513 
 Lines 4145616986-24470 
 Branches   4167 1561 -2606 
   =
   - Hits  19782 5102-14680 
   + Misses1991611481 -8435 
   + Partials   1758  403 -1355 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.93% <0.00%> (-13.52%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `51.55% <ø> (ø)` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `0.00% <0.00%> (-42.81%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3250: [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config

2021-07-12 Thread GitBox


codecov-commenter edited a comment on pull request #3250:
URL: https://github.com/apache/hudi/pull/3250#issuecomment-877313010


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3250](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (01525c9) into 
[master](https://codecov.io/gh/apache/hudi/commit/ca440ccf881c67c308e72beaf6a561e12e1b4da2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (ca440cc) will **decrease** coverage by `20.18%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3250/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3250   +/-   ##
   =
   - Coverage 47.71%   27.53%   -20.19% 
   + Complexity 5527 1291 -4236 
   =
 Files   934  385  -549 
 Lines 4145615215-26241 
 Branches   4167 1316 -2851 
   =
   - Hits  19782 4189-15593 
   + Misses1991610723 -9193 
   + Partials   1758  303 -1455 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.93% <0.00%> (-13.52%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3250?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `0.00% <0.00%> (-42.81%)` | :arrow_down: |
   | 
[...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3250/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[jira] [Commented] (HUDI-1241) Generate config docs automatically

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379394#comment-17379394
 ] 

ASF GitHub Bot commented on HUDI-1241:
--

hudi-bot edited a comment on pull request #3260:
URL: https://github.com/apache/hudi/pull/3260#issuecomment-878357134






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Generate config docs automatically
> --
>
> Key: HUDI-1241
> URL: https://issues.apache.org/jira/browse/HUDI-1241
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Code Cleanup
>Reporter: sivabalan narayanan
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Now that we have `HoodieConfig` and `ConfigProperty`, can we write a small 
> script that can build a certain branch or git-sha, use reflection to load up 
> all the HoodieConfig classes and generate a .md file automatically for each 
> ConfigProperty defined.
>  
> We will then render the .md file thru the site, as always



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1828) Ensure All Tests Pass with ORC format

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379395#comment-17379395
 ] 

ASF GitHub Bot commented on HUDI-1828:
--

hudi-bot edited a comment on pull request #3237:
URL: https://github.com/apache/hudi/pull/3237#issuecomment-876059246






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure All Tests Pass with ORC format
> -
>
> Key: HUDI-1828
> URL: https://issues.apache.org/jira/browse/HUDI-1828
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Storage Management
>Reporter: Teresa Kang
>Priority: Major
>  Labels: pull-request-available
>
> Run all tests with HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT=ORC, ensure all 
> tests pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2063) Add Doc For Spark Sql Integrates With Hudi

2021-07-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379393#comment-17379393
 ] 

ASF GitHub Bot commented on HUDI-2063:
--

leesf commented on pull request #3140:
URL: https://github.com/apache/hudi/pull/3140#issuecomment-878335579






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Doc For Spark Sql Integrates With Hudi
> --
>
> Key: HUDI-2063
> URL: https://issues.apache.org/jira/browse/HUDI-2063
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: pengzhiwei
>Assignee: pengzhiwei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >