[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991519008


   
   ## CI report:
   
   * e68824ee98d8dd75f0b8940da24bc8a52301fdd7 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4200)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991498183


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   * e68824ee98d8dd75f0b8940da24bc8a52301fdd7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4200)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4286: [WIP][HUDI-2955] Upgrade Hadoop to 3.3.1, Hive to 3.1.2, HBase to 2.4.8

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4286:
URL: https://github.com/apache/hudi/pull/4286#issuecomment-991505981


   
   ## CI report:
   
   * 4a459976c56d12c1beb46284862113b866bba284 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4286: [WIP][HUDI-2955] Upgrade Hadoop to 3.3.1, Hive to 3.1.2, HBase to 2.4.8

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4286:
URL: https://github.com/apache/hudi/pull/4286#issuecomment-991506999


   
   ## CI report:
   
   * 4a459976c56d12c1beb46284862113b866bba284 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4202)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4286: [WIP][HUDI-2955] Upgrade Hadoop to 3.3.1, Hive to 3.1.2, HBase to 2.4.8

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4286:
URL: https://github.com/apache/hudi/pull/4286#issuecomment-991505981


   
   ## CI report:
   
   * 4a459976c56d12c1beb46284862113b866bba284 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2955) Upgrade Hadoop to 3.3.x

2021-12-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-2955:
-
Labels: pull-request-available  (was: )

> Upgrade Hadoop to 3.3.x
> ---
>
> Key: HUDI-2955
> URL: https://issues.apache.org/jira/browse/HUDI-2955
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Alexey Kudinkin
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-12-07 at 2.32.51 PM.png
>
>
> According to Hadoop compatibility matrix, this is a pre-requisite to 
> upgrading to JDK11:
> !Screen Shot 2021-12-07 at 2.32.51 PM.png|width=938,height=230!
> [https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions]
>  
> *Upgrading Hadoop from 2.x to 3.x*
> [https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.x+to+3.x+Upgrade+Efforts]
> Everything (relevant to us) seems to be in a good shape, except Spark 2.2/.3



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] alexeykudinkin opened a new pull request #4286: [WIP][HUDI-2955] Upgrade Hadoop to 3.3.1, Hive to 3.1.2, HBase to 2.4.8

2021-12-10 Thread GitBox


alexeykudinkin opened a new pull request #4286:
URL: https://github.com/apache/hudi/pull/4286


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Upgrading
 - Hadoop to 3.3.1
 - Hive to 3.1.2 (3.x Hive only is compatible w/ Hadoop 3.x)
 - HBase to 2.4.8 (HBase 2.4.x seems not to be compatible w/ Hive 2.x) 
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991502571


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4197)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4201)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991482008


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4197)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] danny0405 commented on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


danny0405 commented on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991502518


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991498183


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   * e68824ee98d8dd75f0b8940da24bc8a52301fdd7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4200)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991497328


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   * e68824ee98d8dd75f0b8940da24bc8a52301fdd7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991494032


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991497328


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   * e68824ee98d8dd75f0b8940da24bc8a52301fdd7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991471002


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991494032


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (HUDI-2985) Shade jackson for hudi flink bundle jar

2021-12-10 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved HUDI-2985.
--

> Shade jackson for hudi flink bundle jar
> ---
>
> Key: HUDI-2985
> URL: https://issues.apache.org/jira/browse/HUDI-2985
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HUDI-2985) Shade jackson for hudi flink bundle jar

2021-12-10 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457549#comment-17457549
 ] 

Danny Chen commented on HUDI-2985:
--

Fixed via master branch: 2dcb3f0062074b59676bb65b0afee86994571fd6

> Shade jackson for hudi flink bundle jar
> ---
>
> Key: HUDI-2985
> URL: https://issues.apache.org/jira/browse/HUDI-2985
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] danny0405 merged pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


danny0405 merged pull request #4284:
URL: https://github.com/apache/hudi/pull/4284


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated (9bdcee0 -> 2dcb3f0)

2021-12-10 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from 9bdcee0  [HUDI-2959] Fix the thread leak of cleaning service (#4252)
 add 2dcb3f0  [HUDI-2985] Shade jackson for hudi flink bundle jar (#4284)

No new revisions were added by this update.

Summary of changes:
 .../java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java | 3 +++
 packaging/hudi-flink-bundle/pom.xml   | 4 
 2 files changed, 7 insertions(+)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991467574


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4197)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991482008


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4197)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4284:
URL: https://github.com/apache/hudi/pull/4284#issuecomment-991467569


   
   ## CI report:
   
   * 5be45ca38379285635a0d1d6cdbc2a17c21193b7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4196)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4284:
URL: https://github.com/apache/hudi/pull/4284#issuecomment-991472721


   
   ## CI report:
   
   * 5be45ca38379285635a0d1d6cdbc2a17c21193b7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4196)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991470753


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991471002


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4198)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991470753


   
   ## CI report:
   
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991460023


   
   ## CI report:
   
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] YannByron commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


YannByron commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991470760


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991468392


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * 432c2aff71bf918d22a3c6e81f23b11f5297d3b0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4194)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991453789


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   * 432c2aff71bf918d22a3c6e81f23b11f5297d3b0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4194)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4284:
URL: https://github.com/apache/hudi/pull/4284#issuecomment-991467569


   
   ## CI report:
   
   * 5be45ca38379285635a0d1d6cdbc2a17c21193b7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4196)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4284:
URL: https://github.com/apache/hudi/pull/4284#issuecomment-991466435


   
   ## CI report:
   
   * 5be45ca38379285635a0d1d6cdbc2a17c21193b7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991466458


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991467574


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4197)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4284:
URL: https://github.com/apache/hudi/pull/4284#issuecomment-991466435


   
   ## CI report:
   
   * 5be45ca38379285635a0d1d6cdbc2a17c21193b7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4285:
URL: https://github.com/apache/hudi/pull/4285#issuecomment-991466458


   
   ## CI report:
   
   * 7d18ceca4fb530966ba81e2c954bdf7885567839 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2984) Implement #close for AbstractTableFileSystemView

2021-12-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-2984:
-
Labels: pull-request-available  (was: )

> Implement #close for AbstractTableFileSystemView
> 
>
> Key: HUDI-2984
> URL: https://issues.apache.org/jira/browse/HUDI-2984
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core
>Reporter: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-2985) Shade jackson for hudi flink bundle jar

2021-12-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-2985:
-
Labels: pull-request-available  (was: )

> Shade jackson for hudi flink bundle jar
> ---
>
> Key: HUDI-2985
> URL: https://issues.apache.org/jira/browse/HUDI-2985
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] danny0405 opened a new pull request #4285: [HUDI-2984] Implement #close for AbstractTableFileSystemView

2021-12-10 Thread GitBox


danny0405 opened a new pull request #4285:
URL: https://github.com/apache/hudi/pull/4285


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] danny0405 opened a new pull request #4284: [HUDI-2985] Shade jackson for hudi flink bundle jar

2021-12-10 Thread GitBox


danny0405 opened a new pull request #4284:
URL: https://github.com/apache/hudi/pull/4284


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] york-yu-ctw opened a new issue #4283: [SUPPORT] Data written by hudi 0.10.0 is not able be query by redshift

2021-12-10 Thread GitBox


york-yu-ctw opened a new issue #4283:
URL: https://github.com/apache/hudi/issues/4283


   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   Yes
   
   - Join the mailing list to engage in conversations and get faster support at 
dev-subscr...@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   By using hudi 0.10.0, redshift is no longer able to read any data
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. write data to s3 by hudi 0.10.0
   2. create redshift spectrum table
   3. query this table
   
   the spark config
   ```
   df.write
   .format("hudi")
   .option("hoodie.datasource.write.table.type", "MERGE_ON_READ")
   .option("hoodie.datasource.write.partitionpath.field", "dt")
   .option("hoodie.datasource.write.recordkey.field", "uuid")
   .option("hoodie.datasource.write.precombine.field", "time")
   .option("hoodie.datasource.write.hive_style_partitioning", "true")
   .option("hoodie.datasource.write.operation", "insert")
   .option("hoodie.compaction.strategy", 
"org.apache.hudi.table.action.compact.strategy.UnBoundedCompactionStrategy")
   .option("hoodie.datasource.write.keygenerator.class", 
"org.apache.hudi.keygen.ComplexAvroKeyGenerator")
   .option("hoodie.table.name", "table1")
   .option("hoodie.insert.shuffle.parallelism", "20")
   .mode("overwrite")
   .save("s3://xx/data");
   ```
   
   the spectrum table defination
   ```
   CREATE EXTERNAL TABLE spectrum.biz_game_v3_hudi (
 _hoodie_commit_time   VARCHAR(64),
 _hoodie_record_key VARCHAR(512),
 _hoodie_partition_pathVARCHAR(128),
 uuidVARCHAR(45),
 timeVARCHAR(45),
   )
   PARTITIONED BY
 (dt VARCHAR, appid VARCHAR, region VARCHAR)
   ROW FORMAT SERDE
   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
   STORED AS INPUTFORMAT
   'org.apache.hudi.hadoop.HoodieParquetInputFormat'
   OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
   LOCATION
   's3://x/data';
   ```
   
   I have noticed the instant timestamp length have changed from 
`MMddHHmmss` to `MMddHHmmssSSS`,
   once I tried to use 
'org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat', redshift shows 
'hudi::ParsedFilename::IsValidCommitTimestamp( 
std::string(ctx.hudi_commit_timestamp))' error, I still have no idea why there 
is nothing return when using `org.apache.hudi.hadoop.HoodieParquetInputFormat`
   
   
   
   **Expected behavior**
   Version up hudi should not effect the behavior of reading of lower version
   
   **Environment Description**
   
   * Hudi version : 0.10.0
   
   * Spark version : 3.1.1
   
   * Hive version :
   
   * Hadoop version : 3.2.1
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-2985) Shade jackson for hudi flink bundle jar

2021-12-10 Thread Danny Chen (Jira)
Danny Chen created HUDI-2985:


 Summary: Shade jackson for hudi flink bundle jar
 Key: HUDI-2985
 URL: https://issues.apache.org/jira/browse/HUDI-2985
 Project: Apache Hudi
  Issue Type: Task
  Components: Flink Integration
Reporter: Danny Chen
Assignee: Danny Chen
 Fix For: 0.11.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991453766


   
   ## CI report:
   
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991460023


   
   ## CI report:
   
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4195)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-2984) Implement #close for AbstractTableFileSystemView

2021-12-10 Thread Danny Chen (Jira)
Danny Chen created HUDI-2984:


 Summary: Implement #close for AbstractTableFileSystemView
 Key: HUDI-2984
 URL: https://issues.apache.org/jira/browse/HUDI-2984
 Project: Apache Hudi
  Issue Type: Improvement
  Components: Common Core
Reporter: Danny Chen
 Fix For: 0.11.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991436367


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   * 432c2aff71bf918d22a3c6e81f23b11f5297d3b0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991453789


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   * 432c2aff71bf918d22a3c6e81f23b11f5297d3b0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4194)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991452976


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991453766


   
   ## CI report:
   
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991452976


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   * 2e84bd7c5f19a16c6db44161658d9dff4cc545a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991439751


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2959) Fix the thread leak of cleaning service

2021-12-10 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457529#comment-17457529
 ] 

Danny Chen commented on HUDI-2959:
--

Fixed via master branch: 9bdcee00c010fd6d7c817ee882550dd78e35ad91

> Fix the thread leak of cleaning service
> ---
>
> Key: HUDI-2959
> URL: https://issues.apache.org/jira/browse/HUDI-2959
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[hudi] branch master updated: [HUDI-2959] Fix the thread leak of cleaning service (#4252)

2021-12-10 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 9bdcee0  [HUDI-2959] Fix the thread leak of cleaning service (#4252)
9bdcee0 is described below

commit 9bdcee00c010fd6d7c817ee882550dd78e35ad91
Author: Danny Chan 
AuthorDate: Sat Dec 11 12:08:47 2021 +0800

[HUDI-2959] Fix the thread leak of cleaning service (#4252)
---
 .../org/apache/hudi/async/HoodieAsyncService.java  | 36 +-
 .../hudi/client/AbstractHoodieWriteClient.java |  6 +++-
 .../apache/hudi/client/AsyncCleanerService.java| 14 -
 .../apache/hudi/client/HoodieFlinkWriteClient.java |  6 +++-
 .../java/org/apache/hudi/sink/CleanFunction.java   |  7 +
 5 files changed, 31 insertions(+), 38 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
index 85e0081..f57484d 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
@@ -29,7 +29,6 @@ import java.util.concurrent.BlockingQueue;
 import java.util.concurrent.CompletableFuture;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.ExecutorService;
-import java.util.concurrent.Executors;
 import java.util.concurrent.LinkedBlockingQueue;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.locks.Condition;
@@ -130,7 +129,7 @@ public abstract class HoodieAsyncService implements 
Serializable {
 future = res.getKey();
 executor = res.getValue();
 started = true;
-monitorThreads(onShutdownCallback);
+shutdownCallback(onShutdownCallback);
   }
 
   /**
@@ -141,34 +140,15 @@ public abstract class HoodieAsyncService implements 
Serializable {
   protected abstract Pair startService();
 
   /**
-   * A monitor thread is started which would trigger a callback if the service 
is shutdown.
+   * Add shutdown callback for the completable future.
* 
-   * @param onShutdownCallback
+   * @param callback The callback
*/
-  private void monitorThreads(Function onShutdownCallback) {
-LOG.info("Submitting monitor thread !!");
-Executors.newSingleThreadExecutor(r -> {
-  Thread t = new Thread(r, "Monitor Thread");
-  t.setDaemon(isRunInDaemonMode());
-  return t;
-}).submit(() -> {
-  boolean error = false;
-  try {
-LOG.info("Monitoring thread(s) !!");
-future.get();
-  } catch (ExecutionException ex) {
-LOG.error("Monitor noticed one or more threads failed. Requesting 
graceful shutdown of other threads", ex);
-error = true;
-  } catch (InterruptedException ie) {
-LOG.error("Got interrupted Monitoring threads", ie);
-error = true;
-  } finally {
-// Mark as shutdown
-shutdown = true;
-if (null != onShutdownCallback) {
-  onShutdownCallback.apply(error);
-}
-shutdown(false);
+  @SuppressWarnings("unchecked")
+  private void shutdownCallback(Function callback) {
+future.whenComplete((resp, error) -> {
+  if (null != callback) {
+callback.apply(null != error);
   }
 });
   }
diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
index 358e307..18f93fa 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
@@ -424,7 +424,11 @@ public abstract class AbstractHoodieWriteClient startService() {
+String instantTime = HoodieActiveTimeline.createNewInstantTime();
+LOG.info("Auto cleaning is enabled. Running cleaner async to write 
operation at instant time " + instantTime);
 return Pair.of(CompletableFuture.supplyAsync(() -> {
-  writeClient.clean(cleanInstantTime);
+  writeClient.clean(instantTime);
   return true;
-}), executor);
+}, executor), executor);
   }
 
   public static AsyncCleanerService 
startAsyncCleaningIfEnabled(AbstractHoodieWriteClient writeClient) {
 AsyncCleanerService asyncCleanerService = null;
 if (writeClient.getConfig().isAutoClean() && 
writeClient.getConfig().isAsyncClean()) {
-  String instantTime = HoodieActiveTimeline.createNewInstantTime();
-  LOG.info("Auto cleaning is enabled. Running cleaner async to write 
operation at instant time " + instantTime);
-  asyncCleanerService = new AsyncCleanerService(writeClient, instantTime);
+  asyncCleanerService = 

[GitHub] [hudi] danny0405 merged pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


danny0405 merged pull request #4252:
URL: https://github.com/apache/hudi/pull/4252


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (HUDI-2959) Fix the thread leak of cleaning service

2021-12-10 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved HUDI-2959.
--

> Fix the thread leak of cleaning service
> ---
>
> Key: HUDI-2959
> URL: https://issues.apache.org/jira/browse/HUDI-2959
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot commented on pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4252:
URL: https://github.com/apache/hudi/pull/4252#issuecomment-991444243


   
   ## CI report:
   
   * a9553502b430a7200670e86ebc44078f89cce374 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4189)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4252:
URL: https://github.com/apache/hudi/pull/4252#issuecomment-991418711


   
   ## CI report:
   
   * 41efa313a197fce137e851bc129dd7a941021a8e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4130)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4131)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4132)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4136)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4159)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4160)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4166)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4180) 
   * a9553502b430a7200670e86ebc44078f89cce374 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4189)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1575) Early detection by periodically checking last written commit & active markers

2021-12-10 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-1575:
-
Description: Check if there are more commits, try to do resolution based on 
its current markers, and abort for a currently running job to avoid using up 
resources and running a concurrent job if we already found a commit that 
happened in the meantime. This can give back so much of the cluster early and 
dramatically lower costs in the cloud.  (was: Check if there are more commits, 
try to do resolution based on its current markers, and abort for a currently 
running job to avoid using up resources and running a concurrent job if we 
already found a commit that happened in the meantime)

> Early detection by periodically checking last written commit & active markers
> -
>
> Key: HUDI-1575
> URL: https://issues.apache.org/jira/browse/HUDI-1575
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: Nishith Agarwal
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.11.0
>
>
> Check if there are more commits, try to do resolution based on its current 
> markers, and abort for a currently running job to avoid using up resources 
> and running a concurrent job if we already found a commit that happened in 
> the meantime. This can give back so much of the cluster early and 
> dramatically lower costs in the cloud.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-1575) Early detection by periodically checking last written commit & active markers

2021-12-10 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-1575:
-
Summary: Early detection by periodically checking last written commit & 
active markers  (was: Early detection by periodically checking last written 
commit)

> Early detection by periodically checking last written commit & active markers
> -
>
> Key: HUDI-1575
> URL: https://issues.apache.org/jira/browse/HUDI-1575
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: Nishith Agarwal
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.11.0
>
>
> Check if there are more commits, try to do resolution, and abort for a 
> currently running job to avoid using up resources and running a concurrent 
> job if we already found a commit that happened in the meantime



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-1575) Early detection by periodically checking last written commit & active markers

2021-12-10 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-1575:
-
Description: Check if there are more commits, try to do resolution based on 
its current markers, and abort for a currently running job to avoid using up 
resources and running a concurrent job if we already found a commit that 
happened in the meantime  (was: Check if there are more commits, try to do 
resolution, and abort for a currently running job to avoid using up resources 
and running a concurrent job if we already found a commit that happened in the 
meantime)

> Early detection by periodically checking last written commit & active markers
> -
>
> Key: HUDI-1575
> URL: https://issues.apache.org/jira/browse/HUDI-1575
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: Nishith Agarwal
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.11.0
>
>
> Check if there are more commits, try to do resolution based on its current 
> markers, and abort for a currently running job to avoid using up resources 
> and running a concurrent job if we already found a commit that happened in 
> the meantime



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HUDI-1575) Early detection by periodically checking last written commit

2021-12-10 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar reassigned HUDI-1575:


Assignee: Vinoth Chandar  (was: Nishith Agarwal)

> Early detection by periodically checking last written commit
> 
>
> Key: HUDI-1575
> URL: https://issues.apache.org/jira/browse/HUDI-1575
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: Nishith Agarwal
>Assignee: Vinoth Chandar
>Priority: Major
>
> Check if there are more commits, try to do resolution, and abort for a 
> currently running job to avoid using up resources and running a concurrent 
> job if we already found a commit that happened in the meantime



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-1575) Early detection by periodically checking last written commit

2021-12-10 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-1575:
-
Fix Version/s: 0.11.0

> Early detection by periodically checking last written commit
> 
>
> Key: HUDI-1575
> URL: https://issues.apache.org/jira/browse/HUDI-1575
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: Nishith Agarwal
>Assignee: Vinoth Chandar
>Priority: Major
> Fix For: 0.11.0
>
>
> Check if there are more commits, try to do resolution, and abort for a 
> currently running job to avoid using up resources and running a concurrent 
> job if we already found a commit that happened in the meantime



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-1575) Early detection by periodically checking last written commit

2021-12-10 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-1575:
-
Priority: Blocker  (was: Major)

> Early detection by periodically checking last written commit
> 
>
> Key: HUDI-1575
> URL: https://issues.apache.org/jira/browse/HUDI-1575
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: Nishith Agarwal
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.11.0
>
>
> Check if there are more commits, try to do resolution, and abort for a 
> currently running job to avoid using up resources and running a concurrent 
> job if we already found a commit that happened in the meantime



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991438889


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991439751


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4191)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991008132


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4270: [HUDI-2811] Support Spark 3.2 and Parquet 1.12.x

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4270:
URL: https://github.com/apache/hudi/pull/4270#issuecomment-991438889


   
   ## CI report:
   
   * 12c1b3c30684dde5c870fe4c26d2992dc9a9b495 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4161)
 
   * ecfe39c51cd1777efa1778da7ac5e94ed8833b4f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated (c48a2a1 -> 9797fdf)

2021-12-10 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from c48a2a1  [HUDI-2527] Multi writer test with conflicting async table 
services (#4046)
 add 9797fdf  [HUDI-2974] Make the prefix for metrics name configurable 
(#4274)

No new revisions were added by this update.

Summary of changes:
 .../java/org/apache/hudi/config/HoodieWriteConfig.java |  4 
 .../apache/hudi/config/metrics/HoodieMetricsConfig.java| 14 ++
 .../main/java/org/apache/hudi/metrics/HoodieMetrics.java   |  2 +-
 .../src/main/java/org/apache/hudi/metrics/Metrics.java |  2 +-
 .../deltastreamer/HoodieDeltaStreamerMetrics.java  |  2 +-
 5 files changed, 21 insertions(+), 3 deletions(-)


[GitHub] [hudi] yihua merged pull request #4274: [HUDI-2974] Make the prefix for metrics name configurable

2021-12-10 Thread GitBox


yihua merged pull request #4274:
URL: https://github.com/apache/hudi/pull/4274


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope commented on a change in pull request #4279: [HUDI-2784] Add a hudi-trino-bundle for Trino

2021-12-10 Thread GitBox


codope commented on a change in pull request #4279:
URL: https://github.com/apache/hudi/pull/4279#discussion_r767083222



##
File path: packaging/hudi-trino-bundle/pom.xml
##
@@ -0,0 +1,273 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  
+hudi
+org.apache.hudi
+0.11.0-SNAPSHOT
+../../pom.xml
+  
+  4.0.0
+  hudi-trino-bundle
+  jar
+
+  
+true
+${project.parent.basedir}
+  
+
+  
+
+  
+org.apache.rat
+apache-rat-plugin
+  
+  
+org.apache.maven.plugins
+maven-shade-plugin
+${maven-shade-plugin.version}
+
+  
+package
+
+  shade
+
+
+  ${shadeSources}
+  
${project.build.directory}/dependency-reduced-pom.xml
+  
+  
+
+
+
+  true
+
+
+  META-INF/LICENSE
+  target/classes/META-INF/LICENSE
+
+  
+  
+
+  org.apache.hudi:hudi-common
+  org.apache.hudi:hudi-hadoop-mr
+
+  org.apache.parquet:parquet-avro
+  org.apache.avro:avro
+  org.codehaus.jackson:*
+  com.esotericsoftware:kryo-shaded
+  org.objenesis:objenesis
+  com.esotericsoftware:minlog
+  org.apache.hbase:hbase-client
+  org.apache.hbase:hbase-common
+  org.apache.hbase:hbase-protocol
+  org.apache.hbase:hbase-server
+  org.apache.hbase:hbase-annotations
+  org.apache.htrace:htrace-core
+  com.yammer.metrics:metrics-core
+  com.google.guava:guava
+  commons-lang:commons-lang
+  com.google.protobuf:protobuf-java
+
+  
+  
+
+
+  org.apache.avro.
+  
org.apache.hudi.org.apache.avro.
+
+
+  org.codehaus.jackson.
+  
org.apache.hudi.org.codehaus.jackson.
+
+
+  com.esotericsoftware.kryo.
+  
org.apache.hudi.com.esotericsoftware.kryo.
+
+
+  org.objenesis.
+  org.apache.hudi.org.objenesis.
+
+
+  com.esotericsoftware.minlog.
+  
org.apache.hudi.com.esotericsoftware.minlog.
+
+
+  com.yammer.metrics.
+  
org.apache.hudi.com.yammer.metrics.
+
+
+  com.google.common.
+  
${trino.bundle.bootstrap.shade.prefix}com.google.common.
+
+
+  org.apache.commons.lang.
+  
${trino.bundle.bootstrap.shade.prefix}org.apache.commons.lang.
+
+
+  com.google.protobuf.
+  
${trino.bundle.bootstrap.shade.prefix}com.google.protobuf.
+
+  
+  false
+  
+
+  *:*
+  
+META-INF/*.SF
+META-INF/*.DSA
+META-INF/*.RSA
+META-INF/services/javax.*
+  
+
+  
+  ${project.artifactId}-${project.version}
+
+  
+
+  
+
+
+  
+src/main/resources
+  
+  
+src/test/resources
+  
+
+  
+
+  
+
+
+  org.apache.hudi
+  hudi-common
+  ${project.version}
+  
+
+  org.apache.hbase
+  hbase-server
+
+
+  org.apache.hbase
+  hbase-client
+
+  
+
+
+  org.apache.hudi
+  hudi-hadoop-mr-bundle

Review comment:
   I think this may not be strictly necessary. Instead of the whole bundle, 
we can just add `hudi-hadoop-mr`. If we do need the bundle then it already 
comes with hudi-common so we can remove that on #L160.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991436367


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   * 432c2aff71bf918d22a3c6e81f23b11f5297d3b0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991435391


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2875) Concurrent call to HoodieMergeHandler cause parquet corruption

2021-12-10 Thread ZiyueGuan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZiyueGuan updated HUDI-2875:

Description: 
Problem:

Some corrupted parquet files are generated and exceptions will be thrown when 
read.

e.g.

 
Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value 
at 0 in block -1 in file 
    at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251)
    at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
    at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
    at 
org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49)
    at 
org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45)
    at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:112)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    ... 4 more
Caused by: org.apache.parquet.io.ParquetDecodingException: could not read page 
Page [bytes.size=1054316, valueCount=237, uncompressedSize=1054316] in col  
required binary col
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:599)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.access$300(ColumnReaderImpl.java:57)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:536)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:533)
    at org.apache.parquet.column.page.DataPageV1.accept(DataPageV1.java:95)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:533)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderImpl.java:525)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:638)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:353)
    at 
org.apache.parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:80)
    at 
org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:75)
    at 
org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271)
    at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147)
    at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109)
    at 
org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165)
    at 
org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109)
    at 
org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137)
    at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222)
    ... 11 more
Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:197)
    at java.io.DataInputStream.readFully(DataInputStream.java:169)
    at 
org.apache.parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:286)
    at org.apache.parquet.bytes.BytesInput.toByteBuffer(BytesInput.java:237)
    at org.apache.parquet.bytes.BytesInput.toInputStream(BytesInput.java:246)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:592)
 

How to reproduce:

We need a way which could interrupt one task w/o shutdown JVM. Let's say, 
speculation. When speculation is triggered, other tasks working at the same 
executor will have the risk to suffer a wrong parquet generation. This will not 
always result in corrupted parquet file. Nearly half of them will throw 
exception while there is few tasks succeed without any signal.

RootCause:

ParquetWriter is not thread safe. User of it should apply proper way to 
guarantee that there is not concurrent call to ParquetWriter.

In the following code: 

[https://github.com/apache/hudi/blob/master/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkMergeHelper.java#L103]

We call both write and close to parquet writer concurrently. Data may being 
written while we call close. In close method, compressor (a class used by 
parquet to do compressing which has a stateful data structure insied) will be 
cleared and payback to a pool for following reuse. Due to the concurrent write 
mentioned above, data may be continued pushed to compressor even though we have 
them cleared. Besides, there is a mechanism inside compressor which tries to 
check some invalid use. That's why some of invalid usage will throw exception 
rather than generate corrupted parquet.

Validation:

Current solution is validated by production environment. A single is that when 
this fix applied is that there should be no task failed due 

[jira] [Updated] (HUDI-2875) Concurrent call to HoodieMergeHandler cause parquet corruption

2021-12-10 Thread ZiyueGuan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZiyueGuan updated HUDI-2875:

Description: 
Problem:

Some corrupted parquet files are generated and exceptions will be thrown when 
read.

e.g.

 
Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value 
at 0 in block -1 in file 
    at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251)
    at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
    at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
    at 
org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49)
    at 
org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45)
    at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:112)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    ... 4 more
Caused by: org.apache.parquet.io.ParquetDecodingException: could not read page 
Page [bytes.size=1054316, valueCount=237, uncompressedSize=1054316] in col  
required binary col
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:599)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.access$300(ColumnReaderImpl.java:57)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:536)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl$3.visit(ColumnReaderImpl.java:533)
    at org.apache.parquet.column.page.DataPageV1.accept(DataPageV1.java:95)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:533)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderImpl.java:525)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:638)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:353)
    at 
org.apache.parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:80)
    at 
org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:75)
    at 
org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271)
    at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147)
    at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109)
    at 
org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165)
    at 
org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109)
    at 
org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137)
    at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222)
    ... 11 more
Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:197)
    at java.io.DataInputStream.readFully(DataInputStream.java:169)
    at 
org.apache.parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:286)
    at org.apache.parquet.bytes.BytesInput.toByteBuffer(BytesInput.java:237)
    at org.apache.parquet.bytes.BytesInput.toInputStream(BytesInput.java:246)
    at 
org.apache.parquet.column.impl.ColumnReaderImpl.readPageV1(ColumnReaderImpl.java:592)
 

How to reproduce:

We need a way which could interrupt one task w/o shutdown JVM. Let's say, 
speculation. When speculation is triggered, other tasks working at the same 
executor will have the risk to suffer a wrong parquet generation. This will not 
always result in corrupted parquet file. Nearly half of them will throw 
exception while there is few tasks succeed without any signal.

RootCause:

ParquetWriter is not thread safe. User of it should apply proper way to 
guarantee that there is not concurrent call to ParquetWriter.

In the following code: 

[https://github.com/apache/hudi/blob/master/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/SparkMergeHelper.java#L103]

We call both write and close to parquet writer concurrently. Data may being 
written while we call close. In close method, compressor (a class used by 
parquet to do compressing which has a stateful data structure insied) will be 
cleared and payback to a pool for following reuse. Due to the concurrent write 
mentioned above, data may be continued pushed to compressor even though we have 
them cleared. Besides, there is a mechanism inside compressor which tries to 
check some invalid use. That's why some of invalid usage will throw exception 
rather than generate corrupted parquet.

Validation:

Current solution is validated by production environment. A signal is that when 
this fix applied is that there should be no task failed due 

[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991435391


   
   ## CI report:
   
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991432196


   
   ## CI report:
   
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2976) Add Hudi 0.10.0 release page with highlights

2021-12-10 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457524#comment-17457524
 ] 

Danny Chen commented on HUDI-2976:
--

Fixed via ast-site branch: 147432ce862676557392ac12352512f73b8aef23

> Add Hudi 0.10.0 release page with highlights
> 
>
> Key: HUDI-2976
> URL: https://issues.apache.org/jira/browse/HUDI-2976
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Docs
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HUDI-2976) Add Hudi 0.10.0 release page with highlights

2021-12-10 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved HUDI-2976.
--

> Add Hudi 0.10.0 release page with highlights
> 
>
> Key: HUDI-2976
> URL: https://issues.apache.org/jira/browse/HUDI-2976
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Docs
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] danny0405 merged pull request #4277: [HUDI-2976] Add Hudi 0.10.0 release page with highlights

2021-12-10 Thread GitBox


danny0405 merged pull request #4277:
URL: https://github.com/apache/hudi/pull/4277


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch asf-site updated: [HUDI-2976] Add Hudi 0.10.0 release page with highlights (#4277)

2021-12-10 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 147432c  [HUDI-2976] Add Hudi 0.10.0 release page with highlights 
(#4277)
147432c is described below

commit 147432ce862676557392ac12352512f73b8aef23
Author: Danny Chan 
AuthorDate: Sat Dec 11 11:27:15 2021 +0800

[HUDI-2976] Add Hudi 0.10.0 release page with highlights (#4277)
---
 website/docusaurus.config.js   |   6 +-
 website/releases/download.md   |   4 +
 website/releases/older-releases.md |   2 +-
 website/releases/release-0.10.0.md | 241 +
 website/releases/release-0.7.0.md  |   2 +-
 website/releases/release-0.8.0.md  |   2 +-
 website/releases/release-0.9.0.md  |   2 +-
 website/src/pages/index.js |   5 +-
 8 files changed, 253 insertions(+), 11 deletions(-)

diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js
index d95eba0..0b387af 100644
--- a/website/docusaurus.config.js
+++ b/website/docusaurus.config.js
@@ -98,11 +98,11 @@ module.exports = {
   },
   {
 from: ['/docs/releases', '/docs/next/releases'],
-to: '/releases/release-0.9.0',
+to: '/releases/release-0.10.0',
   },
   {
 from: ['/releases'],
-to: '/releases/release-0.9.0',
+to: '/releases/release-0.10.0',
   },
   {
 from: ['/docs/learn'],
@@ -254,7 +254,7 @@ module.exports = {
 },
 {
   label: 'Releases',
-  to: '/releases/release-0.9.0',
+  to: '/releases/release-0.10.0',
 },
 {
   label: 'Download',
diff --git a/website/releases/download.md b/website/releases/download.md
index 4d46d07..312e3ac 100644
--- a/website/releases/download.md
+++ b/website/releases/download.md
@@ -6,6 +6,10 @@ toc: true
 last_modified_at: 2019-12-30T15:59:57-04:00
 ---
 
+### Release 0.10.0
+* Source Release : [Apache Hudi 0.10.0 Source 
Release](https://www.apache.org/dyn/closer.lua/hudi/0.10.0/hudi-0.10.0.src.tgz) 
([asc](https://downloads.apache.org/hudi/0.10.0/hudi-0.10.0.src.tgz.asc), 
[sha512](https://downloads.apache.org/hudi/0.10.0/hudi-0.10.0.src.tgz.sha512))
+* Release Note : ([Release Note for Apache Hudi 
0.10.0](/releases/release-0.10.0))
+
 ### Release 0.9.0
 * Source Release : [Apache Hudi 0.9.0 Source 
Release](https://www.apache.org/dyn/closer.lua/hudi/0.9.0/hudi-0.9.0.src.tgz) 
([asc](https://downloads.apache.org/hudi/0.9.0/hudi-0.9.0.src.tgz.asc), 
[sha512](https://downloads.apache.org/hudi/0.9.0/hudi-0.9.0.src.tgz.sha512))
 * Release Note : ([Release Note for Apache Hudi 
0.9.0](/releases/release-0.9.0))
diff --git a/website/releases/older-releases.md 
b/website/releases/older-releases.md
index f194c96..dee18e9 100644
--- a/website/releases/older-releases.md
+++ b/website/releases/older-releases.md
@@ -1,6 +1,6 @@
 ---
 title: "Older Releases"
-sidebar_position: 7
+sidebar_position: 8
 layout: releases
 toc: true
 last_modified_at: 2020-05-28T08:40:00-07:00
diff --git a/website/releases/release-0.10.0.md 
b/website/releases/release-0.10.0.md
new file mode 100644
index 000..2826004
--- /dev/null
+++ b/website/releases/release-0.10.0.md
@@ -0,0 +1,241 @@
+---
+title: "Release 0.10.0"
+sidebar_position: 2
+layout: releases
+toc: true
+last_modified_at: 2021-12-10T22:07:00+08:00
+---
+# [Release 0.10.0](https://github.com/apache/hudi/releases/tag/release-0.10.0) 
([docs](/docs/quick-start-guide))
+
+## Migration Guide
+- If migrating from an older release, please also check the upgrade 
instructions for each subsequent release below.
+- With 0.10.0, we have made some foundational fix to metadata table and so as 
part of upgrade, any existing metadata table is cleaned up. 
+  Whenever Hudi is launched with newer table version i.e 3 (or moving from an 
earlier release to 0.10.0), an upgrade step will be executed automatically. 
+  This automatic upgrade step will happen just once per Hudi table as the 
hoodie.table.version will be updated in property file after upgrade is 
completed.
+- Similarly, a command line tool for Downgrading (command - downgrade) is 
added if in case some users want to downgrade Hudi 
+  from table version 3 to 2 or move from Hudi 0.10.0 to pre 0.10.0. This needs 
to be executed from a 0.10.0 hudi-cli binary/script.
+- We have made some major fixes to 0.10.0 release around metadata table and 
would recommend users to try out metadata 
+  for better performance from optimized file listings. As part of the upgrade, 
please follow the below steps to enable metadata table.
+
+### Prerequisites for enabling metadata table
+
+Hudi writes and reads have to perform “list files” operation on the file 
system to get the current view of the system.
+This could be very 

[GitHub] [hudi] hudi-bot removed a comment on pull request #4282: [WIP][DO_NOT_MERGE] Enabling debug logs to investigate IT test failure

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4282:
URL: https://github.com/apache/hudi/pull/4282#issuecomment-991418293


   
   ## CI report:
   
   * f7a8f5ea7c09d8c1596027c31ec5348d30f25849 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4188)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991432196


   
   ## CI report:
   
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4190)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991427887


   
   ## CI report:
   
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4282: [WIP][DO_NOT_MERGE] Enabling debug logs to investigate IT test failure

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4282:
URL: https://github.com/apache/hudi/pull/4282#issuecomment-99143


   
   ## CI report:
   
   * f7a8f5ea7c09d8c1596027c31ec5348d30f25849 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4188)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zhangyue19921010 edited a comment on issue #4275: [SUPPORT] How can I control the number of archive files

2021-12-10 Thread GitBox


zhangyue19921010 edited a comment on issue #4275:
URL: https://github.com/apache/hudi/issues/4275#issuecomment-991418197


   Hi @JoshuaZhuCN , I also meet this issue when using S3.
   Maybe the storage you used is not support append action. So that each time 
active will create a new archive file no matter you use clustering or not.
   https://github.com/apache/hudi/pull/4078
   Hope this PR can help you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991426720


   
   ## CI report:
   
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991427887


   
   ## CI report:
   
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   * ee2a811b2f3630472bf4aeb00e8e39d9384a0b7f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991425623


   
   ## CI report:
   
   * 25a3f9b1e23db855abe6f352c391033fdef4d460 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4184)
 
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991426720


   
   ## CI report:
   
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] danny0405 commented on pull request #4277: [HUDI-2976] Add Hudi 0.10.0 release page with highlights

2021-12-10 Thread GitBox


danny0405 commented on pull request #4277:
URL: https://github.com/apache/hudi/pull/4277#issuecomment-991425904


   > we have enabled timeline server based marker files by default w/ 0.10.0. 
We did extensive testing on this end, but just incase users run into any 
issues, we want to give them a way to repair any dangling data files. so 
introducing a standalone tool for this purpose which will be in master. 
Depending on necessity, users can choose to leverage it.
   
   Thanks for the explanation ~ Post me the release note you want to add when 
finished the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991415363


   
   ## CI report:
   
   * 25a3f9b1e23db855abe6f352c391033fdef4d460 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4184)
 
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4278: [HUDI-2906] Add a repair util to clean up dangling data and log files

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4278:
URL: https://github.com/apache/hudi/pull/4278#issuecomment-991425623


   
   ## CI report:
   
   * 25a3f9b1e23db855abe6f352c391033fdef4d460 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4184)
 
   * 6b705b1a371fe8af7fb65795a1083ca13e9e7348 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4187)
 
   * e8c56862a65de258d657f029cd15466f7e4e41f7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #4277: [HUDI-2976] Add Hudi 0.10.0 release page with highlights

2021-12-10 Thread GitBox


nsivabalan commented on pull request #4277:
URL: https://github.com/apache/hudi/pull/4277#issuecomment-991419924


   we have enabled timeline server based marker files by default w/ 0.10.0. We 
did extensive testing on this end, but just incase users run into any issues, 
we want to give them a way to repair any dangling data files. so introducing a 
standalone tool for this purpose which will be in master. Depending on 
necessity, users can choose to leverage it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4252:
URL: https://github.com/apache/hudi/pull/4252#issuecomment-991418276


   
   ## CI report:
   
   * 41efa313a197fce137e851bc129dd7a941021a8e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4130)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4131)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4132)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4136)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4159)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4160)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4166)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4180) 
   * a9553502b430a7200670e86ebc44078f89cce374 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4252:
URL: https://github.com/apache/hudi/pull/4252#issuecomment-991418711


   
   ## CI report:
   
   * 41efa313a197fce137e851bc129dd7a941021a8e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4130)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4131)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4132)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4136)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4159)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4160)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4166)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4180) 
   * a9553502b430a7200670e86ebc44078f89cce374 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4189)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


hudi-bot removed a comment on pull request #4252:
URL: https://github.com/apache/hudi/pull/4252#issuecomment-991364953


   
   ## CI report:
   
   * 41efa313a197fce137e851bc129dd7a941021a8e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4130)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4131)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4132)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4136)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4159)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4160)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4166)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4180) 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4252: [HUDI-2959] Fix the thread leak of cleaning service

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4252:
URL: https://github.com/apache/hudi/pull/4252#issuecomment-991418276


   
   ## CI report:
   
   * 41efa313a197fce137e851bc129dd7a941021a8e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4130)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4131)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4132)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4136)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4159)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4160)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4166)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4180) 
   * a9553502b430a7200670e86ebc44078f89cce374 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zhangyue19921010 commented on issue #4275: [SUPPORT] How can I control the number of archive files

2021-12-10 Thread GitBox


zhangyue19921010 commented on issue #4275:
URL: https://github.com/apache/hudi/issues/4275#issuecomment-991418197


   Hi @JoshuaZhuCN , I also meet this issue when using S3.
   https://github.com/apache/hudi/pull/4078
   Hope this PR can help you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4282: [WIP][DO_NOT_MERGE] Enabling debug logs to investigate IT test failure

2021-12-10 Thread GitBox


hudi-bot commented on pull request #4282:
URL: https://github.com/apache/hudi/pull/4282#issuecomment-991418293


   
   ## CI report:
   
   * f7a8f5ea7c09d8c1596027c31ec5348d30f25849 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4188)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   >