[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027483972


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5667)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027651306


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5667)
 
   * 29dd8404768aa1811844efc1332d02041971b8e0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027645042


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027648249


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027616904


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027648230


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027616882


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027645011


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027645042


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027623983


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027624006


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027645026


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027616872


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027624006


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027623983


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5673)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027616926


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027616926


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027616904


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027581698


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027579588


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027579601


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027616882


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4734: [DO NOT MERGE] Testing CI 1

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4734:
URL: https://github.com/apache/hudi/pull/4734#issuecomment-1027577511


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027616872


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4734: [DO NOT MERGE] Testing CI 1

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4734:
URL: https://github.com/apache/hudi/pull/4734#issuecomment-1027616856


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027579575


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] danny0405 commented on a change in pull request #4420: [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering

2022-02-01 Thread GitBox


danny0405 commented on a change in pull request #4420:
URL: https://github.com/apache/hudi/pull/4420#discussion_r797294083



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##
@@ -464,27 +464,43 @@ protected void postCommit(HoodieTable table, 
HoodieCommitMetadata me
   }
 
   protected void runTableServicesInline(HoodieTable table, 
HoodieCommitMetadata metadata, Option> extraMetadata) {
-if (config.areAnyTableServicesInline()) {
+if (config.areAnyTableServicesInline() || 
config.scheduleInlineTableServices()) {
   if (config.isMetadataTableEnabled()) {

Review comment:
   Do you think we should returns true for only scheduling(non-execute) 
table service in `areAnyTableServicesInline`, personally i think there are 
responsibilities similarity for these two methods here and they are confusing.

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##
@@ -464,27 +464,43 @@ protected void postCommit(HoodieTable table, 
HoodieCommitMetadata me
   }
 
   protected void runTableServicesInline(HoodieTable table, 
HoodieCommitMetadata metadata, Option> extraMetadata) {
-if (config.areAnyTableServicesInline()) {
+if (config.areAnyTableServicesInline() || 
config.scheduleInlineTableServices()) {
   if (config.isMetadataTableEnabled()) {
 table.getHoodieView().sync();
   }
   // Do an inline compaction if enabled
   if (config.inlineCompactionEnabled()) {
 runAnyPendingCompactions(table);
 metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"true");
-inlineCompact(extraMetadata);
+inlineScheduleCompactAndOptionallyExecute(extraMetadata, 
!config.scheduleInlineCompaction());
   } else {
 metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"false");
   }
 
+  // if just inline schedule is enabled
+  if (!config.inlineCompactionEnabled() && 
config.scheduleInlineCompaction()
+  && 
!table.getActiveTimeline().getWriteTimeline().filterPendingCompactionTimeline().getInstants().findAny().isPresent())
 {
+// proceed only if there are no pending compactions

Review comment:
   Can we just add a tool method in `HoodieActiveTimeline` for decision 
`table.getActiveTimeline().getWriteTimeline().filterPendingCompactionTimeline().getInstants().findAny().isPresent()`

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##
@@ -177,6 +178,16 @@
   .withDocumentation("Determines how to handle updates, deletes to file 
groups that are under clustering."
   + " Default strategy just rejects the update");
 
+  public static final ConfigProperty SCHEDULE_INLINE_CLUSTERING = 
ConfigProperty
+  .key("hoodie.clustering.schedule.inline")

Review comment:
   `SCHEDULE_INLINE_CLUSTERING` -> `CLUSTERING_SCHEDULE_INLINE`

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##
@@ -464,27 +464,43 @@ protected void postCommit(HoodieTable table, 
HoodieCommitMetadata me
   }
 
   protected void runTableServicesInline(HoodieTable table, 
HoodieCommitMetadata metadata, Option> extraMetadata) {
-if (config.areAnyTableServicesInline()) {
+if (config.areAnyTableServicesInline() || 
config.scheduleInlineTableServices()) {
   if (config.isMetadataTableEnabled()) {
 table.getHoodieView().sync();
   }
   // Do an inline compaction if enabled
   if (config.inlineCompactionEnabled()) {
 runAnyPendingCompactions(table);
 metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"true");
-inlineCompact(extraMetadata);
+inlineScheduleCompactAndOptionallyExecute(extraMetadata, 
!config.scheduleInlineCompaction());
   } else {
 metadata.addMetadata(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"false");
   }
 
+  // if just inline schedule is enabled
+  if (!config.inlineCompactionEnabled() && 
config.scheduleInlineCompaction()
+  && 
!table.getActiveTimeline().getWriteTimeline().filterPendingCompactionTimeline().getInstants().findAny().isPresent())
 {
+// proceed only if there are no pending compactions

Review comment:
   Can we also add a tool method for decision 
`config.inlineCompactionEnabled() && config.scheduleInlineCompaction()` in 
config.

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java
##
@@ -1005,13 +1021,16 @@ protected void rollbackFailedWrites(Map inlineCompact(Option> 
extraMetadata) {
+  protected Option 
inlineScheduleCompactAndOptionallyExecute(Option> 
extraMetadata, boolean executeInline) {
 Option compactionInstantTimeOpt = 
scheduleCompaction(extraMetadata);

Review comment:

[GitHub] [hudi] stackls edited a comment on issue #4675: [SUPPORT] Hudi Hive Style partitioning not working in 0.5.0

2022-02-01 Thread GitBox


stackls edited a comment on issue #4675:
URL: https://github.com/apache/hudi/issues/4675#issuecomment-1027578046


   Thanks @ganczarek @nsivabalan  
   Currently using the 0.5 jars which has no custom changes. Will check the 
possibility of writing custom key generator to overwrite getPartitionPath in 
hudi 0.5 code. . No other workarounds to achieve this ?  
   
   We are planning to implement partitioning path like this 
s3://bucketname/folder1/partition_col_nm1=2015-01-01/ in 0.5 hudi release.  So 
even when hudi is upgraded from 0.5 to higher versions(say > 0.8) , the 
partitioning consistency will be maintained.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] stackls edited a comment on issue #4675: [SUPPORT] Hudi Hive Style partitioning not working in 0.5.0

2022-02-01 Thread GitBox


stackls edited a comment on issue #4675:
URL: https://github.com/apache/hudi/issues/4675#issuecomment-1027578046


   Thanks @ganczarek @nsivabalan  
   Just used the 0.5 jars which are existing and has no custom changes. Will 
check the possibility of writing custom key generator to overwrite 
getPartitionPath in hudi 0.5 code. . No other workarounds to achieve this ?  
   
   We are planning to implement partitioning path like this 
s3://bucketname/folder1/partition_col_nm1=2015-01-01/ in 0.5 hudi release.  So 
even when hudi is upgraded from 0.5 to higher versions(say > 0.8) , the 
partitioning consistency will be maintained.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027581698


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5675)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5672)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027579623


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] stackls edited a comment on issue #4675: [SUPPORT] Hudi Hive Style partitioning not working in 0.5.0

2022-02-01 Thread GitBox


stackls edited a comment on issue #4675:
URL: https://github.com/apache/hudi/issues/4675#issuecomment-1027578046


   Thanks @ganczarek @nsivabalan  
   Just used the 0.5 jars which are existing and has no custom changes. Will 
check the possibility of writing custom key generator to overwrite 
getPartitionPath in hudi 0.5 code. . No other workarounds to achieve this ?  
   
   Good practice we think is to implement partitioning path like this 
s3://bucketname/folder1/partition_col_nm1=2015-01-01/ in 0.5  .  So even when 
hudi is upgraded from 0.5 to higher versions(say > 0.8) , the partitioning 
consistency will be maintained.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027579575


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027579588


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027577524


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027579601


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5674)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4738:
URL: https://github.com/apache/hudi/pull/4738#issuecomment-1027579623


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027577554


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027577533


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] stackls edited a comment on issue #4675: [SUPPORT] Hudi Hive Style partitioning not working in 0.5.0

2022-02-01 Thread GitBox


stackls edited a comment on issue #4675:
URL: https://github.com/apache/hudi/issues/4675#issuecomment-1020030963


   Thanks @ganczarek for the reply.  Does keygenerator changes the way 
partition folders are created ?
   @nsivabalan @codope  - Can you please suggest on how to implement this in 
0.5.0 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] stackls commented on issue #4675: [SUPPORT] Hudi Hive Style partitioning not working in 0.5.0

2022-02-01 Thread GitBox


stackls commented on issue #4675:
URL: https://github.com/apache/hudi/issues/4675#issuecomment-1027578046


   Thanks @ganczarek @nsivabalan  
   Just used the 0.5 jars which are existing and has no custom changes. Will 
check the possibility of writing custom key generator to overwrite 
getPartitionPath in hudi 0.5 code. . No other workarounds to achieve this ?  
   
   Good practice we think is to implement partitioning path like this 
s3://bucketname/folder1/partition_col_nm1=2015-01-01/ in 0.5  .  So even when 
hudi is upgrade from 0.5 to higher versions(say > 0.8) , the partitioning 
consistency will be maintained.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4734: [DO NOT MERGE] Testing CI 1

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4734:
URL: https://github.com/apache/hudi/pull/4734#issuecomment-1027577511


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4737:
URL: https://github.com/apache/hudi/pull/4737#issuecomment-1027577554


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4736:
URL: https://github.com/apache/hudi/pull/4736#issuecomment-1027577533


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4734: [DO NOT MERGE] Testing CI 1

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4734:
URL: https://github.com/apache/hudi/pull/4734#issuecomment-1027576159


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4735:
URL: https://github.com/apache/hudi/pull/4735#issuecomment-1027577524


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4711: [HUDI-1295][HUDI-3166] Hoodie Index Type Metadata Bloom implementation

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4711:
URL: https://github.com/apache/hudi/pull/4711#issuecomment-1027577458


   
   ## CI report:
   
   * 8f1fa5c06ee5febc006599bac166f9bf3ad98ef7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5670)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4711: [HUDI-1295][HUDI-3166] Hoodie Index Type Metadata Bloom implementation

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4711:
URL: https://github.com/apache/hudi/pull/4711#issuecomment-1027541139


   
   ## CI report:
   
   * c15ad0d2d77cda49cba5961a1baaf20b592499b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5587)
 
   * 8f1fa5c06ee5febc006599bac166f9bf3ad98ef7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5670)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope opened a new pull request #4738: [DO NOT MERGE] Testing CI 5

2022-02-01 Thread GitBox


codope opened a new pull request #4738:
URL: https://github.com/apache/hudi/pull/4738


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope opened a new pull request #4737: [DO NOT MERGE] Testing CI 4

2022-02-01 Thread GitBox


codope opened a new pull request #4737:
URL: https://github.com/apache/hudi/pull/4737


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope opened a new pull request #4736: [DO NOT MERGE] Testing CI 3

2022-02-01 Thread GitBox


codope opened a new pull request #4736:
URL: https://github.com/apache/hudi/pull/4736


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4734: [DO NOT MERGE] Testing CI 1

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4734:
URL: https://github.com/apache/hudi/pull/4734#issuecomment-1027576159


   
   ## CI report:
   
   * 262da829704a809734834359030d4da40cd0c6c7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope opened a new pull request #4735: [DO NOT MERGE] Testing CI 2

2022-02-01 Thread GitBox


codope opened a new pull request #4735:
URL: https://github.com/apache/hudi/pull/4735


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codope opened a new pull request #4734: [DO NOT MERGE] Testing CI 1

2022-02-01 Thread GitBox


codope opened a new pull request #4734:
URL: https://github.com/apache/hudi/pull/4734


   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #4556: [HUDI-3191][Stacked on 4716] Removing duplicating file-listing process w/in Hive's MOR `FileInputFormat`s

2022-02-01 Thread GitBox


nsivabalan commented on a change in pull request #4556:
URL: https://github.com/apache/hudi/pull/4556#discussion_r797255058



##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java
##
@@ -189,8 +190,10 @@ public void testUpsertPartitioner(boolean 
populateMetaFields) throws Exception {
 
   assertTrue(fileIdToNewSize.entrySet().stream().anyMatch(entry -> 
fileIdToSize.get(entry.getKey()) < entry.getValue()));
 
-  List dataFiles = 
roView.getLatestBaseFiles().map(HoodieBaseFile::getPath).collect(Collectors.toList());
-  List recordsRead = 
HoodieMergeOnReadTestUtils.getRecordsUsingInputFormat(hadoopConf(), dataFiles,
+  List inputPaths = roView.getLatestBaseFiles()
+  .map(baseFile -> new Path(baseFile.getPath()).getParent().toString())
+  .collect(Collectors.toList());

Review comment:
   +1 

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieFileInputFormatBase.java
##
@@ -187,13 +285,28 @@ private static FileStatus 
getFileStatusUnchecked(Option baseFile
   .map(fileSlice -> {
 Option baseFileOpt = fileSlice.getBaseFile();
 Option latestLogFileOpt = 
fileSlice.getLatestLogFile();
-if (baseFileOpt.isPresent()) {
-  return getFileStatusUnchecked(baseFileOpt);
-} else if (includeLogFilesForSnapShotView() && 
latestLogFileOpt.isPresent()) {
-  return 
createRealtimeFileStatusUnchecked(latestLogFileOpt.get(), 
fileSlice.getLogFiles());
+Stream logFiles = fileSlice.getLogFiles();
+
+Option latestCompletedInstantOpt =
+fromScala(fileIndex.latestCompletedInstant());
+
+// Check if we're reading a MOR table
+if (includeLogFilesForSnapshotView()) {
+  if (baseFileOpt.isPresent()) {
+return 
createRealtimeFileStatusUnchecked(baseFileOpt.get(), logFiles, 
latestCompletedInstantOpt, tableMetaClient);

Review comment:
   can you help me understand, what does a List snapshotPaths can 
refer to. can it represent both a base file and a log file, or only either of 
them ? 

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieFileInputFormatBase.java
##
@@ -143,6 +121,126 @@ private static FileStatus 
getFileStatusUnchecked(Option baseFile
 return returns.toArray(new FileStatus[0]);
   }
 
+  private void validate(List targetFiles, List 
legacyFileStatuses) {
+List diff = CollectionUtils.diff(targetFiles, 
legacyFileStatuses);
+checkState(diff.isEmpty(), "Should be empty");
+  }
+
+  @Nonnull
+  private static FileStatus getFileStatusUnchecked(HoodieBaseFile baseFile) {
+try {
+  return HoodieInputFormatUtils.getFileStatus(baseFile);
+} catch (IOException ioe) {
+  throw new HoodieIOException("Failed to get file-status", ioe);
+}
+  }
+
+  /**
+   * Abstracts and exposes {@link FileInputFormat#listStatus(JobConf)} 
operation to subclasses that
+   * lists files (returning an array of {@link FileStatus}) corresponding to 
the input paths specified
+   * as part of provided {@link JobConf}
+   */
+  protected final FileStatus[] doListStatus(JobConf job) throws IOException {
+return super.listStatus(job);
+  }
+
+  /**
+   * Achieves listStatus functionality for an incrementally queried table. 
Instead of listing all
+   * partitions and then filtering based on the commits of interest, this 
logic first extracts the
+   * partitions touched by the desired commits and then lists only those 
partitions.
+   */
+  protected List listStatusForIncrementalMode(JobConf job,
+  
HoodieTableMetaClient tableMetaClient,
+  List 
inputPaths,
+  String 
incrementalTable) throws IOException {
+Job jobContext = Job.getInstance(job);
+Option timeline = 
HoodieInputFormatUtils.getFilteredCommitsTimeline(jobContext, tableMetaClient);
+if (!timeline.isPresent()) {
+  return null;
+}
+Option> commitsToCheck = 
HoodieInputFormatUtils.getCommitsForIncrementalQuery(jobContext, 
incrementalTable, timeline.get());
+if (!commitsToCheck.isPresent()) {
+  return null;
+}
+Option incrementalInputPaths = 
HoodieInputFormatUtils.getAffectedPartitions(commitsToCheck.get(), 
tableMetaClient, timeline.get(), inputPaths);
+// Mutate the JobConf to set the input paths to only partitions touched by 
incremental pull.
+if (!incrementalInputPaths.isPresent()) {
+  return null;
+}
+setInputPaths(job, incrementalInputPaths.get());
+FileStatus[] fileStatuses = doListStatus(job);
+return HoodieInputFormatUtils.filterIncrementalFileStatus(jobContext, 
tableMetaClient, timeline.get(), 

[GitHub] [hudi] nsivabalan commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


nsivabalan commented on a change in pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#discussion_r797247870



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##
@@ -184,12 +185,14 @@ private void enablePartitions() {
   throw new HoodieException("Failed to enable metadata partitions!", e);
 }
 
-enablePartition(MetadataPartitionType.FILES, metadataConfig, metaClient, 
isBootstrapCompleted);
+Option fsView = Option.ofNullable(
+metaClient.isPresent() ? 
HoodieTableMetadataUtil.getFileSystemView(metaClient.get()) : null);

Review comment:
   metaClient.map().OrElse(Option.empty()) would be simpler I guess




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027541190


   
   ## CI report:
   
   * 891d9658daa099eb50560741086aac23924e3600 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5669)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4711: [HUDI-1295][HUDI-3166] Hoodie Index Type Metadata Bloom implementation

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4711:
URL: https://github.com/apache/hudi/pull/4711#issuecomment-1027541139


   
   ## CI report:
   
   * c15ad0d2d77cda49cba5961a1baaf20b592499b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5587)
 
   * 8f1fa5c06ee5febc006599bac166f9bf3ad98ef7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5670)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027513978


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   * 891d9658daa099eb50560741086aac23924e3600 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5669)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4711: [HUDI-1295][HUDI-3166] Hoodie Index Type Metadata Bloom implementation

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4711:
URL: https://github.com/apache/hudi/pull/4711#issuecomment-1027539632


   
   ## CI report:
   
   * c15ad0d2d77cda49cba5961a1baaf20b592499b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5587)
 
   * 8f1fa5c06ee5febc006599bac166f9bf3ad98ef7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4711: [HUDI-1295][HUDI-3166] Hoodie Index Type Metadata Bloom implementation

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4711:
URL: https://github.com/apache/hudi/pull/4711#issuecomment-1027539632


   
   ## CI report:
   
   * c15ad0d2d77cda49cba5961a1baaf20b592499b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5587)
 
   * 8f1fa5c06ee5febc006599bac166f9bf3ad98ef7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4711: [HUDI-1295][HUDI-3166] Hoodie Index Type Metadata Bloom implementation

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4711:
URL: https://github.com/apache/hudi/pull/4711#issuecomment-1024801695


   
   ## CI report:
   
   * c15ad0d2d77cda49cba5961a1baaf20b592499b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5587)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4449: [HUDI-2763] Metadata table records - support for key deduplication based on hardcoded key field

2022-02-01 Thread GitBox


manojpec commented on a change in pull request #4449:
URL: https://github.com/apache/hudi/pull/4449#discussion_r797235960



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java
##
@@ -126,7 +126,7 @@
 
   public static final ConfigProperty POPULATE_META_FIELDS = 
ConfigProperty
   .key(METADATA_PREFIX + ".populate.meta.fields")
-  .defaultValue(true)
+  .defaultValue(false)

Review comment:
   This is only for metadata table. Its intentional that we are disabling 
meta fields, that is enabling virtual keys for metadata table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


manojpec commented on pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#issuecomment-1027528462


   @nsivabalan @codope CI test failure in 
TestHoodieDeltaStreamerWithMultiWriter is fixed by 
https://github.com/apache/hudi/pull/4704


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027513978


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   * 891d9658daa099eb50560741086aac23924e3600 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5669)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027512642


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   * 891d9658daa099eb50560741086aac23924e3600 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027508471


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027512642


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   * 891d9658daa099eb50560741086aac23924e3600 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027443659


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027508471


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4449: [HUDI-2763] Metadata table records - support for key deduplication based on hardcoded key field

2022-02-01 Thread GitBox


alexeykudinkin commented on a change in pull request #4449:
URL: https://github.com/apache/hudi/pull/4449#discussion_r797215038



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java
##
@@ -126,7 +126,7 @@
 
   public static final ConfigProperty POPULATE_META_FIELDS = 
ConfigProperty
   .key(METADATA_PREFIX + ".populate.meta.fields")
-  .defaultValue(true)
+  .defaultValue(false)

Review comment:
   Why is this changing?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027483972


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5667)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027406849


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * 9fb948535b79336115a8e0fddfa955f7bfbff5f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5225)
 
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5667)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4556: [HUDI-3191][Stacked on 4716] Removing duplicating file-listing process w/in Hive's MOR `FileInputFormat`s

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4556:
URL: https://github.com/apache/hudi/pull/4556#issuecomment-1027470496


   
   ## CI report:
   
   * 77d11131baabd1c4e3cc2050337daca4df5f6427 UNKNOWN
   * 3d9c2ae28da858d1e8476052c99391015effb7db UNKNOWN
   * 31b0669d7b638bd65a17b22a2ceb772f2627512c UNKNOWN
   * 28a5a4f537544d35dfcd8700a7b97fb7216682ce UNKNOWN
   * c09e228f7cce78a7dbbc394e93b3cf8c6c3d4d5f UNKNOWN
   * 5b8f5819fff8fec34864eb409fd429b95be17b9b UNKNOWN
   * 5d37935bc8bb33260735d782ca560fd59e02f321 UNKNOWN
   * f911d869f50009e5cd9f3fd341c83c732d7531ba Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5634)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5657)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5659)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4556: [HUDI-3191][Stacked on 4716] Removing duplicating file-listing process w/in Hive's MOR `FileInputFormat`s

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4556:
URL: https://github.com/apache/hudi/pull/4556#issuecomment-1027389109


   
   ## CI report:
   
   * 77d11131baabd1c4e3cc2050337daca4df5f6427 UNKNOWN
   * 3d9c2ae28da858d1e8476052c99391015effb7db UNKNOWN
   * 31b0669d7b638bd65a17b22a2ceb772f2627512c UNKNOWN
   * 28a5a4f537544d35dfcd8700a7b97fb7216682ce UNKNOWN
   * c09e228f7cce78a7dbbc394e93b3cf8c6c3d4d5f UNKNOWN
   * 5b8f5819fff8fec34864eb409fd429b95be17b9b UNKNOWN
   * 5d37935bc8bb33260735d782ca560fd59e02f321 UNKNOWN
   * f911d869f50009e5cd9f3fd341c83c732d7531ba Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5634)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5657)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5659)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#issuecomment-1027384717


   
   ## CI report:
   
   * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN
   * 3784c4bf415fec6e48f1438c2f14eb4061c608cf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5644)
 
   * 2541391e4cfa70e28c5d17a63cb920bc3547dd5e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5664)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#issuecomment-1027463742


   
   ## CI report:
   
   * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN
   * 2541391e4cfa70e28c5d17a63cb920bc3547dd5e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5664)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027443659


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5668)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027142618


   
   ## CI report:
   
   * 0fbee41c28aa740d981285db5a7db224da17ee2c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5656)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zhangyue19921010 commented on pull request #4721: [HUDI-3320] Hoodie metadata table validator

2022-02-01 Thread GitBox


zhangyue19921010 commented on pull request #4721:
URL: https://github.com/apache/hudi/pull/4721#issuecomment-1027441270


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027384679


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * 9fb948535b79336115a8e0fddfa955f7bfbff5f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5225)
 
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027406849


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * 9fb948535b79336115a8e0fddfa955f7bfbff5f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5225)
 
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5667)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #4716: [HUDI-3322][HUDI-3343] Fixing Metadata Table Records Duplication Issues

2022-02-01 Thread GitBox


nsivabalan commented on a change in pull request #4716:
URL: https://github.com/apache/hudi/pull/4716#discussion_r797157620



##
File path: hudi-common/src/main/avro/HoodieRollbackMetadata.avsc
##
@@ -38,14 +38,6 @@
 "type": "long",
 "doc": "Size of this file in bytes"
 }
-}], "default":null },

Review comment:
   hmmm, interesting. Can you try this out explicitly. write an avro using 
master branch. and then try to read it using this branch. or you can just try 
it out using a stand alone java main class too, which ever works. 
   just wanted to ensure we are good here. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #4716: [HUDI-3322][HUDI-3343] Fixing Metadata Table Records Duplication Issues

2022-02-01 Thread GitBox


nsivabalan commented on a change in pull request #4716:
URL: https://github.com/apache/hudi/pull/4716#discussion_r797155269



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
##
@@ -90,42 +86,41 @@ public MarkerBasedRollbackStrategy(HoodieTable 
table, HoodieEngineCo
 Collections.singletonList(fullDeletePath.toString()),
 Collections.emptyMap());
   case APPEND:
+// NOTE: This marker file-path does NOT correspond to a log-file, 
but rather is a phony
+//   path serving as a "container" for the following 
components:
+//  - Base file's file-id
+//  - Base file's commit instant
+//  - Partition path
 return 
getRollbackRequestForAppend(WriteMarkers.stripMarkerSuffix(markerFilePath));
   default:
 throw new HoodieRollbackException("Unknown marker type, during 
rollback of " + instantToRollback);
 }
-  }, parallelism).stream().collect(Collectors.toList());
+  }, parallelism);
 } catch (Exception e) {
   throw new HoodieRollbackException("Error rolling back using marker files 
written for " + instantToRollback, e);
 }
   }
 
-  protected HoodieRollbackRequest getRollbackRequestForAppend(String 
appendBaseFilePath) throws IOException {
-Path baseFilePathForAppend = new Path(basePath, appendBaseFilePath);
+  protected HoodieRollbackRequest getRollbackRequestForAppend(String 
markerFilePath) throws IOException {
+Path baseFilePathForAppend = new Path(basePath, markerFilePath);
 String fileId = FSUtils.getFileIdFromFilePath(baseFilePathForAppend);
 String baseCommitTime = 
FSUtils.getCommitTime(baseFilePathForAppend.getName());
-String partitionPath = FSUtils.getRelativePartitionPath(new 
Path(basePath), new Path(basePath, appendBaseFilePath).getParent());
-Map writtenLogFileSizeMap = 
getWrittenLogFileSizeMap(partitionPath, baseCommitTime, fileId);
-Map writtenLogFileStrSizeMap = new HashMap<>();
-for (Map.Entry entry : writtenLogFileSizeMap.entrySet()) 
{
-  writtenLogFileStrSizeMap.put(entry.getKey().getPath().toString(), 
entry.getValue());
-}
-return new HoodieRollbackRequest(partitionPath, fileId, baseCommitTime, 
Collections.emptyList(), writtenLogFileStrSizeMap);
+String relativePartitionPath = FSUtils.getRelativePartitionPath(new 
Path(basePath), baseFilePathForAppend.getParent());
+Path partitionPath = FSUtils.getPartitionPath(config.getBasePath(), 
relativePartitionPath);
+
+// NOTE: Since we're rolling back incomplete Delta Commit, it only could 
have appended its
+//   block to the latest log-file
+// TODO(HUDI-1517) use provided marker-file's path instead
+HoodieLogFile latestLogFile = 
FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
+HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime).get();
+
+// NOTE: Marker's don't carry information about the cumulative size of the 
blocks that have been appended,
+//   therefore we simply stub this value.
+Map logFilesWithBlocsToRollback =

Review comment:
   based on offline discussion, this will case an issue in storage schemes 
like hdfs where rollback block could be added to existing log file. will take 
it up as a follow up PR. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #4716: [HUDI-3322][HUDI-3343] Fixing Metadata Table Records Duplication Issues

2022-02-01 Thread GitBox


nsivabalan commented on a change in pull request #4716:
URL: https://github.com/apache/hudi/pull/4716#discussion_r797153919



##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
##
@@ -442,8 +443,11 @@ private void updateTableMetadata(HoodieTable>, JavaRD
 metaClient, config, context, 
SparkUpgradeDowngradeHelper.getInstance())
 .run(HoodieTableVersion.current(), instantTime);
 metaClient.reloadActiveTimeline();
-initializeMetadataTable(Option.of(instantTime));
   }
+  // Initialize Metadata Table to make sure it's bootstrapped _before_ the 
operation,
+  // if it didn't exist before
+  // See https://issues.apache.org/jira/browse/HUDI-3343 for more details
+  initializeMetadataTable(Option.of(instantTime));

Review comment:
   based on offline discussion, we are good here. MDT intitalization may 
not kick in if there are any pending operations in data table




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated (16138db -> 72f7348)

2022-02-01 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from 16138db  [HUDI-3368] Revert "[HUDI-3306] Upgrade rocksdb version 
(#4663)" (#4733)
 add 72f7348  [HUDI-2589] RFC-37: Metadata table based bloom index (#3989)

No new revisions were added by this update.

Summary of changes:
 rfc/README.md |   2 +-
 rfc/rfc-37/metadata_index_1.png   | Bin 0 -> 221832 bytes
 rfc/rfc-37/metadata_index_bloom_partition.png | Bin 0 -> 200465 bytes
 rfc/rfc-37/metadata_index_col_stats.png   | Bin 0 -> 57602 bytes
 rfc/rfc-37/rfc-37.md  | 329 ++
 5 files changed, 330 insertions(+), 1 deletion(-)
 create mode 100644 rfc/rfc-37/metadata_index_1.png
 create mode 100644 rfc/rfc-37/metadata_index_bloom_partition.png
 create mode 100644 rfc/rfc-37/metadata_index_col_stats.png
 create mode 100644 rfc/rfc-37/rfc-37.md


[GitHub] [hudi] nsivabalan merged pull request #3989: [HUDI-2589] RFC-37: Metadata table based bloom index

2022-02-01 Thread GitBox


nsivabalan merged pull request #3989:
URL: https://github.com/apache/hudi/pull/3989


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4556: [HUDI-3191][Stacked on 4716] Removing duplicating file-listing process w/in Hive's MOR `FileInputFormat`s

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4556:
URL: https://github.com/apache/hudi/pull/4556#issuecomment-1027244710


   
   ## CI report:
   
   * 77d11131baabd1c4e3cc2050337daca4df5f6427 UNKNOWN
   * 3d9c2ae28da858d1e8476052c99391015effb7db UNKNOWN
   * 31b0669d7b638bd65a17b22a2ceb772f2627512c UNKNOWN
   * 28a5a4f537544d35dfcd8700a7b97fb7216682ce UNKNOWN
   * c09e228f7cce78a7dbbc394e93b3cf8c6c3d4d5f UNKNOWN
   * 5b8f5819fff8fec34864eb409fd429b95be17b9b UNKNOWN
   * 5d37935bc8bb33260735d782ca560fd59e02f321 UNKNOWN
   * f911d869f50009e5cd9f3fd341c83c732d7531ba Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5634)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5657)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5659)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4556: [HUDI-3191][Stacked on 4716] Removing duplicating file-listing process w/in Hive's MOR `FileInputFormat`s

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4556:
URL: https://github.com/apache/hudi/pull/4556#issuecomment-1027389109


   
   ## CI report:
   
   * 77d11131baabd1c4e3cc2050337daca4df5f6427 UNKNOWN
   * 3d9c2ae28da858d1e8476052c99391015effb7db UNKNOWN
   * 31b0669d7b638bd65a17b22a2ceb772f2627512c UNKNOWN
   * 28a5a4f537544d35dfcd8700a7b97fb7216682ce UNKNOWN
   * c09e228f7cce78a7dbbc394e93b3cf8c6c3d4d5f UNKNOWN
   * 5b8f5819fff8fec34864eb409fd429b95be17b9b UNKNOWN
   * 5d37935bc8bb33260735d782ca560fd59e02f321 UNKNOWN
   * f911d869f50009e5cd9f3fd341c83c732d7531ba Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5634)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5657)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5659)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on pull request #4556: [HUDI-3191][Stacked on 4716] Removing duplicating file-listing process w/in Hive's MOR `FileInputFormat`s

2022-02-01 Thread GitBox


alexeykudinkin commented on pull request #4556:
URL: https://github.com/apache/hudi/pull/4556#issuecomment-1027387044


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1012822806


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * 9fb948535b79336115a8e0fddfa955f7bfbff5f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5225)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#issuecomment-1027361753


   
   ## CI report:
   
   * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN
   * 3784c4bf415fec6e48f1438c2f14eb4061c608cf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5644)
 
   * 2541391e4cfa70e28c5d17a63cb920bc3547dd5e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-1027384679


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 93f3baa443153657ebe212f1c1b453776dc4cc82 UNKNOWN
   * 9fb948535b79336115a8e0fddfa955f7bfbff5f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5225)
 
   * d0b17e523f7d9b4316583ebe8eefe72116a64dd7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


hudi-bot commented on pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#issuecomment-1027384717


   
   ## CI report:
   
   * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN
   * 3784c4bf415fec6e48f1438c2f14eb4061c608cf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5644)
 
   * 2541391e4cfa70e28c5d17a63cb920bc3547dd5e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5664)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


alexeykudinkin commented on a change in pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#discussion_r797123944



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java
##
@@ -481,4 +447,63 @@ public long moveToPrev() throws IOException {
   public void remove() {
 throw new UnsupportedOperationException("Remove not supported for 
HoodieLogFileReader");
   }
+
+  private static Path makeQualified(FileSystem fs, Path path) {
+return path.makeQualified(fs.getUri(), fs.getWorkingDirectory());
+  }
+
+  /**
+   * Fetch the right {@link FSDataInputStream} to be used by wrapping with 
required input streams.
+   * @param fs instance of {@link FileSystem} in use.
+   * @param bufferSize buffer size to be used.
+   * @return the right {@link FSDataInputStream} as required.
+   */
+  private static FSDataInputStream getFSDataInputStream(FileSystem fs,

Review comment:
   This didn't change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


alexeykudinkin commented on a change in pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#discussion_r797123287



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieDataBlock.java
##
@@ -18,151 +18,173 @@
 
 package org.apache.hudi.common.table.log.block;
 
-import org.apache.hudi.common.model.HoodieRecord;
-import org.apache.hudi.common.util.Option;
-import org.apache.hudi.exception.HoodieException;
-import org.apache.hudi.exception.HoodieIOException;
-
 import org.apache.avro.Schema;
 import org.apache.avro.generic.IndexedRecord;
 import org.apache.hadoop.fs.FSDataInputStream;
-
-import javax.annotation.Nonnull;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.exception.HoodieIOException;
 
 import java.io.IOException;
+import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
+import java.util.stream.Collectors;
+
+import static org.apache.hudi.common.util.ValidationUtils.checkState;
 
 /**
  * DataBlock contains a list of records serialized using formats compatible 
with the base file format.
  * For each base file format there is a corresponding DataBlock format.
- *
+ * 
  * The Datablock contains:
  *   1. Data Block version
  *   2. Total number of records in the block
  *   3. Actual serialized content of the records
  */
 public abstract class HoodieDataBlock extends HoodieLogBlock {
 
-  protected List records;
-  protected Schema schema;
-  protected String keyField;
+  // TODO rebase records/content to leverage Either to warrant
+  //  that they are mutex (used by read/write flows respectively)
+  private Option> records;
 
-  public HoodieDataBlock(@Nonnull Map 
logBlockHeader,
-  @Nonnull Map logBlockFooter,
-  @Nonnull Option blockContentLocation, 
@Nonnull Option content,
-  FSDataInputStream inputStream, boolean readBlockLazily) {
-super(logBlockHeader, logBlockFooter, blockContentLocation, content, 
inputStream, readBlockLazily);
-this.keyField = HoodieRecord.RECORD_KEY_METADATA_FIELD;
-  }
+  /**
+   * Dot-path notation reference to the key field w/in the record's schema
+   */
+  private final String keyFieldRef;

Review comment:
   Double-checked this is indeed expected to just be a field-name (w/in the 
record's schema) not dot-path notation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-02-01 Thread GitBox


alexeykudinkin commented on a change in pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#discussion_r797120652



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java
##
@@ -224,6 +217,33 @@ public static HoodieAvroDataBlock getBlock(byte[] content, 
Schema readerSchema)
 return new HoodieAvroDataBlock(records, readerSchema);
   }
 
+  private static byte[] compress(String text) {

Review comment:
   This just have been moved down to make it easier to analyze actually 
overridden methods. 
   
   > can we move this to IOUtils or some common utils class?
   
   This are actually a deprecated methods.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2899) Fix DataFormatter usages removed in Spark 3.2

2022-02-01 Thread Alexey Kudinkin (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17485520#comment-17485520
 ] 

Alexey Kudinkin commented on HUDI-2899:
---

Yep, it's working fine w/ Spark 3.2 now

> Fix DataFormatter usages removed in Spark 3.2
> -
>
> Key: HUDI-2899
> URL: https://issues.apache.org/jira/browse/HUDI-2899
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark
>Reporter: Alexey Kudinkin
>Assignee: Yann Byron
>Priority: Major
> Fix For: 0.11.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
>  
> Trying to read that is partitioned on a string field ("product_category") 
> rather than by date, leads to `NoSuchMethodError`
> {code:java}
> scala> val readDf: DataFrame =
>      |   spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), 
> "false").format("hudi").load(outputPath)
> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.catalyst.util.DateFormatter$.apply(Ljava/time/ZoneId;)Lorg/apache/spark/sql/catalyst/util/DateFormatter;
>   at 
> org.apache.spark.sql.execution.datasources.Spark3ParsePartitionUtil.parsePartition(Spark3ParsePartitionUtil.scala:32)
>   at 
> org.apache.hudi.HoodieFileIndex.$anonfun$getAllQueryPartitionPaths$3(HoodieFileIndex.scala:559)
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:286)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.hudi.HoodieFileIndex.getAllQueryPartitionPaths(HoodieFileIndex.scala:511)
>   at 
> org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:575)
>   at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:360)
>   at org.apache.hudi.HoodieFileIndex.(HoodieFileIndex.scala:157)
>   at 
> org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
>   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
>   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
>   at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
>   at 
> org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
>   at scala.Option.getOrElse(Option.scala:189)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:188)
>   ... 68 elided {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (HUDI-2899) Fix DataFormatter usages removed in Spark 3.2

2022-02-01 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin closed HUDI-2899.
-
Resolution: Fixed

> Fix DataFormatter usages removed in Spark 3.2
> -
>
> Key: HUDI-2899
> URL: https://issues.apache.org/jira/browse/HUDI-2899
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark
>Reporter: Alexey Kudinkin
>Assignee: Yann Byron
>Priority: Major
> Fix For: 0.11.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
>  
> Trying to read that is partitioned on a string field ("product_category") 
> rather than by date, leads to `NoSuchMethodError`
> {code:java}
> scala> val readDf: DataFrame =
>      |   spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), 
> "false").format("hudi").load(outputPath)
> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.catalyst.util.DateFormatter$.apply(Ljava/time/ZoneId;)Lorg/apache/spark/sql/catalyst/util/DateFormatter;
>   at 
> org.apache.spark.sql.execution.datasources.Spark3ParsePartitionUtil.parsePartition(Spark3ParsePartitionUtil.scala:32)
>   at 
> org.apache.hudi.HoodieFileIndex.$anonfun$getAllQueryPartitionPaths$3(HoodieFileIndex.scala:559)
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:286)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.hudi.HoodieFileIndex.getAllQueryPartitionPaths(HoodieFileIndex.scala:511)
>   at 
> org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:575)
>   at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:360)
>   at org.apache.hudi.HoodieFileIndex.(HoodieFileIndex.scala:157)
>   at 
> org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
>   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
>   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
>   at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
>   at 
> org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
>   at scala.Option.getOrElse(Option.scala:189)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:188)
>   ... 68 elided {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2022-02-01 Thread GitBox


hudi-bot removed a comment on pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#issuecomment-1026531057


   
   ## CI report:
   
   * 235981abd20a498a3e29e98ce0eda9de35018f99 UNKNOWN
   * 3784c4bf415fec6e48f1438c2f14eb4061c608cf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5644)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   >