[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000683899


   
   ## CI report:
   
   * 10b433d337b81dd2c40deb324eaacd62fe1e5a75 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4715)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000664317


   
   ## CI report:
   
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   * 10b433d337b81dd2c40deb324eaacd62fe1e5a75 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4715)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000664317


   
   ## CI report:
   
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   * 10b433d337b81dd2c40deb324eaacd62fe1e5a75 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4715)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000663650


   
   ## CI report:
   
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   * 10b433d337b81dd2c40deb324eaacd62fe1e5a75 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000658726


   
   ## CI report:
   
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000663650


   
   ## CI report:
   
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   * 10b433d337b81dd2c40deb324eaacd62fe1e5a75 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000658726


   
   ## CI report:
   
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000643984


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000643984


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   * eba225e0b44687c8df5a6da2db11b74810608282 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4714)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000643262


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   * eba225e0b44687c8df5a6da2db11b74810608282 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000641915


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000643262


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   * eba225e0b44687c8df5a6da2db11b74810608282 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000641915


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000622429


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000634264


   
   ## CI report:
   
   * 44f1b5daf2bcadd8ef2cc9a3db6d612dd9bfb746 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4712)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000611396


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   * 44f1b5daf2bcadd8ef2cc9a3db6d612dd9bfb746 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4712)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000631871


   
   ## CI report:
   
   * f639f8bdbd7b805f872e810a17ff90c92f940779 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4711)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000610573


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   * f639f8bdbd7b805f872e810a17ff90c92f940779 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4711)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000622429


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4713)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000621666


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4441:
URL: https://github.com/apache/hudi/pull/4441#issuecomment-1000621666


   
   ## CI report:
   
   * 91c40556be8851456ba1c6e184e128b2b185ecbd UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-3085) Refactor fileId & writeHandler logic into partitioner for bulk_insert

2021-12-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-3085:
-
Labels: pull-request-available  (was: )

> Refactor fileId & writeHandler logic into partitioner for bulk_insert
> -
>
> Key: HUDI-3085
> URL: https://issues.apache.org/jira/browse/HUDI-3085
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Yuwei Xiao
>Priority: Major
>  Labels: pull-request-available
>
> a better partitioner abstraction for bulk_insert



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] YuweiXiao opened a new pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2021-12-23 Thread GitBox


YuweiXiao opened a new pull request #4441:
URL: https://github.com/apache/hudi/pull/4441


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Restructure the bulk insert partitioner interface, to include the handling 
of fileIdPfx & write handle factory.
   
   With this improvement, one can implement a new bulk_insert partitioner that 
is capable of routing records to pre-defined fileIds using customized write 
factory (e.g., different write factories for different partitions)
   
   ## Brief change log
   
   - Modify interface of `BulkInsertPartitioner`
   - Modify bulk_insert write path (e.g., `AbstractBulkInsertHelper` and its 
subclasses) to make use of the new partitioner interface
   - The java bulk_insert write path is mostly left untouched because of its 
specialty, e.g., always write to a single filegroup (i.e., parallelism=1) and 
has customized fileId generator `FileIdPrefixProvider`.
   
   ## Verify this pull request
   
   Added a fileId generation check to existing tests, and other parts are 
already covered by existing tests, such as `TestBulkInsertInternalPartitioner`.
   
   ## Committer checklist
   
- [x] Has a corresponding JIRA in PR title & commit

- [x] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2941) Show _hoodie_operation in spark sql results

2021-12-23 Thread Forward Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464875#comment-17464875
 ] 

Forward Xu commented on HUDI-2941:
--

hi [~lsy] Are you fixing this problem?

> Show _hoodie_operation in spark sql results
> ---
>
> Key: HUDI-2941
> URL: https://issues.apache.org/jira/browse/HUDI-2941
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Raymond Xu
>Assignee: dalongliu
>Priority: Critical
>  Labels: user-support-issues
> Fix For: 0.11.0
>
>
> Details in
> [https://github.com/apache/hudi/issues/4160]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4440: [HUDI-3100] Add config for hive conditional sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4440:
URL: https://github.com/apache/hudi/pull/4440#issuecomment-1000603173


   
   ## CI report:
   
   * b4f66eadd1ac123521940c4616fb66572d1c72f9 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4710)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4440: [HUDI-3100] Add config for hive conditional sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4440:
URL: https://github.com/apache/hudi/pull/4440#issuecomment-1000618141


   
   ## CI report:
   
   * b4f66eadd1ac123521940c4616fb66572d1c72f9 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4710)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000610559


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   * 44f1b5daf2bcadd8ef2cc9a3db6d612dd9bfb746 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000611396


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   * 44f1b5daf2bcadd8ef2cc9a3db6d612dd9bfb746 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4712)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000610573


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   * f639f8bdbd7b805f872e810a17ff90c92f940779 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4711)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (HUDI-2661) java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.catalog.CatalogTable.copy

2021-12-23 Thread Forward Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464516#comment-17464516
 ] 

Forward Xu edited comment on HUDI-2661 at 12/24/21, 2:12 AM:
-

hi [~wordcount] [~biyan900...@gmail.com] This problem, I also encountered in 
the company's internal version, is caused by inconsistent Spark CatalogTable 
parameters. You need to change the spark in hudi to be consistent with the 
spark version used in your company's environment, and then compile it to solve 
the problem. I found that the problem is spark2.4.6.


was (Author: x1q1j1):
hi [~wordcount] [~biyan900...@gmail.com] This problem, I also encountered in 
the company's internal version, is caused by inconsistent Spark CatalogTable 
parameters. You need to change the spark in hudi to be consistent with the 
spark version used in your company's environment, and then compile it to solve 
the problem.

> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.catalyst.catalog.CatalogTable.copy
> 
>
> Key: HUDI-2661
> URL: https://issues.apache.org/jira/browse/HUDI-2661
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Spark Integration
>Affects Versions: 0.10.0
>Reporter: Changjun Zhang
>Assignee: Yann Byron
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: image-2021-11-01-21-47-44-538.png, 
> image-2021-11-01-21-48-22-765.png
>
>
> Hudi Integrate with Spark SQL  :
> when I add :
> {code:sh}
> // Some comments here
> spark-sql --conf 
> 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
> --conf 
> 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
> {code}
> to create a table on an existing hudi table: 
> {code:sql}
> create table testdb.tb_hudi_operation_test using hudi 
> location '/tmp/flinkdb/datas/tb_hudi_operation';
> {code}
> then throw Exception :
>  !image-2021-11-01-21-47-44-538.png|thumbnail! 
>  !image-2021-11-01-21-48-22-765.png|thumbnail! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000273098


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000610559


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   * 44f1b5daf2bcadd8ef2cc9a3db6d612dd9bfb746 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000609753


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   * f639f8bdbd7b805f872e810a17ff90c92f940779 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000609753


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   * f639f8bdbd7b805f872e810a17ff90c92f940779 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000589041


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xushiyan commented on a change in pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


xushiyan commented on a change in pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#discussion_r774815485



##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java
##
@@ -309,6 +309,8 @@ public static HiveSyncConfig 
buildHiveSyncConfig(TypedProperties props, String b
 
DataSourceWriteOptions.HIVE_SKIP_RO_SUFFIX_FOR_READ_OPTIMIZED_TABLE().defaultValue()));
 hiveSyncConfig.supportTimestamp = 
Boolean.valueOf(props.getString(DataSourceWriteOptions.HIVE_SUPPORT_TIMESTAMP_TYPE().key(),
 DataSourceWriteOptions.HIVE_SUPPORT_TIMESTAMP_TYPE().defaultValue()));
+hiveSyncConfig.isConditionalSync = 
Boolean.valueOf(props.getString(DataSourceWriteOptions.HIVE_CONDITIONAL_SYNC().key(),
+DataSourceWriteOptions.HIVE_CONDITIONAL_SYNC().defaultValue()));

Review comment:
   irrelevant change.. moving to separate PR

##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##
@@ -495,6 +495,16 @@ object DataSourceWriteOptions {
 .withDocumentation("‘INT64’ with original type TIMESTAMP_MICROS is 
converted to hive ‘timestamp’ type. " +
   "Disabled by default for backward compatibility.")
 
+  /**
+   * Flag to indicate whether to use conditional syncing in HiveSync.
+   * If set true, the Hive sync procedure will only run if partition or schema 
changes are detected.
+   * By default true.
+   */
+  val HIVE_CONDITIONAL_SYNC: ConfigProperty[String] = ConfigProperty
+.key("hoodie.datasource.hive_sync.conditional_sync")
+.defaultValue("false")
+.withDocumentation("Enables conditional hive sync, where partition or 
schema change must exist to perform sync to hive.")

Review comment:
   irrelevant change.. moving to separate PR
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


xiarixiaoyao commented on a change in pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#discussion_r774838809



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/AlterHoodieTableDropPartitionCommand.scala
##
@@ -33,7 +35,10 @@ import org.apache.spark.sql.{AnalysisException, Row, 
SaveMode, SparkSession}
 
 case class AlterHoodieTableDropPartitionCommand(
 tableIdentifier: TableIdentifier,
-specs: Seq[TablePartitionSpec])
+specs: Seq[TablePartitionSpec],
+ifExists : scala.Boolean,
+purge : scala.Boolean,

Review comment:
   Why not use Boolean directly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4440: [HUDI-3100] Add config for hive conditional sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4440:
URL: https://github.com/apache/hudi/pull/4440#issuecomment-1000602519


   
   ## CI report:
   
   * b4f66eadd1ac123521940c4616fb66572d1c72f9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4440: [HUDI-3100] Add config for hive conditional sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4440:
URL: https://github.com/apache/hudi/pull/4440#issuecomment-1000603173


   
   ## CI report:
   
   * b4f66eadd1ac123521940c4616fb66572d1c72f9 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4710)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2199) DynamoDB based external index implementation

2021-12-23 Thread Biswajit mohapatra (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464856#comment-17464856
 ] 

Biswajit mohapatra commented on HUDI-2199:
--

Insert and upsert is completed and testing is in progress , rollback is still 
left 

As dynamodb only can efficiently search using hash key and sort key and both of 
them are already used by hudi primary key and partition key so using commit_ts 
one gsi index needs to be created for efficent rollback 

> DynamoDB based external index implementation
> 
>
> Key: HUDI-2199
> URL: https://issues.apache.org/jira/browse/HUDI-2199
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Index
>Reporter: Vinoth Chandar
>Assignee: Biswajit mohapatra
>Priority: Major
>
> We have a HBaseIndex, that provides uses with ability to store fileID <=> 
> recordKey mappings in an external kv store, for fast lookups during upsert 
> operations. We can potentially create a similar one for DynamoDB. 
> We just use a single column family in HBase, so we should be able to largely 
> re-use the code/key-value schema across them even. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot commented on pull request #4440: [HUDI-3100] Add config for hive conditional sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4440:
URL: https://github.com/apache/hudi/pull/4440#issuecomment-1000602519


   
   ## CI report:
   
   * b4f66eadd1ac123521940c4616fb66572d1c72f9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-3100) Hive Conditional sync cannot be set from deltastreamer

2021-12-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-3100:
-
Labels: pull-request-available  (was: )

> Hive Conditional sync cannot be set from deltastreamer
> --
>
> Key: HUDI-3100
> URL: https://issues.apache.org/jira/browse/HUDI-3100
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: DeltaStreamer, Hive Integration
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0, 0.10.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] xushiyan opened a new pull request #4440: [HUDI-3100] Add config for hive conditional sync

2021-12-23 Thread GitBox


xushiyan opened a new pull request #4440:
URL: https://github.com/apache/hudi/pull/4440


   
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (HUDI-3100) Hive Conditional sync cannot be set from deltastreamer

2021-12-23 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-3100:


Assignee: Raymond Xu

> Hive Conditional sync cannot be set from deltastreamer
> --
>
> Key: HUDI-3100
> URL: https://issues.apache.org/jira/browse/HUDI-3100
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: DeltaStreamer, Hive Integration
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.11.0, 0.10.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HUDI-3100) Hive Conditional sync cannot be set from deltastreamer

2021-12-23 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-3100:


 Summary: Hive Conditional sync cannot be set from deltastreamer
 Key: HUDI-3100
 URL: https://issues.apache.org/jira/browse/HUDI-3100
 Project: Apache Hudi
  Issue Type: Bug
  Components: DeltaStreamer, Hive Integration
Reporter: Raymond Xu






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3100) Hive Conditional sync cannot be set from deltastreamer

2021-12-23 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3100:
-
Fix Version/s: 0.11.0
   0.10.1

> Hive Conditional sync cannot be set from deltastreamer
> --
>
> Key: HUDI-3100
> URL: https://issues.apache.org/jira/browse/HUDI-3100
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: DeltaStreamer, Hive Integration
>Reporter: Raymond Xu
>Priority: Major
> Fix For: 0.11.0, 0.10.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] waywtdcc opened a new issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error

2021-12-23 Thread GitBox


waywtdcc opened a new issue #4439:
URL: https://github.com/apache/hudi/issues/4439


   
   
   **Describe the problem you faced**
   
   `org.apache.flink.util.FlinkException: Global failure triggered by 
OperatorCoordinator for 'hoodie_stream_write' (operator 
a75843bfd239577dd73ed4d150d8cfdb).
at 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:553)
at 
org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$start$0(StreamWriteOperatorCoordinator.java:170)
at 
org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:103)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: Executor executes 
action [commits the instant 20211221143230391] error
... 5 more
   Caused by: org.apache.hudi.exception.HoodieRollbackException: Failed to 
rollback hdfs://grgbanking/user/hive/warehouse/hudi.db/datagen_hudi1 commits 
20211221141250745
at 
org.apache.hudi.client.AbstractHoodieWriteClient.rollback(AbstractHoodieWriteClient.java:655)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.rollbackFailedWrites(AbstractHoodieWriteClient.java:946)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.rollbackFailedWrites(AbstractHoodieWriteClient.java:929)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.rollbackFailedWrites(AbstractHoodieWriteClient.java:917)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.lambda$startCommitWithTime$97cdbdca$1(AbstractHoodieWriteClient.java:810)
at 
org.apache.hudi.common.util.CleanerUtils.rollbackFailedWrites(CleanerUtils.java:143)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.startCommitWithTime(AbstractHoodieWriteClient.java:809)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.startCommitWithTime(AbstractHoodieWriteClient.java:802)
at 
org.apache.hudi.sink.StreamWriteOperatorCoordinator.startInstant(StreamWriteOperatorCoordinator.java:334)
at 
org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$notifyCheckpointComplete$2(StreamWriteOperatorCoordinator.java:234)
at 
org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:93)
... 3 more
   Caused by: java.lang.IllegalArgumentException: Cannot use marker based 
rollback strategy on completed instant:[20211221141250745__commit__COMPLETED]
at 
org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40)
at 
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.(BaseRollbackActionExecutor.java:90)
at 
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.(BaseRollbackActionExecutor.java:71)
at 
org.apache.hudi.table.action.rollback.MergeOnReadRollbackActionExecutor.(MergeOnReadRollbackActionExecutor.java:48)
at 
org.apache.hudi.table.HoodieFlinkMergeOnReadTable.rollback(HoodieFlinkMergeOnReadTable.java:131)
at 
org.apache.hudi.client.AbstractHoodieWriteClient.rollback(AbstractHoodieWriteClient.java:640)
... 13 more
   `
   
   **To Reproduce**
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.10
   
   * Flink version: 1.13.5
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000589041


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000568051


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] xushiyan commented on a change in pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


xushiyan commented on a change in pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#discussion_r774815496



##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##
@@ -495,6 +495,16 @@ object DataSourceWriteOptions {
 .withDocumentation("‘INT64’ with original type TIMESTAMP_MICROS is 
converted to hive ‘timestamp’ type. " +
   "Disabled by default for backward compatibility.")
 
+  /**
+   * Flag to indicate whether to use conditional syncing in HiveSync.
+   * If set true, the Hive sync procedure will only run if partition or schema 
changes are detected.
+   * By default true.
+   */
+  val HIVE_CONDITIONAL_SYNC: ConfigProperty[String] = ConfigProperty
+.key("hoodie.datasource.hive_sync.conditional_sync")
+.defaultValue("false")
+.withDocumentation("Enables conditional hive sync, where partition or 
schema change must exist to perform sync to hive.")

Review comment:
   irrelevant change.. moving to separate PR
   
   

##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java
##
@@ -309,6 +309,8 @@ public static HiveSyncConfig 
buildHiveSyncConfig(TypedProperties props, String b
 
DataSourceWriteOptions.HIVE_SKIP_RO_SUFFIX_FOR_READ_OPTIMIZED_TABLE().defaultValue()));
 hiveSyncConfig.supportTimestamp = 
Boolean.valueOf(props.getString(DataSourceWriteOptions.HIVE_SUPPORT_TIMESTAMP_TYPE().key(),
 DataSourceWriteOptions.HIVE_SUPPORT_TIMESTAMP_TYPE().defaultValue()));
+hiveSyncConfig.isConditionalSync = 
Boolean.valueOf(props.getString(DataSourceWriteOptions.HIVE_CONDITIONAL_SYNC().key(),
+DataSourceWriteOptions.HIVE_CONDITIONAL_SYNC().defaultValue()));

Review comment:
   irrelevant change.. moving to separate PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000568051


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4709)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000567466


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4438:
URL: https://github.com/apache/hudi/pull/4438#issuecomment-1000567466


   
   ## CI report:
   
   * 81236ff25e47018f1e152053c2f0ad31b8a42ed2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-2989) Hive sync to Glue tables not updating S3 location

2021-12-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-2989:
-
Labels: pull-request-available  (was: )

> Hive sync to Glue tables not updating S3 location
> -
>
> Key: HUDI-2989
> URL: https://issues.apache.org/jira/browse/HUDI-2989
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Hive Integration
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0, 0.10.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] xushiyan opened a new pull request #4438: [HUDI-2989] Update location during hive sync

2021-12-23 Thread GitBox


xushiyan opened a new pull request #4438:
URL: https://github.com/apache/hudi/pull/4438


   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] gunjdesai opened a new issue #4437: Example for CREATE TABLE on TRINO using HUDI [QUESTION]

2021-12-23 Thread GitBox


gunjdesai opened a new issue #4437:
URL: https://github.com/apache/hudi/issues/4437


   I am using `Spark Structured Streaming (3.1.1)` to read data from `Kafka` 
and use `HUDI (0.8.0)` as the storage system on S3 partitioning the data by 
date. (no problems with this section)
   
   I am looking to use `Trino (355)` to be able to query that data. As a 
pre-curser, I've already placed the `hudi-presto-bundle-0.8.0.jar` in 
`/data/trino/hive/`
   
   I created a table with the following schema
   
   ```
   CREATE TABLE table_new (
 columns, dt
   ) WITH (
 partitioned_by = ARRAY['dt'], 
 external_location = 's3a://bucket/location/',
 format = 'parquet'
   );
   ```
   Even after calling the below function, trino is unable to discover any 
partitions
   
   ```
   CALL system.sync_partition_metadata('schema', 'table_new', 'ALL')
   ```
   My assessment is that I am unable to create a table under trino using hudi 
largely due to the fact that I am not able to pass the right values under 
`WITH` Options. I am also unable to find a create table example under 
documentation for HUDI.
   
   I would really appreciate if anyone can give me a example for that, or point 
me to the right direction, if in case I've missed anything.
   
   Really appreciate the help
   
   **Environment Description**
   
   * Hudi version : 0.8.0
   
   * Spark version : 3.1.1
   
   * Hive version : 3.0.0 (Metastore only)
   
   * Hadoop version : 3.2.0
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : Yes
   
   This is more of a question than an issue, sorry couldn't apply a label for 
it.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4352: [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups

2021-12-23 Thread GitBox


manojpec commented on a change in pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#discussion_r774718997



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
##
@@ -380,4 +382,76 @@ public Boolean apply(String recordKey) {
 
 return val;
   }
+
+  /**
+   * Parse min/max statistics stored in parquet footers for all columns.
+   */
+  public Collection> 
readColumnStatsFromParquetMetadata(Configuration conf,
+   
   String partitionPath,

Review comment:
   sure @xiarixiaoyao, will take a look at readRangeFromParquetMetadata.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000374092


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000410909


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000366547


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000374092


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000366547


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000361588


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] stym06 commented on issue #4318: [SUPPORT] Duplicate records in COW table within same partition path

2021-12-23 Thread GitBox


stym06 commented on issue #4318:
URL: https://github.com/apache/hudi/issues/4318#issuecomment-1000362919


   This is set up on EMR. Is there a doc to set it up in local env?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000361588


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000306233


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#issuecomment-1000290053


   
   ## CI report:
   
   * eca826ac6531ec16762c5d8216a1333ec5e4ef1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4428)
 
   * 71f26552ed0e36d6d9c1e4ed71a81f56327f6570 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4707)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#issuecomment-1000325662


   
   ## CI report:
   
   * 71f26552ed0e36d6d9c1e4ed71a81f56327f6570 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4707)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000306233


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000266792


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000265115


   
   ## CI report:
   
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4705)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot commented on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000298801


   
   ## CI report:
   
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4705)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#issuecomment-1000288267


   
   ## CI report:
   
   * eca826ac6531ec16762c5d8216a1333ec5e4ef1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4428)
 
   * 71f26552ed0e36d6d9c1e4ed71a81f56327f6570 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#issuecomment-1000290053


   
   ## CI report:
   
   * eca826ac6531ec16762c5d8216a1333ec5e4ef1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4428)
 
   * 71f26552ed0e36d6d9c1e4ed71a81f56327f6570 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4707)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#issuecomment-996689997


   
   ## CI report:
   
   * eca826ac6531ec16762c5d8216a1333ec5e4ef1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4428)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#issuecomment-1000288267


   
   ## CI report:
   
   * eca826ac6531ec16762c5d8216a1333ec5e4ef1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4428)
 
   * 71f26552ed0e36d6d9c1e4ed71a81f56327f6570 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] harsh1231 commented on a change in pull request #4353: [HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t…

2021-12-23 Thread GitBox


harsh1231 commented on a change in pull request #4353:
URL: https://github.com/apache/hudi/pull/4353#discussion_r774554053



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroDFSSource.java
##
@@ -45,6 +45,7 @@
   public AvroDFSSource(TypedProperties props, JavaSparkContext sparkContext, 
SparkSession sparkSession,
   SchemaProvider schemaProvider) throws IOException {
 super(props, sparkContext, sparkSession, schemaProvider);
+
sparkContext.hadoopConfiguration().set("avro.schema.input.key",schemaProvider.getSourceSchema().toString());

Review comment:
   Fixed it 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000243090


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4436: [HUDI-3099] Purge drop partition for spark sql

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4436:
URL: https://github.com/apache/hudi/pull/4436#issuecomment-1000273098


   
   ## CI report:
   
   * 9206ca5fdaa848ea7c7947a80c8fc418aa70fadc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4704)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (HUDI-2661) java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.catalog.CatalogTable.copy

2021-12-23 Thread Forward Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464516#comment-17464516
 ] 

Forward Xu edited comment on HUDI-2661 at 12/23/21, 12:20 PM:
--

hi [~wordcount] [~biyan900...@gmail.com] This problem, I also encountered in 
the company's internal version, is caused by inconsistent Spark CatalogTable 
parameters. You need to change the spark in hudi to be consistent with the 
spark version used in your company's environment, and then compile it to solve 
the problem.


was (Author: x1q1j1):
hi [~wordcount] [~biyan900...@gmail.com] This problem, I also encountered in 
the company's internal version, is caused by inconsistent Spark CatalogTable 
parameters.

> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.catalyst.catalog.CatalogTable.copy
> 
>
> Key: HUDI-2661
> URL: https://issues.apache.org/jira/browse/HUDI-2661
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Spark Integration
>Affects Versions: 0.10.0
>Reporter: Changjun Zhang
>Assignee: Yann Byron
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: image-2021-11-01-21-47-44-538.png, 
> image-2021-11-01-21-48-22-765.png
>
>
> Hudi Integrate with Spark SQL  :
> when I add :
> {code:sh}
> // Some comments here
> spark-sql --conf 
> 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
> --conf 
> 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
> {code}
> to create a table on an existing hudi table: 
> {code:sql}
> create table testdb.tb_hudi_operation_test using hudi 
> location '/tmp/flinkdb/datas/tb_hudi_operation';
> {code}
> then throw Exception :
>  !image-2021-11-01-21-47-44-538.png|thumbnail! 
>  !image-2021-11-01-21-48-22-765.png|thumbnail! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000257502


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000266792


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4706)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-2661) java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.catalog.CatalogTable.copy

2021-12-23 Thread Forward Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464516#comment-17464516
 ] 

Forward Xu commented on HUDI-2661:
--

hi [~wordcount] [~biyan900...@gmail.com] This problem, I also encountered in 
the company's internal version, is caused by inconsistent Spark CatalogTable 
parameters.

> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.catalyst.catalog.CatalogTable.copy
> 
>
> Key: HUDI-2661
> URL: https://issues.apache.org/jira/browse/HUDI-2661
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Spark Integration
>Affects Versions: 0.10.0
>Reporter: Changjun Zhang
>Assignee: Yann Byron
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: image-2021-11-01-21-47-44-538.png, 
> image-2021-11-01-21-48-22-765.png
>
>
> Hudi Integrate with Spark SQL  :
> when I add :
> {code:sh}
> // Some comments here
> spark-sql --conf 
> 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
> --conf 
> 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
> {code}
> to create a table on an existing hudi table: 
> {code:sql}
> create table testdb.tb_hudi_operation_test using hudi 
> location '/tmp/flinkdb/datas/tb_hudi_operation';
> {code}
> then throw Exception :
>  !image-2021-11-01-21-47-44-538.png|thumbnail! 
>  !image-2021-11-01-21-48-22-765.png|thumbnail! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot commented on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot commented on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000265115


   
   ## CI report:
   
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4705)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000247211


   
   ## CI report:
   
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] harsh1231 commented on a change in pull request #4342: [HUDI-735] Fixing error messages on record key not found

2021-12-23 Thread GitBox


harsh1231 commented on a change in pull request #4342:
URL: https://github.com/apache/hudi/pull/4342#discussion_r774528025



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -229,7 +229,11 @@ object HoodieSparkSqlWriter {
 }
 sparkContext.getConf.registerAvroSchemas(schema)
 log.info(s"Registered avro schema : ${schema.toString(true)}")
-
+val columnSet = df.columns.toSet
+keyGenerator.getRecordKeyFieldNames.foreach(fieldName => 
if(!columnSet.contains(fieldName)) {
+  throw new Exception(s"record key '$fieldName' does not exist in 
existing table schema :  ${schema.toString(true)}")

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000250274


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000257502


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   * 9cbcdce74e36f936d985a7dcd75d851c71021293 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] leesf commented on a change in pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


leesf commented on a change in pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#discussion_r774396525



##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala
##
@@ -151,9 +151,8 @@ class MergeOnReadSnapshotRelation(val sqlContext: 
SQLContext,
   // Load files from the global paths if it has defined to be compatible 
with the original mode
   val inMemoryFileIndex = 
HoodieSparkUtils.createInMemoryFileIndex(sqlContext.sparkSession, globPaths.get)
   val fsView = new HoodieTableFileSystemView(metaClient,
-// file-slice after pending compaction-requested instant-time is also 
considered valid
-
metaClient.getCommitsAndCompactionTimeline.filterCompletedAndCompactionInstants,
-inMemoryFileIndex.allFiles().toArray)
+metaClient.getActiveTimeline.getCommitsTimeline

Review comment:
   Indeed I do not intend to change it, but just move it from 
spark-datasource/hudi-spark module to spark-datasource/hudi-spark-common 
module, so strange and changed. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] leesf commented on a change in pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


leesf commented on a change in pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#discussion_r774522685



##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/HoodieSqlCommonUtils.scala
##
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hudi
+
+import scala.collection.JavaConverters._
+import java.net.URI
+import java.util.{Date, Locale, Properties}
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+
+import org.apache.hudi.{AvroConversionUtils, SparkAdapterSupport}
+import org.apache.hudi.client.common.HoodieSparkEngineContext
+import org.apache.hudi.common.config.DFSPropertiesConfiguration
+import org.apache.hudi.common.config.HoodieMetadataConfig
+import org.apache.hudi.common.fs.FSUtils
+import org.apache.hudi.common.model.HoodieRecord
+import org.apache.hudi.common.table.{HoodieTableMetaClient, 
TableSchemaResolver}
+import org.apache.hudi.common.table.timeline.{HoodieActiveTimeline, 
HoodieInstantTimeGenerator}
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.sql.{Column, DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType}
+import org.apache.spark.sql.catalyst.expressions.{And, Attribute, Cast, 
Expression, Literal}
+import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, SubqueryAlias}
+import org.apache.spark.sql.execution.datasources.LogicalRelation
+import org.apache.spark.sql.internal.{SQLConf, StaticSQLConf}
+import org.apache.spark.api.java.JavaSparkContext
+import org.apache.spark.sql.types.{DataType, NullType, StringType, 
StructField, StructType}
+
+import java.text.SimpleDateFormat
+
+import scala.collection.immutable.Map
+
+object HoodieSqlCommonUtils extends SparkAdapterSupport {
+  // NOTE: {@code SimpleDataFormat} is NOT thread-safe
+  // TODO replace w/ DateTimeFormatter
+  private val defaultDateFormat =
+  ThreadLocal.withInitial(new java.util.function.Supplier[SimpleDateFormat] {
+override def get() = new SimpleDateFormat("-MM-dd")
+  })
+
+  def isHoodieTable(table: CatalogTable): Boolean = {
+table.provider.map(_.toLowerCase(Locale.ROOT)).orNull == "hudi"
+  }
+
+  def isHoodieTable(tableId: TableIdentifier, spark: SparkSession): Boolean = {
+val table = spark.sessionState.catalog.getTableMetadata(tableId)
+isHoodieTable(table)
+  }
+
+  def isHoodieTable(table: LogicalPlan, spark: SparkSession): Boolean = {
+tripAlias(table) match {
+  case LogicalRelation(_, _, Some(tbl), _) => isHoodieTable(tbl)
+  case relation: UnresolvedRelation =>
+isHoodieTable(sparkAdapter.toTableIdentify(relation), spark)
+  case _=> false
+}
+  }
+
+  def getTableIdentify(table: LogicalPlan): TableIdentifier = {
+table match {
+  case SubqueryAlias(name, _) => sparkAdapter.toTableIdentify(name)
+  case _ => throw new IllegalArgumentException(s"Illegal table: $table")
+}
+  }
+
+  def getTableSqlSchema(metaClient: HoodieTableMetaClient,
+includeMetadataFields: Boolean = false): 
Option[StructType] = {
+val schemaResolver = new TableSchemaResolver(metaClient)
+val avroSchema = try 
Some(schemaResolver.getTableAvroSchema(includeMetadataFields))
+catch {
+  case _: Throwable => None
+}
+avroSchema.map(AvroConversionUtils.convertAvroSchemaToStructType)
+  }
+
+  def getAllPartitionPaths(spark: SparkSession, table: CatalogTable): 
Seq[String] = {
+val sparkEngine = new HoodieSparkEngineContext(new 
JavaSparkContext(spark.sparkContext))
+val metadataConfig = {
+  val properties = new Properties()
+  properties.putAll((spark.sessionState.conf.getAllConfs ++ 
table.storage.properties ++ table.properties).asJava)
+  HoodieMetadataConfig.newBuilder.fromProperties(properties).build()
+}
+FSUtils.getAllPartitionPaths(sparkEngine, metadataConfig, 
getTableLocation(table, spark)).asScala
+  }
+
+  /**
+   * This method is used to compatible with the old non-hive-styled partition 

[GitHub] [hudi] leesf commented on a change in pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


leesf commented on a change in pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#discussion_r774396525



##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala
##
@@ -151,9 +151,8 @@ class MergeOnReadSnapshotRelation(val sqlContext: 
SQLContext,
   // Load files from the global paths if it has defined to be compatible 
with the original mode
   val inMemoryFileIndex = 
HoodieSparkUtils.createInMemoryFileIndex(sqlContext.sparkSession, globPaths.get)
   val fsView = new HoodieTableFileSystemView(metaClient,
-// file-slice after pending compaction-requested instant-time is also 
considered valid
-
metaClient.getCommitsAndCompactionTimeline.filterCompletedAndCompactionInstants,
-inMemoryFileIndex.allFiles().toArray)
+metaClient.getActiveTimeline.getCommitsTimeline

Review comment:
   Indeed I do not change it, but just move it from 
spark-datasource/hudi-spark module to spark-datasource/hudi-spark-common 
module, so strange and changed. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000234937


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4350:
URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000250274


   
   ## CI report:
   
   * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN
   * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN
   * 7350710328487b78eaf595d1e092411b6e5b278b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4701)
 
   * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000245726


   
   ## CI report:
   
   * 3e704df22d2756a2e2bff86368967235dbb8b221 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4693)
 
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot commented on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000247211


   
   ## CI report:
   
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000231348


   
   ## CI report:
   
   * 3e704df22d2756a2e2bff86368967235dbb8b221 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4693)
 
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #3957: [HUDI-2688][RFC-40] A new Hudi connector for Trino

2021-12-23 Thread GitBox


hudi-bot commented on pull request #3957:
URL: https://github.com/apache/hudi/pull/3957#issuecomment-1000245726


   
   ## CI report:
   
   * 3e704df22d2756a2e2bff86368967235dbb8b221 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4693)
 
   * 018383c35183c590f21e494b7db6b31420bd9662 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4703)
 
   * ecb89c3de7d29a0002e1306cd0228f489296c4a1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4435: [Minor] remove unused method in HoodieActiveTimeline

2021-12-23 Thread GitBox


hudi-bot removed a comment on pull request #4435:
URL: https://github.com/apache/hudi/pull/4435#issuecomment-1000209590


   
   ## CI report:
   
   * 191e166e02d6107290c9d12f668b08b932ac2edc Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4702)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4435: [Minor] remove unused method in HoodieActiveTimeline

2021-12-23 Thread GitBox


hudi-bot commented on pull request #4435:
URL: https://github.com/apache/hudi/pull/4435#issuecomment-1000244477


   
   ## CI report:
   
   * 191e166e02d6107290c9d12f668b08b932ac2edc Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4702)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-1387) [UMBRELLA] Support Apache Calcite for writing/querying Hudi datasets

2021-12-23 Thread Forward Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464487#comment-17464487
 ] 

Forward Xu commented on HUDI-1387:
--

This sounds very good, I think about it, there are probably two big things that 
need to be made, one is that the parser can support the commonly used olap or 
engine such as flink/spark/presto. One is to convert the parser into Hudi's own 
execution plan.

> [UMBRELLA] Support Apache Calcite for writing/querying Hudi datasets
> 
>
> Key: HUDI-1387
> URL: https://issues.apache.org/jira/browse/HUDI-1387
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: Common Core, Writer Core
>Reporter: Raymond Xu
>Priority: Major
>  Labels: gsoc, gsoc2021, hudi-umbrellas, mentor
>
> (More details to be added)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


  1   2   >