[jira] [Assigned] (HUDI-5422) Control KEPP_LATEST_VERSIONS clean replaced files immediately or not

2022-12-19 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang reassigned HUDI-5422: --- Assignee: Yue Zhang > Control KEPP_LATEST_VERSIONS clean replaced files immediately or not >

[jira] [Updated] (HUDI-5422) Control KEPP_LATEST_VERSIONS clean replaced files immediately or not

2022-12-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5422: - Labels: pull-request-available (was: ) > Control KEPP_LATEST_VERSIONS clean replaced files

[GitHub] [hudi] zhangyue19921010 opened a new pull request, #7519: [HUDI-5422] Control KEPP_LATEST_VERSIONS clean replaced files immediately or delete after a while

2022-12-19 Thread GitBox
zhangyue19921010 opened a new pull request, #7519: URL: https://github.com/apache/hudi/pull/7519 ### Change Logs At present, when Hudi cleaner uses the KEEP_LATEST_FILE_VERSIONS strategy to clean the Hudi table, hoodie will assume that once replaced a file group automatically

[jira] [Created] (HUDI-5422) Control KEPP_LATEST_VERSIONS clean replaced files immediately or not

2022-12-19 Thread Yue Zhang (Jira)
Yue Zhang created HUDI-5422: --- Summary: Control KEPP_LATEST_VERSIONS clean replaced files immediately or not Key: HUDI-5422 URL: https://issues.apache.org/jira/browse/HUDI-5422 Project: Apache Hudi

[GitHub] [hudi] zhangyue19921010 commented on pull request #7517: [HUDI-5420] Fix metadata table validator to exclude uncommitted log files due to retry

2022-12-19 Thread GitBox
zhangyue19921010 commented on PR #7517: URL: https://github.com/apache/hudi/pull/7517#issuecomment-1358946971 Ack! Will review this pr later this week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] jomach commented on issue #5054: [SUPPORT] Hudi Failing with out of memory issue on Glue with >300 Mio. Records

2022-12-19 Thread GitBox
jomach commented on issue #5054: URL: https://github.com/apache/hudi/issues/5054#issuecomment-1358946467 I have the same issue with 2GB of input. https://user-images.githubusercontent.com/4804546/208608830-f7d41b8a-fae0-4414-a29a-a390a174d7b3.png;> -- This is an automated message

[GitHub] [hudi] danny0405 commented on a diff in pull request #7405: [HUDI-5341] IncrementalCleaning consider clustering completed later

2022-12-19 Thread GitBox
danny0405 commented on code in PR #7405: URL: https://github.com/apache/hudi/pull/7405#discussion_r1052980427 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -455,6 +458,10 @@ private Stream getCommitInstantsToArchive()

[GitHub] [hudi] danny0405 commented on a diff in pull request #7405: [HUDI-5341] IncrementalCleaning consider clustering completed later

2022-12-19 Thread GitBox
danny0405 commented on code in PR #7405: URL: https://github.com/apache/hudi/pull/7405#discussion_r1052979624 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java: ## @@ -188,8 +188,12 @@ private List

[GitHub] [hudi] hudi-bot commented on pull request #7517: [HUDI-5420] Fix metadata table validator to exclude uncommitted log files due to retry

2022-12-19 Thread GitBox
hudi-bot commented on PR #7517: URL: https://github.com/apache/hudi/pull/7517#issuecomment-1358921553 ## CI report: * 5cffefb5f39a12d11bc0c1b87d917a21bf462a2e Azure:

[GitHub] [hudi] yihua opened a new pull request, #7518: [MINOR] Fix minor issues in HoodieMetadataTableValidator docs

2022-12-19 Thread GitBox
yihua opened a new pull request, #7518: URL: https://github.com/apache/hudi/pull/7518 ### Change Logs This PR fixex minor issues in HoodieMetadataTableValidator docs. ### Impact Only Javadocs update. ### Risk level none ### Documentation Update

[GitHub] [hudi] hudi-bot commented on pull request #7517: [HUDI-5420] Fix metadata table validator to exclude uncommitted log files due to retry

2022-12-19 Thread GitBox
hudi-bot commented on PR #7517: URL: https://github.com/apache/hudi/pull/7517#issuecomment-1358917009 ## CI report: * 5cffefb5f39a12d11bc0c1b87d917a21bf462a2e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7499: [HUDI-5413] Add record count payload to support pv/uv

2022-12-19 Thread GitBox
hudi-bot commented on PR #7499: URL: https://github.com/apache/hudi/pull/7499#issuecomment-1358916911 ## CI report: * c158002ed1a0d136ca832a56ecd8452aa8cb09a8 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7499: [HUDI-5413] Add record count payload to support pv/uv

2022-12-19 Thread GitBox
hudi-bot commented on PR #7499: URL: https://github.com/apache/hudi/pull/7499#issuecomment-1358913453 ## CI report: * 72eddbdd7672eac4e7ea739d3ba8ab049dcbc446 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7467: [MINOR] fixed Flink's DataStream does not support creating managed table

2022-12-19 Thread GitBox
hudi-bot commented on PR #7467: URL: https://github.com/apache/hudi/pull/7467#issuecomment-1358913316 ## CI report: * bac8eda6c1b02d75b5ef04d81baaa59c2bc6a85d Azure:

[jira] [Updated] (HUDI-5420) Fix metadata table validator to exclude uncommitted log files in successful deltacommits

2022-12-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5420: - Labels: pull-request-available (was: ) > Fix metadata table validator to exclude uncommitted log

[GitHub] [hudi] yihua opened a new pull request, #7517: [HUDI-5420] Fix metadata table validator to exclude uncommitted log files due to retry

2022-12-19 Thread GitBox
yihua opened a new pull request, #7517: URL: https://github.com/apache/hudi/pull/7517 ### Change Logs When a write transaction writes uncommitted log files in a delta commit, e.g., due to Spark task retries, these log files stay in the file system after the successful delta commit

[GitHub] [hudi] hudi-bot commented on pull request #7513: [HUDI-5419] Revert spark configs after each sql test

2022-12-19 Thread GitBox
hudi-bot commented on PR #7513: URL: https://github.com/apache/hudi/pull/7513#issuecomment-1358909751 ## CI report: * 067359c8b56e237f112e23d7a6064019b08c1731 Azure:

[jira] [Updated] (HUDI-5420) Fix metadata table validator to exclude uncommitted log files in successful deltacommits

2022-12-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5420: Description: When a write transaction writes uncommitted log files in a delta commit, e.g., due to Spark

[GitHub] [hudi] hudi-bot commented on pull request #7509: [HUDI-5414] No need to guard the table initialization by lock for Hoo…

2022-12-19 Thread GitBox
hudi-bot commented on PR #7509: URL: https://github.com/apache/hudi/pull/7509#issuecomment-1358865000 ## CI report: * 71a67f9f2d6149ac6883684491ee0f8903594688 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7509: [HUDI-5414] No need to guard the table initialization by lock for Hoo…

2022-12-19 Thread GitBox
hudi-bot commented on PR #7509: URL: https://github.com/apache/hudi/pull/7509#issuecomment-1358861996 ## CI report: * 71a67f9f2d6149ac6883684491ee0f8903594688 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7509: [HUDI-5414] No need to guard the table initialization by lock for Hoo…

2022-12-19 Thread GitBox
hudi-bot commented on PR #7509: URL: https://github.com/apache/hudi/pull/7509#issuecomment-1358859067 ## CI report: * 71a67f9f2d6149ac6883684491ee0f8903594688 Azure:

[GitHub] [hudi] wzx140 closed pull request #7512: [HUDI-3217] support to read avro from non-legacy map/list in parquet log

2022-12-19 Thread GitBox
wzx140 closed pull request #7512: [HUDI-3217] support to read avro from non-legacy map/list in parquet log URL: https://github.com/apache/hudi/pull/7512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-3478] imporve cdc-related codes

2022-12-19 Thread GitBox
hudi-bot commented on PR #7410: URL: https://github.com/apache/hudi/pull/7410#issuecomment-1358855295 ## CI report: * 1195034b3a3dc06084733ae8572ffebc2b79d295 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7144: [HUDI-5164] improve delete files for drop table and truncate table

2022-12-19 Thread GitBox
hudi-bot commented on PR #7144: URL: https://github.com/apache/hudi/pull/7144#issuecomment-1358855026 ## CI report: * 973cf0df2b2f5e1065005a6e34cc8c8445eacd22 UNKNOWN * 455f684f0df1e35b091c52827e62ed77c63eb88f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7499: [HUDI-5413] Add record count payload to support pv/uv

2022-12-19 Thread GitBox
hudi-bot commented on PR #7499: URL: https://github.com/apache/hudi/pull/7499#issuecomment-1358816143 ## CI report: * 72eddbdd7672eac4e7ea739d3ba8ab049dcbc446 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7499: [HUDI-5413] Add record count payload to support pv/uv

2022-12-19 Thread GitBox
hudi-bot commented on PR #7499: URL: https://github.com/apache/hudi/pull/7499#issuecomment-1358812720 ## CI report: * 72eddbdd7672eac4e7ea739d3ba8ab049dcbc446 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7515: Bump gson from 2.6.2 to 2.8.9 in /packaging/hudi-cli-bundle

2022-12-19 Thread GitBox
hudi-bot commented on PR #7515: URL: https://github.com/apache/hudi/pull/7515#issuecomment-1358809393 ## CI report: * 35662a898bc24ed28e32af01d220db8554db720c Azure:

[GitHub] [hudi] hechao-ustc commented on a diff in pull request #7499: [HUDI-5413] Add record count payload to support pv/uv

2022-12-19 Thread GitBox
hechao-ustc commented on code in PR #7499: URL: https://github.com/apache/hudi/pull/7499#discussion_r1052847124 ## hudi-common/src/main/java/org/apache/hudi/common/model/RecordCountAvroPayload.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Updated] (HUDI-5305) Detect concurrent writes during compaction and clustering if they shouldn't happen

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5305: -- Sprint: (was: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint) > Detect concurrent writes during

[jira] [Updated] (HUDI-4688) Decouple lazy cleaning of failed writes from clean action in multi-writer

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4688: -- Sprint: (was: 2022/12/12, 0.13.0 Final Sprint) > Decouple lazy cleaning of failed writes from

[jira] [Updated] (HUDI-5401) Hivemetastore URI set in hudi conf not respected.

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5401: -- Sprint: (was: 0.13.0 Final Sprint) > Hivemetastore URI set in hudi conf not respected. >

[jira] [Updated] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3301: -- Sprint: (was: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint) > MergedLogRecordReader inline

[jira] [Updated] (HUDI-4863) Deprecate `hoodie.compaction.payload.class` and re-use hoodie.datasource.write.payload.class

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4863: -- Sprint: (was: 2022/11/01, 2022/12/12, 0.13.0 Final Sprint) > Deprecate

[jira] [Updated] (HUDI-3055) Make sure that Compression Codec configuration is respected across the board

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3055: -- Sprint: (was: 0.13.0 Final Sprint) > Make sure that Compression Codec configuration is

[jira] [Updated] (HUDI-4847) hive sync fails w/ utilities bundle in 0.13-snapshot, but succeeds w/ 0.11

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4847: -- Sprint: (was: 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/12/12, 0.13.0 Final

[jira] [Updated] (HUDI-4967) Improve docs for meta sync with TimestampBasedKeyGenerator

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4967: -- Sprint: (was: 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint) > Improve

[jira] [Updated] (HUDI-4854) Deltastreamer does not respect partition selector regex for metadata-only bootstrap

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4854: -- Sprint: (was: 2022/09/05, 0.13.0 Final Sprint) > Deltastreamer does not respect partition

[jira] [Updated] (HUDI-5018) Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5018: -- Sprint: (was: 2022/11/01, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint) > Make user-provided

[jira] [Updated] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-83: Sprint: (was: Cont' improve - 2021/01/24, Cont' improve - 2021/01/31, 2022/09/05, 2022/10/04,

[jira] [Updated] (HUDI-2740) Support for snapshot querying on MOR table

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-2740: -- Sprint: (was: 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint) > Support

[GitHub] [hudi] hechao-ustc commented on a diff in pull request #7499: [HUDI-5413] Add record count payload to support pv/uv

2022-12-19 Thread GitBox
hechao-ustc commented on code in PR #7499: URL: https://github.com/apache/hudi/pull/7499#discussion_r1052834597 ## hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java: ## @@ -90,4 +94,35 @@ public Boolean overwriteField(Object value,

[jira] [Updated] (HUDI-5418) Spark Sql Guide says that precombine field is only required for MOR but it is always required

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5418: -- Sprint: (was: 2022/12/12, 0.13.0 Final Sprint) > Spark Sql Guide says that precombine field

[jira] [Updated] (HUDI-5101) Adding spark structured streaming tests to integ tests

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5101: -- Sprint: (was: 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final

[jira] [Updated] (HUDI-5181) Enhance keygen class validation

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5181: -- Sprint: (was: 2022/12/12, 0.13.0 Final Sprint) > Enhance keygen class validation >

[jira] [Updated] (HUDI-4749) PartitionsForFullCleaning in CleanPlanner is using FileSystemBasedListing

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4749: -- Sprint: (was: 2022/08/22, 2022/09/05, 0.13.0 Final Sprint) > PartitionsForFullCleaning in

[jira] [Updated] (HUDI-5419) Spark-SQL tests persist configs

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5419: -- Sprint: (was: 2022/12/12, 0.13.0 Final Sprint) > Spark-SQL tests persist configs >

[jira] [Updated] (HUDI-5262) When creating table in spark-sql setting wrong keygenerator config does not warn

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5262: -- Sprint: (was: 2022/12/12, 0.13.0 Final Sprint) > When creating table in spark-sql setting

[jira] [Closed] (HUDI-4886) Detect incompatible schema change during deltastreamer ingestion

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-4886. - Resolution: Fixed > Detect incompatible schema change during deltastreamer ingestion >

[jira] [Assigned] (HUDI-4886) Detect incompatible schema change during deltastreamer ingestion

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-4886: - Assignee: Alexey Kudinkin > Detect incompatible schema change during deltastreamer

[jira] [Commented] (HUDI-4886) Detect incompatible schema change during deltastreamer ingestion

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649539#comment-17649539 ] Alexey Kudinkin commented on HUDI-4886: --- Addressed in [https://github.com/apache/hudi/pull/6358] >

[jira] [Updated] (HUDI-4878) Fix incremental cleaning for clean based on LATEST_FILE_VERSIONS

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4878: -- Sprint: 2022/09/05, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/29, 2022/12/12 (was:

[jira] [Updated] (HUDI-5231) Address checkstyle warnings while building hudi

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5231: -- Sprint: 2022/11/15, 2022/11/29, 2022/12/12 (was: 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0

[jira] [Updated] (HUDI-4990) Parallelize deduplication in CLI tool

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4990: -- Sprint: 2022/11/01, 2022/11/15 (was: 2022/11/01, 2022/11/15, 0.13.0 Final Sprint) >

[jira] [Updated] (HUDI-4911) Make sure LogRecordReader doesn't flush the cache before each lookup

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4911: -- Priority: Blocker (was: Critical) > Make sure LogRecordReader doesn't flush the cache before

[jira] [Updated] (HUDI-2754) Performance improvement for IncrementalRelation

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-2754: -- Sprint: Cont' improve - 2022/03/7, 2022/08/22, 2022/09/05 (was: Cont' improve - 2022/03/7,

[jira] [Closed] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-3301. - Resolution: Duplicate > MergedLogRecordReader inline reading should be stateless and thread safe

[jira] [Updated] (HUDI-4245) Support nested fields in Column Stats Index

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4245: -- Sprint: (was: 0.13.0 Final Sprint) > Support nested fields in Column Stats Index >

[jira] [Updated] (HUDI-4692) Clean up HoodieSparkSqlWriter

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4692: -- Sprint: (was: 0.13.0 Final Sprint) > Clean up HoodieSparkSqlWriter >

[jira] [Updated] (HUDI-4827) Rebase Azure Image on Ubuntu 22.04

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4827: -- Sprint: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/29, 2022/12/12) > Rebase

[jira] [Updated] (HUDI-4688) Decouple lazy cleaning of failed writes from clean action in multi-writer

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4688: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Decouple lazy cleaning of failed

[jira] [Updated] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3301: -- Sprint: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/29, 2022/12/12) >

[jira] [Updated] (HUDI-5352) Jackson fails to serialize LocalDate when updating Delta Commit metadata

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5352: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Jackson fails to serialize

[jira] [Updated] (HUDI-4613) Avoid the use of regex expressions when call hoodieFileGroup#addLogFile function

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4613: -- Sprint: 2022/09/05, 2022/12/12, 0.13.0 Final Sprint (was: 2022/09/05, 2022/12/12) > Avoid the

[jira] [Updated] (HUDI-5018) Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5018: -- Sprint: 2022/11/01, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/01, 2022/11/29,

[jira] [Updated] (HUDI-5384) Make sure predicates are appropriately pushed down to HoodieFileIndex when lazy listing

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5384: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Make sure predicates are

[jira] [Updated] (HUDI-5419) Spark-SQL tests persist configs

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5419: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Spark-SQL tests persist configs >

[jira] [Updated] (HUDI-5323) Decouple virtual key with writing bloom filters to parquet files

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5323: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Decouple virtual key with writing

[jira] [Updated] (HUDI-5319) NPE in Bloom Filter Index

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5319: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > NPE in Bloom Filter Index >

[jira] [Updated] (HUDI-3249) Performance Improvements

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3249: -- Sprint: 2022/08/22, 2022/09/05, 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/15,

[jira] [Updated] (HUDI-5075) Add support to rollback residual clustering after disabling clustering

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5075: -- Sprint: 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was:

[jira] [Updated] (HUDI-5392) Fix Bootstrap files reader to configure arrays to be read in the new format

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5392: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Fix Bootstrap files reader to

[jira] [Updated] (HUDI-5418) Spark Sql Guide says that precombine field is only required for MOR but it is always required

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5418: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Spark Sql Guide says that

[jira] [Updated] (HUDI-1574) Trim existing unit tests to finish in much shorter amount of time

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-1574: -- Sprint: 2022/08/22, 2022/09/05, 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/15,

[jira] [Updated] (HUDI-4847) hive sync fails w/ utilities bundle in 0.13-snapshot, but succeeds w/ 0.11

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4847: -- Sprint: 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/12/12, 0.13.0 Final Sprint (was:

[jira] [Updated] (HUDI-5262) When creating table in spark-sql setting wrong keygenerator config does not warn

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5262: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > When creating table in spark-sql

[jira] [Updated] (HUDI-4886) Detect incompatible schema change during deltastreamer ingestion

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4886: -- Sprint: 2022/11/01, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/01, 2022/12/12) > Detect

[jira] [Updated] (HUDI-5420) Fix metadata table validator to exclude uncommitted log files in successful deltacommits

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5420: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Fix metadata table validator to

[jira] [Updated] (HUDI-5305) Detect concurrent writes during compaction and clustering if they shouldn't happen

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5305: -- Sprint: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/29, 2022/12/12) > Detect

[jira] [Updated] (HUDI-5238) Hudi throwing "PipeBroken" exception during Merging on GCS

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5238: -- Sprint: 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/15, 2022/11/29,

[jira] [Updated] (HUDI-5231) Address checkstyle warnings while building hudi

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5231: -- Sprint: 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/15, 2022/11/29,

[jira] [Updated] (HUDI-5012) Fix clean planning for very large partitions

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5012: -- Sprint: 2022/10/04, 2022/10/18, 2022/11/01, 2022/12/12, 0.13.0 Final Sprint (was: 2022/10/04,

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3636: -- Sprint: 2022/08/22, 2022/09/05, 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/29,

[jira] [Updated] (HUDI-5101) Adding spark structured streaming tests to integ tests

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5101: -- Sprint: 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was:

[jira] [Updated] (HUDI-3088) Make Spark 3 the default profile for build and test

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3088: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14,

[jira] [Updated] (HUDI-5321) Fix Bulk Insert ColumnSortPartitioners

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5321: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Fix Bulk Insert

[jira] [Updated] (HUDI-4911) Make sure LogRecordReader doesn't flush the cache before each lookup

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4911: -- Sprint: 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/15, 2022/11/29,

[jira] [Updated] (HUDI-3673) Add a common hudi-hbase-shaded for shaded hbase dependencies

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3673: -- Sprint: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/29, 2022/12/12) > Add a

[jira] [Updated] (HUDI-5131) Bundle validation: upgrade/downgrade

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5131: -- Sprint: 2022/11/01, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/01, 2022/12/12) > Bundle

[jira] [Updated] (HUDI-4954) Shade avro in all bundles where it is included

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4954: -- Sprint: 2022/10/04, 2022/10/18, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/10/04,

[jira] [Updated] (HUDI-2608) Support JSON schema in schema registry provider

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-2608: -- Sprint: 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/29, 2022/12/12) > Support

[jira] [Updated] (HUDI-5080) UnpersistRdds unpersist all rdds in the spark context

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5080: -- Sprint: 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was:

[jira] [Updated] (HUDI-4937) Fix HoodieTable injecting HoodieBackedTableMetadata not reusing underlying MT readers

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4937: -- Sprint: 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final

[jira] [Updated] (HUDI-2740) Support for snapshot querying on MOR table

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-2740: -- Sprint: 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/01,

[jira] [Updated] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-83: Sprint: Cont' improve - 2021/01/24, Cont' improve - 2021/01/31, 2022/09/05, 2022/10/04, 2022/10/18,

[jira] [Updated] (HUDI-3967) Automatic savepoint in Hudi

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3967: -- Sprint: 2022/08/22, 2022/09/05, 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/15,

[jira] [Updated] (HUDI-4967) Improve docs for meta sync with TimestampBasedKeyGenerator

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4967: -- Sprint: 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12, 0.13.0 Final Sprint (was: 2022/11/01,

[jira] [Updated] (HUDI-4921) Fix last completed commit in CleanPlanner

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4921: -- Sprint: 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29, 2022/12/12,

[jira] [Updated] (HUDI-4878) Fix incremental cleaning for clean based on LATEST_FILE_VERSIONS

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4878: -- Sprint: 2022/09/05, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/29, 2022/12/12, 0.13.0 Final

[jira] [Updated] (HUDI-3601) Support multi-arch builds in docker setup

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3601: -- Sprint: 2022/09/05, 2022/09/19, 2022/10/04, 2022/10/18, 2022/11/01, 2022/11/15, 2022/11/29,

[jira] [Updated] (HUDI-5181) Enhance keygen class validation

2022-12-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5181: -- Sprint: 2022/12/12, 0.13.0 Final Sprint (was: 2022/12/12) > Enhance keygen class validation >

  1   2   3   4   >