[GitHub] [hudi] pintusoliya opened a new pull request, #7537: feat: landing page design changes

2022-12-21 Thread GitBox
pintusoliya opened a new pull request, #7537: URL: https://github.com/apache/hudi/pull/7537 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performa

[GitHub] [hudi] voonhous commented on pull request #7480: [HUDI-5400] Fix read issues when Hudi-FULL schema evolution is not enabled

2022-12-21 Thread GitBox
voonhous commented on PR #7480: URL: https://github.com/apache/hudi/pull/7480#issuecomment-1362461598 @xiarixiaoyao I have added support for Hudi tables that are schema-evolved via ASR for Spark2.4. Can you please help to review the PR again? Thank you! -- This is an autom

[GitHub] [hudi] XuQianJin-Stars commented on pull request #4966: [HUDI-3572]support DAY_ROLLING strategy in ClusteringPlanPartitionFilterMode

2022-12-21 Thread GitBox
XuQianJin-Stars commented on PR #4966: URL: https://github.com/apache/hudi/pull/4966#issuecomment-1362446802 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [hudi] koochiswathiTR commented on issue #7530: Hudi Log files are increasing in our application day by day

2022-12-21 Thread GitBox
koochiswathiTR commented on issue #7530: URL: https://github.com/apache/hudi/issues/7530#issuecomment-1362416370 ![hudi_snapshot](https://user-images.githubusercontent.com/53506762/209059112-7711243e-b8eb-4a87-a5fb-60a873c6aea8.PNG) Please find my hudi table config. -- This is an autom

[jira] [Updated] (HUDI-5455) Add commons configuration2 to cli bundle

2022-12-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5455: - Labels: pull-request-available (was: ) > Add commons configuration2 to cli bundle > -

[GitHub] [hudi] hudi-bot commented on pull request #7536: [HUDI-5455] Add commons-configuration2 in hudi cli bundle

2022-12-21 Thread GitBox
hudi-bot commented on PR #7536: URL: https://github.com/apache/hudi/pull/7536#issuecomment-1362394896 ## CI report: * 1104498741d7d1c6355579bdcff3034c90ea7226 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] koochiswathiTR commented on issue #7530: Hudi Log files are increasing in our application day by day

2022-12-21 Thread GitBox
koochiswathiTR commented on issue #7530: URL: https://github.com/apache/hudi/issues/7530#issuecomment-1362379587 Earlier we were using KEEP_LATEST_COMMITS , Now we are using CLEANER_HOURS_RETAINED. The log files created with KEEP_LATEST_COMMITS configuration log files will not be delet

[jira] [Created] (HUDI-5455) Add commons configuration2 to cli bundle

2022-12-21 Thread Rahil Chertara (Jira)
Rahil Chertara created HUDI-5455: Summary: Add commons configuration2 to cli bundle Key: HUDI-5455 URL: https://issues.apache.org/jira/browse/HUDI-5455 Project: Apache Hudi Issue Type: Task

[GitHub] [hudi] rahil-c commented on pull request #7536: Add commons-configuration2 in hudi cli bundle

2022-12-21 Thread GitBox
rahil-c commented on PR #7536: URL: https://github.com/apache/hudi/pull/7536#issuecomment-1362363885 @yihua When you get a chance if you can approve would be great -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [hudi] rahil-c opened a new pull request, #7536: Add commons-configuration2 in hudi cli bundle

2022-12-21 Thread GitBox
rahil-c opened a new pull request, #7536: URL: https://github.com/apache/hudi/pull/7536 ### Change Logs Add commons-configuration2 in hudi cli bundle due to spark3-hadoop3 tar providing this dependency in spark jars folder. ### Impact medium ### Risk level (write none, low me

[GitHub] [hudi] yihua commented on issue #7453: [SUPPORT] Hudi Upsert fails for

2022-12-21 Thread GitBox
yihua commented on issue #7453: URL: https://github.com/apache/hudi/issues/7453#issuecomment-1362317260 Hi @kepplertreet Thanks for reporting the issue. Could you provide the data types of the following columns: `id`, `_year_month`? It would be even better if you can also provide a sample

[GitHub] [hudi] TengHuo commented on pull request #7307: [HUDI-5271] fix issue inconsistent reader and writer schema in HoodieAvroDataBlock

2022-12-21 Thread GitBox
TengHuo commented on PR #7307: URL: https://github.com/apache/hudi/pull/7307#issuecomment-1362316237 Hi is there anyone can help to review it? Really appreciate -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[jira] [Updated] (HUDI-5411) Make sure Trino does not re-instantiates Hive's InputFormat for every partition during file listing

2022-12-21 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5411: -- Status: In Progress (was: Open) > Make sure Trino does not re-instantiates Hive's InputFormat for every

[jira] [Updated] (HUDI-5429) Investigate lot of head requests in MDT

2022-12-21 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5429: -- Status: In Progress (was: Open) > Investigate lot of head requests in MDT > ---

[jira] [Updated] (HUDI-5428) Investigate S3 connection leaks w/ MDT

2022-12-21 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5428: -- Status: In Progress (was: Open) > Investigate S3 connection leaks w/ MDT > ---

[GitHub] [hudi] yihua commented on issue #7472: [SUPPORT] Too many metadata timeline file caused by old rollback active timeline

2022-12-21 Thread GitBox
yihua commented on issue #7472: URL: https://github.com/apache/hudi/issues/7472#issuecomment-1362311362 Hi @yyar Thanks for raising this issue. We're are of the problem, which is documented here: HUDI-5434. I'm currently working on a fix now. -- This is an automated message from the Apa

[GitHub] [hudi] yihua commented on issue #7487: [SUPPORT] S3 Buckets reached quota limit when reading from hudi tables

2022-12-21 Thread GitBox
yihua commented on issue #7487: URL: https://github.com/apache/hudi/issues/7487#issuecomment-1362308145 Hi @AdarshKadameriTR Thanks for raising this. Does the read timeout happen in a write job or a query? Could you ask the AWS support to clarify what types of quota limits are reached?

[GitHub] [hudi] hudi-bot commented on pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2022-12-21 Thread GitBox
hudi-bot commented on PR #6361: URL: https://github.com/apache/hudi/pull/6361#issuecomment-1362283186 ## CI report: * a28a39f44afe5561fbf33a6381721e98911a01db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1374

[GitHub] [hudi] alexeykudinkin commented on pull request #6782: [HUDI-4911][HUDI-3301] Fixing `HoodieMetadataLogRecordReader` to avoid flushing cache for every lookup

2022-12-21 Thread GitBox
alexeykudinkin commented on PR #6782: URL: https://github.com/apache/hudi/pull/6782#issuecomment-1362265756 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [hudi] alexeykudinkin commented on pull request #6782: [HUDI-4911][HUDI-3301] Fixing `HoodieMetadataLogRecordReader` to avoid flushing cache for every lookup

2022-12-21 Thread GitBox
alexeykudinkin commented on PR #6782: URL: https://github.com/apache/hudi/pull/6782#issuecomment-1362265671 hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread waywtdcc (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] waywtdcc updated HUDI-5454: --- Component/s: reader-core writer-core (was: core) > Support LSM tree writ

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread waywtdcc (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] waywtdcc updated HUDI-5454: --- Description: [https://github.com/apache/hudi/issues/7414] [https://github.com/apache/hudi/issues/7529] >From

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread waywtdcc (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] waywtdcc updated HUDI-5454: --- Issue Type: Epic (was: New Feature) > Support LSM tree writing > > >

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread waywtdcc (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] waywtdcc updated HUDI-5454: --- Component/s: core > Support LSM tree writing > > > Key: HUDI-5454 >

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread waywtdcc (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] waywtdcc updated HUDI-5454: --- Labels: hudi-umbrellas (was: ) > Support LSM tree writing > > > Key:

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #6361: URL: https://github.com/apache/hudi/pull/6361#discussion_r1054943048 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala: ## @@ -25,265 +25,266 @@ import org.apache.h

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #6361: URL: https://github.com/apache/hudi/pull/6361#discussion_r957962272 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala: ## @@ -26,127 +26,163 @@ import org.apache.hu

[GitHub] [hudi] dyang108 commented on issue #7524: [SUPPORT] Deltastreamer continuous mode failed async compaction

2022-12-21 Thread GitBox
dyang108 commented on issue #7524: URL: https://github.com/apache/hudi/issues/7524#issuecomment-1362181143 Hey Ethan - thanks for the response. I see this exception in the driver logs before the async compaction failure: ``` 22/12/19 20:10:33 WARN scheduler.TaskSetManager: Lost tas

[GitHub] [hudi] yihua commented on issue #7524: [SUPPORT] Deltastreamer continuous mode failed async compaction

2022-12-21 Thread GitBox
yihua commented on issue #7524: URL: https://github.com/apache/hudi/issues/7524#issuecomment-1362176615 Hi @dyang108 Thanks for raising the issue. Could you search for any exception that happened inside the Spark driver logs? The exception logs can happen way before the async compaction f

[GitHub] [hudi] szingerpeter commented on issue #7533: [SUPPORT] Recreate deleted metadata table

2022-12-21 Thread GitBox
szingerpeter commented on issue #7533: URL: https://github.com/apache/hudi/issues/7533#issuecomment-1362167702 @yihua , thank you for your quick reply. Unfortunately, the environment is fixed and upgrading EMR is not possible right now. i sent the requested files via slack

[GitHub] [hudi] yihua commented on issue #7529: [SUPPORT][RFC] Support lsm tree writing

2022-12-21 Thread GitBox
yihua commented on issue #7529: URL: https://github.com/apache/hudi/issues/7529#issuecomment-1362163536 Closing this one as we'll track the feature dev in [HUDI-5454](https://issues.apache.org/jira/browse/HUDI-5454). -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] yihua closed issue #7529: [SUPPORT][RFC] Support lsm tree writing

2022-12-21 Thread GitBox
yihua closed issue #7529: [SUPPORT][RFC] Support lsm tree writing URL: https://github.com/apache/hudi/issues/7529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[GitHub] [hudi] yihua closed issue #7414: [SUPPORT] Support lsm tree writing

2022-12-21 Thread GitBox
yihua closed issue #7414: [SUPPORT] Support lsm tree writing URL: https://github.com/apache/hudi/issues/7414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[GitHub] [hudi] yihua commented on issue #7414: [SUPPORT] Support lsm tree writing

2022-12-21 Thread GitBox
yihua commented on issue #7414: URL: https://github.com/apache/hudi/issues/7414#issuecomment-1362162747 Closing this one as we'll track the feature dev in [HUDI-5454](https://issues.apache.org/jira/browse/HUDI-5454). -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7527: [HUDI-5411] Avoid virtual key info for COW table in the input format

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #7527: URL: https://github.com/apache/hudi/pull/7527#discussion_r1054874088 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java: ## @@ -247,81 +239,33 @@ private List listStatusForSnapshotMode(JobC

[GitHub] [hudi] yihua commented on issue #7529: [SUPPORT][RFC] Support lsm tree writing

2022-12-21 Thread GitBox
yihua commented on issue #7529: URL: https://github.com/apache/hudi/issues/7529#issuecomment-1362162358 Hi @waywtdcc Thanks for bringing up the idea! Let's follow up using the Jira ticket HUDI-5454 for the new feature. Could you also follow the RFC Process here: https://hudi.apache.org/co

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7527: [HUDI-5411] Avoid virtual key info for COW table in the input format

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #7527: URL: https://github.com/apache/hudi/pull/7527#discussion_r1054873749 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java: ## @@ -247,81 +239,33 @@ private List listStatusForSnapshotMode(JobC

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5454: Description: [https://github.com/apache/hudi/issues/7414] [https://github.com/apache/hudi/issues/7529] >Fr

[jira] [Updated] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5454: Fix Version/s: 1.0.0 > Support LSM tree writing > > > Key: HUDI-545

[jira] [Created] (HUDI-5454) Support LSM tree writing

2022-12-21 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-5454: --- Summary: Support LSM tree writing Key: HUDI-5454 URL: https://issues.apache.org/jira/browse/HUDI-5454 Project: Apache Hudi Issue Type: New Feature Reporter

[GitHub] [hudi] hudi-bot commented on pull request #7535: [MINOR] Fixed some spelling and syntax issues in InternalSchemaUtils

2022-12-21 Thread GitBox
hudi-bot commented on PR #7535: URL: https://github.com/apache/hudi/pull/7535#issuecomment-1362095226 ## CI report: * 21d8d919a6898ba4e6568afda847a8b95f4de353 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] parisni commented on issue #7531: [SUPPORT] table comments not fully supported

2022-12-21 Thread GitBox
parisni commented on issue #7531: URL: https://github.com/apache/hudi/issues/7531#issuecomment-1362091294 > When saying spark DF with comments metadata, do you mean the schema associated with the dataframe has the comments? That's it. Well, basically the steps are:

[GitHub] [hudi] yihua commented on issue #7530: Hudi Log files are increasing in our application day by day

2022-12-21 Thread GitBox
yihua commented on issue #7530: URL: https://github.com/apache/hudi/issues/7530#issuecomment-1362077983 Hi @koochiswathiTR thanks for reporting this issue. The cleaner is supposed to delete the file slices containing the base and log files that are older than 48 hours. One quick res

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6782: [HUDI-4911][HUDI-3301] Fixing `HoodieMetadataLogRecordReader` to avoid flushing cache for every lookup

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #6782: URL: https://github.com/apache/hudi/pull/6782#discussion_r1054824916 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java: ## @@ -188,40 +179,41 @@ protected AbstractHoodieLogRecordReader(F

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6782: [HUDI-4911][HUDI-3301] Fixing `HoodieMetadataLogRecordReader` to avoid flushing cache for every lookup

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #6782: URL: https://github.com/apache/hudi/pull/6782#discussion_r1054823911 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/Iterators.scala: ## @@ -261,7 +253,7 @@ object LogFileIterator { tableState

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6782: [HUDI-4911][HUDI-3301] Fixing `HoodieMetadataLogRecordReader` to avoid flushing cache for every lookup

2022-12-21 Thread GitBox
alexeykudinkin commented on code in PR #6782: URL: https://github.com/apache/hudi/pull/6782#discussion_r1054824254 ## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataLogRecordReader.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Software Foundation (A

[hudi] branch asf-site updated: [DOCS] Add videos page for tracking all guides, tutorials and hands on labs (#7534)

2022-12-21 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 135364477b [DOCS] Add videos page for tracking

[GitHub] [hudi] yihua merged pull request #7534: [DOCS] Add videos page for tracking all guides, tutorials and hands o…

2022-12-21 Thread GitBox
yihua merged PR #7534: URL: https://github.com/apache/hudi/pull/7534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] jonvex opened a new pull request, #7535: [MINOR] Fixed some spelling and syntax issues in InternalSchemaUtils

2022-12-21 Thread GitBox
jonvex opened a new pull request, #7535: URL: https://github.com/apache/hudi/pull/7535 ### Change Logs Made the error messages better ### Impact Makes our project look better ### Risk level (write none, low medium or high below) none ### Documentation

[GitHub] [hudi] yihua commented on issue #7531: [SUPPORT] table comments not fully supported

2022-12-21 Thread GitBox
yihua commented on issue #7531: URL: https://github.com/apache/hudi/issues/7531#issuecomment-1362061079 @parisni Thanks for raising this issue. Could you provide more details and reproducible steps? When saying `spark DF with comments metadata`, do you mean the schema associated with the

[GitHub] [hudi] yihua commented on issue #7533: [SUPPORT] Recreate deleted metadata table

2022-12-21 Thread GitBox
yihua commented on issue #7533: URL: https://github.com/apache/hudi/issues/7533#issuecomment-1362057544 @szingerpeter Thanks for reporting the issue. On Hudi 0.11.0 release, the metadata table is enabled by default. After you deleted the `s3:///.hoodie/metadata`, the next write operation

[GitHub] [hudi] hudi-bot commented on pull request #7476: [HUDI-5023] Switching default Write Executor type to `SIMPLE`

2022-12-21 Thread GitBox
hudi-bot commented on PR #7476: URL: https://github.com/apache/hudi/pull/7476#issuecomment-1362018077 ## CI report: * 45952c5ff7b5a1bbedd202050442f6d770e5fa89 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=137

[GitHub] [hudi] xccui commented on issue #7375: [SUPPORT] Hudi 0.12.1 support for Spark Structured Streaming. read clustering metadata replace avro file error. Unrecognized token 'Obj^A^B^Vavro'

2022-12-21 Thread GitBox
xccui commented on issue #7375: URL: https://github.com/apache/hudi/issues/7375#issuecomment-1362011808 Stacktrace: ``` Caused by: org.apache.hudi.exception.HoodieException: java.io.IOException: unable to read commit metadata at org.apache.hudi.sink.partitioner.profile.WritePr

[jira] [Created] (HUDI-5453) Ensure new fileId format is good across all code paths and backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5453: - Summary: Ensure new fileId format is good across all code paths and backwards compatible Key: HUDI-5453 URL: https://issues.apache.org/jira/browse/HUDI-5453

[jira] [Assigned] (HUDI-5453) Ensure new fileId format is good across all code paths and backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-5453: - Assignee: sivabalan narayanan > Ensure new fileId format is good across all code

[jira] [Updated] (HUDI-5453) Ensure new fileId format is good across all code paths and backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5453: -- Sprint: 0.13.0 Final Sprint > Ensure new fileId format is good across all code paths and

[jira] [Updated] (HUDI-5453) Ensure new fileId format is good across all code paths and backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5453: -- Epic Link: HUDI-466 Story Points: 2 > Ensure new fileId format is good across all

[jira] [Updated] (HUDI-5453) Ensure new fileId format is good across all code paths and backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5453: -- Fix Version/s: 0.13.0 > Ensure new fileId format is good across all code paths and backw

[GitHub] [hudi] bhasudha commented on pull request #7534: [DOCS] Add videos page for tracking all guides, tutorials and hands o…

2022-12-21 Thread GitBox
bhasudha commented on PR #7534: URL: https://github.com/apache/hudi/pull/7534#issuecomment-1362009354 Locally tested https://user-images.githubusercontent.com/2179254/208989207-9ceb2550-210d-4230-a5d0-fa019135e758.png";> https://user-images.githubusercontent.com/2179254/208989265-15

[jira] [Created] (HUDI-5452) Spark-sql long datatype conversion to bigint in hive causes issues with alter table

2022-12-21 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5452: - Summary: Spark-sql long datatype conversion to bigint in hive causes issues with alter table Key: HUDI-5452 URL: https://issues.apache.org/jira/browse/HUDI-5452 Pro

[jira] [Updated] (HUDI-5451) Ensure switching "001" and "002" suffix for compaction and cleaning in MDT is backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5451: -- Sprint: 0.13.0 Final Sprint > Ensure switching "001" and "002" suffix for compaction and

[jira] [Created] (HUDI-5451) Ensure switching "001" and "002" suffix for compaction and cleaning in MDT is backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5451: - Summary: Ensure switching "001" and "002" suffix for compaction and cleaning in MDT is backwards compatible Key: HUDI-5451 URL: https://issues.apache.org/jira/browse/HU

[jira] [Updated] (HUDI-5451) Ensure switching "001" and "002" suffix for compaction and cleaning in MDT is backwards compatible

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5451: -- Fix Version/s: 0.13.0 > Ensure switching "001" and "002" suffix for compaction and clean

[GitHub] [hudi] bhasudha opened a new pull request, #7534: [DOCS] Add videos page for tracking all guides, tutorials and hands o…

2022-12-21 Thread GitBox
bhasudha opened a new pull request, #7534: URL: https://github.com/apache/hudi/pull/7534 …n labs ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or

[jira] [Updated] (HUDI-5446) Add support to write record level index to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5446: -- Story Points: 4 (was: 3) > Add support to write record level index to MDT > ---

[jira] [Updated] (HUDI-5450) Test Record level index as default w/ azure CI

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5450: -- Epic Link: HUDI-466 Story Points: 6 > Test Record level index as default w/ azure

[jira] [Updated] (HUDI-5450) Test Record level index as default w/ azure CI

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5450: -- Fix Version/s: 0.13.0 > Test Record level index as default w/ azure CI > ---

[jira] [Assigned] (HUDI-5445) Fix/Unify deletion code paths for MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-5445: - Assignee: sivabalan narayanan > Fix/Unify deletion code paths for MDT > -

[jira] [Created] (HUDI-5450) Test Record level index as default w/ azure CI

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5450: - Summary: Test Record level index as default w/ azure CI Key: HUDI-5450 URL: https://issues.apache.org/jira/browse/HUDI-5450 Project: Apache Hudi Is

[jira] [Assigned] (HUDI-5446) Add support to write record level index to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-5446: - Assignee: sivabalan narayanan > Add support to write record level index to MDT >

[jira] [Assigned] (HUDI-5447) Add support for Record level index read from MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-5447: - Assignee: sivabalan narayanan > Add support for Record level index read from MDT

[jira] [Assigned] (HUDI-5448) Add metrics to record level index in MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-5448: - Assignee: sivabalan narayanan > Add metrics to record level index in MDT > --

[jira] [Updated] (HUDI-5449) Misc and adhoc fixes to add record level index support to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5449: -- Fix Version/s: 0.13.0 > Misc and adhoc fixes to add record level index support to MDT >

[jira] [Created] (HUDI-5449) Misc and adhoc fixes to add record level index support to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5449: - Summary: Misc and adhoc fixes to add record level index support to MDT Key: HUDI-5449 URL: https://issues.apache.org/jira/browse/HUDI-5449 Project: Apache H

[jira] [Assigned] (HUDI-5449) Misc and adhoc fixes to add record level index support to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-5449: - Assignee: sivabalan narayanan > Misc and adhoc fixes to add record level index su

[jira] [Updated] (HUDI-5449) Misc and adhoc fixes to add record level index support to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5449: -- Epic Link: HUDI-466 Story Points: 1 > Misc and adhoc fixes to add record level in

[jira] [Updated] (HUDI-5448) Add metrics to record level index in MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5448: -- Fix Version/s: 0.13.0 > Add metrics to record level index in MDT > -

[jira] [Updated] (HUDI-5448) Add metrics to record level index in MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5448: -- Epic Link: HUDI-466 Story Points: 2 > Add metrics to record level index in MDT >

[jira] [Created] (HUDI-5448) Add metrics to record level index in MDT

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5448: - Summary: Add metrics to record level index in MDT Key: HUDI-5448 URL: https://issues.apache.org/jira/browse/HUDI-5448 Project: Apache Hudi Issue Ty

[jira] [Created] (HUDI-5447) Add support for Record level index read from MDT

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5447: - Summary: Add support for Record level index read from MDT Key: HUDI-5447 URL: https://issues.apache.org/jira/browse/HUDI-5447 Project: Apache Hudi

[jira] [Updated] (HUDI-5447) Add support for Record level index read from MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5447: -- Fix Version/s: 0.13.0 > Add support for Record level index read from MDT > -

[jira] [Updated] (HUDI-5447) Add support for Record level index read from MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5447: -- Epic Link: HUDI-466 Story Points: 4 > Add support for Record level index read fro

[GitHub] [hudi] alexeykudinkin commented on pull request #7527: [HUDI-5411] Avoid virtual key info for COW table in the input format

2022-12-21 Thread GitBox
alexeykudinkin commented on PR #7527: URL: https://github.com/apache/hudi/pull/7527#issuecomment-1361868932 @codope let's mark this as stacked on top of https://github.com/apache/hudi/pull/7526/files -- This is an automated message from the Apache Git Service. To respond to the message, p

[jira] [Updated] (HUDI-5446) Add support to write record level index to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5446: -- Epic Link: HUDI-466 Story Points: 3 > Add support to write record level index to

[jira] [Updated] (HUDI-5446) Add support to write record level index to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5446: -- Sprint: 0.13.0 Final Sprint > Add support to write record level index to MDT > -

[jira] [Updated] (HUDI-5446) Add support to write record level index to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5446: -- Fix Version/s: 0.13.0 > Add support to write record level index to MDT > ---

[jira] [Created] (HUDI-5446) Add support to write record level index to MDT

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5446: - Summary: Add support to write record level index to MDT Key: HUDI-5446 URL: https://issues.apache.org/jira/browse/HUDI-5446 Project: Apache Hudi Is

[jira] [Updated] (HUDI-5298) Optimize WriteStatus storing HoodieRecord

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5298: -- Priority: Blocker (was: Major) > Optimize WriteStatus storing HoodieRecord > --

[jira] [Updated] (HUDI-5300) Optimize initial commit w/ metadata table

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5300: -- Priority: Blocker (was: Critical) > Optimize initial commit w/ metadata table > ---

[jira] [Closed] (HUDI-5288) Optimize drop duplicates by avoiding index look up twice

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-5288. - Resolution: Invalid > Optimize drop duplicates by avoiding index look up twice > -

[jira] [Updated] (HUDI-5297) Deprecate InternalWriteStatus and re-use WriteStatus

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5297: -- Priority: Blocker (was: Major) > Deprecate InternalWriteStatus and re-use WriteStatus

[jira] [Updated] (HUDI-5300) Optimize initial commit w/ metadata table

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5300: -- Sprint: 0.13.0 Final Sprint > Optimize initial commit w/ metadata table > --

[jira] [Updated] (HUDI-5298) Optimize WriteStatus storing HoodieRecord

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5298: -- Sprint: 0.13.0 Final Sprint > Optimize WriteStatus storing HoodieRecord > --

[jira] [Commented] (HUDI-5288) Optimize drop duplicates by avoiding index look up twice

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17650970#comment-17650970 ] sivabalan narayanan commented on HUDI-5288: --- w/ insert, there is no index look u

[jira] [Updated] (HUDI-5297) Deprecate InternalWriteStatus and re-use WriteStatus

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5297: -- Sprint: 0.13.0 Final Sprint > Deprecate InternalWriteStatus and re-use WriteStatus > --

[jira] [Created] (HUDI-5445) Fix/Unify deletion code paths for MDT

2022-12-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5445: - Summary: Fix/Unify deletion code paths for MDT Key: HUDI-5445 URL: https://issues.apache.org/jira/browse/HUDI-5445 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-5445) Fix/Unify deletion code paths for MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5445: -- Epic Link: HUDI-466 Story Points: 1 > Fix/Unify deletion code paths for MDT > ---

[jira] [Updated] (HUDI-5445) Fix/Unify deletion code paths for MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5445: -- Fix Version/s: 0.13.0 > Fix/Unify deletion code paths for MDT >

[jira] [Updated] (HUDI-5445) Fix/Unify deletion code paths for MDT

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5445: -- Priority: Critical (was: Major) > Fix/Unify deletion code paths for MDT > -

[jira] [Updated] (HUDI-5298) Optimize WriteStatus storing HoodieRecord

2022-12-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5298: -- Epic Link: HUDI-466 > Optimize WriteStatus storing HoodieRecord > --

  1   2   >