[jira] [Updated] (HUDI-5743) 0.13.0 release note part 1

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5743: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 1 > -- > > Key:

[jira] [Updated] (HUDI-5677) [DOCS] Update AWS libs version

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5677: Sprint: Sprint 2023-01-31 > [DOCS] Update AWS libs version > -- > >

[jira] [Updated] (HUDI-5745) 0.13.0 release note part 3

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5745: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 3 > -- > > Key:

[jira] [Updated] (HUDI-5747) 0.13.0 release note part 5

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5747: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 5 > -- > > Key:

[jira] [Updated] (HUDI-5750) 0.13.0 release note part 8

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5750: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 8 > -- > > Key:

[jira] [Updated] (HUDI-5748) 0.13.0 release note part 6

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5748: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 6 > -- > > Key:

[jira] [Updated] (HUDI-5746) 0.13.0 release note part 4

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5746: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 4 > -- > > Key:

[jira] [Updated] (HUDI-5744) 0.13.0 release note part 2

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5744: Sprint: Sprint 2023-01-31 > 0.13.0 release note part 2 > -- > > Key:

[jira] [Updated] (HUDI-5757) Add Log Compaction to Write Operation docs

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5757: Sprint: Sprint 2023-01-31 > Add Log Compaction to Write Operation docs > ---

[jira] [Updated] (HUDI-5753) Add feature docs for Record Payload

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5753: Sprint: Sprint 2023-01-31 > Add feature docs for Record Payload > --- > >

[jira] [Updated] (HUDI-5751) Add feature docs for Metaserver

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5751: Sprint: Sprint 2023-01-31 > Add feature docs for Metaserver > --- > >

[jira] [Updated] (HUDI-5752) Add feature docs for Change Data Capture

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5752: Sprint: Sprint 2023-01-31 > Add feature docs for Change Data Capture > -

[jira] [Updated] (HUDI-5754) Add detailed description of GCS Incr, Proto Kafka, and Pulsar Sources in Deltastreamer page

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5754: Sprint: Sprint 2023-01-31 > Add detailed description of GCS Incr, Proto Kafka, and Pulsar Sources in > Delt

[jira] [Updated] (HUDI-5755) Add detailed description of OCC early conflict detection to concurrency control docs

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5755: Sprint: Sprint 2023-01-31 > Add detailed description of OCC early conflict detection to concurrency > contr

[jira] [Updated] (HUDI-5760) Make sure DeleteBlock doesn't use Kryo for serialization to disk

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5760: Fix Version/s: 0.13.1 > Make sure DeleteBlock doesn't use Kryo for serialization to disk > -

[jira] [Updated] (HUDI-5758) MOR table w/ delete block in 0.12.2 not readable in 0.13 and also not compactable

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5758: Sprint: Sprint 2023-01-31 > MOR table w/ delete block in 0.12.2 not readable in 0.13 and also not > compact

[jira] [Updated] (HUDI-5756) Add Consistent Hashing Index to Indexing docs

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5756: Sprint: Sprint 2023-01-31 > Add Consistent Hashing Index to Indexing docs >

[jira] [Updated] (HUDI-5767) Add known regression of Hive Sync performance to release notes

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5767: Sprint: Sprint 2023-01-31 > Add known regression of Hive Sync performance to release notes > ---

[jira] [Updated] (HUDI-5769) Partitions created by Async indexer could be deleted by regular writers

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5769: Sprint: Sprint 2023-01-31 > Partitions created by Async indexer could be deleted by regular writers > --

[jira] [Updated] (HUDI-5771) Improve deploy script of release artifacts

2023-02-13 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5771: Sprint: Sprint 2023-01-31 > Improve deploy script of release artifacts > ---

[hudi] branch master updated: [HUDI-3580] [RFC-48] Create RFC for LogCompaction support to Hudi (#5041)

2023-02-13 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new d395f058183 [HUDI-3580] [RFC-48] Create RFC for

[GitHub] [hudi] nsivabalan merged pull request #5041: [HUDI-3580] [RFC-48] Create RFC for LogCompaction support to Hudi

2023-02-13 Thread via GitHub
nsivabalan merged PR #5041: URL: https://github.com/apache/hudi/pull/5041 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

[GitHub] [hudi] xushiyan commented on issue #7820: [SUPPORT] Errors are thrown when upserting a record with cleaner service enabled

2023-02-13 Thread via GitHub
xushiyan commented on issue #7820: URL: https://github.com/apache/hudi/issues/7820#issuecomment-1428882771 > Update 2: We believe the issue is related to having the CSE (client-side encryption) on the cluster and does not appear when the encryption is not enabled. @imuntyan Any furth

[GitHub] [hudi] xushiyan closed issue #7757: [SUPPORT] missing records when HoodieDeltaStreamer run in continuous mode

2023-02-13 Thread via GitHub
xushiyan closed issue #7757: [SUPPORT] missing records when HoodieDeltaStreamer run in continuous mode URL: https://github.com/apache/hudi/issues/7757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] xushiyan commented on issue #7757: [SUPPORT] missing records when HoodieDeltaStreamer run in continuous mode

2023-02-13 Thread via GitHub
xushiyan commented on issue #7757: URL: https://github.com/apache/hudi/issues/7757#issuecomment-1428877513 > It's a known critical bug for 0.11.x release, I have put a fix for 0.12.0: #6179 Closing due to fixed -- This is an automated message from the Apache Git Service. To respond

[jira] [Closed] (HUDI-5686) Missing records when HoodieDeltaStreamer run in continuous

2023-02-13 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-5686. Fix Version/s: 0.12.0 (was: 0.13.1) Resolution: Duplicate > Missing records wh

[GitHub] [hudi] hudi-bot commented on pull request #7934: [HUDI-5777] Support Metrics for Multiple Tables Simultaneously

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7934: URL: https://github.com/apache/hudi/pull/7934#issuecomment-1428867003 ## CI report: * 63ab1c1d6bf7adecea5a59f97c6d0e52648cf0f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[jira] [Updated] (HUDI-5711) NPE occurs when enabling metadata on table which does'nt has metadata previously

2023-02-13 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5711: - Description: https://github.com/apache/hudi/issues/7824#issuecomment-1420170722 > NPE occurs when enablin

[jira] [Updated] (HUDI-5711) NPE occurs when enabling metadata on table which does'nt has metadata previously

2023-02-13 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5711: - Component/s: metadata > NPE occurs when enabling metadata on table which does'nt has metadata > previousl

[GitHub] [hudi] hudi-bot commented on pull request #7935: [DRAFT] Add maven-build-cache-extension

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7935: URL: https://github.com/apache/hudi/pull/7935#issuecomment-1428779381 ## CI report: * 28a41cd538a50171538b8898adcc799628f4fd60 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[jira] [Created] (HUDI-5778) Support Absolute Paths for config includes

2023-02-13 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5778: - Summary: Support Absolute Paths for config includes Key: HUDI-5778 URL: https://issues.apache.org/jira/browse/HUDI-5778 Project: Apache Hudi Issue Type: Im

[GitHub] [hudi] hudi-bot commented on pull request #7935: [DRAFT] Add maven-build-cache-extension

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7935: URL: https://github.com/apache/hudi/pull/7935#issuecomment-1428729323 ## CI report: * 28a41cd538a50171538b8898adcc799628f4fd60 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[GitHub] [hudi] hudi-bot commented on pull request #7934: [HUDI-5777] Support Metrics for Multiple Tables Simultaneously

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7934: URL: https://github.com/apache/hudi/pull/7934#issuecomment-1428729287 ## CI report: * 63ab1c1d6bf7adecea5a59f97c6d0e52648cf0f1 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[GitHub] [hudi] hudi-bot commented on pull request #7935: [DRAFT] Add maven-build-cache-extension

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7935: URL: https://github.com/apache/hudi/pull/7935#issuecomment-1428721319 ## CI report: * 28a41cd538a50171538b8898adcc799628f4fd60 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-5777) Support Multiple tables for metrics

2023-02-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5777: - Labels: pull-request-available (was: ) > Support Multiple tables for metrics > --

[GitHub] [hudi] hudi-bot commented on pull request #7934: [HUDI-5777] Support Metrics for Multiple Tables Simultaneously

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7934: URL: https://github.com/apache/hudi/pull/7934#issuecomment-1428721251 ## CI report: * 63ab1c1d6bf7adecea5a59f97c6d0e52648cf0f1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Created] (HUDI-5777) Support Multiple tables for metrics

2023-02-13 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5777: - Summary: Support Multiple tables for metrics Key: HUDI-5777 URL: https://issues.apache.org/jira/browse/HUDI-5777 Project: Apache Hudi Issue Type: Improveme

[GitHub] [hudi] kazdy opened a new pull request, #7935: [DRAFT] Add maven-build-cache-extension

2023-02-13 Thread via GitHub
kazdy opened a new pull request, #7935: URL: https://github.com/apache/hudi/pull/7935 ### Change Logs Add maven-build-cache-extension to enable incremental builds. ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Ri

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7362: [HUDI-5315] The record size is dynamically estimated when the table i…

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7362: URL: https://github.com/apache/hudi/pull/7362#discussion_r1105023710 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -418,4 +426,23 @@ public Partitioner ge

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7915: [HUDI-5759] Supports add column on mor table with log

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7915: URL: https://github.com/apache/hudi/pull/7915#discussion_r1105014773 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala: ## @@ -202,6 +202,13 @@ private[sql] object SchemaConver

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7915: [HUDI-5759] Supports add column on mor table with log

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7915: URL: https://github.com/apache/hudi/pull/7915#discussion_r1105014257 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala: ## @@ -202,6 +202,13 @@ private[sql] object SchemaConver

[GitHub] [hudi] xushiyan commented on issue #7835: [SUPPORT] Hudi bootstrapping with METADATA_ONLY option is re-writing the complete dataset instead of just creating HUDI metadata skeleton files separ

2023-02-13 Thread via GitHub
xushiyan commented on issue #7835: URL: https://github.com/apache/hudi/issues/7835#issuecomment-1428677586 cc @lokeshj1703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7915: [HUDI-5759] Supports add column on mor table with log

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7915: URL: https://github.com/apache/hudi/pull/7915#discussion_r1105007223 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala: ## @@ -202,6 +202,13 @@ private[sql] object SchemaConver

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7871: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7871: URL: https://github.com/apache/hudi/pull/7871#discussion_r1105004523 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala: ## @@ -554,4 +657,16 @@ object MergeIntoHood

[GitHub] [hudi] jonvex opened a new pull request, #7934: metrics now offers multitable support

2023-02-13 Thread via GitHub
jonvex opened a new pull request, #7934: URL: https://github.com/apache/hudi/pull/7934 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

[GitHub] [hudi] xushiyan commented on issue #7877: [SUPPORT] Hudi examples: An exception or error caused a run to abort: org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism

2023-02-13 Thread via GitHub
xushiyan commented on issue #7877: URL: https://github.com/apache/hudi/issues/7877#issuecomment-1428655860 cc @jonvex -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7871: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7871: URL: https://github.com/apache/hudi/pull/7871#discussion_r1104996074 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala: ## @@ -127,164 +155,189 @@ case class MergeI

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7871: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7871: URL: https://github.com/apache/hudi/pull/7871#discussion_r1104995855 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala: ## @@ -28,97 +28,125 @@ import org.apache.h

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7871: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7871: URL: https://github.com/apache/hudi/pull/7871#discussion_r1104985764 ## hudi-common/src/main/java/org/apache/hudi/common/util/CollectionUtils.java: ## @@ -69,6 +69,26 @@ public static boolean nonEmpty(Collection c) { return !isN

[GitHub] [hudi] xushiyan commented on issue #7899: [SUPPORT] Error "Could not create interface org.apache.hudi.org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactory Is the hadoop comp

2023-02-13 Thread via GitHub
xushiyan commented on issue #7899: URL: https://github.com/apache/hudi/issues/7899#issuecomment-1428636069 for classpath dependency conflicts, can you share the exact jars passed to the spark job? like what is in `--jars /tmp/XX.jar ` ? and what jars are on classpath. you can enable pri

[GitHub] [hudi] xushiyan commented on issue #7902: [SUPPORT].UnresolvedUnionException: Not in union exception occurred when writing data through spark

2023-02-13 Thread via GitHub
xushiyan commented on issue #7902: URL: https://github.com/apache/hudi/issues/7902#issuecomment-1428631149 nothing special about this setup so it looks like a data issue. @nb can you share the schema and some sample data to help reproduce? cc @jonvex -- This is an automated message f

[GitHub] [hudi] xushiyan commented on issue #7909: Failed to create Marker file

2023-02-13 Thread via GitHub
xushiyan commented on issue #7909: URL: https://github.com/apache/hudi/issues/7909#issuecomment-1428613918 ``` Caused by: org.apache.http.NoHttpResponseException: ip-100-67-243-210.8043.aws-int.thomsonreuters.com:38839 failed to respond ``` @koochiswathiTR this usually means tim

[GitHub] [hudi] xushiyan commented on issue #7919: How hudi manages old parquet file with new data

2023-02-13 Thread via GitHub
xushiyan commented on issue #7919: URL: https://github.com/apache/hudi/issues/7919#issuecomment-1428588555 old parquet files will only be deleted by cleaner (run as a table service). In MOR, records belongs to the same file group will be compacted and written to new file slices (new parquet

[GitHub] [hudi] xushiyan closed issue #7919: How hudi manages old parquet file with new data

2023-02-13 Thread via GitHub
xushiyan closed issue #7919: How hudi manages old parquet file with new data URL: https://github.com/apache/hudi/issues/7919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5041: [HUDI-3580] [RFC-48] [UBER] Create RFC for LogCompaction action

2023-02-13 Thread via GitHub
nsivabalan commented on code in PR #5041: URL: https://github.com/apache/hudi/pull/5041#discussion_r1104923554 ## rfc/rfc-48/rfc-48.md: ## @@ -0,0 +1,174 @@ + +# RFC-46: Optimize Record Payload handling + +## Proposers + +- @suryaprasanna + +## Approvers +- @vinothchandar +- @pw

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7672: [HUDI-5557]Avoid converting columns that are not indexed in CSI

2023-02-13 Thread via GitHub
alexeykudinkin commented on code in PR #7672: URL: https://github.com/apache/hudi/pull/7672#discussion_r1104911822 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/ColumnStatsIndexSupport.scala: ## @@ -209,11 +209,11 @@ class ColumnStatsIndexSupport(spar

[GitHub] [hudi] Gatsby-Lee commented on issue #4839: Hudi upsert doesnt trigger compaction for MOR

2023-02-13 Thread via GitHub
Gatsby-Lee commented on issue #4839: URL: https://github.com/apache/hudi/issues/4839#issuecomment-1428450326 AWS Glue has Glue 4.0 ( Hudi 0.12.1 support ) I haven't tried it yet, but it might support MoR. https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.h

[GitHub] [hudi] hudi-bot commented on pull request #7886: [HUDI-5726]Fix timestamp field is 8 hours longer than the time

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7886: URL: https://github.com/apache/hudi/pull/7886#issuecomment-1428444513 ## CI report: * 88351933e3cc70cadb34f2cbf45f91c0ee108d77 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1428370770 ## CI report: * 74b25c01f1730a9b8d250ab7e41a808fb1f95f22 UNKNOWN * 0c6d81deb75b0aabe16bdd088e65b36ec9d1c387 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #7933: [HUDI-5774] Support for adding labels to prometheus metrics

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7933: URL: https://github.com/apache/hudi/pull/7933#issuecomment-1428359395 ## CI report: * 62dd45b5f189526ba0595abb566c38ed3bcadad6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[jira] [Updated] (HUDI-5776) Investigate Azure CI flakey test custom detection

2023-02-13 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-5776: -- Description: Flakey test detection is currently enabled, but it only seems to be reporting at th

[GitHub] [hudi] xushiyan commented on a diff in pull request #7927: [HUDI-5771] Improve deploy script of release artifacts

2023-02-13 Thread via GitHub
xushiyan commented on code in PR #7927: URL: https://github.com/apache/hudi/pull/7927#discussion_r1104786887 ## scripts/release/deploy_staging_jars.sh: ## @@ -36,38 +36,41 @@ if [ "$#" -gt "1" ]; then exit 1 fi -BUNDLE_MODULES=$(find -s packaging -name 'hudi-*-bundle' -typ

[jira] [Created] (HUDI-5776) Investigate Azure CI flakey test custom detection

2023-02-13 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5776: - Summary: Investigate Azure CI flakey test custom detection Key: HUDI-5776 URL: https://issues.apache.org/jira/browse/HUDI-5776 Project: Apache Hudi Issue T

[jira] [Created] (HUDI-5775) Improve detection of flakey tests in Azure CI

2023-02-13 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5775: - Summary: Improve detection of flakey tests in Azure CI Key: HUDI-5775 URL: https://issues.apache.org/jira/browse/HUDI-5775 Project: Apache Hudi Issue Type:

[GitHub] [hudi] hudi-bot commented on pull request #6121: [HUDI-4406] Support Flink compaction/clustering write error resolvement to avoid data loss

2023-02-13 Thread via GitHub
hudi-bot commented on PR #6121: URL: https://github.com/apache/hudi/pull/6121#issuecomment-1428247024 ## CI report: * 52b6f55e196007f993b0506d899c48bb80b36546 UNKNOWN * c846b9e08ee1c9f1ee39d7076a3a24cebcb162f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] jonvex commented on issue #7910: [SUPPORT]

2023-02-13 Thread via GitHub
jonvex commented on issue #7910: URL: https://github.com/apache/hudi/issues/7910#issuecomment-1428221207 @nsivabalan We have recently seen some questions in the slack about small file sizes. Do you think something bigger at play than configuration? -- This is an automated message from the

[GitHub] [hudi] hudi-bot commented on pull request #7362: [HUDI-5315] The record size is dynamically estimated when the table i…

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7362: URL: https://github.com/apache/hudi/pull/7362#issuecomment-1428137081 ## CI report: * b3e842754a302dc1372b330a8c32298d49732107 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1483

[GitHub] [hudi] hudi-bot commented on pull request #7918: [MINOR] Fix spark sql run clean do not exit

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7918: URL: https://github.com/apache/hudi/pull/7918#issuecomment-1428025828 ## CI report: * 0f35441097e274abe020127c5bd2a5f3d46e0b99 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1512

[GitHub] [hudi] sandyfog commented on a diff in pull request #7886: [HUDI-5726]Fix timestamp field is 8 hours longer than the time

2023-02-13 Thread via GitHub
sandyfog commented on code in PR #7886: URL: https://github.com/apache/hudi/pull/7886#discussion_r1104494581 ## hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/storage/row/RowDataParquetWriteSupport.java: ## @@ -53,7 +53,7 @@ public WriteContext init(Configuration

[GitHub] [hudi] hudi-bot commented on pull request #7931: [HUDI-5773] Support archive command for spark sql

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7931: URL: https://github.com/apache/hudi/pull/7931#issuecomment-1427964455 ## CI report: * 978d8b7b51f80bbeb22891c53d85b2ca5e166efd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[GitHub] [hudi] hudi-bot commented on pull request #7886: [HUDI-5726]Fix timestamp field is 8 hours longer than the time

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7886: URL: https://github.com/apache/hudi/pull/7886#issuecomment-1427964078 ## CI report: * 8e69fe9f5530f7249c48e6c28691783c449302fd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1510

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1427953239 ## CI report: * 74b25c01f1730a9b8d250ab7e41a808fb1f95f22 UNKNOWN * 17034dd7b4b2316e10ce3dc2829b6e9a3fe8f0f6 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[GitHub] [hudi] hudi-bot commented on pull request #7886: [HUDI-5726]Fix timestamp field is 8 hours longer than the time

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7886: URL: https://github.com/apache/hudi/pull/7886#issuecomment-1427952865 ## CI report: * 8e69fe9f5530f7249c48e6c28691783c449302fd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1510

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1427943258 ## CI report: * 82b52107672f324918988ef7b9b914fe992202df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[GitHub] [hudi] liaotian1005 commented on pull request #7633: [HUDI-5737] Fix Deletes issued without any prior commits

2023-02-13 Thread via GitHub
liaotian1005 commented on PR #7633: URL: https://github.com/apache/hudi/pull/7633#issuecomment-1427938398 > The cmd to rebase the code: > > ```shell > -- fetch upstream branch 'master', assume you already set up the upstream as the hudi git repository > git fetch upstream master

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #7891: [HUDI-5728] HoodieTimelineArchiver archives the latest instant before inflight replacecommit

2023-02-13 Thread via GitHub
zhuanshenbsj1 commented on code in PR #7891: URL: https://github.com/apache/hudi/pull/7891#discussion_r1103965694 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -473,6 +473,33 @@ private Stream getCommitInstantsToArchive

[GitHub] [hudi] hudi-bot commented on pull request #7894: [HUDI-5729] Fix RowDataKeyGen method getRecordKey

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7894: URL: https://github.com/apache/hudi/pull/7894#issuecomment-1427882316 ## CI report: * 7576ed9c5b6208a497a55cfbf4c7ba463a34cc4a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[GitHub] [hudi] hudi-bot commented on pull request #7933: [HUDI-5774] Support for adding labels to prometheus metrics

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7933: URL: https://github.com/apache/hudi/pull/7933#issuecomment-1427874194 ## CI report: * 62dd45b5f189526ba0595abb566c38ed3bcadad6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1514

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1427874114 ## CI report: * 82b52107672f324918988ef7b9b914fe992202df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[jira] [Updated] (HUDI-5774) Support for adding labels to prometheus metrics

2023-02-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5774: - Labels: pull-request-available (was: ) > Support for adding labels to prometheus metrics > --

[GitHub] [hudi] hudi-bot commented on pull request #7933: [HUDI-5774] Support for adding labels to prometheus metrics

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7933: URL: https://github.com/apache/hudi/pull/7933#issuecomment-1427864275 ## CI report: * 62dd45b5f189526ba0595abb566c38ed3bcadad6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1427864214 ## CI report: * 82b52107672f324918988ef7b9b914fe992202df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1427854179 ## CI report: * 82b52107672f324918988ef7b9b914fe992202df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[jira] [Created] (HUDI-5774) Support for adding labels to prometheus metrics

2023-02-13 Thread Lokesh Jain (Jira)
Lokesh Jain created HUDI-5774: - Summary: Support for adding labels to prometheus metrics Key: HUDI-5774 URL: https://issues.apache.org/jira/browse/HUDI-5774 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] lokeshj1703 opened a new pull request, #7933: Add labels to prometheus

2023-02-13 Thread via GitHub
lokeshj1703 opened a new pull request, #7933: URL: https://github.com/apache/hudi/pull/7933 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performa

[GitHub] [hudi] hudi-bot commented on pull request #6121: [HUDI-4406] Support Flink compaction/clustering write error resolvement to avoid data loss

2023-02-13 Thread via GitHub
hudi-bot commented on PR #6121: URL: https://github.com/apache/hudi/pull/6121#issuecomment-1427805620 ## CI report: * 52b6f55e196007f993b0506d899c48bb80b36546 UNKNOWN * 5dc463fcade7c5a495cca1437fca8230b01d0229 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7928: URL: https://github.com/apache/hudi/pull/7928#issuecomment-1427800516 ## CI report: * 82b52107672f324918988ef7b9b914fe992202df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[GitHub] [hudi] SteNicholas commented on a diff in pull request #7928: [HUDI-5772] Align Flink clustering configuration with HoodieClusteringConfig

2023-02-13 Thread via GitHub
SteNicholas commented on code in PR #7928: URL: https://github.com/apache/hudi/pull/7928#discussion_r1104354977 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java: ## @@ -712,6 +708,36 @@ private FlinkOptions() { .defaultValu

[GitHub] [hudi] lokeshj1703 commented on pull request #7932: [DOCS] [WIP] Add doc update for HUDI-5295

2023-02-13 Thread via GitHub
lokeshj1703 commented on PR #7932: URL: https://github.com/apache/hudi/pull/7932#issuecomment-1427793172 https://user-images.githubusercontent.com/9255455/218447404-723e31be-b3f4-4c97-99ac-81cf3b9bc354.png";> -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [hudi] lokeshj1703 opened a new pull request, #7932: [DOCS] [WIP] Add doc update for HUDI-5295

2023-02-13 Thread via GitHub
lokeshj1703 opened a new pull request, #7932: URL: https://github.com/apache/hudi/pull/7932 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performa

[GitHub] [hudi] hudi-bot commented on pull request #7362: [HUDI-5315] The record size is dynamically estimated when the table i…

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7362: URL: https://github.com/apache/hudi/pull/7362#issuecomment-1427781275 ## CI report: * b3e842754a302dc1372b330a8c32298d49732107 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1483

[GitHub] [hudi] hudi-bot commented on pull request #6121: [HUDI-4406] Support Flink compaction/clustering write error resolvement to avoid data loss

2023-02-13 Thread via GitHub
hudi-bot commented on PR #6121: URL: https://github.com/apache/hudi/pull/6121#issuecomment-1427779190 ## CI report: * 52b6f55e196007f993b0506d899c48bb80b36546 UNKNOWN * 5dc463fcade7c5a495cca1437fca8230b01d0229 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] weimingdiit commented on pull request #7362: [HUDI-5315] The record size is dynamically estimated when the table i…

2023-02-13 Thread via GitHub
weimingdiit commented on PR #7362: URL: https://github.com/apache/hudi/pull/7362#issuecomment-1427769776 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] chenshzh commented on pull request #6121: [HUDI-4406] Support Flink compaction/clustering write error resolvement to avoid data loss

2023-02-13 Thread via GitHub
chenshzh commented on PR #6121: URL: https://github.com/apache/hudi/pull/6121#issuecomment-1427748025 > Thanks for the contribution, reviewed and attach a patch here: [4406.zip](https://github.com/apache/hudi/files/10720334/4406.zip) > > You can apply the patch with cmd: > > ``

[GitHub] [hudi] hudi-bot commented on pull request #7931: [HUDI-5773] Support archive command for spark sql

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7931: URL: https://github.com/apache/hudi/pull/7931#issuecomment-1427713005 ## CI report: * 978d8b7b51f80bbeb22891c53d85b2ca5e166efd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1513

[GitHub] [hudi] hudi-bot commented on pull request #7918: [MINOR] Fix spark sql run clean do not exit

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7918: URL: https://github.com/apache/hudi/pull/7918#issuecomment-1427712874 ## CI report: * 0f35441097e274abe020127c5bd2a5f3d46e0b99 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1512

[GitHub] [hudi] hudi-bot commented on pull request #7894: [HUDI-5729] Fix RowDataKeyGen method getRecordKey

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7894: URL: https://github.com/apache/hudi/pull/7894#issuecomment-1427712679 ## CI report: * 3d5313695087d4198d4439201cdb2588b6df5e41 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=151

[GitHub] [hudi] sandyfog commented on a diff in pull request #7894: [HUDI-5729] Fix RowDataKeyGen method getRecordKey

2023-02-13 Thread via GitHub
sandyfog commented on code in PR #7894: URL: https://github.com/apache/hudi/pull/7894#discussion_r1104272567 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/RowDataKeyGen.java: ## @@ -138,9 +142,11 @@ public static RowDataKeyGen instance(Configuration

[GitHub] [hudi] hudi-bot commented on pull request #7931: [HUDI-5773] Support archive command for spark sql

2023-02-13 Thread via GitHub
hudi-bot commented on PR #7931: URL: https://github.com/apache/hudi/pull/7931#issuecomment-1427702925 ## CI report: * 978d8b7b51f80bbeb22891c53d85b2ca5e166efd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] stream2000 commented on pull request #7918: [MINOR] Fix spark sql run clean do not exit

2023-02-13 Thread via GitHub
stream2000 commented on PR #7918: URL: https://github.com/apache/hudi/pull/7918#issuecomment-1427702787 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

<    1   2   3   4   >