[jira] [Created] (HUDI-3104) Hudi-kafka-connect can not scan hadoop config files by HADOOP_CONF_DIR
cdmikechen created HUDI-3104: Summary: Hudi-kafka-connect can not scan hadoop config files by HADOOP_CONF_DIR Key: HUDI-3104 URL: https://issues.apache.org/jira/browse/HUDI-3104 Project: Apache Hudi Issue Type: Bug Components: configs Reporter: cdmikechen Fix For: 0.10.1 I used hudi-kafka-connect to test pull kafka topic datas to hudi. I've build a kafka connect docker by this dockerfile: {code} FROM confluentinc/cp-kafka-connect:6.1.1 RUN confluent-hub install --no-prompt confluentinc/kafka-connect-hdfs:10.1.3 COPY hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar /usr/share/confluent-hub-components/confluentinc-kafka-connect-hdfs/lib {code} When I started this docker container and submit a task, hudi report this error: {code} [2021-12-27 15:04:55,214] INFO Setting record key volume and partition fields date for table hdfs://hdp-syzh-cluster/hive/warehouse/default.db/hudi-test-topichudi-test-topic (org.apache.hudi.connect.writers.KafkaConnectTransactionServices) [2021-12-27 15:04:55,224] INFO Initializing hdfs://hdp-syzh-cluster/hive/warehouse/default.db/hudi-test-topic as hoodie table hdfs://hdp-syzh-cluster/hive/warehouse/default.db/hudi-test-topic (org.apache.hudi.common.table.HoodieTableMetaClient) WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/usr/share/confluent-hub-components/confluentinc-kafka-connect-hdfs/lib/hadoop-auth-2.10.1.jar) to method sun.security.krb5.Config.getInstance() WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release [2021-12-27 15:04:55,571] WARN Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (org.apache.hadoop.util.NativeCodeLoader) [2021-12-27 15:04:56,154] ERROR Fatal error initializing task null for partition 0 (org.apache.hudi.connect.HoodieSinkTask) org.apache.hudi.exception.HoodieException: Fatal error instantiating Hudi Transaction Services at org.apache.hudi.connect.writers.KafkaConnectTransactionServices.(KafkaConnectTransactionServices.java:113) ~[hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] at org.apache.hudi.connect.transaction.ConnectTransactionCoordinator.(ConnectTransactionCoordinator.java:88) ~[hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] at org.apache.hudi.connect.HoodieSinkTask.bootstrap(HoodieSinkTask.java:191) [hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] at org.apache.hudi.connect.HoodieSinkTask.open(HoodieSinkTask.java:151) [hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] at org.apache.kafka.connect.runtime.WorkerSinkTask.openPartitions(WorkerSinkTask.java:640) [connect-runtime-6.1.1-ccs.jar:?] at org.apache.kafka.connect.runtime.WorkerSinkTask.access$1100(WorkerSinkTask.java:71) [connect-runtime-6.1.1-ccs.jar:?] at org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance.onPartitionsAssigned(WorkerSinkTask.java:705) [connect-runtime-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsAssigned(ConsumerCoordinator.java:293) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:430) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:449) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:365) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:508) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1257) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1226) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1206) [kafka-clients-6.1.1-ccs.jar:?] at org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:457) [connect-runtime-6.1.1-ccs.jar:?] at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:324) [connect-runtime-6.1.1-ccs.jar:?] at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232) [
[GitHub] [hudi] hudi-bot removed a comment on pull request #4287: [DO NOT MERGE] 0.10.0 release patch for flink
hudi-bot removed a comment on pull request #4287: URL: https://github.com/apache/hudi/pull/4287#issuecomment-997321582 ## CI report: * 45769dd17905240d5b513d304e5f9e86fe094642 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4539) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4287: [DO NOT MERGE] 0.10.0 release patch for flink
hudi-bot commented on pull request #4287: URL: https://github.com/apache/hudi/pull/4287#issuecomment-1001395825 ## CI report: * 45769dd17905240d5b513d304e5f9e86fe094642 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4539) * 5b7a535559d80359a3febc2d1a80bf9a8ac20cf9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001343881 ## CI report: * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) * 5b45722373a6802da251ad35ebd146379cddb6f7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4736) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001372419 ## CI report: * 5b45722373a6802da251ad35ebd146379cddb6f7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4736) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] China-JasonW commented on issue #3042: [SUPPORT]Caused by: java.lang.NoSuchMethodError: org.apache.parquet.schema.Types$PrimitiveBuilder.as(Lorg/apache/parquet/schema/LogicalTypeAnnota
China-JasonW commented on issue #3042: URL: https://github.com/apache/hudi/issues/3042#issuecomment-1001361825 I meet the same exception, how to fix it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
danny0405 commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001357063 > > > > I have fired a fix in #4443 > > > > > > > > > Does hudi0.9 have this bug? > > > > > > No, but generally 0.10 is better. > > After the patch file is compiled, the restart task still reports this error. Do you want to delete the data and then write it again? You can delete the error meta file only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-3026) HoodieAppendhandle may result in duplicate key for hbase index
[ https://issues.apache.org/jira/browse/HUDI-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZiyueGuan updated HUDI-3026: Description: Problem: a same key may occur in two file group when Hbase index is used. These two file group will have same FileID prefix. As Hbase index is global, this is unexpected How to repro: We should have a table w/o record sorted in spark. Let's say we have 1,2,3,4,5 records to write. They may be iterated in different order. In the first attempt 1, we write 543 to fileID_1_log.1_attempt1. But this attempt failed. Spark will have a try in the second task attempt (attempt 2), we write 1234 to fileID_1_log.1_attempt2. And then, we find this filegroup is large enough by call canWrite. So hudi write record 5 to fileID_2_log.1_attempt2 and finish this commit. When we do compaction, fileID_1_log.1_attempt1 and fileID_1_log.1_attempt2 will be compacted. And we finally got 543 + 1234 = 12345 in fileID_1 while we also got 5 in fileID_2. Record 5 will appear in two fileGroup. Reason: Markerfile doesn't reconcile log file as code show in [https://github.com/apache/hudi/blob/9a2030ab3190acf600ce4820be9a08929595763e/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L553.] And log file is actually not fail-safe. I'm not sure if [~danny0405] have found this problem too as I find FlinkAppendHandle had been made to always return true. But it was just changed back recently. Solution: We may have a quick fix by making canWrite in HoodieAppendHandle always return true. However, I think there may be a more elegant solution that we use append result to generate compaction plan rather than list log file, in which we will have a more granular control on log block instead of log file. was: Problem: a same key may occur in two file group. These two file group will have same FileID prefix. How to repro: We should have a table w/o record sorted in spark. Let's say we have 1,2,3,4,5 records to write. They may be iterated in different order. In the first attempt 1, we write 543 to fileID_1_log.1_attempt1. But this attempt failed. Spark will have a try in the second task attempt (attempt 2), we write 1234 to fileID_1_log.1_attempt2. And then, we find this filegroup is large enough by call canWrite. So hudi write record 5 to fileID_2_log.1_attempt2 and finish this commit. When we do compaction, fileID_1_log.1_attempt1 and fileID_1_log.1_attempt2 will be compacted. And we finally got 543 + 1234 = 12345 in fileID_1 while we also got 5 in fileID_2. Record 5 will appear in two fileGroup. Reason: Markerfile doesn't reconcile log file as code show in [https://github.com/apache/hudi/blob/9a2030ab3190acf600ce4820be9a08929595763e/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L553.] And log file is actually not fail-safe. I'm not sure if [~danny0405] have found this problem too as I find FlinkAppendHandle had been made to always return true. But it was just changed back recently. Solution: We may have a quick fix by making canWrite in HoodieAppendHandle always return true. However, I think there may be a more elegant solution that we use append result to generate compaction plan rather than list log file, in which we will have a more granular control on log block instead of log file. > HoodieAppendhandle may result in duplicate key for hbase index > -- > > Key: HUDI-3026 > URL: https://issues.apache.org/jira/browse/HUDI-3026 > Project: Apache Hudi > Issue Type: Bug >Reporter: ZiyueGuan >Assignee: ZiyueGuan >Priority: Major > Labels: pull-request-available > > Problem: a same key may occur in two file group when Hbase index is used. > These two file group will have same FileID prefix. As Hbase index is global, > this is unexpected > How to repro: > We should have a table w/o record sorted in spark. Let's say we have > 1,2,3,4,5 records to write. They may be iterated in different order. > In the first attempt 1, we write 543 to fileID_1_log.1_attempt1. But this > attempt failed. Spark will have a try in the second task attempt (attempt 2), > we write 1234 to fileID_1_log.1_attempt2. And then, we find this filegroup > is large enough by call canWrite. So hudi write record 5 to > fileID_2_log.1_attempt2 and finish this commit. > When we do compaction, fileID_1_log.1_attempt1 and fileID_1_log.1_attempt2 > will be compacted. And we finally got 543 + 1234 = 12345 in fileID_1 while we > also got 5 in fileID_2. Record 5 will appear in two fileGroup. > Reason: Markerfile doesn't reconcile log file as code show in > [https://github.com/apache/hudi/blob/9a2030ab3190acf600ce4820be9a08929595763e/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/Ho
[GitHub] [hudi] guanziyue commented on pull request #4214: [HUDI-2928] Switching default Parquet's column encoding to zstd
guanziyue commented on pull request #4214: URL: https://github.com/apache/hudi/pull/4214#issuecomment-1001350751 We do use zstd in hudi and gained good benefit from it. However, just like mentioned in this issue, zstd is not widely bundled in hadoop 2.x environment or spark2.x. Such an upgrade may need more work in hudi side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001343881 ## CI report: * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) * 5b45722373a6802da251ad35ebd146379cddb6f7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4736) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001343150 ## CI report: * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) * 5b45722373a6802da251ad35ebd146379cddb6f7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001341495 ## CI report: * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001343150 ## CI report: * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) * 5b45722373a6802da251ad35ebd146379cddb6f7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001341495 ## CI report: * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001316238 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001300542 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4734) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001317161 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4734) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001315713 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) * 5b55f33d803d901cdf7881358546b948aadc5de1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001316238 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) * 5b55f33d803d901cdf7881358546b948aadc5de1 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4735) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001257177 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001315713 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) * 5b55f33d803d901cdf7881358546b948aadc5de1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] zhangyue19921010 commented on pull request #4346: [HUDI-3045] New ClusteringPlanStrategy to use regex choose partitions when building clustering plan
zhangyue19921010 commented on pull request #4346: URL: https://github.com/apache/hudi/pull/4346#issuecomment-1001314000 Ack! Will do it asap :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] waywtdcc commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001304712 > Did you use the offline compaction yet ? use offline compaction.error is ` 2021-12-27T10:59:37.248+0800: 69.147: Total time for which application threads were stopped: 0.0002348 seconds, Stopping threads took: 0.544 seconds 2021-12-27T10:59:37.253+0800: 69.153: Total time for which application threads were stopped: 0.0002075 seconds, Stopping threads took: 0.396 seconds 2021-12-27T10:59:37.254+0800: 69.153: Total time for which application threads were stopped: 0.0001900 seconds, Stopping threads took: 0.293 seconds 2021-12-27T10:59:37.254+0800: 69.154: Total time for which application threads were stopped: 0.0001623 seconds, Stopping threads took: 0.290 seconds 2021-12-27T10:59:37.255+0800: 69.154: Total time for which application threads were stopped: 0.0001547 seconds, Stopping threads took: 0.273 seconds 2021-12-27T10:59:37.255+0800: 69.154: Total time for which application threads were stopped: 0.0001694 seconds, Stopping threads took: 0.235 seconds 2021-12-27T10:59:37.255+0800: 69.155: Total time for which application threads were stopped: 0.0001623 seconds, Stopping threads took: 0.263 seconds The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute job 'flink_hudi_compaction'. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: org.apache.flink.util.FlinkException: Failed to execute job 'flink_hudi_compaction'. at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1970) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:137) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:76) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1834) at org.apache.hudi.sink.compact.HoodieFlinkCompactor.main(HoodieFlinkCompactor.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) ... 11 more Caused by: java.lang.RuntimeException: Error while waiting for job to be initialized at org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160) at org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:443) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001300012 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001300542 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4734) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001300012 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001210539 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] leesf commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
leesf commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001299699 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #3173: [HUDI-1951] Add bucket hash index, compatible with the hive bucket
vinothchandar commented on a change in pull request #3173: URL: https://github.com/apache/hudi/pull/3173#discussion_r775310381 ## File path: hudi-client/hudi-client-common/src/test/resources/hive_bucket_id_check.csv ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +true,0,527706,0.9105978527125206,0.5821051620180587,57 Review comment: I'd avoid checking in these data files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #3173: [HUDI-1951] Add bucket hash index, compatible with the hive bucket
vinothchandar commented on a change in pull request #3173: URL: https://github.com/apache/hudi/pull/3173#discussion_r775310284 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/storage/HoodieStorageLayout.java ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.table.storage; + +import org.apache.hudi.common.model.WriteOperationType; + +import java.io.Serializable; + +public class HoodieStorageLayout implements Serializable { Review comment: nts: Review these classes for names, docs ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java ## @@ -195,6 +195,10 @@ public boolean canWrite(HoodieRecord record) { return false; } + public boolean canSwitchToNewOne() { Review comment: Consider naming this better? ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/deltacommit/AbstractSparkDeltaCommitActionExecutor.java ## @@ -74,8 +74,8 @@ public Partitioner getUpsertPartitioner(WorkloadProfile profile) { public Iterator> handleUpdate(String partitionPath, String fileId, Iterator> recordItr) throws IOException { LOG.info("Merging updates for commit " + instantTime + " for file " + fileId); - -if (!table.getIndex().canIndexLogFiles() && mergeOnReadUpsertPartitioner.getSmallFileIds().contains(fileId)) { +if (!table.getIndex().canIndexLogFiles() && mergeOnReadUpsertPartitioner != null Review comment: let me take a closer look and suggest what I think ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndex.java ## @@ -122,13 +123,29 @@ public O updateLocation(O writeStatuses, HoodieEngineContext context, @PublicAPIMethod(maturity = ApiMaturityLevel.STABLE) public abstract boolean isImplicitWithStorage(); + /** + * If the `getCustomizedPartitioner` returns a partitioner, it has to be true. + */ + @PublicAPIMethod(maturity = ApiMaturityLevel.EVOLVING) + public boolean performTagging(WriteOperationType operationType) { +switch (operationType) { + case INSERT: + case INSERT_OVERWRITE: +return false; + case UPSERT: Review comment: I think this is better push into `WriteOperationType` and we have more operations that may need tagging. ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java ## @@ -303,7 +305,7 @@ protected void commit(Option> extraMetadata, HoodieWriteMeta @SuppressWarnings("unchecked") protected Iterator> handleUpsertPartition(String instantTime, Integer partition, Iterator recordItr, Partitioner partitioner) { -UpsertPartitioner upsertPartitioner = (UpsertPartitioner) partitioner; +SparkHoodiePartitioner upsertPartitioner = (SparkHoodiePartitioner) partitioner; Review comment: is this renamed? ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSourceWithBucket.scala ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.functional + +import org.apache.hudi.common.testutils.HoodieTestDataGenerator +import org.apache.hudi.config.{Hoodi
[GitHub] [hudi] waywtdcc edited a comment on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc edited a comment on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001292749 > > > I have fired a fix in #4443 > > > > > > Does hudi0.9 have this bug? > > No, but generally 0.10 is better. After the patch file is compiled, the restart task still reports this error. Do you want to delete the data and then write it again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] waywtdcc commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001292749 > > > I have fired a fix in #4443 > > > > > > Does hudi0.9 have this bug? > > No, but generally 0.10 is better. Restart the task or report this error. Do you want to delete the data and then write it again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] manojpec edited a comment on pull request #4067: [WIP][HUDI-2763] Metadata table records key deduplication
manojpec edited a comment on pull request #4067: URL: https://github.com/apache/hudi/pull/4067#issuecomment-1001237923 @vinothchandar @prashantwason Posted PR https://github.com/apache/hudi/pull/4447 based on the comments and discussion we had here. Please take a look. This is a generic version and the Log/Base file readers/writers pass in the needed key field for de-duplication and materialization. And here is the much simplified version of the PR https://github.com/apache/hudi/pull/4449, where the HFile readers and writers do the key deduplication and materialization based on the hardcoded metadata payload key field 'key'. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4449: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
hudi-bot commented on pull request #4449: URL: https://github.com/apache/hudi/pull/4449#issuecomment-1001272026 ## CI report: * dc9fe1b878dc47eaed13911fc5ca7eaffb80fb2f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] manojpec opened a new pull request #4449: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
manojpec opened a new pull request #4449: URL: https://github.com/apache/hudi/pull/4449 ## What is the purpose of the pull request - The backing log format for the metadata table is HFile, a KeyValue type. Since the key field in the metadata record payload is a duplicate of the Key in the Cell, the redundant key field in the record can be emptied to save on the cost. - HoodieHFileWriter and HoodieHFileDataBlock will now serialize records with the key field emptied by default. HFile writer tries to find if the record has metadata payload schema field 'key' and if so it does the key trimming from the record payload. - HoodieHFileReader when reading the serialized records back from disk, it materializes the missing keyFields if any. HFile reader tries to find if the record has metadata payload schema fiels 'key' and if so it does the key materialization in the record payload. ## Verify this pull request - Tests have been added to verify the default virtual keys and key deduplication support for the metadata table records. - Manually verified the serialized records on the disk are trimmed off the key field ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] waywtdcc commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001267449 But we pay more attention to stability. I feel that 0.10 is not enough stability, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
danny0405 commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001266146 > > I have fired a fix in #4443 > > Does hudi0.9 have this bug? No, but generally 0.10 is better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] waywtdcc commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001263459 > I have fired a fix in #4443 Does hudi0.9 have this bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] waywtdcc removed a comment on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc removed a comment on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001263394 Does hudi0.9 have this bug? > I have fired a fix in #4443 Does hudi0.9 have this bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] waywtdcc commented on issue #4439: [BUG] ROLLBACK meet Cannot use marker based rollback strategy on completed error
waywtdcc commented on issue #4439: URL: https://github.com/apache/hudi/issues/4439#issuecomment-1001263394 Does hudi0.9 have this bug? > I have fired a fix in #4443 Does hudi0.9 have this bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001257177 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001250517 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot removed a comment on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001250295 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001250517 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4733) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4448: [HUDI-2987] Fix payload event time extraction
hudi-bot commented on pull request #4448: URL: https://github.com/apache/hudi/pull/4448#issuecomment-1001250295 ## CI report: * fb8f38443ab24bf7c8b1a1d8a5c37e6fa3e653f3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2987) event time not recorded in commit metadata when insert or bulk insert
[ https://issues.apache.org/jira/browse/HUDI-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2987: - Labels: pull-request-available sev:high (was: sev:high) > event time not recorded in commit metadata when insert or bulk insert > - > > Key: HUDI-2987 > URL: https://issues.apache.org/jira/browse/HUDI-2987 > Project: Apache Hudi > Issue Type: Bug > Components: Writer Core >Reporter: Raymond Xu >Assignee: Raymond Xu >Priority: Blocker > Labels: pull-request-available, sev:high > Fix For: 0.11.0, 0.10.1 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [hudi] xushiyan opened a new pull request #4448: [HUDI-2987] Fix payload event time extraction
xushiyan opened a new pull request #4448: URL: https://github.com/apache/hudi/pull/4448 - Make `eventTime` passed in via constructor for `DefaultHoodiePayload` and `OverwriteWithLatestAvroPayload` - Remove extraction logic out of `getInsertValue()` and `combineAndGetUpdateValue()` ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #4202: [SUPPORT] Support Apache Spark 3.2
xushiyan commented on issue #4202: URL: https://github.com/apache/hudi/issues/4202#issuecomment-1001244186 > Facing the same issue. When will this be fixed? Is it part of the next minor release of 0.10.1 ? @maddy2u Pls let me clarify: spark 3.2 support is a feature to be added, not an issue. And no, 0.10.1 will be a bug fix release, no new feature should be added there. This is expected in 0.11.0 major release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4447: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
hudi-bot commented on pull request #4447: URL: https://github.com/apache/hudi/pull/4447#issuecomment-1001244089 ## CI report: * 925005eaa8cde214dd66c1c7f57cff06bbdc1fd0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4732) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4447: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
hudi-bot removed a comment on pull request #4447: URL: https://github.com/apache/hudi/pull/4447#issuecomment-1001238168 ## CI report: * 925005eaa8cde214dd66c1c7f57cff06bbdc1fd0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4732) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2811) Support Spark 3.2 and Parquet 1.12.x
[ https://issues.apache.org/jira/browse/HUDI-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465449#comment-17465449 ] Madhavan commented on HUDI-2811: When are we targeting to have this fixed? 0.10.1 ? > Support Spark 3.2 and Parquet 1.12.x > > > Key: HUDI-2811 > URL: https://issues.apache.org/jira/browse/HUDI-2811 > Project: Apache Hudi > Issue Type: Improvement > Components: Spark Integration >Reporter: Raymond Xu >Assignee: Yann Byron >Priority: Blocker > Labels: pull-request-available, sev:critical > Fix For: 0.11.0 > > > Reported issues > * [https://github.com/apache/hudi/issues/4001] > * [https://github.com/apache/hudi/issues/3841] > * [https://github.com/apache/hudi/issues/4202] > * [https://github.com/apache/hudi/issues/3834] > * -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [hudi] maddy2u commented on issue #4202: [SUPPORT] Support Apache Spark 3.2
maddy2u commented on issue #4202: URL: https://github.com/apache/hudi/issues/4202#issuecomment-1001239229 Facing the same issue. When will this be fixed? Is it part of the next minor release of 0.10.1 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (HUDI-1657) build failed on AArch64, Fedora 33
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465442#comment-17465442 ] Madhavan edited comment on HUDI-1657 at 12/26/21, 8:32 PM: --- Can you also share the JAVA_HOME path here by echoing the JAVA_HOME path. I had the same problem and was stuck here. Fixing the JAVA_HOME fixed the problem to me. This was inspite of having java -version correctly showing the 1.8 version needed for Hudi build. was (Author: maddy2u): Can you also share the JAVA_HOME path here by echoing the JAVA_HOME path. > build failed on AArch64, Fedora 33 > --- > > Key: HUDI-1657 > URL: https://issues.apache.org/jira/browse/HUDI-1657 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lutz Weischer >Priority: Major > Labels: sev:triage, user-support-issues > > [jw@cn05 hudi]$ mvn package -DskipTests > [INFO] Scanning for projects... > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-java-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-spark-datasource/hudi-spark/pom.xml, line 26, > column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark2_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark2_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-spark-datasource/hudi-spark2/pom.xml, line 24, > column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-utilities_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-utilities_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-utilities/pom.xml, line 26, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-spark-bundle/pom.xml, line 26, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-utilities-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-utilities-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-utilities-bundle/pom.xml, line 26, column > 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-flink_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-flink/pom.xml, line 28, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-flink-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-flink-bundle/pom.xml, line 28, column 15 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > [INFO] > > [INFO] Reactor Build Order: > [INFO] > [INFO] Hudi
[GitHub] [hudi] hudi-bot commented on pull request #4447: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
hudi-bot commented on pull request #4447: URL: https://github.com/apache/hudi/pull/4447#issuecomment-1001238168 ## CI report: * 925005eaa8cde214dd66c1c7f57cff06bbdc1fd0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4732) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4447: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
hudi-bot removed a comment on pull request #4447: URL: https://github.com/apache/hudi/pull/4447#issuecomment-1001237934 ## CI report: * 925005eaa8cde214dd66c1c7f57cff06bbdc1fd0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (HUDI-1657) build failed on AArch64, Fedora 33
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465442#comment-17465442 ] Madhavan edited comment on HUDI-1657 at 12/26/21, 8:23 PM: --- Can you also share the JAVA_HOME path here by echoing the JAVA_HOME path. was (Author: maddy2u): Can you also share this - Execute the following and share the results. (1) /usr/libexec/java_home (2) /usr/libexec/java_home -V > build failed on AArch64, Fedora 33 > --- > > Key: HUDI-1657 > URL: https://issues.apache.org/jira/browse/HUDI-1657 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lutz Weischer >Priority: Major > Labels: sev:triage, user-support-issues > > [jw@cn05 hudi]$ mvn package -DskipTests > [INFO] Scanning for projects... > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-java-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-spark-datasource/hudi-spark/pom.xml, line 26, > column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark2_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark2_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-spark-datasource/hudi-spark2/pom.xml, line 24, > column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-utilities_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-utilities_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-utilities/pom.xml, line 26, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-spark-bundle/pom.xml, line 26, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-utilities-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-utilities-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-utilities-bundle/pom.xml, line 26, column > 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-flink_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-flink/pom.xml, line 28, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-flink-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-flink-bundle/pom.xml, line 28, column 15 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > [INFO] > > [INFO] Reactor Build Order: > [INFO] > [INFO] Hudi > [pom] > [INFO] hudi-common > [jar] > [INFO] hudi
[jira] [Commented] (HUDI-1657) build failed on AArch64, Fedora 33
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465442#comment-17465442 ] Madhavan commented on HUDI-1657: Can you also share this - Execute the following and share the results. (1) /usr/libexec/java_home (2) /usr/libexec/java_home -V > build failed on AArch64, Fedora 33 > --- > > Key: HUDI-1657 > URL: https://issues.apache.org/jira/browse/HUDI-1657 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lutz Weischer >Priority: Major > Labels: sev:triage, user-support-issues > > [jw@cn05 hudi]$ mvn package -DskipTests > [INFO] Scanning for projects... > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-java-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink-client:jar:0.8.0-SNAPSHOT > [WARNING] The expression ${parent.version} is deprecated. Please use > ${project.parent.version} instead. > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-spark-datasource/hudi-spark/pom.xml, line 26, > column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark2_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark2_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-spark-datasource/hudi-spark2/pom.xml, line 24, > column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-utilities_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-utilities_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-utilities/pom.xml, line 26, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-spark-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-spark-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-spark-bundle/pom.xml, line 26, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-utilities-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-utilities-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-utilities-bundle/pom.xml, line 26, column > 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-flink_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/hudi-flink/pom.xml, line 28, column 15 > [WARNING] > [WARNING] Some problems were encountered while building the effective model > for org.apache.hudi:hudi-flink-bundle_2.11:jar:0.8.0-SNAPSHOT > [WARNING] 'artifactId' contains an expression but should be a constant. @ > org.apache.hudi:hudi-flink-bundle_${scala.binary.version}:0.8.0-SNAPSHOT, > /home/jw/apache/hudi/packaging/hudi-flink-bundle/pom.xml, line 28, column 15 > [WARNING] > [WARNING] It is highly recommended to fix these problems because they > threaten the stability of your build. > [WARNING] > [WARNING] For this reason, future Maven versions might no longer support > building such malformed projects. > [WARNING] > [INFO] > > [INFO] Reactor Build Order: > [INFO] > [INFO] Hudi > [pom] > [INFO] hudi-common > [jar] > [INFO] hudi-timeline-service > [jar] > [INFO] hudi-client > [pom] >
[GitHub] [hudi] hudi-bot commented on pull request #4447: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
hudi-bot commented on pull request #4447: URL: https://github.com/apache/hudi/pull/4447#issuecomment-1001237934 ## CI report: * 925005eaa8cde214dd66c1c7f57cff06bbdc1fd0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] manojpec commented on pull request #4067: [WIP][HUDI-2763] Metadata table records key deduplication
manojpec commented on pull request #4067: URL: https://github.com/apache/hudi/pull/4067#issuecomment-1001237923 @vinothchandar @prashantwason Posted PR https://github.com/apache/hudi/pull/4447 based on the comments and discussion we had here. Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] manojpec opened a new pull request #4447: [HUDI-2763] Metadata table records - support for key deduplication and virtual keys
manojpec opened a new pull request #4447: URL: https://github.com/apache/hudi/pull/4447 ## What is the purpose of the pull request - The backing log format for the metadata table is HFile, a KeyValue type. Since the key field in the metadata record payload is a duplicate of the Key in the Cell, the redundant key field in the record can be emptied to save on the cost. - HoodieHFileWriter and HoodieHFileDataBlock will now serialize records with the key field emptied by default. HFile writer level relies on the callers to tell about the key field in the schema. - HoodieHFileReader when reading the serialized records back from disk, it materializes the missing keyFields if any. It relies on the callers to tell about the key field in the record schema. - WriteHandles and all its derived classes rely on the table properties for the key field when constructing the file and log readers. This way base file creation, append and merging all work seamlessly irrespective of data or metadata table. ## Verify this pull request - Tests have been added to verify the default virtual keys and key deduplication support for the metadata table records. - Manually verified the serialized records on the disk are trimmed off the key field ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan edited a comment on issue #4429: [SUPPORT] Spark SQL hudi statements are not working after hudi 0.10.0 version upgrade
xushiyan edited a comment on issue #4429: URL: https://github.com/apache/hudi/issues/4429#issuecomment-1001117485 @vingov spark 3.2 is not supported with hudi 0.10. Put a support matrix here https://hudi.apache.org/docs/next/quick-start-guide#setup this shouldn't be a problem when the right version is used. note some changes highlighted here too https://hudi.apache.org/docs/quick-start-guide/#create-table -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001210539 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001198445 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001198445 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4731) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001198067 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001198067 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN * 2543c009c235d9019068a958e3c6fdf6c0758648 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001192608 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001135883 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001192608 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN * f984f3a9e4f4b7cde1371c9f03e77e3fffd622ed UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] cdmikechen edited a comment on issue #3429: [SUPPORT] Upserting timestamp with microseconds precision truncate the microseconds part
cdmikechen edited a comment on issue #3429: URL: https://github.com/apache/hudi/issues/3429#issuecomment-1001179764 @nsivabalan You can replace `Timestamp.valueOf("2015-01-01T13:51:39.345397Z")` to `Timestamp.valueOf("2015-01-01 13:51:39.345397")` The problem maybe here: https://github.com/apache/hudi/blob/c81df99e50f2df84d85f08ff3a839595dad974d7/hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/AvroConversionHelper.scala#L123-L139 I think maybe we need to add a new configuration to support this feature (microsecond precision) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] cdmikechen commented on issue #3429: [SUPPORT] Upserting timestamp with microseconds precision truncate the microseconds part
cdmikechen commented on issue #3429: URL: https://github.com/apache/hudi/issues/3429#issuecomment-1001179764 @nsivabalan You can replace `Timestamp.valueOf("2015-01-01T13:51:39.345397Z")` to `Timestamp.valueOf("2015-01-01 13:51:39.345397")` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
hudi-bot removed a comment on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001160824 ## CI report: * e849105e74b6fc7ac3a9f1f78f8f32d2a66345c4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4729) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4730) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
hudi-bot commented on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001176364 ## CI report: * e849105e74b6fc7ac3a9f1f78f8f32d2a66345c4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4729) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4730) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] guanziyue commented on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
guanziyue commented on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001160811 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
hudi-bot removed a comment on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001124541 ## CI report: * e849105e74b6fc7ac3a9f1f78f8f32d2a66345c4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
hudi-bot commented on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001160824 ## CI report: * e849105e74b6fc7ac3a9f1f78f8f32d2a66345c4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4729) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4730) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] leesf commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
leesf commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001138558 > @leesf before looking at the code deeply, Can you please clarify what breaking changes we plan in this approach? Is the current "hudi " source codepath renamed to something else? So existing users need to update "All" their jobs to use the new source? @vinothchandar There is no breaking changes for the approach, users have no need to change their format. I move some classes from `hudi-spark-datasource/hudi-spark` into `hudi-spark-datasource/hudi-spark-common` module to make reuse of these classes. And the `hudi` format under `spark-datasource/hudi-spark` module before is moved to `spark-datasource/hudi-spark-common` to reuse `DefaultSource` code. And I introduce `hudi` format which located in both `hudi-spark-datasource/hudi-spark2` and `hudi-spark-datasource/hudi-spark3` which means no matter which spark version users use, they use still use `hudi` format. since the `hudi-spark-datasource/hudi-spark` module depends on `hudi-spark-datasource/hudi-spark2` or `hudi-spark-datasource/hudi-spark3` module. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] leesf removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
leesf removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001138517 > There is no breaking changes for the approach, users have no need to change their format. I move some classes from `hudi-spark-datasource/hudi-spark` into `hudi-spark-datasource/hudi-spark-common` module to make reuse of these classes. And the `hudi` format under `spark-datasource/hudi-spark` module before is moved to `spark-datasource/hudi-spark-common` to reuse `DefaultSource` code. And I introduce `hudi` format which located in both `hudi-spark-datasource/hudi-spark2` and `hudi-spark-datasource/hudi-spark3` which means no matter which spark version users use, they use still use `hudi` format. since the `hudi-spark-datasource/hudi-spark` module depends on `hudi-spark-datasource/hudi-spark2` or `hudi-spark-datasource/hudi-spark3` module. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] leesf commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
leesf commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001138517 > There is no breaking changes for the approach, users have no need to change their format. I move some classes from `hudi-spark-datasource/hudi-spark` into `hudi-spark-datasource/hudi-spark-common` module to make reuse of these classes. And the `hudi` format under `spark-datasource/hudi-spark` module before is moved to `spark-datasource/hudi-spark-common` to reuse `DefaultSource` code. And I introduce `hudi` format which located in both `hudi-spark-datasource/hudi-spark2` and `hudi-spark-datasource/hudi-spark3` which means no matter which spark version users use, they use still use `hudi` format. since the `hudi-spark-datasource/hudi-spark` module depends on `hudi-spark-datasource/hudi-spark2` or `hudi-spark-datasource/hudi-spark3` module. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot commented on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1001135883 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) * 082742e8794ec236f63d45ba5780305045babefb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4350: [HUDI-3047] Basic Implementation of Spark Datasource V2
hudi-bot removed a comment on pull request #4350: URL: https://github.com/apache/hudi/pull/4350#issuecomment-1000410909 ## CI report: * 5f2bceb6f745b359ba7b5691ef1f2ab02eddde06 UNKNOWN * 3855884f4791a45fa3a973e1e540e6988e863223 UNKNOWN * 78e8080c9d530e1e54799afbef69edb67394bb29 UNKNOWN * daaabf8b5843585fa2cc4a4414ae287a8cd36dae UNKNOWN * 66d2d16028d5982e9e863d8c2fe5b1dc7ca45a5c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4708) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
hudi-bot commented on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001124541 ## CI report: * e849105e74b6fc7ac3a9f1f78f8f32d2a66345c4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot removed a comment on pull request #4446: [HUDI-2917] rollback insert data appended to log file when using Hbase Index
hudi-bot removed a comment on pull request #4446: URL: https://github.com/apache/hudi/pull/4446#issuecomment-1001116652 ## CI report: * fb42977822edf73e31efcd94c1e14e1b9c7a9f22 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4728) * e849105e74b6fc7ac3a9f1f78f8f32d2a66345c4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org