[hudi] branch master updated: [HUDI-6515] Fix Spark2 do not support bucket bulk insert (#9163)

2023-07-10 Thread leesf
This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 0258a89112a [HUDI-6515] Fix Spark2 do not support bucket bulk insert 
(#9163)
0258a89112a is described below

commit 0258a89112a6071a8074757236e19a7b27539dbd
Author: StreamingFlames <18889897...@163.com>
AuthorDate: Tue Jul 11 13:45:35 2023 +0800

[HUDI-6515] Fix Spark2 do not support bucket bulk insert (#9163)
---
 .../apache/hudi/internal/HoodieBulkInsertDataInternalWriter.java | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git 
a/hudi-spark-datasource/hudi-spark2/src/main/java/org/apache/hudi/internal/HoodieBulkInsertDataInternalWriter.java
 
b/hudi-spark-datasource/hudi-spark2/src/main/java/org/apache/hudi/internal/HoodieBulkInsertDataInternalWriter.java
index 9f878dfa572..666ca3ec989 100644
--- 
a/hudi-spark-datasource/hudi-spark2/src/main/java/org/apache/hudi/internal/HoodieBulkInsertDataInternalWriter.java
+++ 
b/hudi-spark-datasource/hudi-spark2/src/main/java/org/apache/hudi/internal/HoodieBulkInsertDataInternalWriter.java
@@ -19,7 +19,9 @@
 package org.apache.hudi.internal;
 
 import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
 import org.apache.hudi.table.HoodieTable;
+import 
org.apache.hudi.table.action.commit.BucketBulkInsertDataInternalWriterHelper;
 import org.apache.hudi.table.action.commit.BulkInsertDataInternalWriterHelper;
 
 import org.apache.spark.sql.catalyst.InternalRow;
@@ -39,8 +41,11 @@ public class HoodieBulkInsertDataInternalWriter implements 
DataWriter

[GitHub] [hudi] leesf merged pull request #9163: [MINOR] Fix spark2.4 do not support bucket index row writer

2023-07-10 Thread via GitHub


leesf merged PR #9163:
URL: https://github.com/apache/hudi/pull/9163


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] leesf commented on pull request #9163: [MINOR] Fix spark2.4 do not support bucket index row writer

2023-07-10 Thread via GitHub


leesf commented on PR #9163:
URL: https://github.com/apache/hudi/pull/9163#issuecomment-1630177494

   The second failure case passed in first CI. Let's merge it to unblock other 
PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] voonhous commented on issue #9162: [SUPPORT] Permission denied access ckp_meta while start flink job

2023-07-10 Thread via GitHub


voonhous commented on issue #9162:
URL: https://github.com/apache/hudi/issues/9162#issuecomment-1630159382

   Hmmm, possible to change the permission of 
`/path_to_hudi_table/.hoodie/.aux/ckp_meta` to `drwx--`?
   
   ```shell
   hdfs dfs -chmod -R 700 /path_to_hudi_table/.hoodie/.aux/ckp_meta
   ```
   
   Might need to check with your hadoop admin on why the folder is created 
without execute permissions. 
   
   When a folder is created by a user, the default permissions should be: `775` 
or `drwxrwxr-x`.
   
   Not sure how it got modified to `600`  or `drw---` in your environment. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] Zouxxyy commented on pull request #9164: [HUDI-6516] Correct the use of hoodie.bootstrap.mode.selector

2023-07-10 Thread via GitHub


Zouxxyy commented on PR #9164:
URL: https://github.com/apache/hudi/pull/9164#issuecomment-1630137355

   CI failure is caused by other, Can you help with a review? @codope 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9168: [HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9168:
URL: https://github.com/apache/hudi/pull/9168#issuecomment-1630128240

   
   ## CI report:
   
   * 4dff8041f9695a645f35d846006610be6ec85963 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18476)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18474)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9163: [MINOR] Fix spark2.4 do not support bucket index row writer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9163:
URL: https://github.com/apache/hudi/pull/9163#issuecomment-1630128022

   
   ## CI report:
   
   * 784dbe3012ec4312a4877b389994081272ad6506 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18450)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18475)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1630126159

   
   ## CI report:
   
   * 4dff8041f9695a645f35d846006610be6ec85963 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18476)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18474)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] chandu-1101 commented on issue #9141: [SUPPORT] Example from Hudi Quick start doesnt work!

2023-07-10 Thread via GitHub


chandu-1101 commented on issue #9141:
URL: https://github.com/apache/hudi/issues/9141#issuecomment-1630119428

   @ ad1happy2go  thank you for the reply. I will check these and get back.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] SteNicholas commented on a diff in pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


SteNicholas commented on code in PR #9160:
URL: https://github.com/apache/hudi/pull/9160#discussion_r1259146107


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestBase.java:
##
@@ -770,6 +770,7 @@ protected void updateBatchWithoutCommit(String 
newCommitTime, List
 HoodieWriteConfig hoodieWriteConfig = 
getConfigBuilder(HoodieFailedWritesCleaningPolicy.LAZY)
 .withAutoCommit(false) // disable auto commit
 .withRollbackUsingMarkers(true)
+.withHeartbeatTolerableMisses(0)
 .build();

Review Comment:
   @danny0405, accelerate the expiration of the heartbeat for the deltacommit 
and not wait for the heartbeat expiration in [line 
308](https://github.com/apache/hudi/blob/master/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestSavepointRestoreMergeOnRead.java#L308)
 for `TestSavepointRestoreMergeOnRead#testCleaningCompletedRollback`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nylqd commented on issue #9162: [SUPPORT] Permission denied access ckp_meta while start flink job

2023-07-10 Thread via GitHub


nylqd commented on issue #9162:
URL: https://github.com/apache/hudi/issues/9162#issuecomment-1630060302

   > Are you restarting the job with the hadoop user that you initially created 
the table with? If not, possible to ensure that the job is started with the 
same hadoop user as the one that you created the table with?
   
   we do steps as follow:
   
   1. submit flink job w/ hadoop user A
   2. cancel with savepoints use rest api
   3. start job with savepoint from step 1 (same hadoop user A)
   
   then we got permission problem, `AccessControlException: Permission denied: 
user=A, 
access=ALL,inode="/path_to_hudi_table/.hoodie/.aux/ckp_meta":A:hadoop:drw---`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] luongngochoa commented on issue #9132: [SUPPORT] hudi deltastreamer jsonkafka source schema registry fail

2023-07-10 Thread via GitHub


luongngochoa commented on issue #9132:
URL: https://github.com/apache/hudi/issues/9132#issuecomment-1630043791

   @danny0405 I tried to use with this configuration but it's still not work. 
(follow this issues  [
   ](https://github.com/apache/hudi/pull/7727))
   
`hoodie.deltastreamer.schemaprovider.registry.schemaconverter=org.apache.hudi.utilities.schema.converter.JsonToAvroSchemaConverter`
   then I test another config then it worked, that:
   I change source class to AvroKafkaSource then I defined its schema in avro, 
I produced message to the topic using AvroSerializer in reference to that avro 
schema.
   But it's still have one problem that if I use JsonKafkaSource as source 
class then define its schema with avro schema type 'records'. It's then can 
parse with schema but have some problem with deserializing with the decoding 
because the data contain some utf-8 character. 
   even when I added this additional config. it's still get some error
   
`hoodie.deltastreamer.source.kafka.value.deserializer.class=io.confluent.kafka.serializers.KafkaAvroDeserializer
 or KafkaSchemaAvroDeserializer`
   this is the logs
   ```
   23/07/06 12:53:47 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 
resource profile 0
   23/07/06 12:53:47 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 
0) (10.233.125.104, executor 1, partition 0, PROCESS_LOCAL, 4391 bytes) 
taskResourceAssignments Map()
   23/07/06 12:53:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory 
on 10.233.125.104:45235 (size: 4.3 KiB, free: 110.0 MiB)
   23/07/06 12:53:53 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) 
(10.233.125.104 executor 1): org.apache.hudi.exception.HoodieIOException: 
Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) 
is allowed between tokens
at [Source: (String)"\00\00\00{"type": 1, "name": "\u00d4 T\u00d4 
TR\u01af", "address": "L\u00f4", "brand": "PEUGEOT", "filename": 
"_E0102376505_"[truncated 11 chars]; line: 1, column: 2]
at 
org.apache.hudi.avro.MercifulJsonConverter.convert(MercifulJsonConverter.java:96)
at 
org.apache.hudi.utilities.sources.helpers.AvroConvertor.fromJson(AvroConvertor.java:87)
at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.Iterator$SliceIterator.next(Iterator.scala:273)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
at scala.collection.TraversableOnce.to(TraversableOnce.scala:366)
at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364)
at scala.collection.AbstractIterator.to(Iterator.scala:1431)
at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358)
at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1431)
at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345)
at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1431)
at org.apache.spark.rdd.RDD.$anonfun$take$2(RDD.scala:1449)
at 
org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2254)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
   Caused by: com.fasterxml.jackson.core.JsonParseException: Illegal character 
((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between 
tokens
at [Source: (String)"\00\00\00{"type": 1, line: 1, column: 2]
at 
com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337)
at 
com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:710)
at 
com.fasterxml.jackson.core.base.ParserMinimalBase._throwInvalidSpace(ParserMinimalBase.java:688)
   

[GitHub] [hudi] big-doudou commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


big-doudou commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1630036429

   My TM is running on k8s, and the k8s exception caused TM to exit. The error 
seems to have occurred after this. I will sort out the details and post them 
here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9131:
URL: https://github.com/apache/hudi/pull/9131#issuecomment-1630035617

   
   ## CI report:
   
   * 0fa3be40442ad7e9570a5f7b3f26a75618f0447e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18469)
 
   * e0622a6d0f2f7294ed079dd42cf5ff65a8718da3 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18478)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] pftn commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


pftn commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1630034977

   > @pftn I mean @big-doudou's logs.
   > 
   > Is he running the same version as you? Also, if this error was thrown 
recently on his pipeline, is it possible for him to share his JM + TM logs with 
me privately to assist in reproducing this issue locally?
   > 
   > Thanks.
   
   I have no contact information about him


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9131:
URL: https://github.com/apache/hudi/pull/9131#issuecomment-1630030137

   
   ## CI report:
   
   * 75157cf41f09e386d93f2e154553e9190731095c Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18458)
 
   * 0fa3be40442ad7e9570a5f7b3f26a75618f0447e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18469)
 
   * e0622a6d0f2f7294ed079dd42cf5ff65a8718da3 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


danny0405 commented on code in PR #9160:
URL: https://github.com/apache/hudi/pull/9160#discussion_r1259104247


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestBase.java:
##
@@ -770,6 +770,7 @@ protected void updateBatchWithoutCommit(String 
newCommitTime, List
 HoodieWriteConfig hoodieWriteConfig = 
getConfigBuilder(HoodieFailedWritesCleaningPolicy.LAZY)
 .withAutoCommit(false) // disable auto commit
 .withRollbackUsingMarkers(true)
+.withHeartbeatTolerableMisses(0)
 .build();

Review Comment:
   What the purpose of this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] voonhous commented on issue #9162: [SUPPORT] Permission denied access ckp_meta while start flink job

2023-07-10 Thread via GitHub


voonhous commented on issue #9162:
URL: https://github.com/apache/hudi/issues/9162#issuecomment-1630006589

   Are you restarting the job with the hadoop user that you initially created 
the table with?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] voonhous commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


voonhous commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1630005948

   @pftn I mean @big-doudou's logs.
   
   Is he running the same version as you? Also, if this error was thrown 
recently on his pipeline, is it possible for him to share his JM + TM logs with 
me privately to assist in reproducing this issue locally? 
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] pftn commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


pftn commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-163795

   > @big-doudou
   > 
   > > * Replace partition files using the repairedOutputPath in step 2
   > 
   > Can you please share your Hudi-version + stack trace, thanks.
   
   Listed in the 1st comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9164: [HUDI-6516] Correct the use of hoodie.bootstrap.mode.selector

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9164:
URL: https://github.com/apache/hudi/pull/9164#issuecomment-1629997571

   
   ## CI report:
   
   * 5861f76644649791a3b050c8fb0baa12d0c4135a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18454)
 
   * 57a349c17f63c05f982f1309f2c52fca3f30d096 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18477)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] voonhous commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


voonhous commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1629996333

   @big-doudou 
   
   > * Replace partition files using the repairedOutputPath in step 2
   
   Can you please share your Hudi-version + stack trace, thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] pftn commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


pftn commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1629995280

   > Is this problem solved? I also got the same error
   
   Fix:
   1. Stop the flink job
   2. Execute command by hudi-cli: repair deduplicate --duplicatedPartitionPath 
20220604 --repairedOutputPath hdfs:///hudi/tmp.db/table_repair/20220604/ 
--dedupeType upsert_type --sparkMaster local 
   3. Replace partition files using the repairedOutputPath in step 2
   4. Start the flink job


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9164: [HUDI-6516] Correct the use of hoodie.bootstrap.mode.selector

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9164:
URL: https://github.com/apache/hudi/pull/9164#issuecomment-1629991794

   
   ## CI report:
   
   * 5861f76644649791a3b050c8fb0baa12d0c4135a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18454)
 
   * 57a349c17f63c05f982f1309f2c52fca3f30d096 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-07-10 Thread via GitHub


bvaradar commented on PR #8452:
URL: https://github.com/apache/hudi/pull/8452#issuecomment-1629985656

   @boneanxs : Can you kindly rebase the PR and fix the conflicts ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-07-10 Thread via GitHub


bvaradar commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1259089188


##
hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java:
##
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.metadata;
+
+import org.apache.hudi.common.config.SerializableConfiguration;
+import org.apache.hudi.common.engine.HoodieEngineContext;
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.table.HoodieTableConfig;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.util.PartitionPathEncodeUtils;
+import org.apache.hudi.common.util.StringUtils;
+import org.apache.hudi.exception.TableNotFoundException;
+import org.apache.hudi.expression.ArrayData;
+import org.apache.hudi.hadoop.CachingPath;
+import org.apache.hudi.hadoop.SerializablePath;
+import org.apache.hudi.internal.schema.Types;
+import org.apache.hudi.internal.schema.utils.Conversions;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+public abstract class AbstractHoodieTableMetadata implements 
HoodieTableMetadata {
+
+  protected final transient HoodieEngineContext engineContext;
+
+  protected final SerializableConfiguration hadoopConf;
+  protected final SerializablePath dataBasePath;
+
+  protected final boolean hiveStylePartitioningEnabled;
+  protected final boolean urlEncodePartitioningEnabled;
+
+  // TODO get this from HoodieConfig
+  protected final boolean caseSensitive = false;
+
+  public AbstractHoodieTableMetadata(HoodieEngineContext engineContext, 
SerializableConfiguration conf, String dataBasePath) {
+this.engineContext = engineContext;
+this.hadoopConf = conf;
+this.dataBasePath = new SerializablePath(new CachingPath(dataBasePath));
+FileSystem fs = FSUtils.getFs(dataBasePath, conf.get());
+Path metaPath = new Path(dataBasePath, 
HoodieTableMetaClient.METAFOLDER_NAME);
+TableNotFoundException.checkTableValidity(fs, this.dataBasePath.get(), 
metaPath);
+HoodieTableConfig tableConfig = new HoodieTableConfig(fs, 
metaPath.toString(), null, null);

Review Comment:
   hmmm, We are not creating HoodieTableMetaClient so many times to cause this 
overhead to be the reason. right ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] big-doudou commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-07-10 Thread via GitHub


big-doudou commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1629982551

   Is this problem solved? I also got the same error


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] sunneebaby commented on issue #9073: Spark-3.2 Insert Into Hudi Table UnsupportedOperationException: S3A streams are not Syncable

2023-07-10 Thread via GitHub


sunneebaby commented on issue #9073:
URL: https://github.com/apache/hudi/issues/9073#issuecomment-1629982525

   > @sunneebaby Did you got a chance to try it? Did it worked?
   
   Yes ,I tried it.But it didn't work too.insert failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9145: [HUDI-6464] [WIP] Codreview changes for Spark SQL Merge Into for pkless tables'

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9145:
URL: https://github.com/apache/hudi/pull/9145#issuecomment-1629947745

   
   ## CI report:
   
   * 10f9adc8c62d6fae7219a4472cba010a1c1c0da0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18472)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] leesf commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


leesf commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629944171

   LGTM, let's waiting for https://github.com/apache/hudi/pull/9163 to be 
merged to fix CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9167: [HUDI-6519] The default value of read.streaming.enabled is determined by execution.runtime-mode

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9167:
URL: https://github.com/apache/hudi/pull/9167#issuecomment-1629942478

   
   ## CI report:
   
   * 7a8ef331fc7f4e4862d7b0f0cd6e8c35123e6c8a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18467)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9136: [HUDI-6509] Add GitHub CI for Java 17

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9136:
URL: https://github.com/apache/hudi/pull/9136#issuecomment-1629936547

   
   ## CI report:
   
   * a0e7207fb19738237d56fa0060c91cb7865ae9c0 UNKNOWN
   * cb101756f1bb906839b8f135b618f26205e022a9 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18462)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9080: [HUDI-6445] Making some of Spark DS tests as functional

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9080:
URL: https://github.com/apache/hudi/pull/9080#issuecomment-1629936364

   
   ## CI report:
   
   * d28ff949a1dd43456fda75e5624848bb63e030f4 UNKNOWN
   * 61f56fd22ac70d3ede05692efc86b201bc86326a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18466)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9168: [HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9168:
URL: https://github.com/apache/hudi/pull/9168#issuecomment-1629905445

   
   ## CI report:
   
   * 7f0cf39c716e969938e110e4de9b31e648eec83e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18471)
 
   * 4dff8041f9695a645f35d846006610be6ec85963 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18476)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629904891

   
   ## CI report:
   
   * 7f0cf39c716e969938e110e4de9b31e648eec83e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18471)
 
   * 4dff8041f9695a645f35d846006610be6ec85963 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18476)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9163: [MINOR] Fix spark2.4 do not support bucket index row writer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9163:
URL: https://github.com/apache/hudi/pull/9163#issuecomment-1629899255

   
   ## CI report:
   
   * 784dbe3012ec4312a4877b389994081272ad6506 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18450)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18475)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9168: [HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9168:
URL: https://github.com/apache/hudi/pull/9168#issuecomment-1629899322

   
   ## CI report:
   
   * 7f0cf39c716e969938e110e4de9b31e648eec83e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18471)
 
   * 4dff8041f9695a645f35d846006610be6ec85963 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629898651

   
   ## CI report:
   
   * dac4ff91b1c79911ffd9bd787e8f772a716b241a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18464)
 
   * 7f0cf39c716e969938e110e4de9b31e648eec83e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18471)
 
   * 4dff8041f9695a645f35d846006610be6ec85963 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] stream2000 commented on pull request #9163: [MINOR] Fix spark2.4 do not support bucket index row writer

2023-07-10 Thread via GitHub


stream2000 commented on PR #9163:
URL: https://github.com/apache/hudi/pull/9163#issuecomment-1629898784

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629892081

   
   ## CI report:
   
   * b2a0e7e0a2539122bc9178f0b2e2283a175c9de8 UNKNOWN
   * 6358410f19457f5fbe23b209b0b1b59faee0ade4 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18465)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9145: [HUDI-6464] [WIP] Codreview changes for Spark SQL Merge Into for pkless tables'

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9145:
URL: https://github.com/apache/hudi/pull/9145#issuecomment-1629891978

   
   ## CI report:
   
   * d43ca83278ca3c0222ac3635ca7de0b9db6d0f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18413)
 
   * 10f9adc8c62d6fae7219a4472cba010a1c1c0da0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18472)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua merged pull request #9152: [HUDI-6112] Basic and All configs pages generated from Config doc generation tool

2023-07-10 Thread via GitHub


yihua merged PR #9152:
URL: https://github.com/apache/hudi/pull/9152


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch asf-site updated: [DOCS] Updated video and blogs (#9166)

2023-07-10 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 161d2e4efa3 [DOCS] Updated video and blogs (#9166)
161d2e4efa3 is described below

commit 161d2e4efa34f115b4bd193d608dad8dddc9966c
Author: nadine farah 
AuthorDate: Mon Jul 10 16:33:00 2023 -0700

[DOCS] Updated video and blogs (#9166)
---
 .../2023-03-17-introduction-to-apache-hudi.mdx |  14 +
 website/blog/2023-06-06-TOCTOU.mdx |  14 +
 ...3-06-11-cleaner-and-archival-in-apache-hudi.mdx |  16 +++
 ...Hudi-and-Presto-Power-New-Insights-at-Scale.mdx |  17 
 ...o-query-data-in-Apache-Hudi-using-StarRocks.mdx |  14 +
 .../2023-06-20-timeline-server-in-apache-hudi.mdx  |  14 +
 ...3-06-24-multi-writer-support-in-apache-hudi.mdx |  15 ++
 ...e-DolphinScheduler-and-Hudi-Hangzhou-Meetup.mdx |  14 +
 ...t-Apache-Hudi-Apache-Iceberg-and-Delta-Lake.mdx |  18 +
 .../2023-07-01-monitoring-table-size-stats.mdx |  13 
 ...es-with-Hudi-multi-modal-indexing-subsystem.mdx |  19 ++
 ...-Quickly-start-using-Apache-Hudi-on-AWS-EMR.mdx |  15 ++
 ...e-Foundational-pillar-for-ACID-transactions.mdx |  16 +++
 website/src/pages/videos.md|  22 -
 .../2023-03-17-introduction-to-apache-hudi.png | Bin 0 -> 355902 bytes
 ...Hudi-and-Presto-Power-New-Insights-at-Scale.png | Bin 0 -> 307494 bytes
 ...t-Apache-Hudi-Apache-Iceberg-and-Delta-Lake.png | Bin 0 -> 27787 bytes
 ...es-with-Hudi-multi-modal-indexing-subsystem.png | Bin 0 -> 862160 bytes
 18 files changed, 220 insertions(+), 1 deletion(-)

diff --git a/website/blog/2023-03-17-introduction-to-apache-hudi.mdx 
b/website/blog/2023-03-17-introduction-to-apache-hudi.mdx
new file mode 100644
index 000..9921b02995f
--- /dev/null
+++ b/website/blog/2023-03-17-introduction-to-apache-hudi.mdx
@@ -0,0 +1,14 @@
+---
+title: "Introduction to Apache Hudi"
+authors: 
+- name: Itamar Syn-Hershko
+category: blog
+image: /assets/images/blog/2023-03-17-introduction-to-apache-hudi.png
+tags:
+- how-to
+- guide
+- introduction
+---
+import Redirect from '@site/src/components/Redirect';
+
+https://bigdataboutique.com/blog/introduction-to-apache-hudi-c83367;>Redirecting...
 please wait!! 
\ No newline at end of file
diff --git a/website/blog/2023-06-06-TOCTOU.mdx 
b/website/blog/2023-06-06-TOCTOU.mdx
new file mode 100644
index 000..dd7e788c43f
--- /dev/null
+++ b/website/blog/2023-06-06-TOCTOU.mdx
@@ -0,0 +1,14 @@
+---
+title: "TOCTOU"
+authors:
+- name: Sivabalan Narayanan
+category: blog
+tags:
+- blog
+- medium
+- race conditions
+
+---
+import Redirect from '@site/src/components/Redirect';
+
+https://medium.com/@simpsons/toctou-5224db6470dc;>Redirecting... please 
wait!! 
diff --git a/website/blog/2023-06-11-cleaner-and-archival-in-apache-hudi.mdx 
b/website/blog/2023-06-11-cleaner-and-archival-in-apache-hudi.mdx
new file mode 100644
index 000..af4455f49b6
--- /dev/null
+++ b/website/blog/2023-06-11-cleaner-and-archival-in-apache-hudi.mdx
@@ -0,0 +1,16 @@
+---
+title: "Cleaner and Archival in Apache Hudi"
+authors:
+- name: Sivabalan Narayanan
+category: blog
+tags:
+- blog
+- cleaner
+- timeline
+- active timeline
+- archival timeline
+- medium
+---
+import Redirect from '@site/src/components/Redirect';
+
+https://medium.com/@simpsons/cleaner-and-archival-in-apache-hudi-9e15b08b2933;>Redirecting...
 please wait!! 
diff --git 
a/website/blog/2023-06-16-Exploring-New-Frontiers-How-Apache-Flink-Apache-Hudi-and-Presto-Power-New-Insights-at-Scale.mdx
 
b/website/blog/2023-06-16-Exploring-New-Frontiers-How-Apache-Flink-Apache-Hudi-and-Presto-Power-New-Insights-at-Scale.mdx
new file mode 100644
index 000..c7c2e9273d3
--- /dev/null
+++ 
b/website/blog/2023-06-16-Exploring-New-Frontiers-How-Apache-Flink-Apache-Hudi-and-Presto-Power-New-Insights-at-Scale.mdx
@@ -0,0 +1,17 @@
+---
+title: "Exploring New Frontiers: How Apache Flink, Apache Hudi and Presto 
Power New Insights at Scale"
+authors:
+- name: Nadine Farah
+category: blog
+image: 
/assets/images/blog/2023-06-16-Exploring-New-Frontiers-How-Apache-Flink-Apache-Hudi-and-Presto-Power-New-Insights-at-Scale.png
+tags:
+- blog
+- prestocon
+- flink
+- presto
+- streaming
+- incremental etl
+---
+import Redirect from '@site/src/components/Redirect';
+
+https://www.onehouse.ai/blog/exploring-new-frontiers-how-apache-flink-apache-hudi-and-presto-power-new-insights-at-scale;>Redirecting...
 please wait!! 
diff --git 
a/website/blog/2023-06-20-How-to-query-data-in-Apache-Hudi-using-StarRocks.mdx 
b/website/blog/2023-06-20-How-to-query-data-in-Apache-Hudi-using-StarRocks.mdx
new file mode 100644
index 000..f5b016fd8df
--- /dev/null
+++ 

[GitHub] [hudi] yihua merged pull request #9166: [DOCS] Update video and blogs

2023-07-10 Thread via GitHub


yihua merged PR #9166:
URL: https://github.com/apache/hudi/pull/9166


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated (a9225bd9093 -> 5eeefaeccd5)

2023-07-10 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from a9225bd9093 [HUDI-6515] Fix bucket index row writer write record to 
wrong handle (#9156)
 add 5eeefaeccd5 [HUDI-6511] Mark more recently added configs advanced as 
appropriate (#9149)

No new revisions were added by this update.

Summary of changes:
 .../org/apache/hudi/config/HoodieClusteringConfig.java   |  1 +
 .../org/apache/hudi/config/HoodieCompactionConfig.java   |  1 +
 .../java/org/apache/hudi/config/HoodieWriteConfig.java   |  2 ++
 .../apache/hudi/common/config/HoodieCommonConfig.java|  1 +
 .../apache/hudi/common/config/HoodieMetadataConfig.java  | 10 ++
 .../main/scala/org/apache/hudi/DataSourceOptions.scala   | 16 +---
 .../hudi/utilities/config/HoodieDeltaStreamerConfig.java |  2 +-
 7 files changed, 25 insertions(+), 8 deletions(-)



[GitHub] [hudi] hudi-bot commented on pull request #9145: [HUDI-6464] [WIP] Codreview changes for Spark SQL Merge Into for pkless tables'

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9145:
URL: https://github.com/apache/hudi/pull/9145#issuecomment-1629854751

   
   ## CI report:
   
   * d43ca83278ca3c0222ac3635ca7de0b9db6d0f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18413)
 
   * 10f9adc8c62d6fae7219a4472cba010a1c1c0da0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9168: [HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9168:
URL: https://github.com/apache/hudi/pull/9168#issuecomment-1629801524

   
   ## CI report:
   
   * 7f0cf39c716e969938e110e4de9b31e648eec83e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18471)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9165: [HUDI-6517] Throw error if deletion of invalid data file fails

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9165:
URL: https://github.com/apache/hudi/pull/9165#issuecomment-1629801470

   
   ## CI report:
   
   * a0055f7d26380a87af8c29ec2abb0d4d1e0606d2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18456)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9149: [HUDI-6511] Mark more recently added configs advanced as appropriate

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9149:
URL: https://github.com/apache/hudi/pull/9149#issuecomment-1629801364

   
   ## CI report:
   
   * 1520d0f6f10b04cc69ec1392e42d8b5e3c1d3233 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18420)
 
   * d56320773472f7453a94336b3d4df09dcf457622 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18470)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9131:
URL: https://github.com/apache/hudi/pull/9131#issuecomment-1629801249

   
   ## CI report:
   
   * 75157cf41f09e386d93f2e154553e9190731095c Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18458)
 
   * 0fa3be40442ad7e9570a5f7b3f26a75618f0447e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18469)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629800648

   
   ## CI report:
   
   * dac4ff91b1c79911ffd9bd787e8f772a716b241a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18464)
 
   * 7f0cf39c716e969938e110e4de9b31e648eec83e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18471)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9149: [HUDI-6511] Mark more recently added configs advanced as appropriate

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9149:
URL: https://github.com/apache/hudi/pull/9149#issuecomment-1629793740

   
   ## CI report:
   
   * 1520d0f6f10b04cc69ec1392e42d8b5e3c1d3233 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18420)
 
   * d56320773472f7453a94336b3d4df09dcf457622 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9168: [HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9168:
URL: https://github.com/apache/hudi/pull/9168#issuecomment-1629793858

   
   ## CI report:
   
   * 7f0cf39c716e969938e110e4de9b31e648eec83e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9131:
URL: https://github.com/apache/hudi/pull/9131#issuecomment-1629793624

   
   ## CI report:
   
   * 75157cf41f09e386d93f2e154553e9190731095c Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18458)
 
   * 0fa3be40442ad7e9570a5f7b3f26a75618f0447e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9035:
URL: https://github.com/apache/hudi/pull/9035#issuecomment-1629793353

   
   ## CI report:
   
   * b29e7cad9b2f4384dba40833e09b5885809b836f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18461)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629792954

   
   ## CI report:
   
   * dac4ff91b1c79911ffd9bd787e8f772a716b241a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18464)
 
   * 7f0cf39c716e969938e110e4de9b31e648eec83e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua commented on a diff in pull request #9149: [HUDI-6511] Mark more recently added configs advanced as appropriate

2023-07-10 Thread via GitHub


yihua commented on code in PR #9149:
URL: https://github.com/apache/hudi/pull/9149#discussion_r1258950508


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala:
##
@@ -124,10 +124,12 @@ object DataSourceReadOptions {
   val END_INSTANTTIME: ConfigProperty[String] = ConfigProperty
 .key("hoodie.datasource.read.end.instanttime")
 .noDefaultValue()
-.withDocumentation("Instant time to limit incrementally fetched data to. "
-  + "New data written with an instant_time <= END_INSTANTTIME are fetched 
out. Note that if `"
-  + HoodieCommonConfig.INCREMENTAL_READ_HANDLE_HOLLOW_COMMIT.key() + "` 
set to "
-  + HollowCommitHandling.USE_STATE_TRANSITION_TIME + ", will use instant's 
"
+.withDocumentation("Used when `" + QUERY_TYPE.key() + "` is set to `" + 
QUERY_TYPE_INCREMENTAL_OPT_VAL +
+  "`. Represents the instant time to limit incrementally fetched data to. 
When not specified latest commit time from " +
+  "timeline is assumed by default. When specified, new data written with 
an instant_time <= END_INSTANTTIME are fetched out. " +
+  "Point in time type queries makes more sense with begin and end instant 
times specified. Note that if `"

Review Comment:
   ```suggestion
 "Point in time type queries make more sense with begin and end instant 
times specified. Note that if `"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua commented on a diff in pull request #9149: [HUDI-6511] Mark more recently added configs advanced as appropriate

2023-07-10 Thread via GitHub


yihua commented on code in PR #9149:
URL: https://github.com/apache/hudi/pull/9149#discussion_r1258946099


##
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##
@@ -251,50 +253,58 @@ public final class HoodieMetadataConfig extends 
HoodieConfig {
   .key(METADATA_PREFIX + ".record.index.enable")
   .defaultValue(false)
   .sinceVersion("0.14.0")
+  .markAdvanced()
   .withDocumentation("Create the HUDI Record Index within the Metadata 
Table");
 
   public static final ConfigProperty 
RECORD_INDEX_MIN_FILE_GROUP_COUNT_PROP = ConfigProperty
   .key(METADATA_PREFIX + ".record.index.min.filegroup.count")
   .defaultValue(10)
   .sinceVersion("0.14.0")
+  .markAdvanced()
   .withDocumentation("Minimum number of file groups to use for Record 
Index.");
 
   public static final ConfigProperty 
RECORD_INDEX_MAX_FILE_GROUP_COUNT_PROP = ConfigProperty
   .key(METADATA_PREFIX + ".record.index.max.filegroup.count")
   .defaultValue(1000)
   .sinceVersion("0.14.0")
+  .markAdvanced()
   .withDocumentation("Maximum number of file groups to use for Record 
Index.");
 
   public static final ConfigProperty 
RECORD_INDEX_MAX_FILE_GROUP_SIZE_BYTES_PROP = ConfigProperty
   .key(METADATA_PREFIX + ".record.index.max.filegroup.size")
   .defaultValue(1024 * 1024 * 1024)
   .sinceVersion("0.14.0")
+  .markAdvanced()
   .withDocumentation("Maximum size in bytes of a single file group. Large 
file group takes longer to compact.");
 
   public static final ConfigProperty RECORD_INDEX_GROWTH_FACTOR_PROP = 
ConfigProperty
   .key(METADATA_PREFIX + ".record.index.growth.factor")
   .defaultValue(2.0f)
   .sinceVersion("0.14.0")
+  .markAdvanced()
   .withDocumentation("The current number of records are multiplied by this 
number when estimating the number of "
   + "file groups to create automatically. This helps account for 
growth in the number of records in the dataset.");
 
   public static final ConfigProperty MAX_READER_MEMORY_PROP = 
ConfigProperty
   .key(METADATA_PREFIX + ".max.reader.memory")
   .defaultValue(1024 * 1024 * 1024L)
   .sinceVersion("0.14.0")
+  .markAdvanced()
   .withDocumentation("Max memory to use for the reader to read from 
metadata");
 
   public static final ConfigProperty MAX_READER_BUFFER_SIZE_PROP = 
ConfigProperty
   .key(METADATA_PREFIX + ".max.reader.buffer.size")
   .defaultValue(10 * 1024 * 1024)
   .sinceVersion("0.14.0")
+  .markAdvanced()

Review Comment:
   nit: let's put `.markAdvanced()` before `.sinceVersion()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua opened a new pull request, #9168: [HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


yihua opened a new pull request, #9168:
URL: https://github.com/apache/hudi/pull/9168

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] amrishlal commented on a diff in pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


amrishlal commented on code in PR #9131:
URL: https://github.com/apache/hudi/pull/9131#discussion_r1258932073


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestUpdateTable.scala:
##
@@ -48,235 +48,14 @@ class TestUpdateTable extends HoodieSparkSqlTestBase {
   Seq(1, "a1", 10.0, 1000)
 )
 
-// update data
-spark.sql(s"update $tableName set price = 20 where id = 1")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 20.0, 1000)
-)
-
-// update data
-spark.sql(s"update $tableName set price = price * 2 where id = 1")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 40.0, 1000)
-)
-  }
-})
-  }
-
-  test("Test Update Table Without Primary Key") {
-withRecordType()(withTempDir { tmp =>
-  Seq("cow", "mor").foreach { tableType =>
-val tableName = generateTableName
-// create table
-spark.sql(
-  s"""
- |create table $tableName (
- |  id int,
- |  name string,
- |  price double,
- |  ts long
- |) using hudi
- | location '${tmp.getCanonicalPath}/$tableName'
- | tblproperties (
- |  type = '$tableType',
- |  preCombineField = 'ts'
- | )
- """.stripMargin)
-
-// insert data to table
-spark.sql(s"insert into $tableName select 1, 'a1', 10, 1000")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 10.0, 1000)
-)
+spark.sql("set hoodie.enable.spark.sql.optimized.update=false")
 
 // update data
 spark.sql(s"update $tableName set price = 20 where id = 1")
 checkAnswer(s"select id, name, price, ts from $tableName")(
   Seq(1, "a1", 20.0, 1000)
 )
-
-// update data
-spark.sql(s"update $tableName set price = price * 2 where id = 1")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 40.0, 1000)
-)
   }
 })
   }
-
-  test("Test Update Table On Non-PK Condition") {
-withRecordType()(withTempDir { tmp =>
-  Seq("cow", "mor").foreach {tableType =>
-/** non-partitioned table */
-val tableName = generateTableName
-// create table
-spark.sql(
-  s"""
- |create table $tableName (
- |  id int,
- |  name string,
- |  price double,
- |  ts long
- |) using hudi
- | location '${tmp.getCanonicalPath}/$tableName'
- | tblproperties (
- |  type = '$tableType',
- |  primaryKey = 'id',
- |  preCombineField = 'ts'
- | )
-   """.stripMargin)
-
-// insert data to table
-if (isSpark2) {
-  spark.sql(s"insert into $tableName values (1, 'a1', cast(10.0 as 
double), 1000), (2, 'a2', cast(20.0 as double), 1000)")
-} else {
-  spark.sql(s"insert into $tableName values (1, 'a1', 10.0, 1000), (2, 
'a2', 20.0, 1000)")
-}
-
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 10.0, 1000),
-  Seq(2, "a2", 20.0, 1000)
-)
-
-// update data on non-pk condition
-spark.sql(s"update $tableName set price = 11.0, ts = 1001 where name = 
'a1'")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 11.0, 1001),
-  Seq(2, "a2", 20.0, 1000)
-)
-
-/** partitioned table */
-val ptTableName = generateTableName + "_pt"
-// create table
-spark.sql(
-  s"""
- |create table $ptTableName (
- |  id int,
- |  name string,
- |  price double,
- |  ts long,
- |  pt string
- |) using hudi
- | location '${tmp.getCanonicalPath}/$ptTableName'
- | tblproperties (
- |  type = '$tableType',
- |  primaryKey = 'id',
- |  preCombineField = 'ts'
- | )
- | partitioned by (pt)
-  """.stripMargin)
-
-// insert data to table
-if (isSpark2) {
-  spark.sql(
-s"""
-   |insert into $ptTableName
-   |values (1, 'a1', cast(10.0 as double), 1000, "2021"), (2, 
'a2', cast(20.0 as double), 1000, "2021"), (3, 'a2', cast(30.0 as double), 
1000, "2022")
-   |""".stripMargin)
-} else {
-  spark.sql(
-s"""
-   |insert into $ptTableName
-   |values (1, 'a1', 10.0, 1000, "2021"), (2, 'a2', 20.0, 1000, 
"2021"), (3, 'a2', 30.0, 1000, "2022")
-   |""".stripMargin)
-}
-
-checkAnswer(s"select id, name, price, ts, pt from 

[GitHub] [hudi] hudi-bot commented on pull request #9167: [HUDI-6519] The default value of read.streaming.enabled is determined by execution.runtime-mode

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9167:
URL: https://github.com/apache/hudi/pull/9167#issuecomment-1629739737

   
   ## CI report:
   
   * 7a8ef331fc7f4e4862d7b0f0cd6e8c35123e6c8a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18467)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9080: [HUDI-6445] Making some of Spark DS tests as functional

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9080:
URL: https://github.com/apache/hudi/pull/9080#issuecomment-1629739302

   
   ## CI report:
   
   * d28ff949a1dd43456fda75e5624848bb63e030f4 UNKNOWN
   * 9dadc7f288b7aac6204637e4d292cc1fb59fa540 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18455)
 
   * 61f56fd22ac70d3ede05692efc86b201bc86326a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18466)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9167: [HUDI-6519] The default value of read.streaming.enabled is determined by execution.runtime-mode

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9167:
URL: https://github.com/apache/hudi/pull/9167#issuecomment-1629729800

   
   ## CI report:
   
   * 7a8ef331fc7f4e4862d7b0f0cd6e8c35123e6c8a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9080: [HUDI-6445] Making some of Spark DS tests as functional

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9080:
URL: https://github.com/apache/hudi/pull/9080#issuecomment-1629729006

   
   ## CI report:
   
   * d28ff949a1dd43456fda75e5624848bb63e030f4 UNKNOWN
   * 9dadc7f288b7aac6204637e4d292cc1fb59fa540 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18455)
 
   * 61f56fd22ac70d3ede05692efc86b201bc86326a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629718295

   
   ## CI report:
   
   * b2a0e7e0a2539122bc9178f0b2e2283a175c9de8 UNKNOWN
   * c72150232719f86cc0a2ee1e429b81177698487b Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18463)
 
   * 6358410f19457f5fbe23b209b0b1b59faee0ade4 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18465)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6519) The default value of read.streaming.enabled is determined by execution.runtime-mode

2023-07-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6519:
-
Labels: pull-request-available  (was: )

> The default value of read.streaming.enabled is determined by 
> execution.runtime-mode
> ---
>
> Key: HUDI-6519
> URL: https://issues.apache.org/jira/browse/HUDI-6519
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>
> The default value of read.streaming.enabled could be determined by 
> execution.runtime-mode from which you can choose depending on the 
> requirements of your use case and the characteristics of your job.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] SteNicholas opened a new pull request, #9167: [HUDI-6519] The default value of read.streaming.enabled is determined by execution.runtime-mode

2023-07-10 Thread via GitHub


SteNicholas opened a new pull request, #9167:
URL: https://github.com/apache/hudi/pull/9167

   ### Change Logs
   
   The default value of `read.streaming.enabled` could be determined by 
`execution.runtime-mode` from which you can choose depending on the 
requirements of your use case and the characteristics of your job.
   
   ### Impact
   
   The default value of `read.streaming.enabled` is determined by 
`execution.runtime-mode`.
   
   ### Risk level (write none, low medium or high below)
   
   none.
   
   ### Documentation Update
   
   none.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [x] CI passed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6519) The default value of read.streaming.enabled is determined by execution.runtime-mode

2023-07-10 Thread Nicholas Jiang (Jira)
Nicholas Jiang created HUDI-6519:


 Summary: The default value of read.streaming.enabled is determined 
by execution.runtime-mode
 Key: HUDI-6519
 URL: https://issues.apache.org/jira/browse/HUDI-6519
 Project: Apache Hudi
  Issue Type: Improvement
  Components: flink
Reporter: Nicholas Jiang
Assignee: Nicholas Jiang
 Fix For: 0.14.0


The default value of read.streaming.enabled could be determined by 
execution.runtime-mode from which you can choose depending on the requirements 
of your use case and the characteristics of your job.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629662745

   
   ## CI report:
   
   * b2a0e7e0a2539122bc9178f0b2e2283a175c9de8 UNKNOWN
   * 3eb4090f6ad4d2488cf12fa7c789a5841120e523 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18448)
 
   * c72150232719f86cc0a2ee1e429b81177698487b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18463)
 
   * 6358410f19457f5fbe23b209b0b1b59faee0ade4 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629661488

   
   ## CI report:
   
   * 6c7bcd138f11262da6969cead534dd3d951b21a2 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18459)
 
   * dac4ff91b1c79911ffd9bd787e8f772a716b241a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18464)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629651597

   
   ## CI report:
   
   * b2a0e7e0a2539122bc9178f0b2e2283a175c9de8 UNKNOWN
   * 3eb4090f6ad4d2488cf12fa7c789a5841120e523 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18448)
 
   * c72150232719f86cc0a2ee1e429b81177698487b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9136: [HUDI-6509] Add GitHub CI for Java 17

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9136:
URL: https://github.com/apache/hudi/pull/9136#issuecomment-1629651357

   
   ## CI report:
   
   * a0e7207fb19738237d56fa0060c91cb7865ae9c0 UNKNOWN
   * 0b0d70ccaa422db468feadb36816be896e798e0b Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18460)
 
   * cb101756f1bb906839b8f135b618f26205e022a9 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18462)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9164: [HUDI-6516] Correct the use of hoodie.bootstrap.mode.selector

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9164:
URL: https://github.com/apache/hudi/pull/9164#issuecomment-1629651712

   
   ## CI report:
   
   * 5861f76644649791a3b050c8fb0baa12d0c4135a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18454)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9053: [HUDI-6369] Fix spacial curve with sample strategy fails when 0 or 1 rows only is incoming

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9053:
URL: https://github.com/apache/hudi/pull/9053#issuecomment-1629650886

   
   ## CI report:
   
   * 170cbb483e442332ddc189d4200fbf3fe390059a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18453)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][Testing Only][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629650318

   
   ## CI report:
   
   * 30096da4b800f96401fa16eee37e68befc2f8216 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17428)
 
   * 6c7bcd138f11262da6969cead534dd3d951b21a2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18459)
 
   * dac4ff91b1c79911ffd9bd787e8f772a716b241a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] SteNicholas commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


SteNicholas commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629585323

   @n3nash, could you help to review this pull request?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nfarah86 opened a new pull request, #9166: [DOCS]-update-content-0709

2023-07-10 Thread via GitHub


nfarah86 opened a new pull request, #9166:
URL: https://github.com/apache/hudi/pull/9166

   ### Change Logs
   
   cc @bhasudha please review
   
   updated blog and videos with new content
   https://github.com/apache/hudi/assets/5392555/f085ffac-16e1-4c50-b786-3d40e591341c;>
   https://github.com/apache/hudi/assets/5392555/1350830d-16ad-4625-b6f9-ad3991e700d3;>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9136: [HUDI-6509] Add GitHub CI for Java 17

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9136:
URL: https://github.com/apache/hudi/pull/9136#issuecomment-1629579335

   
   ## CI report:
   
   * a0e7207fb19738237d56fa0060c91cb7865ae9c0 UNKNOWN
   * 360e04ff0549c152e0847b71d8a9651507582ec8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18437)
 
   * 0b0d70ccaa422db468feadb36816be896e798e0b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18460)
 
   * cb101756f1bb906839b8f135b618f26205e022a9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8998: [HUDI-6400] Fail when merger class not found

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8998:
URL: https://github.com/apache/hudi/pull/8998#issuecomment-1629545756

   
   ## CI report:
   
   * 5f3bc519f22cf53fe727f07f007950523600ff23 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18452)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9136: [HUDI-6509] Skip tests for Java 17

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9136:
URL: https://github.com/apache/hudi/pull/9136#issuecomment-1629472367

   
   ## CI report:
   
   * a0e7207fb19738237d56fa0060c91cb7865ae9c0 UNKNOWN
   * 360e04ff0549c152e0847b71d8a9651507582ec8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18437)
 
   * 0b0d70ccaa422db468feadb36816be896e798e0b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18460)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9035:
URL: https://github.com/apache/hudi/pull/9035#issuecomment-1629471792

   
   ## CI report:
   
   * 9085094a917ff54e32258029478eca02ae4709ed Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18457)
 
   * b29e7cad9b2f4384dba40833e09b5885809b836f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18461)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9136: [HUDI-6509] Skip tests for Java 17

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9136:
URL: https://github.com/apache/hudi/pull/9136#issuecomment-1629460929

   
   ## CI report:
   
   * a0e7207fb19738237d56fa0060c91cb7865ae9c0 UNKNOWN
   * 360e04ff0549c152e0847b71d8a9651507582ec8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18437)
 
   * 0b0d70ccaa422db468feadb36816be896e798e0b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9035:
URL: https://github.com/apache/hudi/pull/9035#issuecomment-1629460506

   
   ## CI report:
   
   * e132cdb5b70389de7509265a174a18b601d2ca1d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18370)
 
   * 9085094a917ff54e32258029478eca02ae4709ed Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18457)
 
   * b29e7cad9b2f4384dba40833e09b5885809b836f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nbalajee commented on a diff in pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


nbalajee commented on code in PR #9035:
URL: https://github.com/apache/hudi/pull/9035#discussion_r1258659622


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##
@@ -138,9 +139,35 @@ protected Path makeNewFilePath(String partitionPath, 
String fileName) {
*
* @param partitionPath Partition path
*/
-  protected void createMarkerFile(String partitionPath, String dataFileName) {
-WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime)
-.create(partitionPath, dataFileName, getIOType(), config, fileId, 
hoodieTable.getMetaClient().getActiveTimeline());
+  protected void createInProgressMarkerFile(String partitionPath, String 
dataFileName, String markerInstantTime) {
+WriteMarkers writeMarkers = 
WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime);
+if (!writeMarkers.doesMarkerDirExist()) {
+  throw new HoodieIOException(String.format("Marker root directory absent 
: %s/%s (%s)",
+  partitionPath, dataFileName, markerInstantTime));
+}
+if (config.enforceFinalizeWriteCheck()
+&& writeMarkers.markerExists(writeMarkers.getCompletionMarkerPath("", 
"FINALIZE_WRITE", markerInstantTime, IOType.CREATE))) {
+  throw new HoodieCorruptedDataException("Reconciliation for instant " + 
instantTime + " is completed, job is trying to re-write the data files.");

Review Comment:
   Fixed.  picked up the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629448482

   
   ## CI report:
   
   * 30096da4b800f96401fa16eee37e68befc2f8216 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17428)
 
   * 6c7bcd138f11262da6969cead534dd3d951b21a2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18459)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nbalajee commented on a diff in pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


nbalajee commented on code in PR #9035:
URL: https://github.com/apache/hudi/pull/9035#discussion_r1258658841


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##
@@ -138,9 +139,35 @@ protected Path makeNewFilePath(String partitionPath, 
String fileName) {
*
* @param partitionPath Partition path
*/
-  protected void createMarkerFile(String partitionPath, String dataFileName) {
-WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime)
-.create(partitionPath, dataFileName, getIOType(), config, fileId, 
hoodieTable.getMetaClient().getActiveTimeline());
+  protected void createInProgressMarkerFile(String partitionPath, String 
dataFileName, String markerInstantTime) {
+WriteMarkers writeMarkers = 
WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime);
+if (!writeMarkers.doesMarkerDirExist()) {
+  throw new HoodieIOException(String.format("Marker root directory absent 
: %s/%s (%s)",
+  partitionPath, dataFileName, markerInstantTime));
+}
+if (config.enforceFinalizeWriteCheck()
+&& writeMarkers.markerExists(writeMarkers.getCompletionMarkerPath("", 
"FINALIZE_WRITE", markerInstantTime, IOType.CREATE))) {

Review Comment:
   If the job has passed through the 
   (a) write stage to create the data files
   (b) started a commit and have finalized writes (keeping the files that are 
part of the write statuses and removing the duplicate files)
   (c) when updating the MDT for RLI (or before updating MDT),  if  writestatus 
information (RDD blocks also persisted in the containers local storage) are 
found to be lost due to lost/failed containers
   Having this flag turned on would force the job to fail, instead of retrying 
the tasks/stages to recreate the data files (associated with missing write 
statuses).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8683: [HUDI-5533] Support spark columns comments

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8683:
URL: https://github.com/apache/hudi/pull/8683#issuecomment-1629447846

   
   ## CI report:
   
   * 8d6893fd9daf07c30524474cf9a4d39c66a37cba UNKNOWN
   * de250fbbcf1d16ba358dd08270eab5e11a5e3740 UNKNOWN
   * a8b972a6d624afb07e197ffef99473f644e3305f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18451)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9131:
URL: https://github.com/apache/hudi/pull/9131#issuecomment-1629390619

   
   ## CI report:
   
   * 430e85d5284cb537f0783245d76fa6b504531c6b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18407)
 
   * 75157cf41f09e386d93f2e154553e9190731095c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18458)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9080: [HUDI-6445] Making some of Spark DS tests as functional

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9080:
URL: https://github.com/apache/hudi/pull/9080#issuecomment-1629390228

   
   ## CI report:
   
   * d28ff949a1dd43456fda75e5624848bb63e030f4 UNKNOWN
   * b9dd8237e187586c5d05b46d4d4eee891822813e Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18232)
 
   * 9dadc7f288b7aac6204637e4d292cc1fb59fa540 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18455)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9035:
URL: https://github.com/apache/hudi/pull/9035#issuecomment-1629390085

   
   ## CI report:
   
   * e132cdb5b70389de7509265a174a18b601d2ca1d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18370)
 
   * 9085094a917ff54e32258029478eca02ae4709ed Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18457)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8827: [DNM][HUDI-6276] Rename HoodieDeltaStreamer to HoodieStreamer

2023-07-10 Thread via GitHub


hudi-bot commented on PR #8827:
URL: https://github.com/apache/hudi/pull/8827#issuecomment-1629389500

   
   ## CI report:
   
   * 30096da4b800f96401fa16eee37e68befc2f8216 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17428)
 
   * 6c7bcd138f11262da6969cead534dd3d951b21a2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] amrishlal commented on a diff in pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


amrishlal commented on code in PR #9131:
URL: https://github.com/apache/hudi/pull/9131#discussion_r1258613313


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestUpdateTable.scala:
##
@@ -48,235 +48,14 @@ class TestUpdateTable extends HoodieSparkSqlTestBase {
   Seq(1, "a1", 10.0, 1000)
 )
 
-// update data
-spark.sql(s"update $tableName set price = 20 where id = 1")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 20.0, 1000)
-)
-
-// update data
-spark.sql(s"update $tableName set price = price * 2 where id = 1")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 40.0, 1000)
-)
-  }
-})
-  }
-
-  test("Test Update Table Without Primary Key") {
-withRecordType()(withTempDir { tmp =>
-  Seq("cow", "mor").foreach { tableType =>
-val tableName = generateTableName
-// create table
-spark.sql(
-  s"""
- |create table $tableName (
- |  id int,
- |  name string,
- |  price double,
- |  ts long
- |) using hudi
- | location '${tmp.getCanonicalPath}/$tableName'
- | tblproperties (
- |  type = '$tableType',
- |  preCombineField = 'ts'
- | )
- """.stripMargin)
-
-// insert data to table
-spark.sql(s"insert into $tableName select 1, 'a1', 10, 1000")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 10.0, 1000)
-)
+spark.sql("set hoodie.enable.spark.sql.optimized.update=false")
 
 // update data
 spark.sql(s"update $tableName set price = 20 where id = 1")
 checkAnswer(s"select id, name, price, ts from $tableName")(
   Seq(1, "a1", 20.0, 1000)
 )
-
-// update data
-spark.sql(s"update $tableName set price = price * 2 where id = 1")
-checkAnswer(s"select id, name, price, ts from $tableName")(
-  Seq(1, "a1", 40.0, 1000)
-)
   }
 })
   }
-
-  test("Test Update Table On Non-PK Condition") {

Review Comment:
   Fixed. This was just for limiting testing to a single test for development.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] amrishlal commented on a diff in pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


amrishlal commented on code in PR #9131:
URL: https://github.com/apache/hudi/pull/9131#discussion_r1258612112


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestUpdateTable.scala:
##
@@ -23,7 +23,7 @@ class TestUpdateTable extends HoodieSparkSqlTestBase {
 
   test("Test Update Table") {
 withRecordType()(withTempDir { tmp =>
-  Seq("cow", "mor").foreach { tableType =>
+  Seq("cow").foreach { tableType =>

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9160: [HUDI-6501] HoodieHeartbeatClient should stop all heartbeats and not delete heartbeat files for close

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9160:
URL: https://github.com/apache/hudi/pull/9160#issuecomment-1629377807

   
   ## CI report:
   
   * b2a0e7e0a2539122bc9178f0b2e2283a175c9de8 UNKNOWN
   * 3eb4090f6ad4d2488cf12fa7c789a5841120e523 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18448)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] amrishlal commented on a diff in pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


amrishlal commented on code in PR #9131:
URL: https://github.com/apache/hudi/pull/9131#discussion_r1258610870


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/HoodieSparkSqlTestBase.scala:
##
@@ -195,7 +195,7 @@ class HoodieSparkSqlTestBase extends FunSuite with 
BeforeAndAfterAll {
 
   protected def withRecordType(recordConfig: Map[HoodieRecordType, Map[String, 
String]]=Map.empty)(f: => Unit) {
 // TODO HUDI-5264 Test parquet log with avro record in spark sql test
-Seq(HoodieRecordType.AVRO, HoodieRecordType.SPARK).foreach { recordType =>
+Seq(HoodieRecordType.AVRO).foreach { recordType =>

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9165: [HUDI-6517] Throw error if deletion of invalid data file fails

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9165:
URL: https://github.com/apache/hudi/pull/9165#issuecomment-1629377911

   
   ## CI report:
   
   * a0055f7d26380a87af8c29ec2abb0d4d1e0606d2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18456)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9131:
URL: https://github.com/apache/hudi/pull/9131#issuecomment-1629377557

   
   ## CI report:
   
   * 430e85d5284cb537f0783245d76fa6b504531c6b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18407)
 
   * 75157cf41f09e386d93f2e154553e9190731095c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9080: [HUDI-6445] Making some of Spark DS tests as functional

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9080:
URL: https://github.com/apache/hudi/pull/9080#issuecomment-1629376948

   
   ## CI report:
   
   * d28ff949a1dd43456fda75e5624848bb63e030f4 UNKNOWN
   * b9dd8237e187586c5d05b46d4d4eee891822813e Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18232)
 
   * 9dadc7f288b7aac6204637e4d292cc1fb59fa540 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-07-10 Thread via GitHub


hudi-bot commented on PR #9035:
URL: https://github.com/apache/hudi/pull/9035#issuecomment-1629376661

   
   ## CI report:
   
   * e132cdb5b70389de7509265a174a18b601d2ca1d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18370)
 
   * 9085094a917ff54e32258029478eca02ae4709ed UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] amrishlal commented on a diff in pull request #9131: [HUDI-6315] Feature flag for disabling optimized update/delete code path.

2023-07-10 Thread via GitHub


amrishlal commented on code in PR #9131:
URL: https://github.com/apache/hudi/pull/9131#discussion_r1258604541


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala:
##
@@ -605,6 +605,18 @@ object DataSourceWriteOptions {
 
   val DROP_PARTITION_COLUMNS: ConfigProperty[java.lang.Boolean] = 
HoodieTableConfig.DROP_PARTITION_COLUMNS
 
+  val ENABLE_OPTIMIZED_UPDATE: ConfigProperty[String] = ConfigProperty
+.key("hoodie.enable.spark.sql.optimized.update")
+.defaultValue("true")
+.markAdvanced()
+.withDocumentation("Controls whether spark sql optimized update is 
enabled.")
+
+  val ENABLE_OPTIMIZED_DELETE: ConfigProperty[String] = ConfigProperty
+.key("hoodie.enable.spark.sql.optimized.delete")

Review Comment:
   Changed to `hoodie.spark.sql.writes.optimized.enable`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >