[GitHub] [hudi] fengjian428 edited a comment on issue #3755: [Delta Streamer] file name mismatch with meta when compaction running
fengjian428 edited a comment on issue #3755: URL: https://github.com/apache/hudi/issues/3755#issuecomment-1053172311 > #4753 could this happens when using cow table? @yihua -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] fengjian428 edited a comment on issue #3755: [Delta Streamer] file name mismatch with meta when compaction running
fengjian428 edited a comment on issue #3755: URL: https://github.com/apache/hudi/issues/3755#issuecomment-945089345 > Are you having concurrent writers? If yes, I have come across a similar issue reported by someone else. let me know. I have a question, when Hudi does delta commit, if data is new , it need append them to exist parquet file. meanwhile may cause concurrent issue with async compaction thread if compaction plan contains same parquet file,how Hudi avoid that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] fengjian428 edited a comment on issue #3755: [Delta Streamer] file name mismatch with meta when compaction running
fengjian428 edited a comment on issue #3755: URL: https://github.com/apache/hudi/issues/3755#issuecomment-940179806 > It seems that the file left in reconcile stage is different with commit meta. Could you kindly share relevant logs and file status about marker file? before I got error above, this rpc timeout error happen, and when I restart delta streamer, above error happen, should I change unpersist method non-blocking and try again? `Caused by: java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Futures timed out after [600 seconds]. This timeout is controlled by spark.rpc.askTimeout at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:90) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:171) ... 8 more Caused by: org.apache.hudi.exception.HoodieException: Futures timed out after [600 seconds]. This timeout is controlled by spark.rpc.askTimeout at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:657) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [600 seconds]. This timeout is controlled by spark.rpc.askTimeout at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:47) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:62) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:58) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) at org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:131) at org.apache.spark.SparkContext.unpersistRDD(SparkContext.scala:1821) at org.apache.spark.rdd.RDD.unpersist(RDD.scala:217) at org.apache.spark.api.java.JavaRDD.unpersist(JavaRDD.scala:53) at org.apache.hudi.client.SparkRDDWriteClient.lambda$releaseResources$5(SparkRDDWriteClient.java:499) at java.lang.Iterable.forEach(Iterable.java:75) at org.apache.hudi.client.SparkRDDWriteClient.releaseResources(SparkRDDWriteClient.java:499) at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:193) at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:124) at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:525) at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:304) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:633) ... 4 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [600 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227) at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) ... 16 more` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org