[jira] [Updated] (HUDI-3555) re-use spark config for parquet timestamp format instead of having our own config

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3555:
-
Sprint: 2022/05/16

> re-use spark config for parquet timestamp format instead of having our own 
> config
> -
>
> Key: HUDI-3555
> URL: https://issues.apache.org/jira/browse/HUDI-3555
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark
>Reporter: sivabalan narayanan
>Priority: Major
>
> We have two diff configs to set the right timestamp format. 
> "hoodie.parquet.outputtimestamptype": "TIMESTAMP_MICROS",
> and spark config
> --conf spark.sql.parquet.outputTimestampType=TIMESTAMP_MICROS 
>  
> We should deprecate our own config and just rely on spark's configs. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-3199) Hive sync config unification

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3199:
-
Sprint: 2022/05/16

>  Hive sync config unification
> -
>
> Key: HUDI-3199
> URL: https://issues.apache.org/jira/browse/HUDI-3199
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: configs, hive
>Reporter: cdmikechen
>Priority: Major
>
> There are some hive sync config in flink, spark api, deltastreamer and 
> kafka-connect.
> But these properties are in *hudi-spark-common* package, so that some 
> services like flink or kafka-connect can not use it directly (related classes 
> use scala or spark references).
> At present, a problem caused by class reference can be referred to: 
> https://issues.apache.org/jira/browse/HUDI-3112
> We should unify hive sync config (like move to *hudi-hive-sync*), so that it 
> can be synchronized to other packages when it is updated.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-2536) Rename compaction config keys

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2536:
-
Sprint: 2022/05/16

> Rename compaction config keys 
> --
>
> Key: HUDI-2536
> URL: https://issues.apache.org/jira/browse/HUDI-2536
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: compaction, configs
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Critical
>
> There are certain compaction configs such as 
> "hoodie.compact.inline.trigger.strategy" used for triggering compaction, 
> which have "inline" in their name but used for async as well. This task is to 
> rename such configs while ensuring backward compatibility with older name.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-2197) Replace ConfigOptions with ConfigProperty for FlinkOptions

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2197:
-
Sprint: 2022/05/16

> Replace ConfigOptions with ConfigProperty for FlinkOptions
> --
>
> Key: HUDI-2197
> URL: https://issues.apache.org/jira/browse/HUDI-2197
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Rajesh Mahindra
>Assignee: liujinhui
>Priority: Major
>
> FlinkOption class currently uses ConfigOptions for each config val. Use 
> ConfigProperty instead to ensure it is consistent across rest of the config 
> in Hudi



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-2196) Breakdown FlinkOptions into FlinkReadOptions, FlinkWriteOptions...

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2196:
-
Sprint: 2022/05/16

> Breakdown FlinkOptions into FlinkReadOptions, FlinkWriteOptions...
> --
>
> Key: HUDI-2196
> URL: https://issues.apache.org/jira/browse/HUDI-2196
> Project: Apache Hudi
>  Issue Type: Task
>  Components: configs
>Reporter: Rajesh Mahindra
>Priority: Major
>  Labels: pull-request-available
>
> Breakdown FlinkOptions.java into FlinkReadOptions.java, 
> FlinkWriteOptions.java,...
>  
> This will help improve the config docs and group the Flink configs 
> appropriately similar to Spark configs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] minihippo commented on a diff in pull request #5572: [HUDI-3654] Preparations for hudi metastore.

2022-05-15 Thread GitBox


minihippo commented on code in PR #5572:
URL: https://github.com/apache/hudi/pull/5572#discussion_r873379752


##
hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java:
##
@@ -184,6 +191,11 @@ public static HoodieTableFileSystemView 
createInMemoryFileSystemViewWithTimeline
 if (metadataConfig.enabled()) {
   return new HoodieMetadataFileSystemView(engineContext, metaClient, 
timeline, metadataConfig);
 }
+if (metaClient.getMetastoreConfig().enableMetastore()) {
+  return (HoodieTableFileSystemView) 
ReflectionUtils.loadClass("org.apache.hudi.common.table.view.HoodieMetastoreFileSystemView",

Review Comment:
   fixed



##
hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java:
##
@@ -165,6 +167,11 @@ private static HoodieTableFileSystemView 
createInMemoryFileSystemView(HoodieMeta
   return new HoodieMetadataFileSystemView(metaClient, 
metaClient.getActiveTimeline().filterCompletedAndCompactionInstants(),
   metadataSupplier.get());
 }
+if (metaClient.getMetastoreConfig().enableMetastore()) {
+  return (HoodieTableFileSystemView) 
ReflectionUtils.loadClass("org.apache.hudi.common.table.view.HoodieMetastoreFileSystemView",

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] minihippo commented on a diff in pull request #5572: [HUDI-3654] Preparations for hudi metastore.

2022-05-15 Thread GitBox


minihippo commented on code in PR #5572:
URL: https://github.com/apache/hudi/pull/5572#discussion_r873379586


##
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetastoreConfig.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+import javax.annotation.concurrent.Immutable;
+import java.util.Properties;
+
+/**
+ * Configurations used by the HUDI Metadata Table.
+ */
+@Immutable
+@ConfigClassProperty(name = "Metastore Configs",
+groupName = ConfigGroups.Names.WRITE_CLIENT,
+description = "Configurations used by the Hudi Metastore.")
+public class HoodieMetastoreConfig extends HoodieConfig {
+
+  public static final String METASTORE_PREFIX = "hoodie.metastore";
+
+  public static final ConfigProperty METASTORE_ENABLE = ConfigProperty
+  .key(METASTORE_PREFIX + ".enable")
+  .defaultValue(false)
+  .withDocumentation("Use metastore server to store hoodie table 
metadata");
+
+  public static final ConfigProperty METASTORE_URLS = ConfigProperty
+  .key(METASTORE_PREFIX + ".uris")
+  .defaultValue("thrift://localhost:9090")
+  .withDocumentation("Metastore server uris");
+
+  public static final ConfigProperty METASTORE_CONNECTION_RETRIES = 
ConfigProperty
+  .key(METASTORE_PREFIX + ".connect.retries")
+  .defaultValue(3)
+  .withDocumentation("Number of retries while opening a connection to 
metastore");
+
+  public static final ConfigProperty METASTORE_CONNECTION_RETRY_DELAY 
= ConfigProperty
+  .key(METASTORE_PREFIX + ".connect.retry.delay")
+  .defaultValue(1)
+  .withDocumentation("Number of seconds for the client to wait between 
consecutive connection attempts");
+

Review Comment:
   Good suggestions. Will add it in the following PR #5064



##
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetastoreConfig.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+import javax.annotation.concurrent.Immutable;
+import java.util.Properties;
+
+/**
+ * Configurations used by the HUDI Metadata Table.

Review Comment:
   fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-940) Audit bad/dangling configs and code

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-940:

Sprint: 2022/05/16

> Audit bad/dangling configs and code 
> 
>
> Key: HUDI-940
> URL: https://issues.apache.org/jira/browse/HUDI-940
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Common Core
>Affects Versions: 0.9.0
>Reporter: Balaji Varadarajan
>Priority: Major
> Fix For: 0.12.0
>
>
> Motivation : Avoid bad configs like the one fixed in  
> [https://github.com/apache/hudi/pull/1654]
> We need to take a pass on the code to remove dead/bad configs and code



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-1480) Inspect all tools and provide a standard way to pass in hoodie configs

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1480:
-
Sprint: 2022/05/16

> Inspect all tools and provide a standard way to pass in hoodie configs
> --
>
> Key: HUDI-1480
> URL: https://issues.apache.org/jira/browse/HUDI-1480
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: code-quality
>Reporter: Vinoth Chandar
>Priority: Major
>
> tools = classes with a main method, that we expect the user to be running
>  e.g _DLASyncConfig/HiveSyncConfig_ add an entry for direct hudi configs like 
> useFileListingFromMetadata/verifyMetadataFileListing
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-2871) Decouple metrics dependencies from hudi-client-common

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2871:
-
Sprint: Hudi-Sprint-Mar-01, Hudi-Sprint-Mar-07, Hudi-Sprint-Mar-14, 
Hudi-Sprint-Mar-21, Hudi-Sprint-Mar-22, 2022/05/16  (was: Hudi-Sprint-Mar-01, 
Hudi-Sprint-Mar-07, Hudi-Sprint-Mar-14, Hudi-Sprint-Mar-21, Hudi-Sprint-Mar-22)

> Decouple metrics dependencies from hudi-client-common
> -
>
> Key: HUDI-2871
> URL: https://issues.apache.org/jira/browse/HUDI-2871
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: code-quality, dependencies, metrics, writer-core
>Reporter: Vinoth Chandar
>Assignee: Rajesh
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> There are some metrics stuff  - Cloudwatch, graphite, prometheus etc are all 
> pulled in. 
> might be good to break these out into their own modules and include during 
> packaging. This needs some way of reflection based instantiation of the 
> Metrics reporter



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-1891) Jetty Dependency conflict when upgrade to hive3.1.1 and hadoop3.0.0

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1891:
-
Sprint: Cont' improve - 2022/03/7, 2022/05/16  (was: Cont' improve - 
2022/03/7)

> Jetty Dependency conflict when upgrade to hive3.1.1 and hadoop3.0.0
> ---
>
> Key: HUDI-1891
> URL: https://issues.apache.org/jira/browse/HUDI-1891
> Project: Apache Hudi
>  Issue Type: Task
>  Components: dependencies
>Reporter: shenbing
>Priority: Critical
>
> when package hudi 0.7.0 or 0.9.0-SNAPSHOT using 
> {code:java}
> mvn clean install -DskipTests -DskipITs -Dcheckstyle.skip=true 
> -Drat.skip=true -Dhadoop.version=3.0.0  -Dhive.version=3.1.1{code}
> and then import hudi-spark-bundle_2.11-0.9.0-SNAPSHOT.jar into my project. I 
> got a error :
>  
> {code:java}
> org.apache.hudi.org.apache.jetty.server.session.SessionHandler.setHttpOnly(Z)Vjava.lang.NoSuchMethodError:
>  
> org.apache.hudi.org.apache.jetty.server.session.SessionHandler.setHttpOnly(Z)V
> at 
> io.javalin.core.util.JettyServerUtil.defaultSessionHandler(JettyServerUtil.kt:50)
> at io.javalin.Javalin.(Javalin.java:94)
> at io.javalin.Javalin.create(Javalin.java:107)
> at 
> org.apache.hudi.timeline.service.TimelineService.startService(TimelineService.java:156)
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineService.startServer(EmbeddedTimelineService.java:88)
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineServerHelper.createEmbeddedTimelineService(EmbeddedTimelineServerHelper.java:56)
> at 
> org.apache.hudi.client.AbstractHoodieClient.startEmbeddedServerView(AbstractHoodieClient.java:109)
> at 
> org.apache.hudi.client.AbstractHoodieClient.(AbstractHoodieClient.java:77)
> at 
> org.apache.hudi.client.AbstractHoodieWriteClient.(AbstractHoodieWriteClient.java:132)
> at 
> org.apache.hudi.client.AbstractHoodieWriteClient.(AbstractHoodieWriteClient.java:120)
> at 
> org.apache.hudi.client.SparkRDDWriteClient.(SparkRDDWriteClient.java:84)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] minihippo commented on a diff in pull request #5572: [HUDI-3654] Preparations for hudi metastore.

2022-05-15 Thread GitBox


minihippo commented on code in PR #5572:
URL: https://github.com/apache/hudi/pull/5572#discussion_r873375900


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -612,6 +623,21 @@ public void initializeBootstrapDirsIfNotExists() throws 
IOException {
 initializeBootstrapDirsIfNotExists(getHadoopConf(), basePath.toString(), 
getFs());
   }
 
+  private static HoodieTableMetaClient newMetaClient(Configuration conf, 
String basePath, boolean loadActiveTimelineOnLoad,
+  ConsistencyGuardConfig consistencyGuardConfig, 
Option layoutVersion,
+  String payloadClassName, FileSystemRetryConfig fileSystemRetryConfig, 
Properties props) {
+HoodieMetastoreConfig metastoreConfig = null == props
+? new HoodieMetastoreConfig.Builder().build()
+: new HoodieMetastoreConfig.Builder().fromProperties(props).build();
+return metastoreConfig.enableMetastore()
+? (HoodieTableMetaClient) 
ReflectionUtils.loadClass("org.apache.hudi.common.table.HoodieTableMetastoreClient",

Review Comment:
   HoodieTableMetastoreClient is under `hudi-metastore`, so it has to use 
reflection here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-4023) de-couple spark from utilities bundle

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4023:
-
Sprint: 2022/05/16

> de-couple spark from utilities bundle
> -
>
> Key: HUDI-4023
> URL: https://issues.apache.org/jira/browse/HUDI-4023
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: dependencies
>Reporter: sivabalan narayanan
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.12.0
>
>
> we should be able to couple 
> utilities-slim + any of spark/presto/trino/kafka + any of sync bundle. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-4052) Move AWS Cloudwatch metric configs to hudi-client-common instead of hudi-aws

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4052:
-
Sprint: 2022/05/16

> Move AWS Cloudwatch metric configs to hudi-client-common instead of hudi-aws
> 
>
> Key: HUDI-4052
> URL: https://issues.apache.org/jira/browse/HUDI-4052
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Major
>  Labels: client, pull-request-available, third-party
> Fix For: 0.11.1
>
>
> Currently, HoodieMetricsCloudWatchConfig is part of hudi-aws module, while 
> HoodieMetricConfig is part of hudi-client-common. The builder of 
> HoodieMetricConfig uses HoodieMetricsCloudWatchConfig. When we want to use 
> hudi write client in other engines (presto/trino), we need to include 
> hudi-aws module because HoodieMetricsConfig initialization falls in that 
> path. But, this bloats the presto/trino bundle, which we want to keep as 
> light as possible.
> We should try to decouple the two. Let all metric configs be in client-common 
> (e.g. prometheus and datadog configs are already there). This way we can 
> exclude hudi-aws from the bundles.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-3958) Resolve parquet-avro conflict in hudi-gcp-bundle and hudi-spark3.1-bundle

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3958:
-
Sprint: 2022/05/16

> Resolve parquet-avro conflict in hudi-gcp-bundle and hudi-spark3.1-bundle
> -
>
> Key: HUDI-3958
> URL: https://issues.apache.org/jira/browse/HUDI-3958
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: dependencies
>Reporter: Raymond Xu
>Priority: Major
> Fix For: 0.12.0
>
>
> In gcp bundle (master version) we include parquet-avro, which results in 
> issue running in dataproc 2.0.34-ubuntu18 with spark3.1-bundle and 
> utilities-slim bundle
> {code:text}
> 22/04/23 15:02:14 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 
> 0.0 in stage 36.0 (TID 93) 
> (cluster-4275-m.asia-southeast1-a.c.hudi-bq.internal executor 1): 
> java.lang.RuntimeException: org.apache.hudi.exception.HoodieException: 
> org.apache.hudi.exception.HoodieException: 
> java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
> org/apache/parquet/schema/LogicalTypeAnnotation$UUIDLogicalTypeAnnotation
>   at 
> org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:121)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:46)
>   at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
>   at 
> org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
>   at 
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349)
>   at 
> org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1440)
>   at 
> org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1350)
>   at 
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1414)
>   at 
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1237)
>   at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:131)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: org.apache.hudi.exception.HoodieException: 
> org.apache.hudi.exception.HoodieException: 
> java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
> org/apache/parquet/schema/LogicalTypeAnnotation$UUIDLogicalTypeAnnotation
>   at 
> org.apache.hudi.execution.SparkLazyInsertIterable.computeNext(SparkLazyInsertIterable.java:94)
>   at 
> org.apache.hudi.execution.SparkLazyInsertIterable.computeNext(SparkLazyInsertIterable.java:37)
>   at 
> org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:119)
>   ... 22 more
> Caused by: org.apache.hudi.exception.HoodieException: 
> java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
> org/apache/parquet/schema/LogicalTypeAnnotation$UUIDLogicalTypeAnnotation
>   at 
> org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:160)
>   at 
> org.apache.hudi.execution.SparkLazyInsertIterable.computeNext(SparkLazyInsertIterable.java:90)
>   ... 24 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.NoClassDefFoundError: 
> org/apache/parquet/schema/LogicalTypeAnnotation$UUIDLogicalTypeAnnotation
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:154)
>   ... 25 more
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/parquet/schema/LogicalTypeAnnotation$UUIDLogicalTypeAnnotation
>   at 
> org.apache.hudi.io.storage.HoodieFileWriterFactory.newParquetFileWriter(HoodieFileWriterFactory.java:78)
>   at 
> org.apache.hudi.io.storage.HoodieFileW

[jira] [Updated] (HUDI-4011) Add a Hudi AWS bundle

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4011:
-
Sprint: 2022/05/16

> Add a Hudi AWS bundle
> -
>
> Key: HUDI-4011
> URL: https://issues.apache.org/jira/browse/HUDI-4011
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Udit Mehrotra
>Assignee: Wenning Ding
>Priority: Major
> Fix For: 0.12.0
>
>
> As was raised in [https://github.com/apache/hudi/issues/5451,] there Hudi AWS 
> jars were moved out of hudi-spark-bundle. Hence, customers need to manually 
> pass jars like DynamoDb lock client, DynamoDb aws sdk etc to be able to use 
> DynamoDb lock provider implementation.
> We need an AWS specific bundle, that packages these dependencies to make it 
> easier for customers. They can use this bundle along with hudi-spark-bundle 
> when they need to use DynamoDb lock provider.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-3878) create hudi-aws-bundle to use with utilities-slim and spark/flink on aws

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3878:
-
Sprint: 2022/05/16

> create hudi-aws-bundle to use with utilities-slim and spark/flink on aws
> 
>
> Key: HUDI-3878
> URL: https://issues.apache.org/jira/browse/HUDI-3878
> Project: Apache Hudi
>  Issue Type: Task
>  Components: dependencies
>Reporter: Raymond Xu
>Priority: Blocker
> Fix For: 0.12.0
>
>
> Create a bundle with all aws dep and make it work with utilities-slim and 
> engine bundle (spark or flink)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] minihippo commented on a diff in pull request #5572: [HUDI-3654] Preparations for hudi metastore.

2022-05-15 Thread GitBox


minihippo commented on code in PR #5572:
URL: https://github.com/apache/hudi/pull/5572#discussion_r873370500


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java:
##
@@ -459,6 +459,9 @@ private Stream getCommitInstantsToArchive() {
 
   private Stream getInstantsToArchive() {
 Stream instants = 
Stream.concat(getCleanInstantsToArchive(), getCommitInstantsToArchive());
+if (config.isMetastoreEnabled()) {
+  return Stream.empty();

Review Comment:
   For hudi metastore, timeline no more need to be archived



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


wzx140 commented on code in PR #5522:
URL: https://github.com/apache/hudi/pull/5522#discussion_r873368687


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##
@@ -257,46 +255,45 @@ protected void init(String fileId, 
Iterator> newRecordsItr) {
 + ((ExternalSpillableMap) 
keyToNewRecords).getSizeOfFileOnDiskInBytes());
   }
 
-  private boolean writeUpdateRecord(HoodieRecord hoodieRecord, 
GenericRecord oldRecord, Option indexedRecord) {
+  private boolean writeUpdateRecord(HoodieRecord hoodieRecord, 
HoodieRecord oldRecord, Option combineRecordOp) throws 
IOException {
 boolean isDelete = false;
-if (indexedRecord.isPresent()) {
+Schema schema = useWriterSchemaForCompaction ? tableSchemaWithMetaFields : 
tableSchema;
+if (combineRecordOp.isPresent()) {
   updatedRecordsWritten++;
-  GenericRecord record = (GenericRecord) indexedRecord.get();
-  if (oldRecord != record) {
+  if (!oldRecord.equals(combineRecordOp.get())) {
 // the incoming record is chosen
 isDelete = HoodieOperation.isDelete(hoodieRecord.getOperation());
   }
 }
-return writeRecord(hoodieRecord, indexedRecord, isDelete);
+return writeRecord(hoodieRecord, combineRecordOp, schema, 
config.getProps(), isDelete);
   }
 
   protected void writeInsertRecord(HoodieRecord hoodieRecord) throws 
IOException {
 Schema schema = useWriterSchemaForCompaction ? tableSchemaWithMetaFields : 
tableSchema;
-Option insertRecord = 
hoodieRecord.getData().getInsertValue(schema, config.getProps());
 // just skip the ignored record
-if (insertRecord.isPresent() && insertRecord.get().equals(IGNORE_RECORD)) {
+if (hoodieRecord.shouldIgnore(schema, config.getProps())) {
   return;
 }
-if (writeRecord(hoodieRecord, insertRecord, 
HoodieOperation.isDelete(hoodieRecord.getOperation( {
+if (writeRecord(hoodieRecord, Option.of(hoodieRecord), schema, 
config.getProps(), HoodieOperation.isDelete(hoodieRecord.getOperation( {

Review Comment:
   Because operation has nothing to do with record type, it is unnecessary to 
add isdelete api to record.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


wzx140 commented on code in PR #5522:
URL: https://github.com/apache/hudi/pull/5522#discussion_r873366141


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##
@@ -257,46 +255,45 @@ protected void init(String fileId, 
Iterator> newRecordsItr) {
 + ((ExternalSpillableMap) 
keyToNewRecords).getSizeOfFileOnDiskInBytes());
   }
 
-  private boolean writeUpdateRecord(HoodieRecord hoodieRecord, 
GenericRecord oldRecord, Option indexedRecord) {
+  private boolean writeUpdateRecord(HoodieRecord hoodieRecord, 
HoodieRecord oldRecord, Option combineRecordOp) throws 
IOException {
 boolean isDelete = false;
-if (indexedRecord.isPresent()) {
+Schema schema = useWriterSchemaForCompaction ? tableSchemaWithMetaFields : 
tableSchema;
+if (combineRecordOp.isPresent()) {
   updatedRecordsWritten++;
-  GenericRecord record = (GenericRecord) indexedRecord.get();
-  if (oldRecord != record) {
+  if (!oldRecord.equals(combineRecordOp.get())) {

Review Comment:
   `oldRecord` can not be null. Becase oldRecord is directly read from 
HoodieAvroFileReader



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-195) Bump jackson-databind to prevent deserialization loophole

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-195:

Sprint: 2022/05/16

> Bump jackson-databind to prevent deserialization loophole
> -
>
> Key: HUDI-195
> URL: https://issues.apache.org/jira/browse/HUDI-195
> Project: Apache Hudi
>  Issue Type: Task
>  Components: code-quality, writer-core
>Reporter: vinoyang
>Assignee: liujinhui
>Priority: Major
>
> In Tencent, we can not use 2.6.4 of 
> com.fasterxml.jackson.core:jackson-databind. Because it exists 
> deserialization loophole. The description of loophole is here: 
> [https://www.cnvd.org.cn/flaw/show/CNVD-2017-04483] (unfortunately, it's a 
> Chinese web page).
> We recommend up to 2.7.9.2, 2.8.11 or 2.9.4+.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HUDI-195) Bump jackson-databind to prevent deserialization loophole

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-195:
---

Assignee: liujinhui  (was: vinoyang)

> Bump jackson-databind to prevent deserialization loophole
> -
>
> Key: HUDI-195
> URL: https://issues.apache.org/jira/browse/HUDI-195
> Project: Apache Hudi
>  Issue Type: Task
>  Components: code-quality, writer-core
>Reporter: vinoyang
>Assignee: liujinhui
>Priority: Major
>
> In Tencent, we can not use 2.6.4 of 
> com.fasterxml.jackson.core:jackson-databind. Because it exists 
> deserialization loophole. The description of loophole is here: 
> [https://www.cnvd.org.cn/flaw/show/CNVD-2017-04483] (unfortunately, it's a 
> Chinese web page).
> We recommend up to 2.7.9.2, 2.8.11 or 2.9.4+.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HUDI-1976) Upgrade hive, jackson, log4j, hadoop to remove vulnerability

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-1976:


Assignee: liujinhui  (was: Vinay)

> Upgrade hive, jackson, log4j, hadoop to remove vulnerability
> 
>
> Key: HUDI-1976
> URL: https://issues.apache.org/jira/browse/HUDI-1976
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: dependencies, hive
>Reporter: Nishith Agarwal
>Assignee: liujinhui
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> [https://github.com/apache/hudi/issues/2827]
> [https://github.com/apache/hudi/issues/2826]
> [https://github.com/apache/hudi/issues/2824|https://github.com/apache/hudi/issues/2826]
> [https://github.com/apache/hudi/issues/2823|https://github.com/apache/hudi/issues/2826]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-1976) Upgrade hive, jackson, log4j, hadoop to remove vulnerability

2022-05-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1976:
-
Sprint: Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25, 2022/05/16  (was: 
Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25)

> Upgrade hive, jackson, log4j, hadoop to remove vulnerability
> 
>
> Key: HUDI-1976
> URL: https://issues.apache.org/jira/browse/HUDI-1976
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: dependencies, hive
>Reporter: Nishith Agarwal
>Assignee: liujinhui
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> [https://github.com/apache/hudi/issues/2827]
> [https://github.com/apache/hudi/issues/2826]
> [https://github.com/apache/hudi/issues/2824|https://github.com/apache/hudi/issues/2826]
> [https://github.com/apache/hudi/issues/2823|https://github.com/apache/hudi/issues/2826]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] hudi-bot commented on pull request #5590: [HUDI-4101] BucketIndexPartitioner should take partition path for bet…

2022-05-15 Thread GitBox


hudi-bot commented on PR #5590:
URL: https://github.com/apache/hudi/pull/5590#issuecomment-1127263867

   
   ## CI report:
   
   * 3ce638e5bba2da5b47291cbe2ac6c35a0f542ee5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8673)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


hudi-bot commented on PR #5588:
URL: https://github.com/apache/hudi/pull/5588#issuecomment-1127263837

   
   ## CI report:
   
   * a94b8616a5515088ca59836f64124f3577aab4a9 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8669)
 
   * 6c7d529cec597e77fe0913af6f589a73203d9738 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8672)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5583: [HUDI-4098] Metadata table heartbeat for instant has expired, last he…

2022-05-15 Thread GitBox


hudi-bot commented on PR #5583:
URL: https://github.com/apache/hudi/pull/5583#issuecomment-1127263808

   
   ## CI report:
   
   * 9192e47e4069c54b8a305bd174b00f995c613fbb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8651)
 
   * 2ed49804a1a583d38eb00db77d7b091af1db8024 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8671)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] Consistent bucket index: bucket resizing (split&merge) & concurrent write during resizing

2022-05-15 Thread GitBox


hudi-bot commented on PR #4958:
URL: https://github.com/apache/hudi/pull/4958#issuecomment-1127263134

   
   ## CI report:
   
   * a88d737dcd685b0fcb027d8eb68c741124e826eb Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8523)
 
   * 33aace381fc3633b52e8321adeab29f47f7a24f8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8670)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nleena123 commented on issue #5540: [SUPPORT]HoodieException: Commit 20220509105215 failed and rolled-back ! at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.

2022-05-15 Thread GitBox


nleena123 commented on issue #5540:
URL: https://github.com/apache/hudi/issues/5540#issuecomment-1127262001

   Hi @nsivabalan 
   Below attached property file contain all configs that we used to This job.
   And  used passed below arguments to databrick job (we are running hudi job 
through Azure databricks)
   
   
[metrics.properties.txt](https://github.com/apache/hudi/files/8697581/metrics.properties.txt)
   
   
["--table-type","COPY_ON_WRITE","--source-ordering-field","CDC_TS","--source-class","com.optum.df.hudi.sources.DFAvroKafkaSource","--target-base-path","/mnt/ulp/dataassets-lake/metrics/","--target-table","metrics","--schemaprovider-class","org.apache.hudi.utilities.schema.SchemaRegistryProvider","--props","/mnt/ulp/artifacts/properties/metrics.properties"]
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5590: [HUDI-4101] BucketIndexPartitioner should take partition path for bet…

2022-05-15 Thread GitBox


hudi-bot commented on PR #5590:
URL: https://github.com/apache/hudi/pull/5590#issuecomment-1127261227

   
   ## CI report:
   
   * 3ce638e5bba2da5b47291cbe2ac6c35a0f542ee5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


hudi-bot commented on PR #5588:
URL: https://github.com/apache/hudi/pull/5588#issuecomment-1127261183

   
   ## CI report:
   
   * a94b8616a5515088ca59836f64124f3577aab4a9 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8669)
 
   * 6c7d529cec597e77fe0913af6f589a73203d9738 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5583: [HUDI-4098] Metadata table heartbeat for instant has expired, last he…

2022-05-15 Thread GitBox


hudi-bot commented on PR #5583:
URL: https://github.com/apache/hudi/pull/5583#issuecomment-1127261141

   
   ## CI report:
   
   * 9192e47e4069c54b8a305bd174b00f995c613fbb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8651)
 
   * 2ed49804a1a583d38eb00db77d7b091af1db8024 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] Consistent bucket index: bucket resizing (split&merge) & concurrent write during resizing

2022-05-15 Thread GitBox


hudi-bot commented on PR #4958:
URL: https://github.com/apache/hudi/pull/4958#issuecomment-1127260542

   
   ## CI report:
   
   * a88d737dcd685b0fcb027d8eb68c741124e826eb Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8523)
 
   * 33aace381fc3633b52e8321adeab29f47f7a24f8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


hudi-bot commented on PR #5588:
URL: https://github.com/apache/hudi/pull/5588#issuecomment-1127258788

   
   ## CI report:
   
   * a94b8616a5515088ca59836f64124f3577aab4a9 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8669)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] BalaMahesh commented on issue #5591: [SUPPORT] Hudi Error rolling back using marker files

2022-05-15 Thread GitBox


BalaMahesh commented on issue #5591:
URL: https://github.com/apache/hudi/issues/5591#issuecomment-1127252282

   @yihua - please do take a look at it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] BalaMahesh opened a new issue, #5591: [SUPPORT] Hudi Error rolling back using marker files

2022-05-15 Thread GitBox


BalaMahesh opened a new issue, #5591:
URL: https://github.com/apache/hudi/issues/5591

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at 
dev-subscr...@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   A clear and concise description of the problem.
   
   We have recently upgraded hudi to 0.11.0 and application has been running 
good for last few days. Then all of sudden due to some reason, marker files in 
.hoodie/.temp directory are not created with .marker extension and the 
application while trying to roll back this commit, validation of marker files 
is failing and application is getting stopped. I have attached the screenshot 
of marker files created inside the folder.
   
   Steps to reproduce the behavior:
   
   1. Start the application and keep it running. 
   2. We don't know exactly what has stopped the application, but it started 
trying to rollback the failed commit and it is failing to do that. 
   
   **Expected behavior**
   
   Run and rollback the failed commits if application has failed. 
   
   **Environment Description**
   
   * Hudi version : 0.11.0
   
   * Spark version : 3.2.1
   
   * Hive version : 2.1.3
   
   * Hadoop version : 3.x.x
   
   * Storage (HDFS/S3/GCS..) : GCS
   
   * Running on Docker? (yes/no) : yes(k8's).
   
   
   **Additional context**
   
   
   
   **Stacktrace**
   
   ```
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieRollbackException: Failed to rollback 
gs://xxx/hudi// commits 20220514180424825
at java.base/java.util.concurrent.CompletableFuture.reportGet(Unknown 
Source)
at java.base/java.util.concurrent.CompletableFuture.get(Unknown Source)
at 
org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103)
at 
org.apache.hudi.async.AsyncCleanerService.waitForCompletion(AsyncCleanerService.java:75)
... 11 more
   Caused by: org.apache.hudi.exception.HoodieRollbackException: Failed to 
rollback gs://xxx/hudi/xxx/ commits 20220514180424825
at 
org.apache.hudi.client.BaseHoodieWriteClient.rollback(BaseHoodieWriteClient.java:783)
at 
org.apache.hudi.client.BaseHoodieWriteClient.rollbackFailedWrites(BaseHoodieWriteClient.java:1193)
at 
org.apache.hudi.client.BaseHoodieWriteClient.rollbackFailedWrites(BaseHoodieWriteClient.java:1176)
at 
org.apache.hudi.client.BaseHoodieWriteClient.lambda$clean$33796fd2$1(BaseHoodieWriteClient.java:856)
at 
org.apache.hudi.common.util.CleanerUtils.rollbackFailedWrites(CleanerUtils.java:142)
at 
org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:855)
at 
org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:825)
at 
org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55)
... 4 more
   Caused by: org.apache.hudi.exception.HoodieRollbackException: Error rolling 
back using marker files written for [==>20220514180424825__commit__INFLIGHT]
at 
org.apache.hudi.table.action.rollback.MarkerBasedRollbackStrategy.getRollbackRequests(MarkerBasedRollbackStrategy.java:103)
at 
org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor.requestRollback(BaseRollbackPlanActionExecutor.java:109)
at 
org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor.execute(BaseRollbackPlanActionExecutor.java:132)
at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.scheduleRollback(HoodieSparkCopyOnWriteTable.java:212)
at 
org.apache.hudi.client.BaseHoodieWriteClient.lambda$rollback$6(BaseHoodieWriteClient.java:757)
at org.apache.hudi.common.util.Option.orElseGet(Option.java:142)
at 
org.apache.hudi.client.BaseHoodieWriteClient.rollback(BaseHoodieWriteClient.java:757)
... 11 more
   Caused by: java.lang.IllegalArgumentException
at 
org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
at 
org.apache.hudi.common.util.MarkerUtils.stripMarkerFolderPrefix(MarkerUtils.java:67)
at 
org.apache.hudi.table.marker.DirectWriteMarkers.lambda$allMarkerFilePaths$0(DirectWriteMarkers.java:136)
at org.apache.hudi.common.fs.FSUtils.processFiles(FSUtils.java:277)
at 
org.apache.hudi.table.marker.DirectWriteMarkers.allMarkerFilePaths(DirectWriteMarkers.java:135)
at 
org.apache.hudi.table.marker.MarkerBasedRollbackUtils.getAllMarkerPaths(MarkerBasedRollbackUtils.java:62)
at 
org.apache.hudi.table.action.rollback.MarkerBasedRollbackStrategy.getRollbackRequests(MarkerBasedRollbackStrategy.java:76)
... 17 more
   
   ```
   
   https://user

[jira] [Updated] (HUDI-4101) BucketIndexPartitioner should take partition path for better dispersion

2022-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-4101:
-
Labels: pull-request-available  (was: )

> BucketIndexPartitioner should take partition path for better dispersion
> ---
>
> Key: HUDI-4101
> URL: https://issues.apache.org/jira/browse/HUDI-4101
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: core
>Reporter: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.1, 0.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] danny0405 opened a new pull request, #5590: [HUDI-4101] BucketIndexPartitioner should take partition path for bet…

2022-05-15 Thread GitBox


danny0405 opened a new pull request, #5590:
URL: https://github.com/apache/hudi/pull/5590

   …ter dispersion
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before 
opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-4101) BucketIndexPartitioner should take partition path for better dispersion

2022-05-15 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-4101:
-
Summary: BucketIndexPartitioner should take partition path for better 
dispersion  (was: Bucket index hash algorithm should take partition path for 
better dispersion)

> BucketIndexPartitioner should take partition path for better dispersion
> ---
>
> Key: HUDI-4101
> URL: https://issues.apache.org/jira/browse/HUDI-4101
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: core
>Reporter: Danny Chen
>Priority: Major
> Fix For: 0.11.1, 0.12.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] jinxing64 commented on a diff in pull request #5564: [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

2022-05-15 Thread GitBox


jinxing64 commented on code in PR #5564:
URL: https://github.com/apache/hudi/pull/5564#discussion_r873331963


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/DropHoodieTableCommand.scala:
##
@@ -69,13 +67,13 @@ extends HoodieLeafRunnableCommand {
 val catalog = sparkSession.sessionState.catalog
 
 // Drop table in the catalog
-val enableHive = isEnableHive(sparkSession)
-if (enableHive) {
-  dropHiveDataSourceTable(sparkSession, hoodieCatalogTable)
+if (HoodieTableType.MERGE_ON_READ == hoodieCatalogTable.tableType && 
purge) {
+  val (rtTableOpt, roTableOpt) = getTableRTAndRO(catalog, 
hoodieCatalogTable)
+  rtTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  roTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  catalog.dropTable(table.identifier.copy(table = 
hoodieCatalogTable.tableName), ifExists, purge)
 } else {
-  if (catalog.tableExists(tableIdentifier)) {
-catalog.dropTable(tableIdentifier, ifExists, purge)
-  }
+  catalog.dropTable(table.identifier, ifExists, purge)

Review Comment:
   In existing code, RT and RO tables are dropped only when purging a MOR 
table; This PR respects the current behavior. From my point, I also think such 
logic makes sense and acceptable for the user.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] jinxing64 commented on a diff in pull request #5564: [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

2022-05-15 Thread GitBox


jinxing64 commented on code in PR #5564:
URL: https://github.com/apache/hudi/pull/5564#discussion_r873331963


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/DropHoodieTableCommand.scala:
##
@@ -69,13 +67,13 @@ extends HoodieLeafRunnableCommand {
 val catalog = sparkSession.sessionState.catalog
 
 // Drop table in the catalog
-val enableHive = isEnableHive(sparkSession)
-if (enableHive) {
-  dropHiveDataSourceTable(sparkSession, hoodieCatalogTable)
+if (HoodieTableType.MERGE_ON_READ == hoodieCatalogTable.tableType && 
purge) {
+  val (rtTableOpt, roTableOpt) = getTableRTAndRO(catalog, 
hoodieCatalogTable)
+  rtTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  roTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  catalog.dropTable(table.identifier.copy(table = 
hoodieCatalogTable.tableName), ifExists, purge)
 } else {
-  if (catalog.tableExists(tableIdentifier)) {
-catalog.dropTable(tableIdentifier, ifExists, purge)
-  }
+  catalog.dropTable(table.identifier, ifExists, purge)

Review Comment:
   In existing code, RT and RO tables are dropped only when purging a MOR 
table; This PR respects the current behavior.
   From my point, such logic makes sense and acceptable for the user.



##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/DropHoodieTableCommand.scala:
##
@@ -69,13 +67,13 @@ extends HoodieLeafRunnableCommand {
 val catalog = sparkSession.sessionState.catalog
 
 // Drop table in the catalog
-val enableHive = isEnableHive(sparkSession)
-if (enableHive) {
-  dropHiveDataSourceTable(sparkSession, hoodieCatalogTable)
+if (HoodieTableType.MERGE_ON_READ == hoodieCatalogTable.tableType && 
purge) {
+  val (rtTableOpt, roTableOpt) = getTableRTAndRO(catalog, 
hoodieCatalogTable)
+  rtTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  roTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  catalog.dropTable(table.identifier.copy(table = 
hoodieCatalogTable.tableName), ifExists, purge)
 } else {
-  if (catalog.tableExists(tableIdentifier)) {
-catalog.dropTable(tableIdentifier, ifExists, purge)
-  }
+  catalog.dropTable(table.identifier, ifExists, purge)

Review Comment:
   In existing code, RT and RO tables are dropped only when purging a MOR 
table; This PR respects the current behavior. From my point, such logic makes 
sense and acceptable for the user.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] jinxing64 commented on pull request #5564: [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

2022-05-15 Thread GitBox


jinxing64 commented on PR #5564:
URL: https://github.com/apache/hudi/pull/5564#issuecomment-1127232082

   Thanks @XuQianJin-Stars taking time to review this PR ~
   
   >There is one more situation to consider here
   The operation is as follows:
   Create the mor table hudi_table with sparksql, there are no ro and rt tables 
yet
   Then flink writes data to trigger the creation of ro ​​and rt tables
   At this time, the mor table will generate three tables
   hudi_table,hudi_table_ro,hudi_table_rt
   So we need to drop three tables.
   
   I think current "Test Drop RO & RT table by purging base table." in 
TestDropTable tests such scenario. Right ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (HUDI-3781) spark delete sql can't delete record

2022-05-15 Thread KnightChess (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KnightChess closed HUDI-3781.
-
Resolution: Fixed

> spark delete sql can't delete record
> 
>
> Key: HUDI-3781
> URL: https://issues.apache.org/jira/browse/HUDI-3781
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark, spark-sql
>Reporter: KnightChess
>Assignee: KnightChess
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> create a table and set *hoodie.datasource.write.operation* upsert 
> when I use _*sql*_ to delete, the delete *operation key* will be overwrite by 
> *hoodie.datasource.write.operation* from table or env
>  
> {code:java}
> withSparkConf(sparkSession, hoodieCatalogTable.catalogProperties) {
>   Map(
> "path" -> path,
> RECORDKEY_FIELD.key -> hoodieCatalogTable.primaryKeys.mkString(","),
> TBL_NAME.key -> tableConfig.getTableName,
> HIVE_STYLE_PARTITIONING.key -> tableConfig.getHiveStylePartitioningEnable,
> URL_ENCODE_PARTITIONING.key -> tableConfig.getUrlEncodePartitioning,
> KEYGENERATOR_CLASS_NAME.key -> classOf[SqlKeyGenerator].getCanonicalName,
> SqlKeyGenerator.ORIGIN_KEYGEN_CLASS_NAME -> 
> tableConfig.getKeyGeneratorClassName,
> OPERATION.key -> DataSourceWriteOptions.DELETE_OPERATION_OPT_VAL,
> PARTITIONPATH_FIELD.key -> tableConfig.getPartitionFieldProp,
> HiveSyncConfig.HIVE_SYNC_MODE.key -> HiveSyncMode.HMS.name(),
> HiveSyncConfig.HIVE_SUPPORT_TIMESTAMP_TYPE.key -> "true",
> HoodieWriteConfig.DELETE_PARALLELISM_VALUE.key -> "200",
> SqlKeyGenerator.PARTITION_SCHEMA -> partitionSchema.toDDL
>   )
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-3781) spark delete sql can't delete record

2022-05-15 Thread KnightChess (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KnightChess updated HUDI-3781:
--
Fix Version/s: 0.11.0

> spark delete sql can't delete record
> 
>
> Key: HUDI-3781
> URL: https://issues.apache.org/jira/browse/HUDI-3781
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark, spark-sql
>Reporter: KnightChess
>Assignee: KnightChess
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> create a table and set *hoodie.datasource.write.operation* upsert 
> when I use _*sql*_ to delete, the delete *operation key* will be overwrite by 
> *hoodie.datasource.write.operation* from table or env
>  
> {code:java}
> withSparkConf(sparkSession, hoodieCatalogTable.catalogProperties) {
>   Map(
> "path" -> path,
> RECORDKEY_FIELD.key -> hoodieCatalogTable.primaryKeys.mkString(","),
> TBL_NAME.key -> tableConfig.getTableName,
> HIVE_STYLE_PARTITIONING.key -> tableConfig.getHiveStylePartitioningEnable,
> URL_ENCODE_PARTITIONING.key -> tableConfig.getUrlEncodePartitioning,
> KEYGENERATOR_CLASS_NAME.key -> classOf[SqlKeyGenerator].getCanonicalName,
> SqlKeyGenerator.ORIGIN_KEYGEN_CLASS_NAME -> 
> tableConfig.getKeyGeneratorClassName,
> OPERATION.key -> DataSourceWriteOptions.DELETE_OPERATION_OPT_VAL,
> PARTITIONPATH_FIELD.key -> tableConfig.getPartitionFieldProp,
> HiveSyncConfig.HIVE_SYNC_MODE.key -> HiveSyncMode.HMS.name(),
> HiveSyncConfig.HIVE_SUPPORT_TIMESTAMP_TYPE.key -> "true",
> HoodieWriteConfig.DELETE_PARALLELISM_VALUE.key -> "200",
> SqlKeyGenerator.PARTITION_SCHEMA -> partitionSchema.toDDL
>   )
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] YuweiXiao commented on a diff in pull request #4480: [HUDI-3123] consistent hashing index: basic write path (upsert/insert)

2022-05-15 Thread GitBox


YuweiXiao commented on code in PR #4480:
URL: https://github.com/apache/hudi/pull/4480#discussion_r873322299


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/BucketIndexLocationMapper.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.index.bucket;
+
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecordLocation;
+import org.apache.hudi.common.util.Option;
+
+import java.io.Serializable;
+
+public interface BucketIndexLocationMapper extends Serializable {
+
+  /**
+   * Get record location given hoodie key and partition path
+   */
+  Option getRecordLocation(HoodieKey key, String 
partitionPath);
+

Review Comment:
   Hey danny, I am also planning to introduce this consistent hashing index 
feature into hudi flink engine, in order to support dynamic bucket number. Does 
flink engine have any roadmap for 'dynamic bucket num' currently?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] YuweiXiao commented on a diff in pull request #4480: [HUDI-3123] consistent hashing index: basic write path (upsert/insert)

2022-05-15 Thread GitBox


YuweiXiao commented on code in PR #4480:
URL: https://github.com/apache/hudi/pull/4480#discussion_r873317466


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/BucketIndexLocationMapper.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.index.bucket;
+
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecordLocation;
+import org.apache.hudi.common.util.Option;
+
+import java.io.Serializable;
+
+public interface BucketIndexLocationMapper extends Serializable {
+
+  /**
+   * Get record location given hoodie key and partition path
+   */
+  Option getRecordLocation(HoodieKey key, String 
partitionPath);
+

Review Comment:
   ah you are right. Will simplify the interface in the follow-up PR.
   
   ps. the partition path here is used to get hashing metadata for the given 
partition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #4480: [HUDI-3123] consistent hashing index: basic write path (upsert/insert)

2022-05-15 Thread GitBox


danny0405 commented on code in PR #4480:
URL: https://github.com/apache/hudi/pull/4480#discussion_r873316153


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/BucketIndexLocationMapper.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.index.bucket;
+
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecordLocation;
+import org.apache.hudi.common.util.Option;
+
+import java.io.Serializable;
+
+public interface BucketIndexLocationMapper extends Serializable {
+
+  /**
+   * Get record location given hoodie key and partition path
+   */
+  Option getRecordLocation(HoodieKey key, String 
partitionPath);
+

Review Comment:
   Hey, the `HoodieKey` already contains partition path field. Is there any 
possibility that the `partitionPath` does not equal to the partition path of 
the `HoodieKey` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-4025) Add support to validate presto, trino and hive queries in integ test framework

2022-05-15 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-4025:
--
Status: In Progress  (was: Open)

> Add support to validate presto, trino and hive queries in integ test framework
> --
>
> Key: HUDI-4025
> URL: https://issues.apache.org/jira/browse/HUDI-4025
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: tests-ci
>Reporter: sivabalan narayanan
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> hive sync should also be supported via this enhancement



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] codope commented on a diff in pull request #5560: [HUDI-2673] Add kafka connect sink in docker demo

2022-05-15 Thread GitBox


codope commented on code in PR #5560:
URL: https://github.com/apache/hudi/pull/5560#discussion_r873312838


##
docker/hoodie/kafka_connect/pom.xml:
##
@@ -0,0 +1,122 @@
+
+
+
+http://maven.apache.org/POM/4.0.0";
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+  
+hudi
+org.apache.hudi
+0.12.0-SNAPSHOT
+../../../pom.xml
+  
+  4.0.0
+  pom
+  hudi-kafka-connect-docker
+  Hoodie Kafka Connect Docker
+
+  
+
+  org.apache.hudi
+  hudi-kafka-connect-bundle

Review Comment:
   Got it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


wzx140 commented on code in PR #5522:
URL: https://github.com/apache/hudi/pull/5522#discussion_r873311346


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java:
##
@@ -180,18 +176,14 @@ public void write() {
 } else {
   keyIterator = recordMap.keySet().stream().iterator();
 }
-try {
-  while (keyIterator.hasNext()) {
-final String key = keyIterator.next();
-HoodieRecord record = recordMap.get(key);
-if (useWriterSchema) {
-  write(record, 
record.getData().getInsertValue(tableSchemaWithMetaFields, config.getProps()));
-} else {
-  write(record, record.getData().getInsertValue(tableSchema, 
config.getProps()));
-}
+while (keyIterator.hasNext()) {
+  final String key = keyIterator.next();
+  HoodieRecord record = recordMap.get(key);
+  if (useWriterSchema) {
+write(record, tableSchemaWithMetaFields, config.getProps());
+  } else {
+write(record, tableSchema, config.getProps());

Review Comment:
   Do you mean to do "write(record, useWriterSchema ? tableSchemaWithMetaFields 
: tableSchema , config.getProps());"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


hudi-bot commented on PR #5588:
URL: https://github.com/apache/hudi/pull/5588#issuecomment-1127199004

   
   ## CI report:
   
   * a94b8616a5515088ca59836f64124f3577aab4a9 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8669)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] LinMingQiang commented on issue #5589: [SUPPORT] Optimization of bucket index in Flink

2022-05-15 Thread GitBox


LinMingQiang commented on issue #5589:
URL: https://github.com/apache/hudi/issues/5589#issuecomment-1127197963

   https://issues.apache.org/jira/browse/HUDI-4102


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] LinMingQiang opened a new issue, #5589: [SUPPORT] Optimization of bucket index in Flink

2022-05-15 Thread GitBox


LinMingQiang opened a new issue, #5589:
URL: https://github.com/apache/hudi/issues/5589

   A clear and concise description of the problem.
   write.task value can only be less than or equal to bucket num value, because 
even if it is exceeded, the writing efficiency cannot be improved.This is 
because the bucket index does not consider partition in partitioncustom
   
   Hudi version : master
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


hudi-bot commented on PR #5588:
URL: https://github.com/apache/hudi/pull/5588#issuecomment-1127196510

   
   ## CI report:
   
   * a94b8616a5515088ca59836f64124f3577aab4a9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


hudi-bot commented on PR #5522:
URL: https://github.com/apache/hudi/pull/5522#issuecomment-1127196379

   
   ## CI report:
   
   * 986960516f86a1426725141cd7cb25e84d260020 UNKNOWN
   * c5fb81a0b229ded9a2b925790366b62f1bf7ade9 UNKNOWN
   * 07029886584651cb81f767b7bc8c5a310f47d400 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8668)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-4102) Optimization of bucket index in Flink

2022-05-15 Thread HunterHunter (Jira)
HunterHunter created HUDI-4102:
--

 Summary: Optimization of bucket index in Flink
 Key: HUDI-4102
 URL: https://issues.apache.org/jira/browse/HUDI-4102
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: HunterHunter
 Fix For: 0.12.0
 Attachments: write.task=80,bucket=1.png

write.task value can only be less than or equal to bucket num value, because 
even if it is exceeded, the writing efficiency cannot be improved.This is 
because the bucket index does not consider partition in partitioncustom



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] hudi-bot commented on pull request #5532: [HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource table

2022-05-15 Thread GitBox


hudi-bot commented on PR #5532:
URL: https://github.com/apache/hudi/pull/5532#issuecomment-1127194188

   
   ## CI report:
   
   * be25641540231b290306ef9195faedd46106175e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8666)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


wzx140 commented on code in PR #5522:
URL: https://github.com/apache/hudi/pull/5522#discussion_r873308062


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -201,33 +202,24 @@ protected boolean isUpdateRecord(HoodieRecord 
hoodieRecord) {
 return hoodieRecord.getCurrentLocation() != null;
   }
 
-  private Option getIndexedRecord(HoodieRecord hoodieRecord) 
{
-Option> recordMetadata = 
hoodieRecord.getData().getMetadata();
+  private Option prepareRecord(HoodieRecord hoodieRecord) {
+Option> recordMetadata = hoodieRecord.getMetadata();
 try {
   // Pass the isUpdateRecord to the props for HoodieRecordPayload to judge
   // Whether it is an update or insert record.
   boolean isUpdateRecord = isUpdateRecord(hoodieRecord);
   // If the format can not record the operation field, nullify the DELETE 
payload manually.
   boolean nullifyPayload = 
HoodieOperation.isDelete(hoodieRecord.getOperation()) && 
!config.allowOperationMetadataField();
   
recordProperties.put(HoodiePayloadProps.PAYLOAD_IS_UPDATE_RECORD_FOR_MOR, 
String.valueOf(isUpdateRecord));
-  Option avroRecord = nullifyPayload ? Option.empty() : 
hoodieRecord.getData().getInsertValue(tableSchema, recordProperties);
-  if (avroRecord.isPresent()) {
-if (avroRecord.get().equals(IGNORE_RECORD)) {
-  return avroRecord;
+  Option finalRecord = Option.empty();

Review Comment:
   finalRecord will return at L238. If finalrecord is not assigned a value of 
populatedRecord(L222), it will be returned as empty(L214)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #5564: [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

2022-05-15 Thread GitBox


XuQianJin-Stars commented on code in PR #5564:
URL: https://github.com/apache/hudi/pull/5564#discussion_r873305945


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/DropHoodieTableCommand.scala:
##
@@ -69,13 +67,13 @@ extends HoodieLeafRunnableCommand {
 val catalog = sparkSession.sessionState.catalog
 
 // Drop table in the catalog
-val enableHive = isEnableHive(sparkSession)
-if (enableHive) {
-  dropHiveDataSourceTable(sparkSession, hoodieCatalogTable)
+if (HoodieTableType.MERGE_ON_READ == hoodieCatalogTable.tableType && 
purge) {
+  val (rtTableOpt, roTableOpt) = getTableRTAndRO(catalog, 
hoodieCatalogTable)
+  rtTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  roTableOpt.foreach(table => catalog.dropTable(table.identifier, true, 
false))
+  catalog.dropTable(table.identifier.copy(table = 
hoodieCatalogTable.tableName), ifExists, purge)
 } else {
-  if (catalog.tableExists(tableIdentifier)) {
-catalog.dropTable(tableIdentifier, ifExists, purge)
-  }
+  catalog.dropTable(table.identifier, ifExists, purge)

Review Comment:
   `purge` is false  and `HoodieTableType.MERGE_ON_READ == 
hoodieCatalogTable.tableType` Don't need to drop `ro` and `rt` table? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-4101) Bucket index hash algorithm should take partition path for better dispersion

2022-05-15 Thread Danny Chen (Jira)
Danny Chen created HUDI-4101:


 Summary: Bucket index hash algorithm should take partition path 
for better dispersion
 Key: HUDI-4101
 URL: https://issues.apache.org/jira/browse/HUDI-4101
 Project: Apache Hudi
  Issue Type: Improvement
  Components: core
Reporter: Danny Chen
 Fix For: 0.11.1, 0.12.0






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] xushiyan commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


xushiyan commented on code in PR #5522:
URL: https://github.com/apache/hudi/pull/5522#discussion_r873267889


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/execution/CopyOnWriteInsertHandler.java:
##
@@ -69,8 +69,8 @@ public CopyOnWriteInsertHandler(HoodieWriteConfig config, 
String instantTime,
 
   @Override
   public void consumeOneRecord(HoodieInsertValueGenResult 
payload) {

Review Comment:
   /nit renaming to avoid confusion
   
   ```suggestion
 public void consumeOneRecord(HoodieInsertValueGenResult 
genResult) {
   ```



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -201,33 +202,24 @@ protected boolean isUpdateRecord(HoodieRecord 
hoodieRecord) {
 return hoodieRecord.getCurrentLocation() != null;
   }
 
-  private Option getIndexedRecord(HoodieRecord hoodieRecord) 
{
-Option> recordMetadata = 
hoodieRecord.getData().getMetadata();
+  private Option prepareRecord(HoodieRecord hoodieRecord) {
+Option> recordMetadata = hoodieRecord.getMetadata();
 try {
   // Pass the isUpdateRecord to the props for HoodieRecordPayload to judge
   // Whether it is an update or insert record.
   boolean isUpdateRecord = isUpdateRecord(hoodieRecord);
   // If the format can not record the operation field, nullify the DELETE 
payload manually.
   boolean nullifyPayload = 
HoodieOperation.isDelete(hoodieRecord.getOperation()) && 
!config.allowOperationMetadataField();
   
recordProperties.put(HoodiePayloadProps.PAYLOAD_IS_UPDATE_RECORD_FOR_MOR, 
String.valueOf(isUpdateRecord));
-  Option avroRecord = nullifyPayload ? Option.empty() : 
hoodieRecord.getData().getInsertValue(tableSchema, recordProperties);
-  if (avroRecord.isPresent()) {
-if (avroRecord.get().equals(IGNORE_RECORD)) {
-  return avroRecord;
+  Option finalRecord = Option.empty();

Review Comment:
   no need to init `finalRecord` here. it's only used in L222



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##
@@ -257,46 +255,45 @@ protected void init(String fileId, 
Iterator> newRecordsItr) {
 + ((ExternalSpillableMap) 
keyToNewRecords).getSizeOfFileOnDiskInBytes());
   }
 
-  private boolean writeUpdateRecord(HoodieRecord hoodieRecord, 
GenericRecord oldRecord, Option indexedRecord) {
+  private boolean writeUpdateRecord(HoodieRecord hoodieRecord, 
HoodieRecord oldRecord, Option combineRecordOp) throws 
IOException {
 boolean isDelete = false;
-if (indexedRecord.isPresent()) {
+Schema schema = useWriterSchemaForCompaction ? tableSchemaWithMetaFields : 
tableSchema;
+if (combineRecordOp.isPresent()) {
   updatedRecordsWritten++;
-  GenericRecord record = (GenericRecord) indexedRecord.get();
-  if (oldRecord != record) {
+  if (!oldRecord.equals(combineRecordOp.get())) {
 // the incoming record is chosen
 isDelete = HoodieOperation.isDelete(hoodieRecord.getOperation());
   }
 }
-return writeRecord(hoodieRecord, indexedRecord, isDelete);
+return writeRecord(hoodieRecord, combineRecordOp, schema, 
config.getProps(), isDelete);
   }
 
   protected void writeInsertRecord(HoodieRecord hoodieRecord) throws 
IOException {
 Schema schema = useWriterSchemaForCompaction ? tableSchemaWithMetaFields : 
tableSchema;
-Option insertRecord = 
hoodieRecord.getData().getInsertValue(schema, config.getProps());
 // just skip the ignored record
-if (insertRecord.isPresent() && insertRecord.get().equals(IGNORE_RECORD)) {
+if (hoodieRecord.shouldIgnore(schema, config.getProps())) {
   return;
 }
-if (writeRecord(hoodieRecord, insertRecord, 
HoodieOperation.isDelete(hoodieRecord.getOperation( {
+if (writeRecord(hoodieRecord, Option.of(hoodieRecord), schema, 
config.getProps(), HoodieOperation.isDelete(hoodieRecord.getOperation( {

Review Comment:
   maybe `isDelete()` can be an API too?



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -502,12 +521,16 @@ private void writeToBuffer(HoodieRecord record) {
   record.seal();
 }
 // fetch the ordering val first in case the record was deflated.
-final Comparable orderingVal = record.getData().getOrderingValue();
-Option indexedRecord = getIndexedRecord(record);
+final Comparable orderingVal = 
((HoodieRecordPayload)record.getData()).getOrderingValue();

Review Comment:
   should we have an API like `getOrderingValue()` ?



##
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieAvroRecord.java:
##
@@ -47,4 +69,170 @@ public T getData() {
 }
 return data;
   }
+
+  @Override
+  public String getRecordKey(Option keyGeneratorOpt) {
+return getRecordKey();
+  }
+
+  // TODO remove
+  public Option asAvro(Schema schema) throws IOException {
+return getData().getInser

[GitHub] [hudi] jinxing64 commented on a diff in pull request #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


jinxing64 commented on code in PR #5588:
URL: https://github.com/apache/hudi/pull/5588#discussion_r873303190


##
hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/hudi/catalog/HoodieCatalog.scala:
##
@@ -192,8 +200,31 @@ class HoodieCatalog extends DelegatingCatalogExtension
 loadTable(ident)
   }
 
+  def deduceTableLocationURI(

Review Comment:
   All logic of deduceTableLocationURI is moved from previous 
HoodieCatalog#createHoodieTable



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] lihuahui5683 closed issue #5382: [SUPPORT] org.apache.hudi.hadoop.hive.HoodieCombineRealtimeFileSplit cannot be cast to org.apache.hadoop.hive.shims.HadoopShimsSecure$InputSplitShim

2022-05-15 Thread GitBox


lihuahui5683 closed issue #5382: [SUPPORT] 
org.apache.hudi.hadoop.hive.HoodieCombineRealtimeFileSplit cannot be cast to 
org.apache.hadoop.hive.shims.HadoopShimsSecure$InputSplitShim
URL: https://github.com/apache/hudi/issues/5382


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-4100) CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-4100:
-
Labels: pull-request-available  (was: )

> CTAS failed to clean up when given an illegal MANAGED table definition
> --
>
> Key: HUDI-4100
> URL: https://issues.apache.org/jira/browse/HUDI-4100
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Jin Xing
>Priority: Major
>  Labels: pull-request-available
>
> Current HoodieStagedTable#abortStagedChanges cleans up data path by the table 
> property of location, which doesn't work for a MANAGED table



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] jinxing64 opened a new pull request, #5588: [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread GitBox


jinxing64 opened a new pull request, #5588:
URL: https://github.com/apache/hudi/pull/5588

   ## What is the purpose of the pull request
   
   Current HoodieStagedTable#abortStagedChanges cleans up data path by the 
table property of location, which doesn't work for a MANAGED table. This PR 
proposes to pass table path to HoodieStagedTable and do cleanup work properly.
   
   ## Brief change log
   
   *(for example:)*
 - *HoodieCatalog add a function to deduce table path and pass to 
HoodieStagedTable*
   
   ## Verify this pull request
   
   *(example:)*
   
 - *Added test*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-4100) CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing updated HUDI-4100:
---
Description: Current HoodieStagedTable#abortStagedChanges cleans up data 
path by the table property of location, which doesn't work for a MANAGED table

> CTAS failed to clean up when given an illegal MANAGED table definition
> --
>
> Key: HUDI-4100
> URL: https://issues.apache.org/jira/browse/HUDI-4100
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Jin Xing
>Priority: Major
>
> Current HoodieStagedTable#abortStagedChanges cleans up data path by the table 
> property of location, which doesn't work for a MANAGED table



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-4100) CTAS failed to clean up when given an illegal MANAGED table definition

2022-05-15 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing updated HUDI-4100:
---
Summary: CTAS failed to clean up when given an illegal MANAGED table 
definition  (was: CTAS failed to clean up when given an illegal table 
definition)

> CTAS failed to clean up when given an illegal MANAGED table definition
> --
>
> Key: HUDI-4100
> URL: https://issues.apache.org/jira/browse/HUDI-4100
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Jin Xing
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HUDI-4100) CTAS failed to clean up when given an illegal table definition

2022-05-15 Thread Jin Xing (Jira)
Jin Xing created HUDI-4100:
--

 Summary: CTAS failed to clean up when given an illegal table 
definition
 Key: HUDI-4100
 URL: https://issues.apache.org/jira/browse/HUDI-4100
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Jin Xing






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [hudi] hudi-bot commented on pull request #5583: [HUDI-4098] Metadata table heartbeat for instant has expired, last he…

2022-05-15 Thread GitBox


hudi-bot commented on PR #5583:
URL: https://github.com/apache/hudi/pull/5583#issuecomment-1127174807

   
   ## CI report:
   
   * 9192e47e4069c54b8a305bd174b00f995c613fbb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8651)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5583: [HUDI-4098] Metadata table heartbeat for instant has expired, last he…

2022-05-15 Thread GitBox


hudi-bot commented on PR #5583:
URL: https://github.com/apache/hudi/pull/5583#issuecomment-1127172841

   
   ## CI report:
   
   * 9192e47e4069c54b8a305bd174b00f995c613fbb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8651)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #5583: [HUDI-4098] Metadata table heartbeat for instant has expired, last he…

2022-05-15 Thread GitBox


danny0405 commented on PR #5583:
URL: https://github.com/apache/hudi/pull/5583#issuecomment-1127171825

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


hudi-bot commented on PR #5522:
URL: https://github.com/apache/hudi/pull/5522#issuecomment-1127169092

   
   ## CI report:
   
   * 986960516f86a1426725141cd7cb25e84d260020 UNKNOWN
   * c5fb81a0b229ded9a2b925790366b62f1bf7ade9 UNKNOWN
   * f0e8e71d9c2891115daa6bbef5b40231c7a76460 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8591)
 
   * 07029886584651cb81f767b7bc8c5a310f47d400 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8668)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


hudi-bot commented on PR #5522:
URL: https://github.com/apache/hudi/pull/5522#issuecomment-1127167426

   
   ## CI report:
   
   * 986960516f86a1426725141cd7cb25e84d260020 UNKNOWN
   * c5fb81a0b229ded9a2b925790366b62f1bf7ade9 UNKNOWN
   * f0e8e71d9c2891115daa6bbef5b40231c7a76460 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8591)
 
   * 07029886584651cb81f767b7bc8c5a310f47d400 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [HUDI-3123] consistent hashing index: basic write path (upsert/insert) (#4480)

2022-05-15 Thread leesf
This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 61030d8e7a [HUDI-3123] consistent hashing index: basic write path 
(upsert/insert) (#4480)
61030d8e7a is described below

commit 61030d8e7a5a05e215efed672267ac163b0cbcf6
Author: Yuwei XIAO 
AuthorDate: Mon May 16 11:07:01 2022 +0800

[HUDI-3123] consistent hashing index: basic write path (upsert/insert) 
(#4480)

 1. basic write path(insert/upsert) implementation
 2. adapt simple bucket index
---
 .../hudi/client/utils/LazyIterableIterator.java|   4 +-
 .../org/apache/hudi/config/HoodieIndexConfig.java  |  46 +++-
 .../org/apache/hudi/config/HoodieWriteConfig.java  |   4 +
 .../java/org/apache/hudi/index/HoodieIndex.java|   6 +-
 .../apache/hudi/index/bucket/BucketIdentifier.java |  49 ++--
 .../index/bucket/BucketIndexLocationMapper.java|  35 +++
 .../index/bucket/ConsistentBucketIdentifier.java   | 104 +
 .../hudi/index/bucket/HoodieBucketIndex.java   | 119 --
 .../hudi/index/bucket/HoodieSimpleBucketIndex.java |  99 
 .../org/apache/hudi/io/WriteHandleFactory.java |   3 +-
 .../action/commit/BaseCommitActionExecutor.java|   2 +-
 ...yout.java => HoodieConsistentBucketLayout.java} |  44 ++--
 .../hudi/table/storage/HoodieDefaultLayout.java|   7 +-
 .../hudi/table/storage/HoodieLayoutFactory.java|   9 +-
 ...etLayout.java => HoodieSimpleBucketLayout.java} |  32 +--
 .../hudi/table/storage/HoodieStorageLayout.java|   2 +-
 .../hudi/index/bucket/TestBucketIdentifier.java| 122 ++
 .../bucket/TestConsistentBucketIdIdentifier.java   |  79 +++
 .../apache/hudi/client/SparkRDDWriteClient.java|   2 +-
 .../java/org/apache/hudi/data/HoodieJavaRDD.java   |   5 +
 .../apache/hudi/index/SparkHoodieIndexFactory.java |  16 +-
 .../bucket/HoodieSparkConsistentBucketIndex.java   | 210 +
 .../functional/TestConsistentBucketIndex.java  | 250 +
 .../hudi/client/functional/TestHoodieIndex.java|   3 +
 .../apache/hudi/index/TestHoodieIndexConfigs.java  |  14 +-
 ...Index.java => TestHoodieSimpleBucketIndex.java} |  17 +-
 .../commit/TestCopyOnWriteActionExecutor.java  |   5 +-
 .../org/apache/hudi/common/data/HoodieData.java|   9 +
 .../org/apache/hudi/common/data/HoodieList.java|   5 +
 .../java/org/apache/hudi/common/fs/FSUtils.java|   4 +
 .../hudi/common/model/ConsistentHashingNode.java   |  78 +++
 .../hudi/common/model/HoodieCommitMetadata.java|  23 +-
 .../model/HoodieConsistentHashingMetadata.java | 142 
 .../common/model/HoodieReplaceCommitMetadata.java  |  17 +-
 .../common/model/HoodieRollingStatMetadata.java|   4 +-
 .../hudi/common/table/HoodieTableMetaClient.java   |   8 +
 .../org/apache/hudi/common/util/JsonUtils.java |  38 
 .../org/apache/hudi/common/util/hash/HashID.java   |   9 +
 .../model/TestHoodieConsistentHashingMetadata.java |  25 +--
 .../common/testutils/HoodieCommonTestHarness.java  |   4 +
 .../hudi/index/bucket/TestBucketIdentifier.java|  67 --
 41 files changed, 1444 insertions(+), 277 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/LazyIterableIterator.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/LazyIterableIterator.java
index 020944e7ab..ad54f8c0a0 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/LazyIterableIterator.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/LazyIterableIterator.java
@@ -45,7 +45,7 @@ public abstract class LazyIterableIterator implements 
Iterable, Iterato
   /**
* Called once, before any elements are processed.
*/
-  protected abstract void start();
+  protected void start() {}
 
   /**
* Block computation to be overwritten by sub classes.
@@ -55,7 +55,7 @@ public abstract class LazyIterableIterator implements 
Iterable, Iterato
   /**
* Called once, after all elements are processed.
*/
-  protected abstract void end();
+  protected void end() {}
 
   //
   // iterable implementation
diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
index 7c1f7e00e7..dbd45b9738 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
@@ -216,19 +216,40 @@ public class HoodieIndexConfig extends HoodieConfig {
   /**
* * Bucket Index Configs *
* Bucket Index is targeted to locate the record fast by hash in big data 
scenarios.
-   * The cur

[GitHub] [hudi] leesf merged pull request #4480: [HUDI-3123] consistent hashing index: basic write path (upsert/insert)

2022-05-15 Thread GitBox


leesf merged PR #4480:
URL: https://github.com/apache/hudi/pull/4480


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] wzx140 commented on pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


wzx140 commented on PR #5522:
URL: https://github.com/apache/hudi/pull/5522#issuecomment-1127158433

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] BruceKellan closed issue #5561: [SUPPORT] [Metadata table] Metadata table heartbeat for instant has expired, last heartbeat 0

2022-05-15 Thread GitBox


BruceKellan closed issue #5561: [SUPPORT] [Metadata table] Metadata table 
heartbeat for instant has expired, last heartbeat 0
URL: https://github.com/apache/hudi/issues/5561


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] BruceKellan commented on issue #5561: [SUPPORT] [Metadata table] Metadata table heartbeat for instant has expired, last heartbeat 0

2022-05-15 Thread GitBox


BruceKellan commented on issue #5561:
URL: https://github.com/apache/hudi/issues/5561#issuecomment-1127153034

   I will close this issue. If there are still problems I will reopen this 
issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yuzhaojing closed pull request #3599: [HUDI-2207] Support independent flink hudi clustering function

2022-05-15 Thread GitBox


yuzhaojing closed pull request #3599: [HUDI-2207] Support independent flink 
hudi clustering function
URL: https://github.com/apache/hudi/pull/3599


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-05-15 Thread GitBox


hudi-bot commented on PR #5522:
URL: https://github.com/apache/hudi/pull/5522#issuecomment-1127152108

   
   ## CI report:
   
   * 986960516f86a1426725141cd7cb25e84d260020 UNKNOWN
   * c5fb81a0b229ded9a2b925790366b62f1bf7ade9 UNKNOWN
   * f0e8e71d9c2891115daa6bbef5b40231c7a76460 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8591)
 
   * 07029886584651cb81f767b7bc8c5a310f47d400 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3599: [HUDI-2207] Support independent flink hudi clustering function

2022-05-15 Thread GitBox


hudi-bot commented on PR #3599:
URL: https://github.com/apache/hudi/pull/3599#issuecomment-1127151573

   
   ## CI report:
   
   * c3405300e9bc97445637c7251536ec0f0d6fbbd1 UNKNOWN
   * e5003cfa74eb34255bceb50f1916c57c9c8812de Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8653)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3599: [HUDI-2207] Support independent flink hudi clustering function

2022-05-15 Thread GitBox


hudi-bot commented on PR #3599:
URL: https://github.com/apache/hudi/pull/3599#issuecomment-1127150032

   
   ## CI report:
   
   * c3405300e9bc97445637c7251536ec0f0d6fbbd1 UNKNOWN
   * e5003cfa74eb34255bceb50f1916c57c9c8812de Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8653)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yuzhaojing commented on pull request #3599: [HUDI-2207] Support independent flink hudi clustering function

2022-05-15 Thread GitBox


yuzhaojing commented on PR #3599:
URL: https://github.com/apache/hudi/pull/3599#issuecomment-1127149977

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5532: [HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource table

2022-05-15 Thread GitBox


hudi-bot commented on PR #5532:
URL: https://github.com/apache/hudi/pull/5532#issuecomment-1127147455

   
   ## CI report:
   
   * 56bb2432a41ff2ae576db041e40a0e32080724f8 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8640)
 
   * be25641540231b290306ef9195faedd46106175e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8666)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5532: [HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource table

2022-05-15 Thread GitBox


hudi-bot commented on PR #5532:
URL: https://github.com/apache/hudi/pull/5532#issuecomment-1127145919

   
   ## CI report:
   
   * 56bb2432a41ff2ae576db041e40a0e32080724f8 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8640)
 
   * be25641540231b290306ef9195faedd46106175e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] XuQianJin-Stars commented on issue #5586: [SUPPORT] 0.11.0 SparkSQL ParseException occurs in 0.11.0 when creating view with `timestamp as of`

2022-05-15 Thread GitBox


XuQianJin-Stars commented on issue #5586:
URL: https://github.com/apache/hudi/issues/5586#issuecomment-1127135537

   hi @gnailJC Thanks for the question, now timestamp as of only supports table 
operations, not view related operations yet. If there is a business scenario 
requirement, we can implement it, and mention a jira.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: fix hive sync no partition table error (#5585)

2022-05-15 Thread forwardxu
This is an automated email from the ASF dual-hosted git repository.

forwardxu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 1fded18dff fix hive sync no partition table error (#5585)
1fded18dff is described below

commit 1fded18dff5bae064479d52b4e44f9fcf5bbb1b7
Author: 陈浩 
AuthorDate: Mon May 16 09:51:24 2022 +0800

fix hive sync no partition table error (#5585)
---
 .../src/main/java/org/apache/hudi/common/config/TypedProperties.java  | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/config/TypedProperties.java 
b/hudi-common/src/main/java/org/apache/hudi/common/config/TypedProperties.java
index 09671ba2a3..08015f61b2 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/common/config/TypedProperties.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/config/TypedProperties.java
@@ -18,6 +18,8 @@
 
 package org.apache.hudi.common.config;
 
+import org.apache.hudi.common.util.StringUtils;
+
 import java.io.Serializable;
 import java.util.Arrays;
 import java.util.Enumeration;
@@ -73,7 +75,7 @@ public class TypedProperties extends Properties implements 
Serializable {
 if (!containsKey(property)) {
   return defaultVal;
 }
-return 
Arrays.stream(getProperty(property).split(delimiter)).map(String::trim).collect(Collectors.toList());
+return 
Arrays.stream(getProperty(property).split(delimiter)).map(String::trim).filter(s
 -> !StringUtils.isNullOrEmpty(s)).collect(Collectors.toList());
   }
 
   public int getInteger(String property) {



[GitHub] [hudi] XuQianJin-Stars merged pull request #5585: [HUDI-4099][MINOR]fix hive sync no partition table error

2022-05-15 Thread GitBox


XuQianJin-Stars merged PR #5585:
URL: https://github.com/apache/hudi/pull/5585


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] XuQianJin-Stars merged pull request #5495: [HUDI-4001] Filter the properties should not be used when create table for Spark SQL

2022-05-15 Thread GitBox


XuQianJin-Stars merged PR #5495:
URL: https://github.com/apache/hudi/pull/5495


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [HUDI-4001] Filter the properties should not be used when create table for Spark SQL (#5495)

2022-05-15 Thread forwardxu
This is an automated email from the ASF dual-hosted git repository.

forwardxu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 75f847691f [HUDI-4001] Filter the properties should not be used when 
create table for Spark SQL (#5495)
75f847691f is described below

commit 75f847691f0bdaf226d4713a8cb8c7639cffd5e5
Author: 董可伦 
AuthorDate: Mon May 16 09:50:29 2022 +0800

[HUDI-4001] Filter the properties should not be used when create table for 
Spark SQL (#5495)
---
 .../sql/catalyst/catalog/HoodieCatalogTable.scala  |   3 +
 .../spark/sql/hudi/ProvidesHoodieConfig.scala  |   3 +-
 .../hudi/command/CreateHoodieTableCommand.scala|   6 +-
 .../command/CreateHoodieTableAsSelectCommand.scala |  23 -
 .../apache/spark/sql/hudi/TestCreateTable.scala| 103 -
 5 files changed, 127 insertions(+), 11 deletions(-)

diff --git 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala
 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala
index 7ee8f6ad56..76cea362a3 100644
--- 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala
+++ 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql.catalyst.catalog
 
 import org.apache.hudi.AvroConversionUtils
+import org.apache.hudi.DataSourceWriteOptions.OPERATION
 import org.apache.hudi.HoodieWriterUtils._
 import org.apache.hudi.common.config.DFSPropertiesConfiguration
 import org.apache.hudi.common.model.HoodieTableType
@@ -321,6 +322,8 @@ class HoodieCatalogTable(val spark: SparkSession, val 
table: CatalogTable) exten
 }
 
 object HoodieCatalogTable {
+  // The properties should not be used when create table
+  val needFilterProps: List[String] = 
List(HoodieTableConfig.DATABASE_NAME.key, HoodieTableConfig.NAME.key, 
OPERATION.key)
 
   def apply(sparkSession: SparkSession, tableIdentifier: TableIdentifier): 
HoodieCatalogTable = {
 val catalogTable = 
sparkSession.sessionState.catalog.getTableMetadata(tableIdentifier)
diff --git 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala
 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala
index 31fb0ad6cb..131ebebe85 100644
--- 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala
+++ 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala
@@ -255,8 +255,7 @@ trait ProvidesHoodieConfig extends Logging {
 val hoodieProps = getHoodieProps(catalogProperties, tableConfig, 
sparkSession.sqlContext.conf)
 val hiveSyncConfig = buildHiveSyncConfig(hoodieProps, hoodieCatalogTable)
 
-// operation can not be overwrite
-val options = hoodieCatalogTable.catalogProperties.-(OPERATION.key())
+val options = hoodieCatalogTable.catalogProperties
 
 withSparkConf(sparkSession, options) {
   Map(
diff --git 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
index 195bf4153c..9bf1d72152 100644
--- 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
+++ 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
@@ -26,6 +26,7 @@ import org.apache.hudi.hadoop.utils.HoodieInputFormatUtils
 import org.apache.hudi.{DataSourceWriteOptions, SparkAdapterSupport}
 import org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException
 import org.apache.spark.sql.catalyst.catalog._
+import org.apache.spark.sql.catalyst.catalog.HoodieCatalogTable.needFilterProps
 import org.apache.spark.sql.hive.HiveClientUtils
 import org.apache.spark.sql.hive.HiveExternalCatalog._
 import org.apache.spark.sql.hudi.HoodieSqlCommonUtils.isEnableHive
@@ -130,8 +131,9 @@ object CreateHoodieTableCommand {
   .copy(table = tableName, database = Some(newDatabaseName))
 
 val partitionColumnNames = hoodieCatalogTable.partitionSchema.map(_.name)
-// append pk, preCombineKey, type to the properties of table
-val newTblProperties = hoodieCatalogTable.catalogProperties ++ 
HoodieOptionConfig.extractSqlOptions(properties)
+// Remove some properties should not be used;append pk, preCombineKey, 
type to the properties of table
+val newTblProperties =
+  hoodieCatalogTable.catalogProperties.--(needFilterProps) ++ 
Hoodi

[GitHub] [hudi] XuQianJin-Stars commented on pull request #5564: [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

2022-05-15 Thread GitBox


XuQianJin-Stars commented on PR #5564:
URL: https://github.com/apache/hudi/pull/5564#issuecomment-1127131256

   hi @jinxing64 
   
   There is one more situation to consider here
   
   The operation is as follows:
   1. Create the mor table hudi_table with sparksql, there are no `ro` and `rt` 
tables yet
   2. Then flink writes data to trigger the creation of `ro` ​​and `rt` tables
   
   At this time, the mor table will generate three tables
   hudi_table,hudi_table_ro,hudi_table_rt
   
   So we need to drop three tables.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] cdmikechen commented on a diff in pull request #3391: [HUDI-83] Fix Timestamp/Date type read by Hive3

2022-05-15 Thread GitBox


cdmikechen commented on code in PR #3391:
URL: https://github.com/apache/hudi/pull/3391#discussion_r873107531


##
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java:
##
@@ -160,7 +160,7 @@ public class HiveSyncConfig extends HoodieSyncConfig {
 
   public static final ConfigProperty HIVE_SUPPORT_TIMESTAMP_TYPE = 
ConfigProperty
   .key("hoodie.datasource.hive_sync.support_timestamp")
-  .defaultValue("false")
+  .defaultValue("true")
   .withDocumentation("‘INT64’ with original type TIMESTAMP_MICROS is 
converted to hive ‘timestamp’ type. "
   + "Disabled by default for backward compatibility.");

Review Comment:
   > > @XuQianJin-Stars Can this shows whether we can clearly explain the PR?
   > > 'INT64' with original type TIMESTAMP_MICROS is converted to hive 
'timestamp' type. From 0.12.0, 'timestamp' type will be supported and also can 
be disabled by this variable. Previous versions keep being disabled by default.
   > 
   > In `deprecatedAfter` method write version 0.12.0 and change 
withDocumentation‘s content?
   
   @XuQianJin-Stars 
   Is this right? 
   ```java
 public static final ConfigProperty HIVE_SUPPORT_TIMESTAMP_TYPE = 
ConfigProperty
 .key("hoodie.datasource.hive_sync.support_timestamp")
 .defaultValue("true")
 .deprecatedAfter("0.12.0")
 .withDocumentation("'INT64' with original type TIMESTAMP_MICROS is 
converted to hive 'timestamp' type. "
 + "From 0.12.0, 'timestamp' type will be supported and also can be 
disabled by this variable. "
 + "Previous versions keep being disabled by default.");
   ```
   If there's no problem, I'll change all the other descriptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bettermouse commented on issue #5569: [SUPPORT] Issues with URL_ENCODE_PARTITIONING_OPT_KEY in hudi 0.11.0

2022-05-15 Thread GitBox


bettermouse commented on issue #5569:
URL: https://github.com/apache/hudi/issues/5569#issuecomment-1126979897

   
//hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/KeyGenUtils.java
   getRecordPartitionPath
   //https://github.com/apache/hudi/pull/2645
   String encodeVersion08 = 
URLEncoder.encode("", 
StandardCharsets.UTF_8.toString());
   String encodeVersion11 = 
PartitionPathEncodeUtils.escapePathName("");
   
   System.out.println(encodeVersion08);
   System.out.println(encodeVersion11);
   
   It seems like " Partition Path  Encode " in different way in version 0.8 and 
0.11.
   may  need to rebuild table by hudi 0.11?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5587: fix rat plugin issue

2022-05-15 Thread GitBox


hudi-bot commented on PR #5587:
URL: https://github.com/apache/hudi/pull/5587#issuecomment-1126915244

   
   ## CI report:
   
   * dc7921b3be5d009873a13ec761386e52ad952854 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8664)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5320: [HUDI-3861] update tblp 'path' when rename table

2022-05-15 Thread GitBox


hudi-bot commented on PR #5320:
URL: https://github.com/apache/hudi/pull/5320#issuecomment-1126910787

   
   ## CI report:
   
   * 855eb02e145337afea5b1e9bec95c60ecd282bc8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8663)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] KnightChess commented on a diff in pull request #5320: [HUDI-3861] update tblp 'path' when rename table

2022-05-15 Thread GitBox


KnightChess commented on code in PR #5320:
URL: https://github.com/apache/hudi/pull/5320#discussion_r873147813


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/AlterHoodieTableRenameCommand.scala:
##
@@ -47,7 +47,7 @@ case class AlterHoodieTableRenameCommand(
   AlterTableRenameCommand(oldName, newName, isView).run(sparkSession)
 
   // update table properties path in every op
-  if (hoodieCatalogTable.catalogProperties.contains("path")) {
+  if (hoodieCatalogTable.table.properties.contains("path")) {

Review Comment:
   catalogProperties contain tblp and storage props



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] KnightChess commented on a diff in pull request #5320: [HUDI-3861] update tblp 'path' when rename table

2022-05-15 Thread GitBox


KnightChess commented on code in PR #5320:
URL: https://github.com/apache/hudi/pull/5320#discussion_r873147603


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestAlterTable.scala:
##
@@ -194,9 +194,14 @@ class TestAlterTable extends HoodieSparkSqlTestBase {
 val oldLocation = spark.sessionState.catalog.getTableMetadata(new 
TableIdentifier(tableName)).properties.get("path")
 spark.sql(s"alter table $tableName rename to $newTableName")
 val newLocation = spark.sessionState.catalog.getTableMetadata(new 
TableIdentifier(newTableName)).properties.get("path")
-assertResult(false)(
-  newLocation.equals(oldLocation)
-)
+if (HoodieSparkUtils.isSpark3_2) {

Review Comment:
   only use hoodieCatalog will set path to tblp



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5587: fix rat plugin issue

2022-05-15 Thread GitBox


hudi-bot commented on PR #5587:
URL: https://github.com/apache/hudi/pull/5587#issuecomment-1126903528

   
   ## CI report:
   
   * dc7921b3be5d009873a13ec761386e52ad952854 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8664)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5587: fix rat plugin issue

2022-05-15 Thread GitBox


hudi-bot commented on PR #5587:
URL: https://github.com/apache/hudi/pull/5587#issuecomment-1126903081

   
   ## CI report:
   
   * dc7921b3be5d009873a13ec761386e52ad952854 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #5560: [HUDI-2673] Add kafka connect sink in docker demo

2022-05-15 Thread GitBox


hudi-bot commented on PR #5560:
URL: https://github.com/apache/hudi/pull/5560#issuecomment-1126902519

   
   ## CI report:
   
   * 3203eb09510f40b607078ded11159ce135ed2e11 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8662)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >