[GitHub] [hudi] danny0405 opened a new pull request, #8710: [HUDI-6210] Failed to add fields in BUCKET index table

2023-05-14 Thread via GitHub


danny0405 opened a new pull request, #8710:
URL: https://github.com/apache/hudi/pull/8710

   ### Change Logs
   
   The alter table cmd misses the conversion from sql options to data source 
options, this PR fixes that.
   
   ### Impact
   
   none
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8707: [MINOR] Remove unused imports in Spark adapters

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8707:
URL: https://github.com/apache/hudi/pull/8707#issuecomment-1547288225

   
   ## CI report:
   
   * 8a6d79c445afe80656999d18afe557d8ab3843f6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17069)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8709: [DNM] Release 0.13.1 rc1 testing

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8709:
URL: https://github.com/apache/hudi/pull/8709#issuecomment-1547288329

   
   ## CI report:
   
   * ef16653b95c06919c62bd1d7488566c525e1399b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17071)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8708: [HUDI-6209] Move test deps to tests-common

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8708:
URL: https://github.com/apache/hudi/pull/8708#issuecomment-1547288271

   
   ## CI report:
   
   * c04ee93a562d07012bc6b48ea064909066191526 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17070)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6210) Failed to add fields in BUCKET index table

2023-05-14 Thread Danny Chen (Jira)
Danny Chen created HUDI-6210:


 Summary: Failed to add fields in BUCKET index table
 Key: HUDI-6210
 URL: https://issues.apache.org/jira/browse/HUDI-6210
 Project: Apache Hudi
  Issue Type: Bug
  Components: spark-sql
Reporter: Danny Chen
 Fix For: 0.14.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #8709: [DNM] Release 0.13.1 rc1 testing

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8709:
URL: https://github.com/apache/hudi/pull/8709#issuecomment-1547280416

   
   ## CI report:
   
   * ef16653b95c06919c62bd1d7488566c525e1399b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8707: [MINOR] Remove unused imports in Spark adapters

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8707:
URL: https://github.com/apache/hudi/pull/8707#issuecomment-1547280311

   
   ## CI report:
   
   * 8a6d79c445afe80656999d18afe557d8ab3843f6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8708: [HUDI-6209] Move test deps to tests-common

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8708:
URL: https://github.com/apache/hudi/pull/8708#issuecomment-1547280353

   
   ## CI report:
   
   * c04ee93a562d07012bc6b48ea064909066191526 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8690: [HUDI-6199] Fix deletes with custom payload implementation

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8690:
URL: https://github.com/apache/hudi/pull/8690#issuecomment-1547280222

   
   ## CI report:
   
   * 042b47a253141773fef79156777f2f2eef48af7c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17068)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8690: [HUDI-6199] Fix deletes with custom payload implementation

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8690:
URL: https://github.com/apache/hudi/pull/8690#issuecomment-1547273604

   
   ## CI report:
   
   * 042b47a253141773fef79156777f2f2eef48af7c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8706: [HUDI-6208]Fix jetty conflicts in the packaging process

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8706:
URL: https://github.com/apache/hudi/pull/8706#issuecomment-1547273665

   
   ## CI report:
   
   * 3b0af30efe11a323921253f6bd0ff1cd39bdd22c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17066)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch release-0.13.1-rc1 created (now ef16653b95c)

2023-05-14 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a change to branch release-0.13.1-rc1
in repository https://gitbox.apache.org/repos/asf/hudi.git


  at ef16653b95c [HUDI-6199] Fix deletes with custom payload implementation

No new revisions were added by this update.



[GitHub] [hudi] yihua opened a new pull request, #8709: [DO NOT MERGE] Release 0.13.1 rc1 testing

2023-05-14 Thread via GitHub


yihua opened a new pull request, #8709:
URL: https://github.com/apache/hudi/pull/8709

   ### Change Logs
   
   As above
   
   ### Impact
   
   Testing only
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-14 Thread via GitHub


bvaradar commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1193380088


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/storage/row/TestHoodieRowCreateHandle.java:
##
@@ -190,16 +189,8 @@ public void testInstantiationFailure(boolean 
enableMetadataTable) {
   HoodieTable table = HoodieSparkTable.create(cfg, context, metaClient);
   new HoodieRowCreateHandle(table, cfg, " def", 
UUID.randomUUID().toString(), "001", RANDOM.nextInt(10), RANDOM.nextLong(), 
RANDOM.nextLong(), SparkDatasetTestUtils.STRUCT_TYPE);
   fail("Should have thrown exception");
-} catch (HoodieInsertException ioe) {

Review Comment:
   Makes sense. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-14 Thread via GitHub


bvaradar commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1193378879


##
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/util/FilterGenVisitor.java:
##
@@ -42,9 +43,10 @@ private String quoteStringLiteral(String value) {
 }
   }
 
-  private String visitAnd(Expression left, Expression right) {
-String leftResult = left.accept(this);
-String rightResult = right.accept(this);
+  @Override
+  public String visitAnd(Predicates.And and) {

Review Comment:
   Sounds good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-14 Thread via GitHub


bvaradar commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1193378558


##
hudi-spark-datasource/hudi-spark2/src/main/scala/org/apache/spark/sql/adapter/Spark2Adapter.scala:
##
@@ -186,4 +186,13 @@ class Spark2Adapter extends SparkAdapter {
 case OFF_HEAP => "OFF_HEAP"
 case _ => throw new IllegalArgumentException(s"Invalid StorageLevel: 
$level")
   }
+
+  override def translateFilter(predicate: Expression,
+   supportNestedPredicatePushdown: Boolean = 
false): Option[Filter] = {
+if (supportNestedPredicatePushdown) {

Review Comment:
   Sounds good. Can you add them in the javadoc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-14 Thread via GitHub


bvaradar commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1193377052


##
hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java:
##
@@ -58,13 +62,21 @@ public class FileSystemBackedTableMetadata implements 
HoodieTableMetadata {
   private final SerializableConfiguration hadoopConf;
   private final String datasetBasePath;
   private final boolean assumeDatePartitioning;
+  private final boolean hiveStylePartitioningEnabled;
+  private final boolean urlEncodePartitioningEnabled;
 
   public FileSystemBackedTableMetadata(HoodieEngineContext engineContext, 
SerializableConfiguration conf, String datasetBasePath,
boolean assumeDatePartitioning) {
 this.engineContext = engineContext;
 this.hadoopConf = conf;
 this.datasetBasePath = datasetBasePath;
 this.assumeDatePartitioning = assumeDatePartitioning;
+HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder()

Review Comment:
   Can we create a new abstract class AbstractHoodieTableMetadata and move 
metaclient and hiveStylePartitioningEnabled and urlEncodePartitioningEnabled  
as member variables in it . We can have BaseTableMetadata and 
FileSystemBackedTableMetadata extend it. Later we can refactor to move any 
common functions to this class, 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8394: [HUDI-6085] Eliminate cleaning tasks for flink mor table if online async compaciton is disabled

2023-05-14 Thread via GitHub


zhuanshenbsj1 commented on code in PR #8394:
URL: https://github.com/apache/hudi/pull/8394#discussion_r1193373503


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSink.java:
##
@@ -103,6 +103,8 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Context 
context) {
   conf.setBoolean(FlinkOptions.COMPACTION_ASYNC_ENABLED, false);
 }
 return Pipelines.compact(conf, pipeline);
+  } else if (OptionsResolver.isMorTable(conf)) {
+return Pipelines.dummySink(pipeline);
   } else {

Review Comment:
   If  CLEAN_ASYNC_ENABLED = true,a schedule will still be executed. I think 
cluster and compact should be consistent here(If the cow table  async cluster 
is closed, there will be no clean operator). And now both Spark and Flink will 
clean up when executing offline jobs, unless forcibly closed. What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8394: [HUDI-6085] Eliminate cleaning tasks for flink mor table if online async compaciton is disabled

2023-05-14 Thread via GitHub


zhuanshenbsj1 commented on code in PR #8394:
URL: https://github.com/apache/hudi/pull/8394#discussion_r1193373503


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSink.java:
##
@@ -103,6 +103,8 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Context 
context) {
   conf.setBoolean(FlinkOptions.COMPACTION_ASYNC_ENABLED, false);
 }
 return Pipelines.compact(conf, pipeline);
+  } else if (OptionsResolver.isMorTable(conf)) {
+return Pipelines.dummySink(pipeline);
   } else {

Review Comment:
   If  CLEAN_ASYNC_ENABLED = true,a schedule will still be executed. I think 
cluster and compact should be consistent here. And now both Spark and Flink 
will clean up when executing offline jobs, unless forcibly closed. What do you 
think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan opened a new pull request, #8708: [HUDI-6209] Move test deps to tests-common

2023-05-14 Thread via GitHub


xushiyan opened a new pull request, #8708:
URL: https://github.com/apache/hudi/pull/8708

   ### Change Logs
   
   Move junit, hadoop, awaitility, scalatest to `hudi-tests-common` module
   
   ### Impact
   
   Only test deps
   
   ### Risk level
   
   Low
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua opened a new pull request, #8707: [MINOR] Remove unused imports in Spark adapters

2023-05-14 Thread via GitHub


yihua opened a new pull request, #8707:
URL: https://github.com/apache/hudi/pull/8707

   ### Change Logs
   
   As above.
   
   ### Impact
   
   Code cleanup only.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-14 Thread via GitHub


bvaradar commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1193370249


##
hudi-common/src/main/java/org/apache/hudi/expression/Expression.java:
##
@@ -40,14 +51,19 @@ public enum Operator {
 }
   }
 
-  private final List children;
+  List getChildren();

Review Comment:
   Sounds good. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8669: [HUDI-5362] Rebase IncrementalRelation over HoodieBaseRelation

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8669:
URL: https://github.com/apache/hudi/pull/8669#issuecomment-1547222193

   
   ## CI report:
   
   * 0eacefd8bc063e0c574068f09670014804f10dc2 UNKNOWN
   * 530990d2ada5b8c33d31f8d45284e9038b2df719 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17065)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1547185601

   
   ## CI report:
   
   * 748d2c1eaefffba730a2fc71e437a25066d0cec5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17061)
 
   * 68253fa253913008089a1661392d2ddfa172335a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17067)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8659:
URL: https://github.com/apache/hudi/pull/8659#issuecomment-1547182395

   
   ## CI report:
   
   * eaf85f368095751acf44e8df86752a335bc719ba Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17063)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1547182132

   
   ## CI report:
   
   * 748d2c1eaefffba730a2fc71e437a25066d0cec5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17061)
 
   * 68253fa253913008089a1661392d2ddfa172335a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] eric9204 commented on a diff in pull request #8706: [HUDI-6208]Fix jetty conflicts in the packaging process

2023-05-14 Thread via GitHub


eric9204 commented on code in PR #8706:
URL: https://github.com/apache/hudi/pull/8706#discussion_r1193313618


##
hudi-timeline-service/pom.xml:
##
@@ -87,6 +87,12 @@
   kryo-shaded
 
 
+
+  org.eclipse.jetty
+  jetty-util
+  ${jetty.version}

Review Comment:
   Problem recurrence command:
   ```
   mvn clean package -DskipTests -Dspark3.1 -Dscala-2.12 -Dflink1.17 
-Dhive.version=3.1.2 -Pflink-bundle-shade-hive3 -Dhadoop.version=3.3.0 
-Dcheckstyle.skip=true -Drat.skip=true  -pl packaging/hudi-flink-bundle -am 
   ```
   
   There is only one version of jetty jar on the dependency tree of 
`hudi-timeline-service`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6209) Move test dependencies to test common module

2023-05-14 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-6209:


 Summary: Move test dependencies to test common module
 Key: HUDI-6209
 URL: https://issues.apache.org/jira/browse/HUDI-6209
 Project: Apache Hudi
  Issue Type: Improvement
  Components: tests-ci
Reporter: Raymond Xu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8190:
URL: https://github.com/apache/hudi/pull/8190#issuecomment-1547150618

   
   ## CI report:
   
   * 1253db4e3d1372ebad8ec4e8c1bf143bb5947693 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17017)
 
   * 1ca7cd7dcd3534fa8d21fcf80eefc760c16e6f75 UNKNOWN
   * 30d043fb18dca954f9df59c450671106a7fa070e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8706: [HUDI-6208]Fix jetty conflicts in the packaging process

2023-05-14 Thread via GitHub


danny0405 commented on code in PR #8706:
URL: https://github.com/apache/hudi/pull/8706#discussion_r1193285733


##
hudi-timeline-service/pom.xml:
##
@@ -87,6 +87,12 @@
   kryo-shaded
 
 
+
+  org.eclipse.jetty
+  jetty-util
+  ${jetty.version}

Review Comment:
   How could the confclit happen, are multiple jars of jetty existing on the 
dependency tree?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


danny0405 commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1193284045


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > I currently get HoodieTimelineTimeZone by instantiate a HoodieTableConfig
   
   If no existing table meta client or table config can be reused, we must 
instantiate a new one. For `HoodieTableConfig`, usually we fetch a meta client 
first then get the config, take 
https://github.com/apache/hudi/blob/42b517d9666f5aafe3faa2a153b07d6f7c774dae/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java#L307
 for a reference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] c-f-cooper commented on issue #8696: [SUPPORT]File does not exist

2023-05-14 Thread via GitHub


c-f-cooper commented on issue #8696:
URL: https://github.com/apache/hudi/issues/8696#issuecomment-1547124053

   > You still uses the insert + cow + multi-writer right?
   
   yes,insert+cow+multi-writer,all configs are bellow:
   ```
   metadata.enabled=false
   clustering.schedule.enabled=true
   clustering.async.enabled=true
   hoodie.cleaner.policy.failed.writes=LAZY
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


danny0405 commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1193284045


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > I currently get HoodieTimelineTimeZone by instantiate a HoodieTableConfig
   
   If no existing table meta client or table config can be reused, we must 
instantiate a new one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


danny0405 commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1189302187


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   If possible, fetch the timezone with metaClient.tableConfig, the 
`HoodieTimelineTimeZone` can not assure the initialization of zoneId.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #8657: [HUDI-6150] Support bucketing for each hive client

2023-05-14 Thread via GitHub


danny0405 commented on PR #8657:
URL: https://github.com/apache/hudi/pull/8657#issuecomment-1547119881

   > I am not sure it is a good design to introduce spark concepts within 
hudi-client-common
   
   Obviously it is a bad design that we should avoid to take, can we just impl 
the whole spark murmur 3 as a whole in Hudi, I mean the data types is not that 
big deal we can use the Avro data types instead, or just use Spark data type 
for Spark impl and Flink data type for Flink impl.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8706: [HUDI-6208]Fix jetty conflicts in the packaging process

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8706:
URL: https://github.com/apache/hudi/pull/8706#issuecomment-1547116549

   
   ## CI report:
   
   * 3b0af30efe11a323921253f6bd0ff1cd39bdd22c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17066)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8669: [HUDI-5362] Rebase IncrementalRelation over HoodieBaseRelation

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8669:
URL: https://github.com/apache/hudi/pull/8669#issuecomment-1547116474

   
   ## CI report:
   
   * 0eacefd8bc063e0c574068f09670014804f10dc2 UNKNOWN
   * d2f1d265394f8a6fa33f96577704af6f8422e996 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17036)
 
   * 530990d2ada5b8c33d31f8d45284e9038b2df719 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17065)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] c-f-cooper commented on issue #8696: [SUPPORT]File does not exist

2023-05-14 Thread via GitHub


c-f-cooper commented on issue #8696:
URL: https://github.com/apache/hudi/issues/8696#issuecomment-1547114475

   > Did you config the correct client id for multi-writer?
   
   I use the master branch code,It's default generate the client id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-05-14 Thread via GitHub


danny0405 commented on code in PR #8190:
URL: https://github.com/apache/hudi/pull/8190#discussion_r1193278783


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/common/fs/TestHoodieFileStatusSerialization.java:
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.fs;
+
+import org.apache.hudi.avro.model.HoodieFileStatus;
+import org.apache.hudi.client.common.HoodieSparkEngineContext;
+import org.apache.hudi.common.bootstrap.FileStatusUtils;
+import org.apache.hudi.common.engine.HoodieEngineContext;
+import org.apache.hudi.common.util.ValidationUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.testutils.HoodieClientTestHarness;
+
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.TestInstance;
+import org.junit.jupiter.api.TestInstance.Lifecycle;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+/**
+ * Test the if {@link HoodieFileStatus} is serializable
+ */
+@TestInstance(Lifecycle.PER_CLASS)
+public class TestHoodieFileStatusSerialization extends HoodieClientTestHarness 
{
+
+  HoodieEngineContext engineContext;
+  List testPaths;
+
+  @BeforeAll
+  public void setUp() throws IOException {
+initSparkContexts();
+testPaths = new ArrayList<>(5);
+for (int i = 0; i < 5; i++) {
+  testPaths.add(new Path("s3://table-bucket/"));
+}
+engineContext = new HoodieSparkEngineContext(jsc);
+  }
+
+  @Test
+  public void testNonSerializableFileStatus() {
+try {
+  // this is supposed to throw exception
+  List statuses = engineContext.flatMap(testPaths, path -> {
+FileSystem fileSystem = new NonSerializableFileSystem();
+return Arrays.stream(fileSystem.listStatus(path));
+  }, 5);
+} catch (Exception e) {
+  System.out.println("Exception message:" + e.getMessage());

Review Comment:
   Can we use `org.junit.jupiter.api.assertThrows` instead? And we should not 
print to stdout in testing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8669: [HUDI-5362] Rebase IncrementalRelation over HoodieBaseRelation

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8669:
URL: https://github.com/apache/hudi/pull/8669#issuecomment-1547113162

   
   ## CI report:
   
   * 0eacefd8bc063e0c574068f09670014804f10dc2 UNKNOWN
   * d2f1d265394f8a6fa33f96577704af6f8422e996 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17036)
 
   * 530990d2ada5b8c33d31f8d45284e9038b2df719 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8706: [HUDI-6208]Fix jetty conflicts in the packaging process

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8706:
URL: https://github.com/apache/hudi/pull/8706#issuecomment-1547113228

   
   ## CI report:
   
   * 3b0af30efe11a323921253f6bd0ff1cd39bdd22c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8683: [HUDI-5533] Support spark columns comments

2023-05-14 Thread via GitHub


danny0405 commented on code in PR #8683:
URL: https://github.com/apache/hudi/pull/8683#discussion_r1193277450


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala:
##
@@ -59,32 +59,32 @@ private[sql] object SchemaConverters {
   private val unionFieldMemberPrefix = "member"
 
   private def toSqlTypeHelper(avroSchema: Schema, existingRecordNames: 
Set[String]): SchemaType = {
-avroSchema.getType match {
-  case INT => avroSchema.getLogicalType match {
-case _: Date => SchemaType(DateType, nullable = false)
-case _ => SchemaType(IntegerType, nullable = false)
+(avroSchema.getType, Option(avroSchema.getDoc)) match {

Review Comment:
   The conversion tool is copied from Spark: 
https://github.com/apache/spark/blob/dd4db21cb69a9a9c3715360673a76e6f150303d4/connector/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala#L58,
 just noticed that Spark also does not support keeping comments from Avro 
fields while doing the converison.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on issue #8696: [SUPPORT]File does not exist

2023-05-14 Thread via GitHub


danny0405 commented on issue #8696:
URL: https://github.com/apache/hudi/issues/8696#issuecomment-1547107807

   Did you config the correct client id for multi-writer?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 closed issue #8700: [SUPPORT] URISyntaxException when connnecting from a remote client to the docker compose distro

2023-05-14 Thread via GitHub


danny0405 closed issue #8700: [SUPPORT] URISyntaxException when connnecting 
from a remote client to the docker compose distro
URL: https://github.com/apache/hudi/issues/8700


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated (b3f2c753e4b -> 42b517d9666)

2023-05-14 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from b3f2c753e4b [MINOR] Update 
docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml (#8701)
 add 42b517d9666 [MINOR] Update 
docker-compose_hadoop284_hive233_spark244.yml (#8702)

No new revisions were added by this update.

Summary of changes:
 docker/compose/docker-compose_hadoop284_hive233_spark244.yml | 1 +
 1 file changed, 1 insertion(+)



[GitHub] [hudi] danny0405 merged pull request #8702: Update docker-compose_hadoop284_hive233_spark244.yml

2023-05-14 Thread via GitHub


danny0405 merged PR #8702:
URL: https://github.com/apache/hudi/pull/8702


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [MINOR] Update docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml (#8701)

2023-05-14 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new b3f2c753e4b [MINOR] Update 
docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml (#8701)
b3f2c753e4b is described below

commit b3f2c753e4b86098b536a9953daa691f48616e86
Author: Albert Wong 
AuthorDate: Sun May 14 19:28:36 2023 -0700

[MINOR] Update docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml 
(#8701)

Fix for https://github.com/apache/hudi/issues/8700
---
 docker/compose/docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/docker/compose/docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml 
b/docker/compose/docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml
index 857180cfbee..0abcf676d5f 100644
--- a/docker/compose/docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml
+++ b/docker/compose/docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml
@@ -257,3 +257,4 @@ volumes:
 
 networks:
   default:
+ name: hudi



[GitHub] [hudi] danny0405 merged pull request #8701: [MINOR] Update docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml

2023-05-14 Thread via GitHub


danny0405 merged PR #8701:
URL: https://github.com/apache/hudi/pull/8701


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6208) Fix jetty conflicts in the packaging process

2023-05-14 Thread eric (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

eric updated HUDI-6208:
---
Description: 
!image-2023-05-15-09-48-18-179.png!

 

 

[[HUDI-6208]Fix jetty conflicts in the packaging process by eric9204 · Pull 
Request #8706 · apache/hudi 
(github.com)|https://github.com/apache/hudi/pull/8706]

  was:!image-2023-05-15-09-48-18-179.png!


> Fix jetty conflicts in the packaging process
> 
>
> Key: HUDI-6208
> URL: https://issues.apache.org/jira/browse/HUDI-6208
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: timeline-server
>Affects Versions: 0.14.0
> Environment: hudi-master
>Reporter: eric
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-05-15-09-48-18-179.png
>
>
> !image-2023-05-15-09-48-18-179.png!
>  
>  
> [[HUDI-6208]Fix jetty conflicts in the packaging process by eric9204 · Pull 
> Request #8706 · apache/hudi 
> (github.com)|https://github.com/apache/hudi/pull/8706]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] eric9204 opened a new pull request, #8706: Fix jetty conflicts in the packaging process

2023-05-14 Thread via GitHub


eric9204 opened a new pull request, #8706:
URL: https://github.com/apache/hudi/pull/8706

   ### Change Logs
   
   none
   
   ### Impact
   
   none
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6208) Fix jetty conflicts in the packaging process

2023-05-14 Thread eric (Jira)
eric created HUDI-6208:
--

 Summary: Fix jetty conflicts in the packaging process
 Key: HUDI-6208
 URL: https://issues.apache.org/jira/browse/HUDI-6208
 Project: Apache Hudi
  Issue Type: Bug
  Components: timeline-server
Affects Versions: 0.14.0
 Environment: hudi-master
Reporter: eric
 Attachments: image-2023-05-15-09-48-18-179.png

!image-2023-05-15-09-48-18-179.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8659:
URL: https://github.com/apache/hudi/pull/8659#issuecomment-1547080847

   
   ## CI report:
   
   * 485205559daff6698e273a9e724077fc4cd1c8b1 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17062)
 
   * eaf85f368095751acf44e8df86752a335bc719ba Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17063)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8659:
URL: https://github.com/apache/hudi/pull/8659#issuecomment-1547077054

   
   ## CI report:
   
   * 96b692c2e03d48b69686b899359bd15a818025b5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16978)
 
   * 485205559daff6698e273a9e724077fc4cd1c8b1 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17062)
 
   * eaf85f368095751acf44e8df86752a335bc719ba UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1193244670


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > See the discussions we take in: #8631
   
   It seems that there is no good way to get `HoodieTimelineTimeZone` through 
`HoodieTableMetaClient` in `HoodieInstantTimeGenerator`, I currently get 
`HoodieTimelineTimeZone` by instantiate a `HoodieTableConfig`, can you give me 
some advice?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8659:
URL: https://github.com/apache/hudi/pull/8659#issuecomment-1547052898

   
   ## CI report:
   
   * 96b692c2e03d48b69686b899359bd15a818025b5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16978)
 
   * 485205559daff6698e273a9e724077fc4cd1c8b1 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17062)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8659:
URL: https://github.com/apache/hudi/pull/8659#issuecomment-1547050611

   
   ## CI report:
   
   * 96b692c2e03d48b69686b899359bd15a818025b5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16978)
 
   * 485205559daff6698e273a9e724077fc4cd1c8b1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8190:
URL: https://github.com/apache/hudi/pull/8190#issuecomment-1547030431

   
   ## CI report:
   
   * 1253db4e3d1372ebad8ec4e8c1bf143bb5947693 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17017)
 
   * 1ca7cd7dcd3534fa8d21fcf80eefc760c16e6f75 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1547028682

   
   ## CI report:
   
   * 748d2c1eaefffba730a2fc71e437a25066d0cec5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17061)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] CTTY commented on a diff in pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-05-14 Thread via GitHub


CTTY commented on code in PR #8190:
URL: https://github.com/apache/hudi/pull/8190#discussion_r1193222415


##
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieSerializableFileStatus.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.fs;
+
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.Path;
+
+import java.io.Serializable;
+
+/**
+ * A serializable wrapper for FileStatus.
+ * 
+ * Hadoop 2.x FileStatus does not implement Serializable and can cause issues. 
(HUDI-5936)
+ * This class is supposed to make sure FileStatus can be safely serialized by 
wrapping FileStatus
+ * with it, and it should be only used when we absolutely need to serialize 
FileStatus.
+ */
+public class HoodieSerializableFileStatus extends FileStatus implements 
Serializable {
+
+  Path path;
+  long length;
+  boolean isDirectory;
+  short blockReplication;
+  long blockSize;

Review Comment:
   Good point, I just updated it to reuse `HoodieFileStatus`. It works well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] parisni commented on issue #7117: [SUPPORT] parquet bloom filters not supported by hudi

2023-05-14 Thread via GitHub


parisni commented on issue #7117:
URL: https://github.com/apache/hudi/issues/7117#issuecomment-1547017607

   Hudi is able to benefit from parquet files written with blooms. (tested by 
replacing the hudi parquet files with the vanilla spark's one, and it hudi 
datasource triggers the bloom).
   
   Digging the source code, I guess the reason blooms are not taken in 
consideration is in the [hudi's parquetWriter 
wrapper](https://github.com/apache/hudi/blob/67ae0c8e7e4e58454cce18a8f58bfa43f67c1183/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieBaseParquetWriter.java#L49-L59).
 It then calls [the 
parquetWriter](https://github.com/apache/parquet-mr/blob/cac8f7cf55b390c2ac5ef5d14a6aa72597b99284/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java#L231-L236
 public constructor) which has very limited parquet feature support. [There is 
a more complete 
constructor](https://github.com/apache/parquet-mr/blob/cac8f7cf55b390c2ac5ef5d14a6aa72597b99284/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java#L276)
 but sadly it's access is limited to package.
   
   Accessing to package constructor can be done by changing the 
`HoodieBaseParquetWriter` package to `org.apache.parquet.hadoop`, but also the 
`ParquetWriter` has to be present in the same jar (common package cannot be 
spread over multiple jars).
   
   A better option would be parquet provides more suitable constructors. Or I 
am missing something ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7998: [HUDI-5824] Fix: do not combine if write operation is Upsert and COMBINE_BEFORE_UPSERT is false

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7998:
URL: https://github.com/apache/hudi/pull/7998#issuecomment-1547016098

   
   ## CI report:
   
   * 27d61f01fb6709e3aaa08de9ace7738dbedffb24 UNKNOWN
   * b572d737ef10724f71642084c0edf9a9a26540cc UNKNOWN
   * a44c71610c2efd1ebdb1a19c5195f8b1b5e59df7 UNKNOWN
   * 93db1f02dc47c597f6ce7708e98ef943d50a1206 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17058)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546998716

   
   ## CI report:
   
   * 7f2b79876e197fca26e764017ab793d902a3dce6 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17059)
 
   * 748d2c1eaefffba730a2fc71e437a25066d0cec5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17061)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546997511

   
   ## CI report:
   
   * e7016517a3bcad6a4bbcb8d9e5f4ddef350bd9e6 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17057)
 
   * 7f2b79876e197fca26e764017ab793d902a3dce6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17059)
 
   * 748d2c1eaefffba730a2fc71e437a25066d0cec5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7998: [HUDI-5824] Fix: do not combine if write operation is Upsert and COMBINE_BEFORE_UPSERT is false

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7998:
URL: https://github.com/apache/hudi/pull/7998#issuecomment-1546986111

   
   ## CI report:
   
   * 27d61f01fb6709e3aaa08de9ace7738dbedffb24 UNKNOWN
   * b572d737ef10724f71642084c0edf9a9a26540cc UNKNOWN
   * a44c71610c2efd1ebdb1a19c5195f8b1b5e59df7 UNKNOWN
   * dfdd33316d71f2866b3052f45b3328e30678f1a3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17044)
 
   * 93db1f02dc47c597f6ce7708e98ef943d50a1206 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17058)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan merged pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


xushiyan merged PR #7865:
URL: https://github.com/apache/hudi/pull/7865


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated (a8b1fa33ead -> 67ae0c8e7e4)

2023-05-14 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from a8b1fa33ead [HUDI-4630] Fix hudi-utilities test's transformer 
misconfig (#8705)
 add 67ae0c8e7e4 [HUDI-5710] Load all partitions in advance for clean when 
MDT is enabled (#7865)

No new revisions were added by this update.

Summary of changes:
 .../hudi/client/utils/MetadataTableUtils.java  | 44 ++
 .../hudi/table/action/clean/CleanPlanner.java  |  8 
 .../action/savepoint/SavepointActionExecutor.java  | 19 +-
 .../table/view/AbstractTableFileSystemView.java| 11 ++
 .../table/view/PriorityBasedFileSystemView.java|  5 +++
 .../view/RemoteHoodieTableFileSystemView.java  | 12 ++
 .../common/table/view/TableFileSystemView.java |  6 +++
 .../hudi/timeline/service/RequestHandler.java  |  7 
 .../service/handlers/FileSliceHandler.java |  5 +++
 9 files changed, 99 insertions(+), 18 deletions(-)
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataTableUtils.java



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546984391

   
   ## CI report:
   
   * e7016517a3bcad6a4bbcb8d9e5f4ddef350bd9e6 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17057)
 
   * 7f2b79876e197fca26e764017ab793d902a3dce6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17059)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7998: [HUDI-5824] Fix: do not combine if write operation is Upsert and COMBINE_BEFORE_UPSERT is false

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7998:
URL: https://github.com/apache/hudi/pull/7998#issuecomment-1546984236

   
   ## CI report:
   
   * 27d61f01fb6709e3aaa08de9ace7738dbedffb24 UNKNOWN
   * b572d737ef10724f71642084c0edf9a9a26540cc UNKNOWN
   * a44c71610c2efd1ebdb1a19c5195f8b1b5e59df7 UNKNOWN
   * dfdd33316d71f2866b3052f45b3328e30678f1a3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17044)
 
   * 93db1f02dc47c597f6ce7708e98ef943d50a1206 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546984187

   
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17054)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546974840

   
   ## CI report:
   
   * b13036afd0be130692c62ea29d49e2657151ef96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17055)
 
   * e7016517a3bcad6a4bbcb8d9e5f4ddef350bd9e6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17057)
 
   * 7f2b79876e197fca26e764017ab793d902a3dce6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546973446

   
   ## CI report:
   
   * b13036afd0be130692c62ea29d49e2657151ef96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17055)
 
   * e7016517a3bcad6a4bbcb8d9e5f4ddef350bd9e6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [HUDI-4630] Fix hudi-utilities test's transformer misconfig (#8705)

2023-05-14 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new a8b1fa33ead [HUDI-4630] Fix hudi-utilities test's transformer 
misconfig (#8705)
a8b1fa33ead is described below

commit a8b1fa33eadf27fa89b02db860094339f104c709
Author: Shiyan Xu <2701446+xushi...@users.noreply.github.com>
AuthorDate: Mon May 15 01:23:43 2023 +0800

[HUDI-4630] Fix hudi-utilities test's transformer misconfig (#8705)
---
 .../utilities/deltastreamer/TestHoodieMultiTableDeltaStreamer.java   | 5 +++--
 .../delta-streamer-config/short_trip_uber_config.properties  | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git 
a/hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieMultiTableDeltaStreamer.java
 
b/hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieMultiTableDeltaStreamer.java
index 4d6235779a1..8baa4ddb431 100644
--- 
a/hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieMultiTableDeltaStreamer.java
+++ 
b/hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieMultiTableDeltaStreamer.java
@@ -256,12 +256,13 @@ public class TestHoodieMultiTableDeltaStreamer extends 
HoodieDeltaStreamerTestBa
   String tableLevelKeyGeneratorClass = 
tableExecutionContext.getProperties().getString(DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME().key());
   
assertEquals(TestHoodieDeltaStreamer.TestTableLevelGenerator.class.getName(), 
tableLevelKeyGeneratorClass);
   List transformerClass = 
tableExecutionContext.getConfig().transformerClassNames;
-  
assertEquals("org.apache.hudi.utilities.transform.SqlFileBasedTransformer", 
transformerClass.get(0)); // HUDI-4630
+  assertEquals(1, transformerClass.size());
+  
assertEquals("org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer$TestIdentityTransformer",
 transformerClass.get(0));
   break;
 default:
   String defaultKeyGeneratorClass = 
tableExecutionContext.getProperties().getString(DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME().key());
   assertEquals(TestHoodieDeltaStreamer.TestGenerator.class.getName(), 
defaultKeyGeneratorClass);
-  assertNull(tableExecutionContext.getConfig().transformerClassNames); 
 //HUDI-4630
+  assertNull(tableExecutionContext.getConfig().transformerClassNames);
   }
 });
   }
diff --git 
a/hudi-utilities/src/test/resources/delta-streamer-config/short_trip_uber_config.properties
 
b/hudi-utilities/src/test/resources/delta-streamer-config/short_trip_uber_config.properties
index 370826b8949..25b392d580a 100644
--- 
a/hudi-utilities/src/test/resources/delta-streamer-config/short_trip_uber_config.properties
+++ 
b/hudi-utilities/src/test/resources/delta-streamer-config/short_trip_uber_config.properties
@@ -25,4 +25,4 @@ 
hoodie.datasource.hive_sync.table=short_trip_uber_hive_dummy_table
 
hoodie.datasource.write.keygenerator.class=org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer$TestTableLevelGenerator
 
hoodie.deltastreamer.schemaprovider.registry.baseUrl=http://localhost:8081/subjects/
 hoodie.deltastreamer.schemaprovider.registry.urlSuffix=-value/versions/latest
-hoodie.deltastreamer.transformer.class=org.apache.hudi.utilities.transform.SqlFileBasedTransformer
+hoodie.deltastreamer.transformer.class=org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer$TestIdentityTransformer



[GitHub] [hudi] xushiyan merged pull request #8705: [HUDI-4630] Fix hudi-utilities test's transformer misconfig

2023-05-14 Thread via GitHub


xushiyan merged PR #8705:
URL: https://github.com/apache/hudi/pull/8705


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on pull request #8705: [HUDI-4630] Fix hudi-utilities test's transformer misconfig

2023-05-14 Thread via GitHub


xushiyan commented on PR #8705:
URL: https://github.com/apache/hudi/pull/8705#issuecomment-1546952826

   https://github.com/apache/hudi/assets/2701446/8736a751-e931-4f62-af7a-d5e5283fcd77";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546944779

   
   ## CI report:
   
   * b13036afd0be130692c62ea29d49e2657151ef96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17055)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546926641

   
   ## CI report:
   
   * 07e889c9e6cca2b26677a623e3e3d4d4467aed8c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16366)
 
   * b13036afd0be130692c62ea29d49e2657151ef96 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17055)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8445:
URL: https://github.com/apache/hudi/pull/8445#issuecomment-1546917748

   
   ## CI report:
   
   * 07e889c9e6cca2b26677a623e3e3d4d4467aed8c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16366)
 
   * b13036afd0be130692c62ea29d49e2657151ef96 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8705: [HUDI-4630] Fix hudi-utilities test's transformer misconfig

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8705:
URL: https://github.com/apache/hudi/pull/8705#issuecomment-1546916445

   
   ## CI report:
   
   * 0a2e85bcef904d693083f20dafc9215328913a87 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17053)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546916162

   
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051)
 
   * d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17054)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] parisni commented on pull request #8657: [HUDI-6150] Support bucketing for each hive client

2023-05-14 Thread via GitHub


parisni commented on PR #8657:
URL: https://github.com/apache/hudi/pull/8657#issuecomment-1546907688

   I dig a bit in the spark murmur 3 implementation. It is not standard at 
least of two reason:
   1. they use a hardcoded seed = 42 (which likely would not be the same as 
hive)
   2. [they claim their way of dealing with murmur is not 
standard](https://github.com/apache/spark/blob/b23185080cc3e5a00b88496cec70c2b3cd7019f5/common/unsafe/src/main/java/org/apache/spark/unsafe/hash/Murmur3_x86_32.java#L67-L68)
 there is [an issue about 
this](https://issues.apache.org/jira/browse/SPARK-23381) and a other 
implementation (=hashUnsafeBytes2) exists, but it is not used so far.
   
   Then I am not sure we could use [guava murmur3 as 
is](https://guava.dev/releases/23.0/api/docs/com/google/common/hash/Hashing.html#murmur3_32-int-)
   
   The spark3 implementation is based on catalyst expression while in hudi we 
work with java types. If we want to use their implementation we should import 
spark-unsafe as a dependency in the hudi-client-common. We could also copy 
their implementation within hudi and maintain it. However in both case we would 
have to convert basic java types into catalyst types to be able to re-use the 
spark implementation (see 
https://github.com/apache/spark/blob/v3.4.0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L523-L596).
 I am not sure it is a good design to introduce spark concepts within 
hudi-client-common


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8683: [HUDI-5533] Support spark columns comments

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8683:
URL: https://github.com/apache/hudi/pull/8683#issuecomment-1546904147

   
   ## CI report:
   
   * 7bdb94998ee2853e15de0b4ce6c20735f43a0f5c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17006)
 
   * 8d6893fd9daf07c30524474cf9a4d39c66a37cba UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546903848

   
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051)
 
   * d26c15f3bf8ad3d9097b46d4ab663849b3ec50d3 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] parisni commented on pull request #8683: [HUDI-5533] Support spark columns comments

2023-05-14 Thread via GitHub


parisni commented on PR #8683:
URL: https://github.com/apache/hudi/pull/8683#issuecomment-1546895581

   > Does Option.ofNullable work correctly here?
   
   Scala Option does not have ofNullable (java Optinonal do have). BTW 
`Option(value)` is equivalent as what you suggest, and I have corrected


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8705: [HUDI-4630] Fix hudi-utilities test's transformer misconfig

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8705:
URL: https://github.com/apache/hudi/pull/8705#issuecomment-1546893783

   
   ## CI report:
   
   * 0a2e85bcef904d693083f20dafc9215328913a87 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17053)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8705: [HUDI-4630] Fix hudi-utilities test's transformer misconfig

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8705:
URL: https://github.com/apache/hudi/pull/8705#issuecomment-1546892274

   
   ## CI report:
   
   * 0a2e85bcef904d693083f20dafc9215328913a87 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on a diff in pull request #8399: [HUDI-4630] Add transformer capability to individual feeds in MultiTableDeltaStreamer

2023-05-14 Thread via GitHub


xushiyan commented on code in PR #8399:
URL: https://github.com/apache/hudi/pull/8399#discussion_r1193140118


##
hudi-utilities/src/test/resources/delta-streamer-config/short_trip_uber_config.properties:
##
@@ -25,3 +25,4 @@ 
hoodie.datasource.hive_sync.table=short_trip_uber_hive_dummy_table
 
hoodie.datasource.write.keygenerator.class=org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer$TestTableLevelGenerator
 
hoodie.deltastreamer.schemaprovider.registry.baseUrl=http://localhost:8081/subjects/
 hoodie.deltastreamer.schemaprovider.registry.urlSuffix=-value/versions/latest
+hoodie.deltastreamer.transformer.class=org.apache.hudi.utilities.transform.SqlFileBasedTransformer

Review Comment:
   This is a misconfig - this transformer needs other config to work properly, 
without which it'll break table creation for the failing testcases. Fixing it 
here https://github.com/apache/hudi/pull/8705



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan closed pull request #8704: Revert [HUDI-4630] to fix CI failing.

2023-05-14 Thread via GitHub


xushiyan closed pull request #8704: Revert [HUDI-4630] to fix CI failing.
URL: https://github.com/apache/hudi/pull/8704


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on pull request #8704: Revert [HUDI-4630] to fix CI failing.

2023-05-14 Thread via GitHub


xushiyan commented on PR #8704:
URL: https://github.com/apache/hudi/pull/8704#issuecomment-1546890960

   fixing the test itself in https://github.com/apache/hudi/pull/8705


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan opened a new pull request, #8705: [HUDI-4630] Fix transformer misconfig

2023-05-14 Thread via GitHub


xushiyan opened a new pull request, #8705:
URL: https://github.com/apache/hudi/pull/8705

   ### Change Logs
   
   Fix misconfig in 
https://github.com/apache/hudi/pull/8399/files#diff-cead696b47f975856c3053a840d25202e8e2964fea26b9ab0060c7f5b4e39b4aR28
 that caused CI failure
   
   ### Impact
   
   Fix CI.
   
   ### Risk level
   
   None
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546881551

   
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8704: Revert [HUDI-4630] to fix CI failing.

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8704:
URL: https://github.com/apache/hudi/pull/8704#issuecomment-1546870051

   
   ## CI report:
   
   * 47eb9c99c9e054b11661409771306fb7f838151b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17052)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8704: Revert [HUDI-4630] to fix CI failing.

2023-05-14 Thread via GitHub


hudi-bot commented on PR #8704:
URL: https://github.com/apache/hudi/pull/8704#issuecomment-1546868565

   
   ## CI report:
   
   * 47eb9c99c9e054b11661409771306fb7f838151b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #8704: Revert [HUDI-4630] to fix CI failing.

2023-05-14 Thread via GitHub


zhangyue19921010 commented on PR #8704:
URL: https://github.com/apache/hudi/pull/8704#issuecomment-1546867803

   Hi @xushiyan **Maybe** this is the root cause for 
[HUDI-5710](https://github.com/apache/hudi/pull/7865) CI failing.
   Would u mind to take a look? 
   CC [yesemsanthoshkumar](https://github.com/yesemsanthoshkumar)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] zhangyue19921010 opened a new pull request, #8704: Revert [HUDI-4630] to fix CI failing.

2023-05-14 Thread via GitHub


zhangyue19921010 opened a new pull request, #8704:
URL: https://github.com/apache/hudi/pull/8704

   Origin PR: Add transformer capability to individual feeds in… 
MultiTableDeltaStreamer (#8399)"
   
   This reverts commit b497ef1a3f09c50bca889eeb457be70f1c6544c6.
   
   Now master CI is keep failing maybe caused by this PR 
https://github.com/apache/hudi/pull/8399
   
   
   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546857667

   
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039)
 
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17051)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7865: [HUDI-5710] Load all partitions in advance for clean when MDT is enabled

2023-05-14 Thread via GitHub


hudi-bot commented on PR #7865:
URL: https://github.com/apache/hudi/pull/7865#issuecomment-1546856358

   
   ## CI report:
   
   * 59c457e89bef1b404627f9b3700d65235044387c UNKNOWN
   * bb88f6ebfc9ac76a3789073190c7cc5c21fc1d80 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17039)
 
   * bcc1a4521c3d16a0e391a79428ce5efed8d88687 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan closed pull request #7143: [HUDI-5175] Improving FileIndex load performance in PARALLELISM mode

2023-05-14 Thread via GitHub


xushiyan closed pull request #7143: [HUDI-5175] Improving FileIndex load 
performance in PARALLELISM mode
URL: https://github.com/apache/hudi/pull/7143


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on pull request #7143: [HUDI-5175] Improving FileIndex load performance in PARALLELISM mode

2023-05-14 Thread via GitHub


xushiyan commented on PR #7143:
URL: https://github.com/apache/hudi/pull/7143#issuecomment-1546825904

   discussed with @zhangyue19921010 we can close this. with lazy loading, this 
change won't be of much improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org