Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-23 Thread via GitHub


nsivabalan merged PR #10342:
URL: https://github.com/apache/hudi/pull/10342


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-23 Thread via GitHub


nsivabalan commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1907165124

   https://github.com/apache/hudi/assets/513218/200599ac-5f88-431f-85c3-d7dc1efad863;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1905071110

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * e83c0ad8df5632271dd53f03dedf5410d307e19e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22095)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1905063294

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * e83c0ad8df5632271dd53f03dedf5410d307e19e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22095)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


the-other-tim-brown commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1905025827

   @hudi-bot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1904740395

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * e83c0ad8df5632271dd53f03dedf5410d307e19e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22095)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1904575693

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 75dfe589da0f996e5c9e114bb09e03314fe6e1b8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22075)
 
   * e83c0ad8df5632271dd53f03dedf5410d307e19e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22095)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1904562869

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 75dfe589da0f996e5c9e114bb09e03314fe6e1b8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22075)
 
   * e83c0ad8df5632271dd53f03dedf5410d307e19e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


the-other-tim-brown commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1462190319


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/InputBatch.java:
##
@@ -55,7 +55,7 @@ public SchemaProvider getSchemaProvider() {
 if (batch.isPresent() && schemaProvider == null) {
   throw new HoodieException("Please provide a valid schema provider 
class!");
 }
-return Option.ofNullable(schemaProvider).orElse(new NullSchemaProvider());

Review Comment:
   yes, updating now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-22 Thread via GitHub


nsivabalan commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1462179473


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/InputBatch.java:
##
@@ -55,7 +55,7 @@ public SchemaProvider getSchemaProvider() {
 if (batch.isPresent() && schemaProvider == null) {
   throw new HoodieException("Please provide a valid schema provider 
class!");
 }
-return Option.ofNullable(schemaProvider).orElse(new NullSchemaProvider());

Review Comment:
   minor. do you think we can statically define once and use it rather than 
creating new objects everytime ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-19 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1900958834

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 75dfe589da0f996e5c9e114bb09e03314fe6e1b8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22075)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-19 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1900720862

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 35c5681b0d1af9da97fc15f3befc3345ef3b5bd0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21747)
 
   * 75dfe589da0f996e5c9e114bb09e03314fe6e1b8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22075)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2024-01-19 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1900708923

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 35c5681b0d1af9da97fc15f3befc3345ef3b5bd0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21747)
 
   * 75dfe589da0f996e5c9e114bb09e03314fe6e1b8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-28 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1871442674

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 35c5681b0d1af9da97fc15f3befc3345ef3b5bd0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21747)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-28 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1871338874

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 272e12256eb0187dc12e43c7ce371967d9c0f8ae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21735)
 
   * 35c5681b0d1af9da97fc15f3befc3345ef3b5bd0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21747)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-28 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-187129

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 272e12256eb0187dc12e43c7ce371967d9c0f8ae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21735)
 
   * 35c5681b0d1af9da97fc15f3befc3345ef3b5bd0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-28 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1870989929

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 272e12256eb0187dc12e43c7ce371967d9c0f8ae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21735)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-27 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1870872297

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * a774c2a1efccf6012b20f0a94b44f8b3ae4cdbbe Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21728)
 
   * 272e12256eb0187dc12e43c7ce371967d9c0f8ae Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21735)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-27 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1870868450

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * a774c2a1efccf6012b20f0a94b44f8b3ae4cdbbe Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21728)
 
   * 272e12256eb0187dc12e43c7ce371967d9c0f8ae UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-27 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1870840422

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * a774c2a1efccf6012b20f0a94b44f8b3ae4cdbbe Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21728)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-27 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1870798249

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 3f0829263192c35ae636e707106a97d7c0142ff7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21555)
 
   * a774c2a1efccf6012b20f0a94b44f8b3ae4cdbbe Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21728)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-27 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1870795525

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 3f0829263192c35ae636e707106a97d7c0142ff7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21555)
 
   * a774c2a1efccf6012b20f0a94b44f8b3ae4cdbbe UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


the-other-tim-brown commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429453468


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   `getHoodieClientConfig` is also called from the constructor so you need to 
consider that path too if you are looking simply at paths that can eventually 
hit this `getSchemaForWriteConfig` method. I've updated the code so that there 
is no repeated call anymore and the call from the constructor avoids a 
potential schema lookup entirely.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


nsivabalan commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429412658


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1008
 
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1014
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1049
 
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1076C26-L1076C49
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


nsivabalan commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429412658


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1008
 
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1049
 
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L1076C26-L1076C49
   
   
https://github.com/apache/hudi/blob/50f0d9f3baeafe92c38dfb0b59b337d83391ea42/hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java#L
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859281330

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 3f0829263192c35ae636e707106a97d7c0142ff7 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21555)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859254794

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 7c3ea778cc509ea71d9b837d4b228bb07abf18b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21544)
 
   * 3f0829263192c35ae636e707106a97d7c0142ff7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21555)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859253145

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 7c3ea778cc509ea71d9b837d4b228bb07abf18b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21544)
 
   * 3f0829263192c35ae636e707106a97d7c0142ff7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


the-other-tim-brown commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429243393


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -1010,8 +1008,9 @@ private void reInitWriteClient(Schema sourceSchema, 
Schema targetSchema, Option<
 if (HoodieStreamerUtils.isDropPartitionColumns(props)) {
   targetSchema = HoodieAvroUtils.removeFields(targetSchema, 
HoodieStreamerUtils.getPartitionColumns(props));
 }
-registerAvroSchemas(sourceSchema, targetSchema);
-final HoodieWriteConfig initialWriteConfig = 
getHoodieClientConfig(targetSchema);
+final Pair initialWriteConfigAndSchema = 
getHoodieClientConfigAndWriterSchema(targetSchema, true);
+final HoodieWriteConfig initialWriteConfig = 
initialWriteConfigAndSchema.getLeft();
+registerAvroSchemas(sourceSchema, initialWriteConfigAndSchema.getRight());

Review Comment:
   Ordering was wrong previously. In this edge case, we would not register the 
avro schema for the actual target schema if we fall back to reading it from the 
commits.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


the-other-tim-brown commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429241599


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   @nsivabalan can you link the line where we make the call? I am missing 
something or have some outdated code somehow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


nsivabalan commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429238447


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   We do call. 
   ```
   reInitWriteClient {
   
  final HoodieWriteConfig initialWriteConfig = 
getHoodieClientConfig(targetSchema); 
   }
   ```
   
   within getHoodieClientConfig, we do call 
   ```
   if (schema != null) {
 builder.withSchema(getSchemaForWriteConfig(schema).toString());
   }
   ```
 
   but I agree, that if we had already called getSchemaForWriteConfig and fixed 
the target schema, it will be no-op. Was just pointing out we are making 
repeated calls. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


the-other-tim-brown commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429204766


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   We don't call `getSchemaForWriteConfig` from `reInitWriteClient()`. There is 
a call to `getHoodieClientConfig` that takes in a schema that must be non-null 
to hit the `getSchemaForWriteConfig` path though. 
   
   I think one way to simplify this is to make this call always happen if the 
schema is required in the config. When we initialize the StreamSync class, we 
will pass in some schema or null potentially but I don't think we really even 
need the writer schema for this case and want to avoid a read from the 
filesystem if possible to fetch the latest commit schema. If you agree, I can 
clean this up so that `getHoodieClientConfig` takes in a schema and a boolean 
indicating whether the schema must be set in the config.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-17 Thread via GitHub


nsivabalan commented on code in PR #10342:
URL: https://github.com/apache/hudi/pull/10342#discussion_r1429201396


##
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##
@@ -998,9 +996,9 @@ public void runMetaSync() {
* this constraint.
*/
   private void setupWriteClient(Option> recordsOpt) 
throws IOException {
-if ((null != schemaProvider)) {
+if (null != schemaProvider) {
   Schema sourceSchema = schemaProvider.getSourceSchema();
-  Schema targetSchema = schemaProvider.getTargetSchema();
+  Schema targetSchema = 
getSchemaForWriteConfig(schemaProvider.getTargetSchema());

Review Comment:
   we call reInitWriteClient twice in this class. If we wer fixing the target 
schema here to go through getSchemaForWriteConfig, should we fix the other 
caller as well. 
   Also, looks like we are calling getSchemaForWriteConfig within 
reInitWriteClient(), should we remove that call if callers already take care of 
it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-16 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859029231

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 7c3ea778cc509ea71d9b837d4b228bb07abf18b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21544)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-16 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859021797

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 7c3ea778cc509ea71d9b837d4b228bb07abf18b2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21544)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-16 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859020883

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   * 7c3ea778cc509ea71d9b837d4b228bb07abf18b2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7237] Hudi Streamer: Handle edge case with null schema, minor cleanups [hudi]

2023-12-16 Thread via GitHub


hudi-bot commented on PR #10342:
URL: https://github.com/apache/hudi/pull/10342#issuecomment-1859014631

   
   ## CI report:
   
   * cb62ad9bed32bf3acc6f8227e5e824cb73e8f0e4 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org