Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]

2024-05-28 Thread via GitHub


danny0405 merged PR #11297:
URL: https://github.com/apache/hudi/pull/11297


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]

2024-05-28 Thread via GitHub


hudi-bot commented on PR #11297:
URL: https://github.com/apache/hudi/pull/11297#issuecomment-2136083376

   
   ## CI report:
   
   * b890368f0e4246fd7b9982181b8a3762a25e1249 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24047)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]

2024-05-27 Thread via GitHub


danny0405 commented on PR #11297:
URL: https://github.com/apache/hudi/pull/11297#issuecomment-2134144554

   > LGTM. In this cases, using `queue` for `inputSplits` is better than 
`ArrayList`, right?
   
I didn't see any difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]

2024-05-27 Thread via GitHub


beyond1920 commented on code in PR #11297:
URL: https://github.com/apache/hudi/pull/11297#discussion_r1616165704


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java:
##
@@ -49,15 +54,24 @@ public 
HoodieLookupTableReader(SerializableSupplier> inp
   public void open() throws IOException {
 this.inputFormat = inputFormatSupplier.get();
 inputFormat.configure(conf);
-InputSplit[] inputSplits = inputFormat.createInputSplits(1);
+this.inputSplits = 
Arrays.stream(inputFormat.createInputSplits(1)).collect(Collectors.toList());

Review Comment:
   this.inputSplits = Arrays.stream(inputFormat.createInputSplits(1))

.collect(Collectors.toCollection(LinkedList::new));
   



##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java:
##
@@ -41,6 +44,8 @@ public class HoodieLookupTableReader implements Serializable {
 
   private InputFormat inputFormat;
 
+  private List inputSplits;

Review Comment:
   private Queue inputSplits;



##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java:
##
@@ -49,15 +54,24 @@ public 
HoodieLookupTableReader(SerializableSupplier> inp
   public void open() throws IOException {
 this.inputFormat = inputFormatSupplier.get();
 inputFormat.configure(conf);
-InputSplit[] inputSplits = inputFormat.createInputSplits(1);
+this.inputSplits = 
Arrays.stream(inputFormat.createInputSplits(1)).collect(Collectors.toList());
 ((RichInputFormat) inputFormat).openInputFormat();
-inputFormat.open(inputSplits[0]);
+inputFormat.open(inputSplits.remove(0));

Review Comment:
   inputFormat.open(inputSplits.poll());



##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java:
##
@@ -49,15 +54,24 @@ public 
HoodieLookupTableReader(SerializableSupplier> inp
   public void open() throws IOException {
 this.inputFormat = inputFormatSupplier.get();
 inputFormat.configure(conf);
-InputSplit[] inputSplits = inputFormat.createInputSplits(1);
+this.inputSplits = 
Arrays.stream(inputFormat.createInputSplits(1)).collect(Collectors.toList());
 ((RichInputFormat) inputFormat).openInputFormat();
-inputFormat.open(inputSplits[0]);
+inputFormat.open(inputSplits.remove(0));
   }
 
   @Nullable
   public RowData read(RowData reuse) throws IOException {
 if (!inputFormat.reachedEnd()) {
   return (RowData) inputFormat.nextRecord(reuse);
+} else {
+  while (!inputSplits.isEmpty()) {
+// release the last itr first.
+inputFormat.close();
+inputFormat.open(inputSplits.remove(0));

Review Comment:
   inputFormat.open(inputSplits.poll());
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]

2024-05-24 Thread via GitHub


hudi-bot commented on PR #11297:
URL: https://github.com/apache/hudi/pull/11297#issuecomment-2130654798

   
   ## CI report:
   
   * b890368f0e4246fd7b9982181b8a3762a25e1249 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]

2024-05-24 Thread via GitHub


danny0405 opened a new pull request, #11297:
URL: https://github.com/apache/hudi/pull/11297

   ### Change Logs
   
   Should load all the input splits of the table.
   
   ### Impact
   
   none
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org