Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]
danny0405 merged PR #11297: URL: https://github.com/apache/hudi/pull/11297 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]
hudi-bot commented on PR #11297: URL: https://github.com/apache/hudi/pull/11297#issuecomment-2136083376 ## CI report: * b890368f0e4246fd7b9982181b8a3762a25e1249 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24047) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]
danny0405 commented on PR #11297: URL: https://github.com/apache/hudi/pull/11297#issuecomment-2134144554 > LGTM. In this cases, using `queue` for `inputSplits` is better than `ArrayList`, right? I didn't see any difference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]
beyond1920 commented on code in PR #11297: URL: https://github.com/apache/hudi/pull/11297#discussion_r1616165704 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java: ## @@ -49,15 +54,24 @@ public HoodieLookupTableReader(SerializableSupplier> inp public void open() throws IOException { this.inputFormat = inputFormatSupplier.get(); inputFormat.configure(conf); -InputSplit[] inputSplits = inputFormat.createInputSplits(1); +this.inputSplits = Arrays.stream(inputFormat.createInputSplits(1)).collect(Collectors.toList()); Review Comment: this.inputSplits = Arrays.stream(inputFormat.createInputSplits(1)) .collect(Collectors.toCollection(LinkedList::new)); ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java: ## @@ -41,6 +44,8 @@ public class HoodieLookupTableReader implements Serializable { private InputFormat inputFormat; + private List inputSplits; Review Comment: private Queue inputSplits; ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java: ## @@ -49,15 +54,24 @@ public HoodieLookupTableReader(SerializableSupplier> inp public void open() throws IOException { this.inputFormat = inputFormatSupplier.get(); inputFormat.configure(conf); -InputSplit[] inputSplits = inputFormat.createInputSplits(1); +this.inputSplits = Arrays.stream(inputFormat.createInputSplits(1)).collect(Collectors.toList()); ((RichInputFormat) inputFormat).openInputFormat(); -inputFormat.open(inputSplits[0]); +inputFormat.open(inputSplits.remove(0)); Review Comment: inputFormat.open(inputSplits.poll()); ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/HoodieLookupTableReader.java: ## @@ -49,15 +54,24 @@ public HoodieLookupTableReader(SerializableSupplier> inp public void open() throws IOException { this.inputFormat = inputFormatSupplier.get(); inputFormat.configure(conf); -InputSplit[] inputSplits = inputFormat.createInputSplits(1); +this.inputSplits = Arrays.stream(inputFormat.createInputSplits(1)).collect(Collectors.toList()); ((RichInputFormat) inputFormat).openInputFormat(); -inputFormat.open(inputSplits[0]); +inputFormat.open(inputSplits.remove(0)); } @Nullable public RowData read(RowData reuse) throws IOException { if (!inputFormat.reachedEnd()) { return (RowData) inputFormat.nextRecord(reuse); +} else { + while (!inputSplits.isEmpty()) { +// release the last itr first. +inputFormat.close(); +inputFormat.open(inputSplits.remove(0)); Review Comment: inputFormat.open(inputSplits.poll()); -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]
hudi-bot commented on PR #11297: URL: https://github.com/apache/hudi/pull/11297#issuecomment-2130654798 ## CI report: * b890368f0e4246fd7b9982181b8a3762a25e1249 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [HUDI-7795] Fix loading of input splits from look up table reader [hudi]
danny0405 opened a new pull request, #11297: URL: https://github.com/apache/hudi/pull/11297 ### Change Logs Should load all the input splits of the table. ### Impact none ### Risk level (write none, low medium or high below) none ### Documentation Update none ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org