Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


xushiyan merged PR #10381:
URL: https://github.com/apache/hudi/pull/10381


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


jonvex commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874662101

   Azure CI is passsing: 
https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21791


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874480730

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * 9819ca4db7b4ab9f2476aecc753e3fcc09c7cb7a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21791)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874359251

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * 8fd105afa86dc4d815dd94d7a55bca5bb85031d2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21788)
 
   * 9819ca4db7b4ab9f2476aecc753e3fcc09c7cb7a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21791)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874240503

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21656)
 
   * 8fd105afa86dc4d815dd94d7a55bca5bb85031d2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21788)
 
   * 9819ca4db7b4ab9f2476aecc753e3fcc09c7cb7a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21791)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874227930

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21656)
 
   * 8fd105afa86dc4d815dd94d7a55bca5bb85031d2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21788)
 
   * 9819ca4db7b4ab9f2476aecc753e3fcc09c7cb7a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874152332

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21656)
 
   * 8fd105afa86dc4d815dd94d7a55bca5bb85031d2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21788)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2024-01-02 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1874140814

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21656)
 
   * 8fd105afa86dc4d815dd94d7a55bca5bb85031d2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1866990936

   
   ## CI report:
   
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21656)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1866807501

   
   ## CI report:
   
   * fb6c451f6c26df527833ba6e5a79b00ef6e9ed6a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21636)
 
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21656)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1866764206

   
   ## CI report:
   
   * fb6c451f6c26df527833ba6e5a79b00ef6e9ed6a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21636)
 
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   * fa4b20f1f5cabcd03bc488badaa4e97d26da49c8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1866745066

   
   ## CI report:
   
   * fb6c451f6c26df527833ba6e5a79b00ef6e9ed6a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21636)
 
   * 33a87e77b985a8fd3fe0a6a997059ee20fbedb8b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


jonvex commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1866717921

   > > Did we chase down every reader (base files, log files, iterator) part or 
the new file group and ensured this is the only gap ? @jonvex @linliu-code
   > 
   > I haven't. I will try to do that later.
   
   I checked all the log blocks and the reader context. Could you check the 
record buffers?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


linliu-code commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1866700166

   > Did we chase down every reader (base files, log files, iterator) part or 
the new file group and ensured this is the only gap ? @jonvex @linliu-code
   
   I haven't. I will try to do that later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-21 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865829450

   
   ## CI report:
   
   * fb6c451f6c26df527833ba6e5a79b00ef6e9ed6a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21636)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865493144

   
   ## CI report:
   
   * 5fcf42972d5f140a1730cea1ee18aeb0bb3bbc86 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21628)
 
   * fb6c451f6c26df527833ba6e5a79b00ef6e9ed6a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21636)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865488062

   
   ## CI report:
   
   * 5fcf42972d5f140a1730cea1ee18aeb0bb3bbc86 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21628)
 
   * fb6c451f6c26df527833ba6e5a79b00ef6e9ed6a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


nsivabalan commented on code in PR #10381:
URL: https://github.com/apache/hudi/pull/10381#discussion_r1433487063


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -359,4 +360,13 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tableState: HoodieTableState,
   protected def getLogFilesFromSlice(fileSlice: FileSlice): 
List[HoodieLogFile] = {
 
fileSlice.getLogFiles.sorted(HoodieLogFile.getLogFileComparator).iterator().asScala.toList
   }
+
+  protected def makeMappingIterator(closeableFileGroupRecordIterator: 
HoodieFileGroupReader.HoodieFileGroupReaderIterator[InternalRow],

Review Comment:
   can you name this method also appropriately. 
   getCloseableFileGroupRecordIterator 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865448157

   
   ## CI report:
   
   * 5fcf42972d5f140a1730cea1ee18aeb0bb3bbc86 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21628)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


nsivabalan commented on code in PR #10381:
URL: https://github.com/apache/hudi/pull/10381#discussion_r1433435209


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -359,4 +360,15 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tableState: HoodieTableState,
   protected def getLogFilesFromSlice(fileSlice: FileSlice): 
List[HoodieLogFile] = {
 
fileSlice.getLogFiles.sorted(HoodieLogFile.getLogFileComparator).iterator().asScala.toList
   }
+
+  protected def makeMappingIterator(iter: 
HoodieFileGroupReader.HoodieFileGroupReaderIterator[InternalRow],
+f: Function[InternalRow, InternalRow]): 
Iterator[InternalRow] = {

Review Comment:
   how about CloseableFileGroupRecordIterator  ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865410772

   
   ## CI report:
   
   * 07cae2aa8e635046bb510ba81981d66b207f3725 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21627)
 
   * 5fcf42972d5f140a1730cea1ee18aeb0bb3bbc86 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21628)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


danny0405 commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865389275

   Looks promising, the OOM is gone, the test failures look not related.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


linliu-code commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865312890

   @jonvex , the tests are failing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


linliu-code commented on code in PR #10381:
URL: https://github.com/apache/hudi/pull/10381#discussion_r1433302562


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -359,4 +360,15 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tableState: HoodieTableState,
   protected def getLogFilesFromSlice(fileSlice: FileSlice): 
List[HoodieLogFile] = {
 
fileSlice.getLogFiles.sorted(HoodieLogFile.getLogFileComparator).iterator().asScala.toList
   }
+
+  protected def makeMappingIterator(iter: 
HoodieFileGroupReader.HoodieFileGroupReaderIterator[InternalRow],
+f: Function[InternalRow, InternalRow]): 
Iterator[InternalRow] = {

Review Comment:
   nit: better name for "f"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865304902

   
   ## CI report:
   
   * fe4175984f1eea5443b482e8cbc7e5d2a4ef40a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21620)
 
   * 07cae2aa8e635046bb510ba81981d66b207f3725 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21627)
 
   * 5fcf42972d5f140a1730cea1ee18aeb0bb3bbc86 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21628)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865299610

   
   ## CI report:
   
   * fe4175984f1eea5443b482e8cbc7e5d2a4ef40a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21620)
 
   * 07cae2aa8e635046bb510ba81981d66b207f3725 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21627)
 
   * 5fcf42972d5f140a1730cea1ee18aeb0bb3bbc86 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865266445

   
   ## CI report:
   
   * fe4175984f1eea5443b482e8cbc7e5d2a4ef40a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21620)
 
   * 07cae2aa8e635046bb510ba81981d66b207f3725 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21627)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


linliu-code commented on code in PR #10381:
URL: https://github.com/apache/hudi/pull/10381#discussion_r1433267886


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -359,4 +362,16 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tableState: HoodieTableState,
   protected def getLogFilesFromSlice(fileSlice: FileSlice): 
List[HoodieLogFile] = {
 
fileSlice.getLogFiles.sorted(HoodieLogFile.getLogFileComparator).iterator().asScala.toList
   }
+
+  protected def makeMappingIterator(iter: 
HoodieFileGroupReader.HoodieFileGroupReaderIterator[InternalRow],
+f: Function[InternalRow, InternalRow]): 
Iterator[InternalRow] = {
+val mapIter = new CloseableMappingIterator[InternalRow, InternalRow](iter, 
JFunction.toJavaFunction(f))

Review Comment:
   With this, we have three iterators. Is it possible for us to remove this 
intermediate iterator?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865261200

   
   ## CI report:
   
   * fe4175984f1eea5443b482e8cbc7e5d2a4ef40a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21620)
 
   * 07cae2aa8e635046bb510ba81981d66b207f3725 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7244] Ensure HoodieFileGroupReader.close() is called in spark [hudi]

2023-12-20 Thread via GitHub


hudi-bot commented on PR #10381:
URL: https://github.com/apache/hudi/pull/10381#issuecomment-1865255732

   
   ## CI report:
   
   * fe4175984f1eea5443b482e8cbc7e5d2a4ef40a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21620)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org