[jira] [Commented] (HADOOP-19385) S3A: add a file-format-parsing module for testing format parsing

ASF GitHub Bot (Jira) Wed, 22 Jan 2025 11:04:49 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-19385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916160#comment-17916160
 ]


ASF GitHub Bot commented on HADOOP-19385:
-----------------------------------------

steveloughran opened a new pull request, #7316:
URL: https://github.com/apache/hadoop/pull/7316

   
   Add Iceberg core to the hadoop-aws test classpath.
   
   Iceberg is java17+ only, so this adds
   * A new test source path src/test/java17
   * A new profile "java-17-or-later" which includes this and declares the 
dependency on iceberg-core.
   
   The new test is ITestIcebergBulkDelete; it is parameterized Iceberg bulk 
delete enabled/disabled and s3a multipart delete enabled/disabled.
   
   There is a superclass contract test
     org.apache.fs.test.formats.AbstractIcebergDeleteTest
   To support future stores which implement bulk delete. This is currently a 
minimal superclass; all tests
   are currently in ITestIcebergBulkDelete
   
   
   ### How was this patch tested?
   
   Fun!
   
   1. Install java17
   1. Build locally this Iceberg PR: 
https://github.com/apache/iceberg/pull/10233
   1. switch hadoop builds to java8
   2. Verify that the aws module builds clean
   3. rebuild hadoop with java17
   4. run new test suite against S3 store
   
   Once we have iceberg PR 10233 merged in, we can merge this. Until then it 
doesn't currently compile as for testing unless we move to DynMethods there 
(maybe I should)
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> S3A: add a file-format-parsing module for testing format parsing
> ----------------------------------------------------------------
>
>                 Key: HADOOP-19385
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19385
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure, fs/s3
>    Affects Versions: 3.4.2
>            Reporter: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Create a cloud-storage/format-parsing module declaring various file formats 
> as dependencies (parquet, iceberg, orc) purely for integration/regression 
> testing store support for them.
> h2. Parquet
> for parquet reading we'd want
> * parquet lib
> * samples of well formed files
> * samples of malformed files.
> Test runs would upload the files then open then.
> h2. Iceberg
> Verify bulk delete through iceberg FileIO api. 
> *Update: Iceberg needs java17*
> It can't be merged until hadoop trunk goes there. parquet stuff we can put in 
> earlier and backport
> does let me set up the module though



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19385) S3A: add a file-format-parsing module for testing format parsing

Reply via email to