yyanyy commented on pull request #2807:
URL: https://github.com/apache/iceberg/pull/2807#issuecomment-912112920


   > Before we start to split the PR into smaller PRs, I think we iceberg 
community need to reach the consistence about the public/private vendor 
integration contribution. The iceberg-aws module is a great example, it 
provides independent mock unit tests for the small feature, the most important 
point is : Adobe has provided the s3 integration test utility : 
[com.adobe.testing:s3mock-junit4](https://github.com/adobe/S3Mock), it could 
just launch a local mini s3 cluster for accessing the HTTP API (the S3Mock 
pretend as a real S3 http server by implementing the S3 API under a local fs 
directory). The S3Mock simulator have fully covered test cases to guarantee the 
local S3 has the same semantics as the [aws s3](https://aws.amazon.com/cn/s3/).
   > 
   > When I implement [the aliyun OSS 
integration](https://github.com/apache/iceberg/pull/2230/files), I thought I 
should provide a similar object storage simulator to align between the local 
tests and public aliyun oss, so I provided a 
[OSSMockApplication](https://github.com/apache/iceberg/pull/2230/files#diff-cae7d6bade136ee5e97da24f979e6352929af6df9d244a3afc3a94770396c1bc)
 and 
[TestLocalOSS](https://github.com/apache/iceberg/pull/2230/files#diff-f8329e3691562000032033a485ecc5e30bf6d6a3b7e25e5f8cdd4f4e387b604aR53)
 to align the semantics. For my personal view, I would prefer to provide a 
fully tested simulator for private vendor integration so that we could build 
unit tests on top of it to verify the correctness.
   > 
   > As we will introduce more and more public/private vendor integration in 
future, I think we should consider agreeing on the details of introducing the 
vendor as soon as possible, and provide a more complete guide for community 
contributors to follow and implement.
   > 
   > FYI @rdblue & @danielcweeks .
   
   I think in the ideal world we should, but I'm not sure if we need to 
completely block new contributions for cloud vendor integration if there is no 
working backend library for storage services that are available for unit test. 
In aws module we have an [integration 
test](https://github.com/apache/iceberg/tree/master/aws/src/integration/java/org/apache/iceberg/aws)
 package that talks to the actual service. However we don't run them during PR 
submission and they are run manually before each release. I think we should try 
to integrate them as one of the auto tests to catch regression. With or without 
a library that provides full functionality for unit testing, I think this 
integration test is still valuable. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to