gh-yzou commented on PR #1190: URL: https://github.com/apache/polaris/pull/1190#issuecomment-2749581082
@snazy It is possible to put this in a different repo, but I don't see much benefit of doing this. I can see that is two different ways to handle spark client, and both of approaches are commonly used: 1) Put the client along with the main repo, and bundle the release of client along with main. Projects like Iceberg, Trino, Unity Catalog Spark connector uses this approach. 2) Have a separate repo for the client, and the client will have a server version dependency, a Version Compatibility matrix will be published to describe version compatibilities for the released client version. This approach is adopted by projects like cassandra-spark-connector, and couchbase-spark-connector. In terms of release matrix, i think both are manageable since both approaches have use cases. Furthermore, i don't think we need to release with various Iceberg version, since Iceberg version backward compatibility is guaranteed by Iceberg community, ideally we just need to stay consistent with the most recent version. The major difference I see with those two approaches is the release, with approach 1) we are forced to do a release along with each server release even if there is no change in the API. However, the benefit is that the version compatibility guarantee is clear, every release is guaranteed to be compatible with the current server version and the version less than it. Furthermore, user do not need to wait for a different release to get the compatible client since it is released along with the server release. With approach 2), a new release is only needed if there are changes to the APIs, or new APIs we need to accommodate at client side. However, user will need to look for the Version Compatibility table for the compatibility. Also for new changes in client side, the client release will need to wait for the sever release to finish and then start, so there will be a delay to till user can get an compatible client. Both approach seems doable to me, but I might more leaning towards approach 1) since it makes things easier for user in terms of version compatibility and release time. Regarding to the point about Integration test ``` Integration tests become extremely difficult, especially if both Polaris and the Spark plugin share the same (Maven) group ID. ``` I am not quite getting this point, could you give some more details about this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
