Re: [PR] Polaris Spark Plugin: setup repository code structure and build [polaris]

via GitHub Mon, 24 Mar 2025 15:58:25 -0700


gh-yzou commented on PR #1190:
URL: https://github.com/apache/polaris/pull/1190#issuecomment-2749581082


   @snazy It is possible to put this in a different repo, but I don't see much 
benefit of doing this. I can see that is two different ways to handle spark 
client, and both of approaches are commonly used:
   1) Put the client along with the main repo, and bundle the release of client 
along with main. Projects like Iceberg, Trino, Unity Catalog Spark connector 
uses this approach.
   2) Have a separate repo for the client, and the client will have a server 
version dependency, a Version Compatibility matrix will be published to 
describe version compatibilities for the released client version. This approach 
is adopted by projects like cassandra-spark-connector, and 
couchbase-spark-connector.
   
   In terms of release matrix, i think both are manageable since both 
approaches have use cases. Furthermore, i don't think we need to release with 
various Iceberg version, since Iceberg version backward compatibility is 
guaranteed by Iceberg community, ideally we just need to stay consistent with 
the most recent version.
   
   The major difference I see with those two approaches is the release, with 
approach 1) we are forced to do a release along with each server release even 
if there is no change in the API. However, the benefit is that the version 
compatibility guarantee is clear, every release is guaranteed to be compatible 
with the current server version and the version less than it. Furthermore, user 
do not need to wait for a different release to get the compatible client since 
it is released along with the server release.
   With approach 2), a new release is only needed if there are changes to the 
APIs, or new APIs we need to accommodate at client side. However, user will 
need to look for the Version Compatibility table for the compatibility. Also 
for new changes in client side, the client release will need to wait for the 
sever release to finish and then start, so there will be a delay to till user 
can get an compatible client.
   
   Both approach seems doable to me, but I might more leaning towards approach 
1) since it makes things easier for user in terms of version compatibility and 
release time.
   
   Regarding to the point about Integration test 
   ```
   Integration tests become extremely difficult, especially if both Polaris and 
the Spark plugin share the same (Maven) group ID.
   ```
   I am not quite getting this point, could you give some more details about 
this ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Polaris Spark Plugin: setup repository code structure and build [polaris]

Reply via email to