Imran Rashid created SPARK-27780:
------------------------------------

             Summary: Shuffle server & client should be versioned to enable 
smoother upgrade
                 Key: SPARK-27780
                 URL: https://issues.apache.org/jira/browse/SPARK-27780
             Project: Spark
          Issue Type: New Feature
          Components: Shuffle, Spark Core
    Affects Versions: 3.0.0
            Reporter: Imran Rashid


The external shuffle service is often upgraded at a different time than spark 
itself.  However, this causes problems when the protocol changes between the 
shuffle service and the spark runtime -- this forces users to upgrade 
everything simultaneously.

We should add versioning to the shuffle client & server, so they know what 
messages the other will support.  This would allow better handling of mixed 
versions, from better error msgs to allowing some mismatched versions (with 
reduced capabilities).

This originally came up in a discussion here: 
https://github.com/apache/spark/pull/24565#issuecomment-493496466

There are a few ways we could do the versioning which we still need to discuss:

1) Version specified by config.  This allows for mixed versions across the 
cluster and rolling upgrades.  It also will let a spark 3.0 client talk to a 
2.4 shuffle service.  But, may be a nuisance for users to get this right.

2) Auto-detection during registration with local shuffle service.  This makes 
the versioning easy for the end user, and can even handle a 2.4 shuffle service 
though it does not support the new versioning.  However, it will not handle a 
rolling upgrade correctly -- if the local shuffle service has been upgraded, 
but other nodes in the cluster have not, it will get the version wrong.

3) Exchange versions per-connection.  When a connection is opened, the server & 
client could first exchange messages with their versions, so they know how to 
continue communication after that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to