[
https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303702#comment-14303702
]
Marcelo Vanzin edited comment on SPARK-5388 at 2/3/15 7:48 PM:
---
HI [~pwendell],
Let me try to write a point-by-point feedback for the current spec.
h4. Public protocol or not?
If this is not supposed to be public (i.e., we don't expect someone to try to
directly try to talk to the Spark master, it's always going to happen through
the Spark libraries), then the underlying protocol is less important, since we
only care about different versions being compatible in some way.
Assuming a non-public protocol, my question would be: why implement your own
RPC framework? Why not reuse something that's already there? For example, Avro
has a stable serialization infrastructure that defines semantics for versioned
data, and works well on top of HTTP. If handles serialization and dispatching -
which would remove a lot of code from the current patch, and probably has other
features that the current, cluster-mode only protocol doesn't need but other
future uses might.
h4. Non-submission uses
Similarly, in the non-public protocol scenario, a proper REST-based API would
look like overkill. But a proper REST infrastructure provides interesting room
for growth of the master's public-facing API. For example, you could easily
expose an endpoint for listing the current applications being tracked by the
master, or an endpoint to kill an application. The former could benefit, also,
the history server, which could expose the same API to list the applications it
has found.
h4. Evolvability and Versioning
The current spec does not specify the behavior of the cluster nor the client
with regards to different versions of the protocol. It has a table that
basically says future versions need to be able to submit standalone cluster
applications to a 1.3 master, but it doesn't explain what that means or how
that happens.
Does it mean that, after 1.3, you can't ever change any of the messages used to
launch a standalone cluster app, nor can you add new messages? Or, if that's
allowed, what happens on the server side if it sees a field it doesn't
understand? Does it ignore it, which could potentially break the application
being submitted? Does it throw an error, in which case the client should make
sure to submit an older version of the data structures if that's compatible
with the app being submitted? If the latter, how does it know which version to
use?
As an example of how you could do this negotiation: the client checks what
features the app being submitted needs, and chooses the oldest supported api
version based on that. It then can submit the request to, e.g., /v2 and, if
submitting to a 1.3 cluster, it will fail, because it doesn't support the
features needed by that app.
Also, thinking about the framework, what if later you need different features
than the ones provided now? What if you need to use query params, path params,
or non-json request bodies (e.g. for uploading files)? Are you gonna extend the
current framework to the point where it starts looking like other existing ones?
Of, if HTTP is being used mostly as a dumb pipe, what are the semantics for
when something goes wrong? Should clients only bother about a response if the
status is 200 OK, or should they try to interpret the body of a 500 Internal
Server Error message or 401 Bad Request? Those things need to be specified.
h4. Others
If the suggestions above don't sound particularly interesting for this use
case, I'd strongly suggest, in the very least, removing any mention of REST
from the spec and the code, because this is not a REST protocol in any way.
Also, a question: if it's an HTTP protocol, why not expose it through the
existing http port?
To reply to the questions about my suggestions for how to use REST:
* when you add a new version, you don't remove old ones. Spark v1.4 could add
/v2, but it must still support /v1 in the way that it was specified.
* as for new fields / types, that really depends on how you specify things.
Personally, I like to declare a released API frozen: you can't add new types,
fields, or anything that the old release doesn't know about. Any new thing
requires a new protocol version. But you could take a different approach, by
adding optional fields that don't cause breakages when submitted to the old
server that doesn't know about them. Again, these choices need to be specified
up front, otherwise the implementation of v1 becomes the spec, since where the
spec is not clear, the choices made by the implementation will become a de
facto specification.
(BTW, especially with a v1, the implementation will invariably become a de
facto specification, that's unavoidable. But it helps to have the spec clearly
cover all areas, so that hopefully