Thanks for the discussion! A few responses:

The decision needs to happen at api/config change time, otherwise the
> deprecated warning has no purpose if we are never going to remove them.
>

Even if we never remove an API, I think deprecation warnings (when done
right) can still serve a purpose. For new users, a deprecation can serve as
a pointer to newer, faster APIs or ones with less sharp edges. I would be
supportive of efforts that use them to clean up the docs. For example, we
could hide deprecated APIs after some time so they don't clutter scala/java
doc. We can and should audit things like the user guide and our own
examples to make sure they don't use deprecated APIs.


> That said we still need to be able to remove deprecated things and change
> APIs in major releases, otherwise why do a  major release in the first
> place.  Is it purely to support newer Scala/python/java versions.
>

I don't think Major versions are purely for
Scala/Java/Python/Hive/Metastore, but they are a good chance to move the
project forward. Spark 3.0 has a lot of upgrades here, and I think we did
make the right trade-offs here, even though there are some API breaks.

Major versions are also a good time to drop major changes (i.e. in 2.0 we
released whole-stage code gen).


> I think the hardest part listed here is what the impact is.  Who's call is
> that, it's hard to know how everyone is using things and I think it's been
> harder to get feedback on SPIPs and API changes in general as people are
> busy with other things.
>

This is the hardest part, and we won't always get it right. I think that
having the rubric though will help guide the conversation and help
reviewers ask the right questions.

One other thing I'll add is, sometimes the users come to us and we should
listen! I was very surprised by the response to Karen's email on this list
last week. An actual user was giving us feedback on the impact of the
changes in Spark 3.0 and rather than listen there was a lot of push back.
Users are never wrong when they are telling you what matters to them!


> Like you mention, I think stackoverflow is unreliable, the posts could be
> many years old and no longer relevant.
>

While this is unfortunate, I think the more we can do to keep these
answer relevant (either by updating them or by not breaking them) is good
for the health of the Spark community.

Reply via email to