Re: ASF policy violation and Scala version issues

Dongjoon Hyun Sun, 11 Jun 2023 12:53:49 -0700

Yes, that's exactly the pain point. I totally agree with you.
For now, we are focusing on other stuffs more, but we need to resolve this
situation soon.


Dongjoon.


On Sun, Jun 11, 2023 at 1:21 AM yangjie01 <yangji...@baidu.com> wrote:

> Perhaps we should reconsider our reliance on and use of Ammonite? There
> are still no new available versions of Ammonite one week after the release
> of Scala 2.12.18 and 2.13.11. The question related to version release in
> the Ammonite community also did not receive a response, which makes me feel
> this is unexpected. Of course, we can also wait for a while before making a
> decision.
>
>
>
> ```
>
> Scala version upgrade is blocked by the Ammonite library dev cycle
> currently.
>
>     Although we discussed it here and it had good intentions,
>     the current master branch cannot use the latest Scala.
>
>     - https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
> <https://mailshield.baidu.com/check?q=a0CRn0If1fLAaBgzrkizNpbJftqXtEqgcW38yNaIQU0Q%2bmjDPAzVRvE67%2blIinmxUzxEubVP%2fhQb3ZmEtUYFNqDCCXU%3d>
>     "Ammonite as REPL for Spark Connect"
>      SPARK-42884 Add Ammonite REPL integration
>
>     Specifically, the following are blocked and I'm monitoring the
> Ammonite repository.
>     - SPARK-40497 Upgrade Scala to 2.13.11
>     - SPARK-43832 Upgrade Scala to 2.12.18
>     - According to https://github.com/com-lihaoyi/Ammonite/issues
> <https://mailshield.baidu.com/check?q=NMT2mSYh9onPK%2fRWv7ZdEPl7eFGwlK%2fKLvFdLs%2f1hex2Mqxu8x5e0CQVe0OwQtVEqqli7w%3d%3d>
>  ,
>       Scala 3.3.0 LTS support also looks infeasible.
>
>     Although we may be able to wait for a while, there are two fundamental
> solutions
>     to unblock this situation in a long-term maintenance perspective.
>     - Replace it with a Scala-shell based implementation
>     - Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
>        Maybe, we can put it into the new repo like Rust and Go client.
>
> ```
>
> *发件人**: *Grisha Weintraub <grisha.weintr...@gmail.com>
> *日期**: *2023年6月8日 星期四 04:05
> *收件人**: *Dongjoon Hyun <dongjoon.h...@gmail.com>
> *抄送**: *Nan Zhu <zhunanmcg...@gmail.com>, Sean Owen <sro...@gmail.com>, "
> dev@spark.apache.org" <dev@spark.apache.org>
> *主题**: *Re: ASF policy violation and Scala version issues
>
>
>
> Dongjoon,
>
>
>
> I followed the conversation, and in my opinion, your concern is totally
> legit.
> It just feels that the discussion is focused solely on Databricks, and as
> I said above, the same issue occurs in other vendors as well.
>
>
>
>
>
> On Wed, Jun 7, 2023 at 10:28 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
> wrote:
>
> To Grisha, we are talking about what is the right way and how to comply
> with ASF legal advice which I shared in this thread from "legal-discuss@"
> mailing thread.
>
>
>
> https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4
> <https://mailshield.baidu.com/check?q=ZwiIuh1GjzJ832wcY43%2filMC89G28qpX1MwPTnGE7kWNMJuVe0FwSuGJ6LAJTTLxv%2fy5Mv0poHnEa2T7SxQr4gzLc2I%3d>
>  (legal-discuss@)
>
> https://www.apache.org/foundation/marks/downstream.html#source
> <https://mailshield.baidu.com/check?q=wtR8UhV2EuUe5pw6boBqY5wTjAhKC8N2YWd1CnMAN3Mi58ZQ5oaSUx92kUzkH%2fwAZRZhN7Rus0A1VMxjHf90qN3oMBY%3d>
>  (ASF
> Website)
>
>
>
> Dongjoon
>
>
>
>
>
> On Wed, Jun 7, 2023 at 12:16 PM Grisha Weintraub <
> grisha.weintr...@gmail.com> wrote:
>
> Yes, in Spark UI you have it as "3.1.2-amazon", but when you create a
> cluster it's just Spark 3.1.2.
>
>
>
> On Wed, Jun 7, 2023 at 10:05 PM Nan Zhu <zhunanmcg...@gmail.com> wrote:
>
>
>
>  for EMR, I think they show 3.1.2-amazon in Spark UI, no?
>
>
>
>
>
> On Wed, Jun 7, 2023 at 11:30 Grisha Weintraub <grisha.weintr...@gmail.com>
> wrote:
>
> Hi,
>
>
>
> I am not taking sides here, but just for fairness, I think it should be
> noted that AWS EMR does exactly the same thing.
>
> We choose the EMR version (e.g., 6.4.0) and it has an associated Spark
> version (e.g., 3.1.2).
>
> The Spark version here is not the original Apache version but AWS Spark
> distribution.
>
>
>
> On Wed, Jun 7, 2023 at 8:24 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
> wrote:
>
> I disagree with you in several ways.
>
>
>
> The following is not a *minor* change like the given examples (alterations
> to the start-up and shutdown scripts, configuration files, file layout
> etc.).
>
>
>
> > The change you cite meets the 4th point, minor change, made for
> integration reasons.
>
>
>
> The following is also wrong. There is no such point of state of Apache
> Spark 3.4.0 after 3.4.0 tag creation. Apache Spark community didn't allow
> Scala reverting patches in both `master` branch and `branch-3.4`.
>
>
>
> > There is no known technical objection; this was after all at one point
> the state of Apache Spark.
>
>
>
> Is the following your main point? So, you are selling a box "including
> Harry Potter by J. K. Rolling whose main character is Barry instead of
> Harry", but it's okay because you didn't sell the book itself? And, as a
> cloud-vendor, you borrowed the box instead of selling it like private
> libraries?
>
>
>
> > There is no standalone distribution of Apache Spark anywhere here.
>
>
>
> We are not asking a big thing. Why are you so reluctant to say you are not
> "Apache Spark 3.4.0" by simply saying "Apache Spark 3.4.0-databricks". What
> is the marketing reason here?
>
>
>
> Dongjoon.
>
>
>
>
>
> On Wed, Jun 7, 2023 at 9:27 AM Sean Owen <sro...@gmail.com> wrote:
>
> Hi Dongjoon, I think this conversation is not advancing anymore. I
> personally consider the matter closed unless you can find other support or
> respond with more specifics. While this perhaps should be on private@, I
> think it's not wrong as an instructive discussion on dev@.
>
>
>
> I don't believe you've made a clear argument about the problem, or how it
> relates specifically to policy. Nevertheless I will show you my logic.
>
>
>
> You are asserting that a vendor cannot call a product Apache Spark 3.4.0
> if it omits a patch updating a Scala maintenance version. This difference
> has no known impact on usage, as far as I can tell.
>
>
>
> Let's see what policy requires:
>
>
>
> 1/ All source code changes must meet at least one of the acceptable
> changes criteria set out below:
>
> - The change has accepted by the relevant Apache project community for
> inclusion in a future release. Note that the process used to accept changes
> and how that acceptance is documented varies between projects.
> - A change is a fix for an undisclosed security issue; and the fix is not
> publicly disclosed as as security fix; and the Apache project has been
> notified of the both issue and the proposed fix; and the PMC has rejected
> neither the vulnerability report nor the proposed fix.
> - A change is a fix for a bug; and the Apache project has been notified of
> both the bug and the proposed fix; and the PMC has rejected neither the bug
> report nor the proposed fix.
> - Minor changes (e.g. alterations to the start-up and shutdown scripts,
> configuration files, file layout etc.) to integrate with the target
> platform providing the Apache project has not objected to those changes.
>
>
>
> The change you cite meets the 4th point, minor change, made for
> integration reasons. There is no known technical objection; this was after
> all at one point the state of Apache Spark.
>
>
>
> 2/ A version number must be used that both clearly differentiates it from
> an Apache Software Foundation release and clearly identifies the Apache
> Software Foundation version on which the software is based.
>
>
>
> Keep in mind the product here is not "Apache Spark", but the "Databricks
> Runtime 13.1 (including Apache Spark 3.4.0)". That is, there is far more
> than a version number differentiating this product from Apache Spark. There
> is no standalone distribution of Apache Spark anywhere here. I believe that
> easily matches the intent.
>
>
>
> 3/ The documentation must clearly identify the Apache Software Foundation
> version on which the software is based.
>
>
>
> Clearly, yes.
>
>
> 4/ The end user expects that the distribution channel will back-port
> fixes. It is not necessary to back-port all fixes. Selection of fixes to
> back-port must be consistent with the update policy of that distribution
> channel.
>
>
>
> I think this is safe to say too. Indeed this explicitly contemplates not
> back-porting a change.
>
>
>
>
>
> Backing up, you can see from this document that the spirit of it is: don't
> include changes in your own Apache Foo x.y that aren't wanted by the
> project, and still call it Apache Foo x.y. I don't believe your case
> matches this spirit either.
>
>
>
> I do think it's not crazy to suggest, hey vendor, would you call this
> "Apache Spark + patches" or ".vendor123". But that's at best a suggestion,
> and I think it does nothing in particular for users. You've made the
> suggestion, and I do not see some police action from the PMC must follow.
>
>
>
>
>
> I think you're simply objecting to a vendor choice, but that is not
> on-topic here unless you can specifically rebut the reasoning above and
> show it's connected.
>
>
>
>
>
> On Wed, Jun 7, 2023 at 11:02 AM Dongjoon Hyun <dongj...@apache.org> wrote:
>
> Sean, it seems that you are confused here. We are not talking about your
> upper system (the notebook environment). We are talking about the
> submodule, "Apache Spark 3.4.0-databricks". Whatever you call it, both of
> us knows "Apache Spark 3.4.0-databricks" is different from "Apache Spark
> 3.4.0". You should not use "3.4.0" at your subsystem.
>
> > This also is aimed at distributions of "Apache Foo", not products that
> > "include Apache Foo", which are clearly not Apache Foo.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: ASF policy violation and Scala version issues

Reply via email to