Dongjoon, I followed the conversation, and in my opinion, your concern is totally legit. It just feels that the discussion is focused solely on Databricks, and as I said above, the same issue occurs in other vendors as well.
On Wed, Jun 7, 2023 at 10:28 PM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > To Grisha, we are talking about what is the right way and how to comply > with ASF legal advice which I shared in this thread from "legal-discuss@" > mailing thread. > > https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4 > (legal-discuss@) > https://www.apache.org/foundation/marks/downstream.html#source (ASF > Website) > > Dongjoon > > > On Wed, Jun 7, 2023 at 12:16 PM Grisha Weintraub < > grisha.weintr...@gmail.com> wrote: > >> Yes, in Spark UI you have it as "3.1.2-amazon", but when you create a >> cluster it's just Spark 3.1.2. >> >> On Wed, Jun 7, 2023 at 10:05 PM Nan Zhu <zhunanmcg...@gmail.com> wrote: >> >>> >>> for EMR, I think they show 3.1.2-amazon in Spark UI, no? >>> >>> >>> On Wed, Jun 7, 2023 at 11:30 Grisha Weintraub < >>> grisha.weintr...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I am not taking sides here, but just for fairness, I think it should be >>>> noted that AWS EMR does exactly the same thing. >>>> We choose the EMR version (e.g., 6.4.0) and it has an associated Spark >>>> version (e.g., 3.1.2). >>>> The Spark version here is not the original Apache version but AWS Spark >>>> distribution. >>>> >>>> On Wed, Jun 7, 2023 at 8:24 PM Dongjoon Hyun <dongjoon.h...@gmail.com> >>>> wrote: >>>> >>>>> I disagree with you in several ways. >>>>> >>>>> The following is not a *minor* change like the given examples >>>>> (alterations to the start-up and shutdown scripts, configuration files, >>>>> file layout etc.). >>>>> >>>>> > The change you cite meets the 4th point, minor change, made for >>>>> integration reasons. >>>>> >>>>> The following is also wrong. There is no such point of state of Apache >>>>> Spark 3.4.0 after 3.4.0 tag creation. Apache Spark community didn't allow >>>>> Scala reverting patches in both `master` branch and `branch-3.4`. >>>>> >>>>> > There is no known technical objection; this was after all at one >>>>> point the state of Apache Spark. >>>>> >>>>> Is the following your main point? So, you are selling a box "including >>>>> Harry Potter by J. K. Rolling whose main character is Barry instead of >>>>> Harry", but it's okay because you didn't sell the book itself? And, as a >>>>> cloud-vendor, you borrowed the box instead of selling it like private >>>>> libraries? >>>>> >>>>> > There is no standalone distribution of Apache Spark anywhere here. >>>>> >>>>> We are not asking a big thing. Why are you so reluctant to say you are >>>>> not "Apache Spark 3.4.0" by simply saying "Apache Spark 3.4.0-databricks". >>>>> What is the marketing reason here? >>>>> >>>>> Dongjoon. >>>>> >>>>> >>>>> On Wed, Jun 7, 2023 at 9:27 AM Sean Owen <sro...@gmail.com> wrote: >>>>> >>>>>> Hi Dongjoon, I think this conversation is not advancing anymore. I >>>>>> personally consider the matter closed unless you can find other support >>>>>> or >>>>>> respond with more specifics. While this perhaps should be on private@, >>>>>> I think it's not wrong as an instructive discussion on dev@. >>>>>> >>>>>> I don't believe you've made a clear argument about the problem, or >>>>>> how it relates specifically to policy. Nevertheless I will show you my >>>>>> logic. >>>>>> >>>>>> You are asserting that a vendor cannot call a product Apache Spark >>>>>> 3.4.0 if it omits a patch updating a Scala maintenance version. This >>>>>> difference has no known impact on usage, as far as I can tell. >>>>>> >>>>>> Let's see what policy requires: >>>>>> >>>>>> 1/ All source code changes must meet at least one of the acceptable >>>>>> changes criteria set out below: >>>>>> - The change has accepted by the relevant Apache project community >>>>>> for inclusion in a future release. Note that the process used to accept >>>>>> changes and how that acceptance is documented varies between projects. >>>>>> - A change is a fix for an undisclosed security issue; and the fix is >>>>>> not publicly disclosed as as security fix; and the Apache project has >>>>>> been >>>>>> notified of the both issue and the proposed fix; and the PMC has rejected >>>>>> neither the vulnerability report nor the proposed fix. >>>>>> - A change is a fix for a bug; and the Apache project has been >>>>>> notified of both the bug and the proposed fix; and the PMC has rejected >>>>>> neither the bug report nor the proposed fix. >>>>>> - Minor changes (e.g. alterations to the start-up and shutdown >>>>>> scripts, configuration files, file layout etc.) to integrate with the >>>>>> target platform providing the Apache project has not objected to those >>>>>> changes. >>>>>> >>>>>> The change you cite meets the 4th point, minor change, made for >>>>>> integration reasons. There is no known technical objection; this was >>>>>> after >>>>>> all at one point the state of Apache Spark. >>>>>> >>>>>> >>>>>> 2/ A version number must be used that both clearly differentiates it >>>>>> from an Apache Software Foundation release and clearly identifies the >>>>>> Apache Software Foundation version on which the software is based. >>>>>> >>>>>> Keep in mind the product here is not "Apache Spark", but the >>>>>> "Databricks Runtime 13.1 (including Apache Spark 3.4.0)". That is, there >>>>>> is >>>>>> far more than a version number differentiating this product from Apache >>>>>> Spark. There is no standalone distribution of Apache Spark anywhere >>>>>> here. I >>>>>> believe that easily matches the intent. >>>>>> >>>>>> >>>>>> 3/ The documentation must clearly identify the Apache Software >>>>>> Foundation version on which the software is based. >>>>>> >>>>>> Clearly, yes. >>>>>> >>>>>> >>>>>> 4/ The end user expects that the distribution channel will back-port >>>>>> fixes. It is not necessary to back-port all fixes. Selection of fixes to >>>>>> back-port must be consistent with the update policy of that distribution >>>>>> channel. >>>>>> >>>>>> I think this is safe to say too. Indeed this explicitly contemplates >>>>>> not back-porting a change. >>>>>> >>>>>> >>>>>> Backing up, you can see from this document that the spirit of it is: >>>>>> don't include changes in your own Apache Foo x.y that aren't wanted by >>>>>> the >>>>>> project, and still call it Apache Foo x.y. I don't believe your case >>>>>> matches this spirit either. >>>>>> >>>>>> I do think it's not crazy to suggest, hey vendor, would you call this >>>>>> "Apache Spark + patches" or ".vendor123". But that's at best a >>>>>> suggestion, >>>>>> and I think it does nothing in particular for users. You've made the >>>>>> suggestion, and I do not see some police action from the PMC must follow. >>>>>> >>>>>> >>>>>> I think you're simply objecting to a vendor choice, but that is not >>>>>> on-topic here unless you can specifically rebut the reasoning above and >>>>>> show it's connected. >>>>>> >>>>>> >>>>>> On Wed, Jun 7, 2023 at 11:02 AM Dongjoon Hyun <dongj...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> Sean, it seems that you are confused here. We are not talking about >>>>>>> your upper system (the notebook environment). We are talking about the >>>>>>> submodule, "Apache Spark 3.4.0-databricks". Whatever you call it, both >>>>>>> of >>>>>>> us knows "Apache Spark 3.4.0-databricks" is different from "Apache Spark >>>>>>> 3.4.0". You should not use "3.4.0" at your subsystem. >>>>>>> >>>>>>> > This also is aimed at distributions of "Apache Foo", not products >>>>>>> that >>>>>>> > "include Apache Foo", which are clearly not Apache Foo. >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>> >>>>>>>