Sounds good. If the libraries follow semantic versioning properly, we could try and only have the major version in the module name?
On Mon, Jun 26, 2017 at 2:45 PM, Chesnay Schepler <ches...@apache.org> wrote: > You're raising good points, now i see why having the version in the name > is useful. > > I'll adjust my PR accordingly. And yes, ideally we only release the > modified modules, not everything again. > > > On 26.06.2017 14:29, Stephan Ewen wrote: > >> How should it work when one of the shaded dependencies is updated? We >> probably do not want to release all of them, just because the overall >> version number is in their version number. >> >> How about that: >> >> - Under normal circumstances, we only increase the version of the root >> project, when we add / bump a shaded dependency version >> >> - Shaded dependencies include the version in the same. That way we can >> possibly have two different versions of a dependency, such as >> "flink-shaded-kryo-2" and "flink-shaded-kryo-3". >> - Shaded dependencies should have the version in the relocation pattern >> as well, for the same reason as above (unless the two versions have >> separate namespaces already). >> >> - The released version of the module and artifact is always "1.0" >> unless >> we find that we did shading/relocation errors, in which case we need to >> re-release that artifact (1.1, 1.2, ...) >> >> >> >> On Mon, Jun 26, 2017 at 2:23 PM, Chesnay Schepler <ches...@apache.org> >> wrote: >> >> the guava version is already included in the version of the flink-shaded >>> module. >>> >>> For example, for the first release of flink-shaded-guava the version >>> would >>> be 1-18.0. >>> >>> 1 is the version of flink-shaded itself, 18.0 is the guava version. >>> >>> >>> On 26.06.2017 14:01, Stephan Ewen wrote: >>> >>> Looks good, thanks Chesnay!. >>>> >>>> How about including the dependency version names in the module names, >>>> like >>>> "flink-shaded-guava-18.0"? >>>> >>>> On Mon, Jun 26, 2017 at 11:49 AM, Chesnay Schepler <ches...@apache.org> >>>> wrote: >>>> >>>> The new repo was created and is accessible here: >>>> >>>>> https://github.com/apache/flink-shaded >>>>> >>>>> I've already opened a PR to add a shaded guava module. >>>>> >>>>> Once the shaded-guava module is merged I would like to do a first >>>>> release >>>>> of flink-shaded, >>>>> only containing guava. I already have a branch with all the changes >>>>> necessary to >>>>> integrate this dependency into Flink. >>>>> >>>>> The alternative would be to first create shaded modules for all >>>>> dependencies and make a >>>>> single release, but I feel that would delay things quite a bit. >>>>> >>>>> >>>>> >>>>> On 21.06.2017 17:00, Robert Metzger wrote: >>>>> >>>>> Okay, I'll request a repo for the shading. >>>>> >>>>>> On Wed, Jun 21, 2017 at 1:38 PM, Chesnay Schepler <ches...@apache.org >>>>>> > >>>>>> wrote: >>>>>> >>>>>> I like your suggestion Robert. A lot actually. >>>>>> >>>>>> Having the actual dependency version (i.e 18 for guava) in the version >>>>>>> should improve clarity a lot. >>>>>>> >>>>>>> Originally i intended to release 1 artifact per Flink version, with >>>>>>> the >>>>>>> normal versioning scheme >>>>>>> that we use. But given that the shaded dependencies aren't changed >>>>>>> often >>>>>>> (even rarely might be a stretch), >>>>>>> and aren't actually coupled to the Flink release cycle this doesn't >>>>>>> make >>>>>>> a >>>>>>> lot of sense. >>>>>>> >>>>>>> Having separate repos looks like a reasonable separation of concerns. >>>>>>> The >>>>>>> release for Flink itself >>>>>>> would work just like it does now; we don't have to modify any scripts >>>>>>> or >>>>>>> do extra steps. >>>>>>> >>>>>>> Since the build, release and development process are separate (since >>>>>>> flink-shaded isn't part of Flink build >>>>>>> process, has a separate release process and changes to it will /never >>>>>>> /require immediate changes to Flink) >>>>>>> it seems like a very good candidate to move it into a separate repo. >>>>>>> >>>>>>> >>>>>>> On 21.06.2017 11:26, Robert Metzger wrote: >>>>>>> >>>>>>> Its not completely clear to me how we want to version the shaded >>>>>>> >>>>>>> dependencies, and where we are putting them. >>>>>>>> >>>>>>>> One concern are the official apache release rules. If we want to >>>>>>>> release >>>>>>>> something to maven central, we need to do a proper vote over a >>>>>>>> source >>>>>>>> archive. >>>>>>>> I would propose to create a new repository "flink-shaded.git" that >>>>>>>> contains >>>>>>>> the following maven module structure: >>>>>>>> - flink-shaded: 1 >>>>>>>> - flink-shaded-asm: 1-5.2 >>>>>>>> - flink-shaded-guava: 1-18.0 >>>>>>>> - ... >>>>>>>> >>>>>>>> The number indicates the version (for ASM, I've just guessed). >>>>>>>> The version for the parent "flink-shaded" needs to be updated on >>>>>>>> each >>>>>>>> parent pom change (new module added, new / changed plugins, ...) >>>>>>>> >>>>>>>> We could create a separate release script in this repository that >>>>>>>> creates >>>>>>>> the flink-shaded-src.zip from the code and deploys the artifacts to >>>>>>>> the >>>>>>>> maven staging area. >>>>>>>> >>>>>>>> The advantage of a separate repo would be that we don't need to >>>>>>>> maintain >>>>>>>> separate maven projects in the same git repo. >>>>>>>> Also, the src archives for the release vote can be created from the >>>>>>>> repo >>>>>>>> content (without much filtering). >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 20, 2017 at 9:44 PM, Stephan Ewen <se...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>> I like this approach. >>>>>>>> >>>>>>>> Two additional things can be mention here: >>>>>>>> >>>>>>>>> - We need to deploy these artifacts independently and not as >>>>>>>>> part >>>>>>>>> of >>>>>>>>> the >>>>>>>>> build. That is a manual step once per "bump" in the dependency of >>>>>>>>> that >>>>>>>>> library. >>>>>>>>> >>>>>>>>> - We reduce the shading complexity of the original build and >>>>>>>>> should >>>>>>>>> thus >>>>>>>>> also speed up build times :-) >>>>>>>>> >>>>>>>>> Stephan >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jun 20, 2017 at 1:15 PM, Chesnay Schepler < >>>>>>>>> ches...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I would like to start working on this. >>>>>>>>> >>>>>>>>> I've looked into adding a flink-shaded-guava module. Working >>>>>>>>> against >>>>>>>>> >>>>>>>>>> the >>>>>>>>>> shaded namespaces seems >>>>>>>>>> to work without problems from the IDE, and we could forbid >>>>>>>>>> un-shaded >>>>>>>>>> usages with checkstyle. >>>>>>>>>> >>>>>>>>>> So for the list of dependencies that we want to shade we currently >>>>>>>>>> got: >>>>>>>>>> >>>>>>>>>> * asm >>>>>>>>>> * guava >>>>>>>>>> * netty >>>>>>>>>> * hadoop >>>>>>>>>> * curator >>>>>>>>>> >>>>>>>>>> I've had a chat with Stephan Ewan and he brought up kryo + chill >>>>>>>>>> as >>>>>>>>>> well. >>>>>>>>>> >>>>>>>>>> The nice thing is that we can do this incrementally, one >>>>>>>>>> dependency >>>>>>>>>> at a >>>>>>>>>> time. As such i would propose >>>>>>>>>> to go through the whole process for guava and see what problems >>>>>>>>>> arise. >>>>>>>>>> >>>>>>>>>> This would include adding a flink-shaded module and a child >>>>>>>>>> flink-shaded-guava module to the flink repository >>>>>>>>>> that are not part of the build process, replacing all usages of >>>>>>>>>> guava >>>>>>>>>> in >>>>>>>>>> Flink, adding the >>>>>>>>>> checkstyle rule (optional) and deploying the artifact to maven >>>>>>>>>> central. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11.05.2017 10:54, Stephan Ewen wrote: >>>>>>>>>> >>>>>>>>>> @Ufuk - I have never set up artifact deployment in Maven, could >>>>>>>>>> need >>>>>>>>>> some >>>>>>>>>> help there. >>>>>>>>>> >>>>>>>>>> Regarding shading Netty, I agree, would be good to do that as >>>>>>>>>> >>>>>>>>>>> well... >>>>>>>>>>> >>>>>>>>>>> On Thu, May 11, 2017 at 10:52 AM, Ufuk Celebi <u...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> The advantages you've listed sound really compelling to me. >>>>>>>>>>> >>>>>>>>>>> - Do you have time to implement these changes or do we need a >>>>>>>>>>> >>>>>>>>>>> volunteer? >>>>>>>>>>>> >>>>>>>>>>>> ;) >>>>>>>>>>> >>>>>>>>>> - I assume that republishing the artifacts as you propose doesn't >>>>>>>>>> >>>>>>>>>>> have >>>>>>>>>>>> any new legal implications since we already publish them with >>>>>>>>>>>> our >>>>>>>>>>>> JARs, right? >>>>>>>>>>>> >>>>>>>>>>>> - We might think about adding Netty to the list of shaded >>>>>>>>>>>> artifacts >>>>>>>>>>>> since some dependency conflicts were reported recently. Would >>>>>>>>>>>> have >>>>>>>>>>>> to >>>>>>>>>>>> double check the reported issues before doing that though. ;-) >>>>>>>>>>>> >>>>>>>>>>>> – Ufuk >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 10, 2017 at 8:45 PM, Stephan Ewen <se...@apache.org >>>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> @chesnay: I used ASM as an example in the proposal. Maybe I did >>>>>>>>>>> not >>>>>>>>>>> >>>>>>>>>> say >>>>>>>>>> >>>>>>>>>>> that clearly. >>>>>>>>>>> If we like that approach, we should deal with the other libraries >>>>>>>>>>> (at >>>>>>>>>>> >>>>>>>>>>> least >>>>>>>>>>>> >>>>>>>>>>>>> the frequently used ones) in the same way. >>>>>>>>>>>>> >>>>>>>>>>>> I would imagine to have a project layout like that: >>>>>>>>>>>> >>>>>>>>>>>>> flink-shaded-deps >>>>>>>>>>>>> - flink-shaded-asm >>>>>>>>>>>>> - flink-shaded-guava >>>>>>>>>>>>> - flink-shaded-curator >>>>>>>>>>>>> - flink-shaded-hadoop >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> "flink-shaded-deps" would not be built every time (and not be >>>>>>>>>>>>> released >>>>>>>>>>>>> every time), but only when needed. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 10, 2017 at 7:28 PM, Chesnay Schepler < >>>>>>>>>>>>> ches...@apache.org >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I like the idea, thank you for bringing it up. >>>>>>>>>>>>> >>>>>>>>>>>>> Given that the raised problems aren't really ASM specific would >>>>>>>>>>>>> it >>>>>>>>>>>>> >>>>>>>>>>>>> make >>>>>>>>>>>>>> >>>>>>>>>>>>>> sense to create one flink-shaded module that contains all >>>>>>>>>>>>> >>>>>>>>>>>> frequently >>>>>>>>>>>> >>>>>>>>>>>> shaded >>>>>>>>>>> >>>>>>>>>>> libraries? (or maybe even all shaded dependencies by core >>>>>>>>>>>> modules) >>>>>>>>>>>> >>>>>>>>>>>>> The >>>>>>>>>>>>> >>>>>>>>>>>>> proposal limits the scope of this to ASM and i was wondering >>>>>>>>>>>>> why. >>>>>>>>>>>>> >>>>>>>>>>>>> I also remember that there was a discussion recently about why >>>>>>>>>>>>>> we >>>>>>>>>>>>>> >>>>>>>>>>>>>> shade >>>>>>>>>>>>>> >>>>>>>>>>>>>> things at all, and the idea of working against the shaded >>>>>>>>>>>>> >>>>>>>>>>>> namespaces >>>>>>>>>>>> >>>>>>>>>>>> was >>>>>>>>>>> >>>>>>>>>>> brought up. Back then i was expressing doubts as to whether IDE's >>>>>>>>>>>> >>>>>>>>>>>>> would >>>>>>>>>>>>>> >>>>>>>>>>>>>> properly support this; what's the state on that? >>>>>>>>>>>>> >>>>>>>>>>>> On 10.05.2017 18:18, Stephan Ewen wrote: >>>>>>>>>>> >>>>>>>>>>> Hi! >>>>>>>>>>>> >>>>>>>>>>>>> This is a discussion about altering the way we handle >>>>>>>>>>>>>> dependencies >>>>>>>>>>>>>> >>>>>>>>>>>>>> and >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> shading in Flink. >>>>>>>>>>>>>> >>>>>>>>>>>>> I ran into quite a view problems trying to adjust / fix some >>>>>>>>>>>> shading >>>>>>>>>>>> >>>>>>>>>>>>> issues >>>>>>>>>>>>>>> during release validation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The issue is tracked under: https://issues.apache.org/jira >>>>>>>>>>>>>>> /browse/FLINK-6529 >>>>>>>>>>>>>>> Bring this discussion thread up because it is a bigger issue >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Problem* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Currently, Flink shades dependencies like ASM and Guava into >>>>>>>>>>>>>>> all >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> jars >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> of >>>>>>>>>>>>>> >>>>>>>>>>>>> projects that reference it and relocate the classes. >>>>>>>>>>>> There are some drawbacks to that approach, let's discuss them at >>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>>>> >>>>>>>>>>>>>> example of ASM: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> - The ASM classes are for example in flink-core, >>>>>>>>>>>>>>> flink-java, >>>>>>>>>>>>>>> flink-scala, >>>>>>>>>>>>>>> flink-runtime, etc. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Users that reference these dependencies have the >>>>>>>>>>>>>>> classes >>>>>>>>>>>>>>> multiple >>>>>>>>>>>>>>> times >>>>>>>>>>>>>>> in the classpath. That is unclean (works, through, because >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> classes >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> are >>>>>>>>>>>>>> >>>>>>>>>>>>> identical). The same happens when building the final dist. jar. >>>>>>>>>>>> - Some of these dependencies require to include license >>>>>>>>>>>> >>>>>>>>>>>>> files >>>>>>>>>>>>> >>>>>>>>>>>>> in >>>>>>>>>>>>>> >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> shaded jar. It is hard to impossible to build a good >>>>>>>>>>>>>>> automatic >>>>>>>>>>>>>>> solution >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> that, partly due to Maven's very poor cross-project path >>>>>>>>>>>>>>> support >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Most importantly: Scala does not support shading >>>>>>>>>>>>>>> really >>>>>>>>>>>>>>> well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Scala >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> classes have references to classes in more places than just >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> >>>>>>>>>>>>>> class >>>>>>>>>>>>> >>>>>>>>>>>>> names >>>>>>>>>>>>> >>>>>>>>>>>>> (apparently for Scala reflect support). Referencing a Scala >>>>>>>>>>>>>> >>>>>>>>>>>>>>> project >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> shaded ASM still requires to add a reference to unshaded ASM >>>>>>>>>>>>>>> >>>>>>>>>>>>>> (at >>>>>>>>>>>>>> >>>>>>>>>>>>>> least >>>>>>>>>>>>> >>>>>>>>>>>>> as >>>>>>>>>>>>> >>>>>>>>>>>>> a >>>>>>>>>>>>>> >>>>>>>>>>>>>> compile dependency). >>>>>>>>>>>>> >>>>>>>>>>>>> *Proposal* >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I propose that we build and deploy a asm-flink-shaded version >>>>>>>>>>>>>>> of >>>>>>>>>>>>>>> ASM >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> directly program against the relocated namespaces. Since we >>>>>>>>>>>>>>> >>>>>>>>>>>>>> never >>>>>>>>>>>>>> >>>>>>>>>>>>>> use >>>>>>>>>>>>> >>>>>>>>>>>>> classes that we relocate in public interfaces, Flink users will >>>>>>>>>>>>> >>>>>>>>>>>>> never >>>>>>>>>>>>>> >>>>>>>>>>>>>> see >>>>>>>>>>>>>> >>>>>>>>>>>>> the relocated class names. Internally, it does not hurt to use >>>>>>>>>>>> >>>>>>>>>>> them. >>>>>>>>>>> >>>>>>>>>>> - Proper maven dependency management, no hidden (shaded) >>>>>>>>>>>> >>>>>>>>>>>>> dependencies >>>>>>>>>>>>>> >>>>>>>>>>>>>>> - One copy of each class for shaded dependencies >>>>>>>>>>>>>>> >>>>>>>>>>>>>> - Proper Scala interoperability >>>>>>>>>>>>> >>>>>>>>>>>>> - Natural License management (license is part of >>>>>>>>>>>>>> deployed >>>>>>>>>>>>>> >>>>>>>>>>>>>>> asm-flink-shaded jar) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Happy to hear thoughts! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Stephan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >