the guava version is already included in the version of the flink-shaded module.

For example, for the first release of flink-shaded-guava the version would be 1-18.0.

1 is the version of flink-shaded itself, 18.0 is the guava version.

On 26.06.2017 14:01, Stephan Ewen wrote:
Looks good, thanks Chesnay!.

How about including the dependency version names in the module names, like
"flink-shaded-guava-18.0"?

On Mon, Jun 26, 2017 at 11:49 AM, Chesnay Schepler <ches...@apache.org>
wrote:

The new repo was created and is accessible here:
https://github.com/apache/flink-shaded

I've already opened a PR to add a shaded guava module.

Once the shaded-guava module is merged I would like to do a first release
of flink-shaded,
only containing guava. I already have a branch with all the changes
necessary to
integrate this dependency into Flink.

The alternative would be to first create shaded modules for all
dependencies and make a
single release, but I feel that would delay things quite a bit.



On 21.06.2017 17:00, Robert Metzger wrote:

Okay, I'll request a repo for the shading.

On Wed, Jun 21, 2017 at 1:38 PM, Chesnay Schepler <ches...@apache.org>
wrote:

I like your suggestion Robert. A lot actually.
Having the actual dependency version (i.e 18 for guava) in the version
should improve clarity a lot.

Originally i intended to release 1 artifact per Flink version, with the
normal versioning scheme
that we use. But given that the shaded dependencies aren't changed often
(even rarely might be a stretch),
and aren't actually coupled to the Flink release cycle this doesn't make
a
lot of sense.

Having separate repos looks like a reasonable separation of concerns. The
release for Flink itself
would work just like it does now; we don't have to modify any scripts or
do extra steps.

Since the build, release and development process are separate (since
flink-shaded isn't part of Flink build
process, has a separate release process and changes to it will /never
/require immediate changes to Flink)
it seems like a very good candidate to move it into a separate repo.


On 21.06.2017 11:26, Robert Metzger wrote:

Its not completely clear to me how we want to version the shaded
dependencies, and where we are putting them.

One concern are the official apache release rules. If we want to release
something to maven central, we need to do a proper vote over a source
archive.
I would propose to create a new repository "flink-shaded.git" that
contains
the following maven module structure:
- flink-shaded: 1
      - flink-shaded-asm: 1-5.2
      - flink-shaded-guava: 1-18.0
      - ...

The number indicates the version (for ASM, I've just guessed).
The version for the parent "flink-shaded" needs to be updated on each
parent pom change (new module added, new / changed plugins, ...)

We could create a separate release script in this repository that
creates
the flink-shaded-src.zip from the code and deploys the artifacts to the
maven staging area.

The advantage of a separate repo would be that we don't need to maintain
separate maven projects in the same git repo.
Also, the src archives for the release vote can be created from the repo
content (without much filtering).


On Tue, Jun 20, 2017 at 9:44 PM, Stephan Ewen <se...@apache.org> wrote:

I like this approach.

Two additional things can be mention here:

     - We need to deploy these artifacts independently and not as part
of
the
build. That is a manual step once per "bump" in the dependency of that
library.

     - We reduce the shading complexity of the original build and should
thus
also speed up build times :-)

Stephan




On Tue, Jun 20, 2017 at 1:15 PM, Chesnay Schepler <ches...@apache.org>
wrote:

I would like to start working on this.

I've looked into adding a flink-shaded-guava module. Working against
the
shaded namespaces seems
to work without problems from the IDE, and we could forbid un-shaded
usages with checkstyle.

So for the list of dependencies that we want to shade we currently
got:

    * asm
    * guava
    * netty
    * hadoop
    * curator

I've had a chat with Stephan Ewan and he brought up kryo + chill as
well.

The nice thing is that we can do this incrementally, one dependency
at a
time. As such i would propose
to go through the whole process for guava and see what problems arise.

This would include adding a flink-shaded module and a child
flink-shaded-guava module to the flink repository
that are not part of the build process, replacing all usages of guava
in
Flink, adding the
checkstyle rule (optional) and deploying the artifact to maven
central.


On 11.05.2017 10:54, Stephan Ewen wrote:

@Ufuk  - I have never set up artifact deployment in Maven, could need
some
help there.

Regarding shading Netty, I agree, would be good to do that as well...

On Thu, May 11, 2017 at 10:52 AM, Ufuk Celebi <u...@apache.org>
wrote:

The advantages you've listed sound really compelling to me.

- Do you have time to implement these changes or do we need a
volunteer?
;)

- I assume that republishing the artifacts as you propose doesn't
have
any new legal implications since we already publish them with our
JARs, right?

- We might think about adding Netty to the list of shaded artifacts
since some dependency conflicts were reported recently. Would have
to
double check the reported issues before doing that though. ;-)

– Ufuk


On Wed, May 10, 2017 at 8:45 PM, Stephan Ewen <se...@apache.org>

wrote:
@chesnay: I used ASM as an example in the proposal. Maybe I did not

say
that clearly.
If we like that approach, we should deal with the other libraries (at
least

the frequently used ones) in the same way.

I would imagine to have a project layout like that:

flink-shaded-deps
      - flink-shaded-asm
      - flink-shaded-guava
      - flink-shaded-curator
      - flink-shaded-hadoop


"flink-shaded-deps" would not be built every time (and not be
released
every time), but only when needed.






On Wed, May 10, 2017 at 7:28 PM, Chesnay Schepler <
ches...@apache.org
wrote:

I like the idea, thank you for bringing it up.

Given that the raised problems aren't really ASM specific would it
make
sense to create one flink-shaded module that contains all frequently
shaded
libraries? (or maybe even all shaded dependencies by core modules)
The

proposal limits the scope of this to ASM and i was wondering why.
I also remember that there was a discussion recently about why we

shade
things at all, and the idea of working against the shaded namespaces
was
brought up. Back then i was expressing doubts as to whether IDE's

would
properly support this; what's the state on that?
On 10.05.2017 18:18, Stephan Ewen wrote:
Hi!

This is a discussion about altering the way we handle dependencies
and
shading in Flink.
I ran into quite a view problems trying to adjust / fix some
shading
issues
during release validation.

The issue is tracked under: https://issues.apache.org/jira
/browse/FLINK-6529
Bring this discussion thread up because it is a bigger issue

*Problem*

Currently, Flink shades dependencies like ASM and Guava into all

jars
of
projects that reference it and relocate the classes.
There are some drawbacks to that approach, let's discuss them at
the

example of ASM:

       - The ASM classes are for example in flink-core,
flink-java,
flink-scala,
flink-runtime, etc.

       - Users that reference these dependencies have the classes
multiple
times
in the classpath. That is unclean (works, through, because the

classes
are
identical). The same happens when building the final dist. jar.
       - Some of these dependencies require to include license files
in
the
shaded jar. It is hard to impossible to build a good automatic
solution
for
that, partly due to Maven's very poor cross-project path support

       - Most importantly: Scala does not support shading really
well.

Scala

classes have references to classes in more places than just the
class

names
(apparently for Scala reflect support). Referencing a Scala
project

with

shaded ASM still requires to add a reference to unshaded ASM (at
least

as
a
compile dependency).
*Proposal*

I propose that we build and deploy a asm-flink-shaded version of
ASM

and

directly program against the relocated namespaces. Since we never
use

classes that we relocate in public interfaces, Flink users will
never

see
the relocated class names. Internally, it does not hurt to use them.
       - Proper maven dependency management, no hidden (shaded)
dependencies

       - One copy of each class for shaded dependencies
       - Proper Scala interoperability
       - Natural License management (license is part of deployed
asm-flink-shaded jar)


Happy to hear thoughts!

Stephan






Reply via email to