Not sure we'll be able to replace them all. Things like guava Table and
Multimap don't have great replacements in Java8.

On Wed, Jan 31, 2018 at 10:11 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> +1, it was on my TODO for a while waiting the Java8 update.
>
> Regards
> JB
>
> On 02/01/2018 06:56 AM, Romain Manni-Bucau wrote:
> > Why not dropping guava for all beam codebase? With java 8 it is quite
> easy to do
> > it and avoid a bunch of conflicts. Did it in 2 projects with quite a
> good result.
> >
> > Le 1 févr. 2018 06:50, "Lukasz Cwik" <lc...@google.com
> > <mailto:lc...@google.com>> a écrit :
> >
> >     Make sure to include the guava version in the artifact name so that
> we can
> >     have multiple vendored versions.
> >
> >     On Wed, Jan 31, 2018 at 9:16 PM, Kenneth Knowles <k...@google.com
> >     <mailto:k...@google.com>> wrote:
> >
> >         I didn't have time for this, but it just bit me. We definitely
> have
> >         Guava on the API surface of runner support code in ways that get
> >         incompatibly shaded. I will probably start "1a" by making a
> shaded
> >         library org.apache.beam:vendored-guava and starting to use it.
> It sounds
> >         like there is generally unanimous support for that much, anyhow.
> >
> >         Kenn
> >
> >         On Wed, Dec 13, 2017 at 8:31 AM, Aljoscha Krettek <
> aljos...@apache.org
> >         <mailto:aljos...@apache.org>> wrote:
> >
> >             Thanks Ismaël for bringing up this discussion again!
> >
> >             I would be in favour of 1) and more specifically of 1a)
> >
> >             Aljoscha
> >
> >
> >>             On 12. Dec 2017, at 18:56, Lukasz Cwik <lc...@google.com
> >>             <mailto:lc...@google.com>> wrote:
> >>
> >>             You can always run tests on post shaded artifacts instead
> of the
> >>             compiled classes, it just requires us to change our maven
> surefire
> >>             / gradle test configurations but it is true that most IDEs
> would
> >>             behave better with a dependency jar unless you delegate all
> the
> >>             build/test actions to the build system and then it won't
> matter.
> >>
> >>             On Mon, Dec 11, 2017 at 9:05 PM, Kenneth Knowles <
> k...@google.com
> >>             <mailto:k...@google.com>> wrote:
> >>
> >>                 There's also, with additional overhead,
> >>
> >>                 1a) A relocated and shipped package for each thing we
> want to
> >>                 relocate. I think this has also been tried outside
> Beam...
> >>
> >>                 Pros:
> >>                 * all the pros of 1) plus no bloat beyond what is
> necessary
> >>                 Cons:
> >>                 * abandons whitelist approach for public deps,
> reverting to
> >>                 blacklist approach for trouble things like guava, so a
> bit
> >>                 less principled
> >>
> >>                 For both 1) and 1a) I would add:
> >>
> >>                 Pros:
> >>                 * clearly readable dependency since code will `import
> >>                 org.apache.beam.private.guava21` and IDEs will
> understand it
> >>                 is a distinct lilbrary
> >>                 * can run tests on unpackaged classes, as long as deps
> are
> >>                 shaded or provided as jars
> >>                 * no mysterious action at a distance from inherited
> configuration
> >>                 Cons:
> >>                 * need to adjust imports
> >>
> >>                 Kenn
> >>
> >>                 On Mon, Dec 11, 2017 at 9:57 AM, Lukasz Cwik <
> lc...@google.com
> >>                 <mailto:lc...@google.com>> wrote:
> >>
> >>                     I would suggest that either we use:
> >>                     1) A common deps package containing shaded
> dependencies
> >>                     allows for
> >>                     Pros
> >>                     * doesn't require the user to build an uber jar
> >>                     Risks
> >>                     * dependencies package will keep growing even if
> something
> >>                     is or isn't needed by all of Apache Beam leading to
> a
> >>                     large jar anyways negating any space savings
> >>
> >>                     2) Shade within each module to a common location
> like
> >>                     org.apache.beam.relocated.guava....
> >>                     Pros
> >>                     * you only get the shaded dependencies of the
> things that
> >>                     are required
> >>                     * its one less dependency for users to manage
> >>                     Risks
> >>                     * requires an uber jar to be built to get the space
> >>                     savings (either by a user or a distribution of
> Apache
> >>                     Beam) otherwise we negate any space savings.
> >>
> >>                     If we either use a common relocation scheme or a
> >>                     dependencies jar then each relocation should
> specifically
> >>                     contain the version number of the package because
> we would
> >>                     like to allow for us to be using two different
> versions of
> >>                     the same library.
> >>
> >>                     For the common deps package approach, should we
> check in
> >>                     code where the imports contain the relocated
> location
> >>                     (e.g. import org.apache.beam.guava.20.0.com
> >>                     <http://org.apache.beam.guava.20.0.com/
> >.google.common.collect.ImmutableList)?
> >>
> >>
> >>                     On Mon, Dec 11, 2017 at 8:47 AM, Jean-Baptiste
> Onofré
> >>                     <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote:
> >>
> >>                         Thanks for bringing that back.
> >>
> >>                         Indeed guava is shaded in different uber-jar.
> Maybe we
> >>                         can have a common deps module that we include
> once
> >>                         (but the user will have to explicitly define
> the dep) ?
> >>
> >>                         Basically, what do you propose for protobuf
> >>                         (unfortunately, I don't see an obvious) ?
> >>
> >>                         Regards
> >>                         JB
> >>
> >>
> >>                         On 12/11/2017 05:35 PM, Ismaël Mejía wrote:
> >>
> >>                             Hello, I wanted to bring back this subject
> because
> >>                             I think we should
> >>                             take action on this and at least first have
> a
> >>                             shaded version of guava.
> >>                             I was playing with a toy project and I did
> the
> >>                             procedure we use to
> >>                             submit jars to a Hadoop cluster via
> Flink/Spark
> >>                             which involves
> >>                             creating an uber jar and I realized that
> the size
> >>                             of the jar was way
> >>                             bigger than I expected, and the fact that
> we shade
> >>                             guava in every
> >>                             module contributes to this. I found guava
> shaded on:
> >>
> >>                             sdks/java/core
> >>                             runners/core-construction-java
> >>                             runners/core-java
> >>                             model/job-management
> >>                             runners/spark
> >>                             sdks/java/io/hadoop-file-system
> >>                             sdks/java/io/kafka
> >>
> >>                             This means at least 6 times more of the
> size it
> >>                             should which counts in
> >>                             around 15MB more (2.4MB*6 deps) of extra
> weight
> >>                             that we can simply
> >>                             reduce by using a shaded version of the
> library.
> >>
> >>                             I add this point to the previous ones
> mentioned
> >>                             during the discussion
> >>                             because this goes against the end user
> experience
> >>                             and affects us all
> >>                             (devs/users).
> >>
> >>                             Another question is if we should shade (and
> how)
> >>                             protocol buffers
> >>                             because now with the portability work we are
> >>                             exposing it closer to the
> >>                             end users. I say this because I also found
> an
> >>                             issue while running a
> >>                             job on YARN with the spark runner because
> >>                             hadoop-common includes
> >>                             protobuf-java 2 and I had to explicitly
> provide
> >>                             protocol-buffers 3 as
> >>                             a dependency to be able to use triggers
> (note the
> >>                             Spark runner
> >>                             translates them using some method from
> >>                             runners/core-java). Since
> >>                             hadoop-common is provided in the cluster
> with the
> >>                             older version of
> >>                             protobuf, I am afraid that this will bite
> us in
> >>                             the future.
> >>
> >>                             Ismaël
> >>
> >>                             ps. There is already a JIRA for that
> shading for
> >>                             protobuf on
> >>                             hadoop-common but this is not coming until
> version
> >>                             3 is out.
> >>                             https://issues.apache.org/
> jira/browse/HADOOP-13136
> >>                             <https://issues.apache.org/
> jira/browse/HADOOP-13136>
> >>
> >>                             ps2. Extra curious situation is to see that
> the
> >>                             dataflow-runner ends
> >>                             up having guava shaded twice via its shaded
> version on
> >>                             core-construction-java.
> >>
> >>                             ps3. Of course this message means a
> de-facto +1 at
> >>                             least to do it for
> >>                             guava and evaluate it for other libs.
> >>
> >>
> >>                             On Tue, Oct 17, 2017 at 7:29 PM, Lukasz Cwik
> >>                             <lc...@google.com.invalid
> >>                             <mailto:lc...@google.com.invalid>> wrote:
> >>
> >>                                 An issue to call out is how to deal
> with our
> >>                                 generated code (.avro and
> >>                                 .proto) as I don't believe those
> plugins allow
> >>                                 you to generate code using a
> >>                                 shaded package prefix on imports.
> >>
> >>                                 On Tue, Oct 17, 2017 at 10:28 AM,
> Thomas Groh
> >>                                 <tg...@google.com.invalid
> >>                                 <mailto:tg...@google.com.invalid>>
> >>                                 wrote:
> >>
> >>                                     +1 to the goal. I'm hugely in favor
> of not
> >>                                     doing the same shading work
> >>                                     every time for dependencies we know
> we'll use.
> >>
> >>                                     This also means that if we end up
> pulling
> >>                                     in transitive dependencies we
> >>                                     don't want in any particular module
> we can
> >>                                     avoid having to adjust our
> >>                                     repackaging strategy for that
> module -
> >>                                     which I have run into face-first in
> >>                                     the past.
> >>
> >>                                     On Tue, Oct 17, 2017 at 9:48 AM,
> Kenneth
> >>                                     Knowles <k...@google.com.invalid
> >>                                     <mailto:k...@google.com.invalid>>
> >>                                     wrote:
> >>
> >>                                         Hi all,
> >>
> >>                                         Shading is a big part of how we
> keep
> >>                                         our dependencies sane in Beam.
> But
> >>                                         downsides: shading is super
> slow,
> >>                                         causes massive jar bloat, and
> kind of
> >>                                         hard to get right because
> artifacts
> >>                                         and namespaces are not 1-to-1.
> >>
> >>                                         I know that some communities
> >>                                         distribute their own shaded
> >>                                         distributions of
> >>                                         dependencies. I had a thought
> about
> >>                                         doing something similar that I
> wanted
> >>                                         to throw out there for people
> to poke
> >>                                         holes in.
> >>
> >>                                         To set the scene, here is how I
> view
> >>                                         shading:
> >>
> >>                                           - A module has public
> dependencies
> >>                                         and private dependencies.
> >>                                           - Public deps are used for
> data
> >>                                         interchange; users must share
> these
> >>
> >>                                     deps.
> >>
> >>                                           - Private deps are just
> >>                                         functionality and can be hidden
> (in
> >>                                         our case,
> >>                                         relocated + bundled)
> >>                                           - It isn't necessarily that
> simple,
> >>                                         because public and private deps
> >>
> >>                                     might
> >>
> >>                                         interact in higher-order ways
> >>                                         ("public" is contagious)
> >>
> >>                                         Shading is an implementation
> detail of
> >>                                         expressing these
> characteristics.
> >>
> >>                                     We
> >>
> >>                                         use shading selectively because
> of its
> >>                                         downsides I mentioned above.
> >>
> >>                                         But what about this idea:
> Introduce
> >>                                         shaded deps as a single separate
> >>                                         artifact.
> >>
> >>                                           - sdks/java/private-deps:
> bundled
> >>                                         uber jar with relocated
> versions of
> >>                                         everything we want to shade
> >>
> >>                                           - sdks/java/core and
> >>                                         sdks/java/harness: no
> relocation or
> >>                                         bundling -
> >>                                         depends on
> >>                                         `beam-sdks-java-private-deps`
> and
> >>                                         imports like
> >>                                         `org.apache.beam.sdk.private.
> com.google.common`
> >>                                         directly (this is what
> >>                                         they
> >>                                         are rewritten to
> >>
> >>                                         Some benefits
> >>
> >>                                           - much faster builds of other
> modules
> >>                                           - only one shaded uber jar
> >>                                           - rare/no rebuilds of the
> uber jar
> >>                                           - can use maven enforcer to
> forbid
> >>                                         imports like com.google.common
> >>                                           - configuration all in one
> place
> >>                                           - no automated rewriting of
> our real
> >>                                         code, which has led to some
> major
> >>                                         confusion
> >>                                           - easy to implement
> incrementally
> >>
> >>                                         Downsides:
> >>
> >>                                           - plenty of effort work to
> get there
> >>                                           - unclear how many different
> such
> >>                                         deps modules we need; sharing
> them
> >>
> >>                                     could
> >>
> >>                                         get weird
> >>                                           - if we hit a roadblock, we
> will
> >>                                         have committed a lot of time
> >>
> >>                                         Just something I was musing as
> I spent
> >>                                         another evening waiting for slow
> >>                                         builds to try to confirm
> changes to
> >>                                         brittle poms.
> >>
> >>                                         Kenn
> >>
> >>
> >>
> >>                         --
> >>                         Jean-Baptiste Onofré
> >>                         jbono...@apache.org <mailto:jbono...@apache.org
> >
> >>                         http://blog.nanthrax.net <
> http://blog.nanthrax.net/>
> >>                         Talend - http://www.talend.com <
> http://www.talend.com/>
> >>
> >>
> >>
> >>
> >
> >
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to