well we can disagree on the code - it is fine ;), but the needed part of it
by beam is not huge and in any case it can be forked without requiring 10
classes - if so we'll use another impl than the guava one ;). This is the
whole point.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-02 17:24 GMT+01:00 Reuven Lax <re...@google.com>:

> TypeToken is not trivial. I've written code to do what TypeToken does
> before (figuring out generic ancestor types). It's actually somewhat
> tricky, and the code I wrote had subtle bugs in it; eventually we removed
> this code in favor of Guava's implementation :)
>
> On Fri, Feb 2, 2018 at 7:47 AM, Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> Yep, note I never said to reinvent the wheel, we can copy it from guava,
>> openwebbeans or any other impl. Point was more to avoid to depend on
>> something we don't own for that which is after all not that much code. I
>> also think we can limit it a lot to align it on what is supported by beam
>> (I'm thinking to coders here) but this can be another topic.
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-02 16:33 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>
>>> On Fri, Feb 2, 2018 at 7:18 AM, Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
>>>> Don't forget beam doesn't support much behind it (mainly only a few
>>>> ParameterizedType due the usage code path) so it is mainly only about
>>>> handling parameterized types and typevariables recursively. Not a lot of
>>>> work. Main concern being it is in the API so using a shade as an API is
>>>> never a good idea, in particular cause the shade can be broken and requires
>>>> to setup clirr or things like that and when it breaks you are done and need
>>>> to fork it anyway. Limiting the dependencies for an API - as beam is - is
>>>> always saner even if it requires a small fork of code.
>>>>
>>>
>>> The main thing that TypeToken is used for is capturing generics that are
>>> lost by Java reflection. It is a bit tricky, actually.
>>>
>>> Kenn
>>>
>>>
>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> | Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-02 15:49 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>>>
>>>>> On Fri, Feb 2, 2018 at 6:41 AM, Romain Manni-Bucau <
>>>>> rmannibu...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> 2018-02-02 15:37 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>>>>>
>>>>>>> Another couple:
>>>>>>>
>>>>>>>  - User-facing TypeDescriptor is a very thin wrapper on Guava's
>>>>>>> TypeToken
>>>>>>>
>>>>>>
>>>>>> Technically reflect Type is enough
>>>>>>
>>>>>
>>>>> If you want to try to remove TypeToken from underneath TypeDescriptor,
>>>>> I have no objections as long as you expand the test suite to cover all the
>>>>> functionality where we trust TypeToken's tests. Good luck :-)
>>>>>
>>>>> Kenn
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>  - ImmutableList and friends and their builders are very widely used
>>>>>>> and IMO still add a lot for readability and preventing someone coming 
>>>>>>> along
>>>>>>> and adding mistakes to a codebase
>>>>>>>
>>>>>>
>>>>>> Sugar but not required. When you compare the cost of a shade versus
>>>>>> of duplicating the parts we need there is no real match IMHO.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> So considering it all, I would keep a vendored Guava (but also move
>>>>>>> off where we can, and also have our own improvements). I hope it will 
>>>>>>> be a
>>>>>>> near-empty build file to generate it so not a maintenance burden.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Thu, Feb 1, 2018 at 8:44 PM, Kenneth Knowles <k...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Nice. This sounds like a great idea (or two?) and goes along with
>>>>>>>> what I just started for futures.
>>>>>>>>
>>>>>>>> Guava: filed https://issues.apache.org/jira/browse/BEAM-3606 and
>>>>>>>> assigned to Ismaël :-) and converted my futures thing to a subtask.
>>>>>>>>
>>>>>>>> Specific things for our micro guava:
>>>>>>>>
>>>>>>>>  - checkArgumentNotNull can throw the right exception
>>>>>>>>  - our own Optional because Java's is not Serializable
>>>>>>>>  - futures combinators since many are missing, especially Java's
>>>>>>>> don't do exceptions right
>>>>>>>>
>>>>>>>> Protobuf: didn't file an issue because I'm not sure
>>>>>>>>
>>>>>>>> I was wondering if pre-shading works. We really need to get rid of
>>>>>>>> it from public APIs in a 100% reliable way. It is also a problem for
>>>>>>>> Dataflow. I was wondering if one approach is to pre-shade
>>>>>>>> gcpio-protobuf-java, gcpio-protobuf-java-util, gcpio-grpc-java, etc.
>>>>>>>> Anything that needs to take a Message object. (and do the same for
>>>>>>>> beam-model-protobuf-java since the model bits have to depend on each 
>>>>>>>> other
>>>>>>>> but nothing else).
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Thu, Feb 1, 2018 at 1:56 AM, Ismaël Mejía <ieme...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Huge +1 to get rid of Guava!
>>>>>>>>>
>>>>>>>>> This solves annoying dependency issues for some IOs and allow us to
>>>>>>>>> get rid of the shading that makes current jars bigger than they
>>>>>>>>> should
>>>>>>>>> be.
>>>>>>>>>
>>>>>>>>> We can create our own 'micro guava' package with some classes for
>>>>>>>>> things that are hard to migrate, or that we  prefer to still have
>>>>>>>>> like
>>>>>>>>> the check* methods for example. Given the size of the task we
>>>>>>>>> should
>>>>>>>>> probably divide it into subtasks, more important is to get rid of
>>>>>>>>> it
>>>>>>>>> for 'sdks/java/core'. We can then attack other areas afterwards.
>>>>>>>>>
>>>>>>>>> Other important idea would be to get rid of Protobuf in public APIs
>>>>>>>>> like GCPIO and to better shade it from leaking into the runners. An
>>>>>>>>> unexpected side effect of this is a leak of netty via gRPC/protobuf
>>>>>>>>> that is byting us for the Spark runner, but well that's worth a
>>>>>>>>> different discussion.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 1, 2018 at 10:08 AM, Romain Manni-Bucau
>>>>>>>>> <rmannibu...@gmail.com> wrote:
>>>>>>>>> > a map of list is fine and not a challenge we'll face long I hope
>>>>>>>>> ;)
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Romain Manni-Bucau
>>>>>>>>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>>>>>>>>> >
>>>>>>>>> > 2018-02-01 9:40 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>>> >>
>>>>>>>>> >> Not sure we'll be able to replace them all. Things like guava
>>>>>>>>> Table and
>>>>>>>>> >> Multimap don't have great replacements in Java8.
>>>>>>>>> >>
>>>>>>>>> >> On Wed, Jan 31, 2018 at 10:11 PM, Jean-Baptiste Onofré <
>>>>>>>>> j...@nanthrax.net>
>>>>>>>>> >> wrote:
>>>>>>>>> >>>
>>>>>>>>> >>> +1, it was on my TODO for a while waiting the Java8 update.
>>>>>>>>> >>>
>>>>>>>>> >>> Regards
>>>>>>>>> >>> JB
>>>>>>>>> >>>
>>>>>>>>> >>> On 02/01/2018 06:56 AM, Romain Manni-Bucau wrote:
>>>>>>>>> >>> > Why not dropping guava for all beam codebase? With java 8 it
>>>>>>>>> is quite
>>>>>>>>> >>> > easy to do
>>>>>>>>> >>> > it and avoid a bunch of conflicts. Did it in 2 projects with
>>>>>>>>> quite a
>>>>>>>>> >>> > good result.
>>>>>>>>> >>> >
>>>>>>>>> >>> > Le 1 févr. 2018 06:50, "Lukasz Cwik" <lc...@google.com
>>>>>>>>> >>> > <mailto:lc...@google.com>> a écrit :
>>>>>>>>> >>> >
>>>>>>>>> >>> >     Make sure to include the guava version in the artifact
>>>>>>>>> name so that
>>>>>>>>> >>> > we can
>>>>>>>>> >>> >     have multiple vendored versions.
>>>>>>>>> >>> >
>>>>>>>>> >>> >     On Wed, Jan 31, 2018 at 9:16 PM, Kenneth Knowles <
>>>>>>>>> k...@google.com
>>>>>>>>> >>> >     <mailto:k...@google.com>> wrote:
>>>>>>>>> >>> >
>>>>>>>>> >>> >         I didn't have time for this, but it just bit me. We
>>>>>>>>> definitely
>>>>>>>>> >>> > have
>>>>>>>>> >>> >         Guava on the API surface of runner support code in
>>>>>>>>> ways that
>>>>>>>>> >>> > get
>>>>>>>>> >>> >         incompatibly shaded. I will probably start "1a" by
>>>>>>>>> making a
>>>>>>>>> >>> > shaded
>>>>>>>>> >>> >         library org.apache.beam:vendored-guava and starting
>>>>>>>>> to use it.
>>>>>>>>> >>> > It sounds
>>>>>>>>> >>> >         like there is generally unanimous support for that
>>>>>>>>> much,
>>>>>>>>> >>> > anyhow.
>>>>>>>>> >>> >
>>>>>>>>> >>> >         Kenn
>>>>>>>>> >>> >
>>>>>>>>> >>> >         On Wed, Dec 13, 2017 at 8:31 AM, Aljoscha Krettek
>>>>>>>>> >>> > <aljos...@apache.org
>>>>>>>>> >>> >         <mailto:aljos...@apache.org>> wrote:
>>>>>>>>> >>> >
>>>>>>>>> >>> >             Thanks Ismaël for bringing up this discussion
>>>>>>>>> again!
>>>>>>>>> >>> >
>>>>>>>>> >>> >             I would be in favour of 1) and more specifically
>>>>>>>>> of 1a)
>>>>>>>>> >>> >
>>>>>>>>> >>> >             Aljoscha
>>>>>>>>> >>> >
>>>>>>>>> >>> >
>>>>>>>>> >>> >>             On 12. Dec 2017, at 18:56, Lukasz Cwik <
>>>>>>>>> lc...@google.com
>>>>>>>>> >>> >>             <mailto:lc...@google.com>> wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>             You can always run tests on post shaded
>>>>>>>>> artifacts instead
>>>>>>>>> >>> >> of the
>>>>>>>>> >>> >>             compiled classes, it just requires us to change
>>>>>>>>> our maven
>>>>>>>>> >>> >> surefire
>>>>>>>>> >>> >>             / gradle test configurations but it is true
>>>>>>>>> that most IDEs
>>>>>>>>> >>> >> would
>>>>>>>>> >>> >>             behave better with a dependency jar unless you
>>>>>>>>> delegate
>>>>>>>>> >>> >> all the
>>>>>>>>> >>> >>             build/test actions to the build system and then
>>>>>>>>> it won't
>>>>>>>>> >>> >> matter.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>             On Mon, Dec 11, 2017 at 9:05 PM, Kenneth Knowles
>>>>>>>>> >>> >> <k...@google.com
>>>>>>>>> >>> >>             <mailto:k...@google.com>> wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 There's also, with additional overhead,
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 1a) A relocated and shipped package for
>>>>>>>>> each thing we
>>>>>>>>> >>> >> want to
>>>>>>>>> >>> >>                 relocate. I think this has also been tried
>>>>>>>>> outside
>>>>>>>>> >>> >> Beam...
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 Pros:
>>>>>>>>> >>> >>                 * all the pros of 1) plus no bloat beyond
>>>>>>>>> what is
>>>>>>>>> >>> >> necessary
>>>>>>>>> >>> >>                 Cons:
>>>>>>>>> >>> >>                 * abandons whitelist approach for public
>>>>>>>>> deps,
>>>>>>>>> >>> >> reverting to
>>>>>>>>> >>> >>                 blacklist approach for trouble things like
>>>>>>>>> guava, so a
>>>>>>>>> >>> >> bit
>>>>>>>>> >>> >>                 less principled
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 For both 1) and 1a) I would add:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 Pros:
>>>>>>>>> >>> >>                 * clearly readable dependency since code
>>>>>>>>> will `import
>>>>>>>>> >>> >>                 org.apache.beam.private.guava21` and IDEs
>>>>>>>>> will
>>>>>>>>> >>> >> understand it
>>>>>>>>> >>> >>                 is a distinct lilbrary
>>>>>>>>> >>> >>                 * can run tests on unpackaged classes, as
>>>>>>>>> long as deps
>>>>>>>>> >>> >> are
>>>>>>>>> >>> >>                 shaded or provided as jars
>>>>>>>>> >>> >>                 * no mysterious action at a distance from
>>>>>>>>> inherited
>>>>>>>>> >>> >> configuration
>>>>>>>>> >>> >>                 Cons:
>>>>>>>>> >>> >>                 * need to adjust imports
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 Kenn
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                 On Mon, Dec 11, 2017 at 9:57 AM, Lukasz Cwik
>>>>>>>>> >>> >> <lc...@google.com
>>>>>>>>> >>> >>                 <mailto:lc...@google.com>> wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                     I would suggest that either we use:
>>>>>>>>> >>> >>                     1) A common deps package containing
>>>>>>>>> shaded
>>>>>>>>> >>> >> dependencies
>>>>>>>>> >>> >>                     allows for
>>>>>>>>> >>> >>                     Pros
>>>>>>>>> >>> >>                     * doesn't require the user to build an
>>>>>>>>> uber jar
>>>>>>>>> >>> >>                     Risks
>>>>>>>>> >>> >>                     * dependencies package will keep
>>>>>>>>> growing even if
>>>>>>>>> >>> >> something
>>>>>>>>> >>> >>                     is or isn't needed by all of Apache
>>>>>>>>> Beam leading
>>>>>>>>> >>> >> to a
>>>>>>>>> >>> >>                     large jar anyways negating any space
>>>>>>>>> savings
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                     2) Shade within each module to a common
>>>>>>>>> location
>>>>>>>>> >>> >> like
>>>>>>>>> >>> >>                     org.apache.beam.relocated.guava....
>>>>>>>>> >>> >>                     Pros
>>>>>>>>> >>> >>                     * you only get the shaded dependencies
>>>>>>>>> of the
>>>>>>>>> >>> >> things that
>>>>>>>>> >>> >>                     are required
>>>>>>>>> >>> >>                     * its one less dependency for users to
>>>>>>>>> manage
>>>>>>>>> >>> >>                     Risks
>>>>>>>>> >>> >>                     * requires an uber jar to be built to
>>>>>>>>> get the
>>>>>>>>> >>> >> space
>>>>>>>>> >>> >>                     savings (either by a user or a
>>>>>>>>> distribution of
>>>>>>>>> >>> >> Apache
>>>>>>>>> >>> >>                     Beam) otherwise we negate any space
>>>>>>>>> savings.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                     If we either use a common relocation
>>>>>>>>> scheme or a
>>>>>>>>> >>> >>                     dependencies jar then each relocation
>>>>>>>>> should
>>>>>>>>> >>> >> specifically
>>>>>>>>> >>> >>                     contain the version number of the
>>>>>>>>> package because
>>>>>>>>> >>> >> we would
>>>>>>>>> >>> >>                     like to allow for us to be using two
>>>>>>>>> different
>>>>>>>>> >>> >> versions of
>>>>>>>>> >>> >>                     the same library.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                     For the common deps package approach,
>>>>>>>>> should we
>>>>>>>>> >>> >> check in
>>>>>>>>> >>> >>                     code where the imports contain the
>>>>>>>>> relocated
>>>>>>>>> >>> >> location
>>>>>>>>> >>> >>                     (e.g. import
>>>>>>>>> org.apache.beam.guava.20.0.com
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> <http://org.apache.beam.guava.20.0.com/
>>>>>>>>> >.google.common.collect.ImmutableList)?
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                     On Mon, Dec 11, 2017 at 8:47 AM,
>>>>>>>>> Jean-Baptiste
>>>>>>>>> >>> >> Onofré
>>>>>>>>> >>> >>                     <j...@nanthrax.net <mailto:
>>>>>>>>> j...@nanthrax.net>> wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                         Thanks for bringing that back.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                         Indeed guava is shaded in different
>>>>>>>>> uber-jar.
>>>>>>>>> >>> >> Maybe we
>>>>>>>>> >>> >>                         can have a common deps module that
>>>>>>>>> we include
>>>>>>>>> >>> >> once
>>>>>>>>> >>> >>                         (but the user will have to
>>>>>>>>> explicitly define
>>>>>>>>> >>> >> the dep) ?
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                         Basically, what do you propose for
>>>>>>>>> protobuf
>>>>>>>>> >>> >>                         (unfortunately, I don't see an
>>>>>>>>> obvious) ?
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                         Regards
>>>>>>>>> >>> >>                         JB
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                         On 12/11/2017 05:35 PM, Ismaël
>>>>>>>>> Mejía wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             Hello, I wanted to bring back
>>>>>>>>> this subject
>>>>>>>>> >>> >> because
>>>>>>>>> >>> >>                             I think we should
>>>>>>>>> >>> >>                             take action on this and at
>>>>>>>>> least first
>>>>>>>>> >>> >> have a
>>>>>>>>> >>> >>                             shaded version of guava.
>>>>>>>>> >>> >>                             I was playing with a toy
>>>>>>>>> project and I did
>>>>>>>>> >>> >> the
>>>>>>>>> >>> >>                             procedure we use to
>>>>>>>>> >>> >>                             submit jars to a Hadoop cluster
>>>>>>>>> via
>>>>>>>>> >>> >> Flink/Spark
>>>>>>>>> >>> >>                             which involves
>>>>>>>>> >>> >>                             creating an uber jar and I
>>>>>>>>> realized that
>>>>>>>>> >>> >> the size
>>>>>>>>> >>> >>                             of the jar was way
>>>>>>>>> >>> >>                             bigger than I expected, and the
>>>>>>>>> fact that
>>>>>>>>> >>> >> we shade
>>>>>>>>> >>> >>                             guava in every
>>>>>>>>> >>> >>                             module contributes to this. I
>>>>>>>>> found guava
>>>>>>>>> >>> >> shaded on:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             sdks/java/core
>>>>>>>>> >>> >>                             runners/core-construction-java
>>>>>>>>> >>> >>                             runners/core-java
>>>>>>>>> >>> >>                             model/job-management
>>>>>>>>> >>> >>                             runners/spark
>>>>>>>>> >>> >>                             sdks/java/io/hadoop-file-system
>>>>>>>>> >>> >>                             sdks/java/io/kafka
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             This means at least 6 times
>>>>>>>>> more of the
>>>>>>>>> >>> >> size it
>>>>>>>>> >>> >>                             should which counts in
>>>>>>>>> >>> >>                             around 15MB more (2.4MB*6 deps)
>>>>>>>>> of extra
>>>>>>>>> >>> >> weight
>>>>>>>>> >>> >>                             that we can simply
>>>>>>>>> >>> >>                             reduce by using a shaded
>>>>>>>>> version of the
>>>>>>>>> >>> >> library.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             I add this point to the
>>>>>>>>> previous ones
>>>>>>>>> >>> >> mentioned
>>>>>>>>> >>> >>                             during the discussion
>>>>>>>>> >>> >>                             because this goes against the
>>>>>>>>> end user
>>>>>>>>> >>> >> experience
>>>>>>>>> >>> >>                             and affects us all
>>>>>>>>> >>> >>                             (devs/users).
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             Another question is if we
>>>>>>>>> should shade
>>>>>>>>> >>> >> (and how)
>>>>>>>>> >>> >>                             protocol buffers
>>>>>>>>> >>> >>                             because now with the
>>>>>>>>> portability work we
>>>>>>>>> >>> >> are
>>>>>>>>> >>> >>                             exposing it closer to the
>>>>>>>>> >>> >>                             end users. I say this because I
>>>>>>>>> also found
>>>>>>>>> >>> >> an
>>>>>>>>> >>> >>                             issue while running a
>>>>>>>>> >>> >>                             job on YARN with the spark
>>>>>>>>> runner because
>>>>>>>>> >>> >>                             hadoop-common includes
>>>>>>>>> >>> >>                             protobuf-java 2 and I had to
>>>>>>>>> explicitly
>>>>>>>>> >>> >> provide
>>>>>>>>> >>> >>                             protocol-buffers 3 as
>>>>>>>>> >>> >>                             a dependency to be able to use
>>>>>>>>> triggers
>>>>>>>>> >>> >> (note the
>>>>>>>>> >>> >>                             Spark runner
>>>>>>>>> >>> >>                             translates them using some
>>>>>>>>> method from
>>>>>>>>> >>> >>                             runners/core-java). Since
>>>>>>>>> >>> >>                             hadoop-common is provided in
>>>>>>>>> the cluster
>>>>>>>>> >>> >> with the
>>>>>>>>> >>> >>                             older version of
>>>>>>>>> >>> >>                             protobuf, I am afraid that this
>>>>>>>>> will bite
>>>>>>>>> >>> >> us in
>>>>>>>>> >>> >>                             the future.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             Ismaël
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             ps. There is already a JIRA for
>>>>>>>>> that
>>>>>>>>> >>> >> shading for
>>>>>>>>> >>> >>                             protobuf on
>>>>>>>>> >>> >>                             hadoop-common but this is not
>>>>>>>>> coming until
>>>>>>>>> >>> >> version
>>>>>>>>> >>> >>                             3 is out.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> https://issues.apache.org/jira/browse/HADOOP-13136
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> <https://issues.apache.org/jira/browse/HADOOP-13136>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             ps2. Extra curious situation is
>>>>>>>>> to see
>>>>>>>>> >>> >> that the
>>>>>>>>> >>> >>                             dataflow-runner ends
>>>>>>>>> >>> >>                             up having guava shaded twice
>>>>>>>>> via its
>>>>>>>>> >>> >> shaded version on
>>>>>>>>> >>> >>                             core-construction-java.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             ps3. Of course this message
>>>>>>>>> means a
>>>>>>>>> >>> >> de-facto +1 at
>>>>>>>>> >>> >>                             least to do it for
>>>>>>>>> >>> >>                             guava and evaluate it for other
>>>>>>>>> libs.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                             On Tue, Oct 17, 2017 at 7:29
>>>>>>>>> PM, Lukasz
>>>>>>>>> >>> >> Cwik
>>>>>>>>> >>> >>                             <lc...@google.com.invalid
>>>>>>>>> >>> >>                             <mailto:lc...@google.com.inva
>>>>>>>>> lid>> wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                 An issue to call out is how
>>>>>>>>> to deal
>>>>>>>>> >>> >> with our
>>>>>>>>> >>> >>                                 generated code (.avro and
>>>>>>>>> >>> >>                                 .proto) as I don't believe
>>>>>>>>> those
>>>>>>>>> >>> >> plugins allow
>>>>>>>>> >>> >>                                 you to generate code using a
>>>>>>>>> >>> >>                                 shaded package prefix on
>>>>>>>>> imports.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                 On Tue, Oct 17, 2017 at
>>>>>>>>> 10:28 AM,
>>>>>>>>> >>> >> Thomas Groh
>>>>>>>>> >>> >>                                 <tg...@google.com.invalid
>>>>>>>>> >>> >>                                 <mailto:
>>>>>>>>> tg...@google.com.invalid>>
>>>>>>>>> >>> >>                                 wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     +1 to the goal. I'm
>>>>>>>>> hugely in
>>>>>>>>> >>> >> favor of not
>>>>>>>>> >>> >>                                     doing the same shading
>>>>>>>>> work
>>>>>>>>> >>> >>                                     every time for
>>>>>>>>> dependencies we
>>>>>>>>> >>> >> know we'll use.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     This also means that if
>>>>>>>>> we end up
>>>>>>>>> >>> >> pulling
>>>>>>>>> >>> >>                                     in transitive
>>>>>>>>> dependencies we
>>>>>>>>> >>> >>                                     don't want in any
>>>>>>>>> particular
>>>>>>>>> >>> >> module we can
>>>>>>>>> >>> >>                                     avoid having to adjust
>>>>>>>>> our
>>>>>>>>> >>> >>                                     repackaging strategy
>>>>>>>>> for that
>>>>>>>>> >>> >> module -
>>>>>>>>> >>> >>                                     which I have run into
>>>>>>>>> face-first
>>>>>>>>> >>> >> in
>>>>>>>>> >>> >>                                     the past.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     On Tue, Oct 17, 2017 at
>>>>>>>>> 9:48 AM,
>>>>>>>>> >>> >> Kenneth
>>>>>>>>> >>> >>                                     Knowles
>>>>>>>>> <k...@google.com.invalid
>>>>>>>>> >>> >>                                     <mailto:
>>>>>>>>> k...@google.com.invalid>>
>>>>>>>>> >>> >>                                     wrote:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Hi all,
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Shading is a big
>>>>>>>>> part of how
>>>>>>>>> >>> >> we keep
>>>>>>>>> >>> >>                                         our dependencies
>>>>>>>>> sane in Beam.
>>>>>>>>> >>> >> But
>>>>>>>>> >>> >>                                         downsides: shading
>>>>>>>>> is super
>>>>>>>>> >>> >> slow,
>>>>>>>>> >>> >>                                         causes massive jar
>>>>>>>>> bloat, and
>>>>>>>>> >>> >> kind of
>>>>>>>>> >>> >>                                         hard to get right
>>>>>>>>> because
>>>>>>>>> >>> >> artifacts
>>>>>>>>> >>> >>                                         and namespaces are
>>>>>>>>> not 1-to-1.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         I know that some
>>>>>>>>> communities
>>>>>>>>> >>> >>                                         distribute their
>>>>>>>>> own shaded
>>>>>>>>> >>> >>                                         distributions of
>>>>>>>>> >>> >>                                         dependencies. I had
>>>>>>>>> a thought
>>>>>>>>> >>> >> about
>>>>>>>>> >>> >>                                         doing something
>>>>>>>>> similar that I
>>>>>>>>> >>> >> wanted
>>>>>>>>> >>> >>                                         to throw out there
>>>>>>>>> for people
>>>>>>>>> >>> >> to poke
>>>>>>>>> >>> >>                                         holes in.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         To set the scene,
>>>>>>>>> here is how
>>>>>>>>> >>> >> I view
>>>>>>>>> >>> >>                                         shading:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                           - A module has
>>>>>>>>> public
>>>>>>>>> >>> >> dependencies
>>>>>>>>> >>> >>                                         and private
>>>>>>>>> dependencies.
>>>>>>>>> >>> >>                                           - Public deps are
>>>>>>>>> used for
>>>>>>>>> >>> >> data
>>>>>>>>> >>> >>                                         interchange; users
>>>>>>>>> must share
>>>>>>>>> >>> >> these
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     deps.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                           - Private deps
>>>>>>>>> are just
>>>>>>>>> >>> >>                                         functionality and
>>>>>>>>> can be
>>>>>>>>> >>> >> hidden (in
>>>>>>>>> >>> >>                                         our case,
>>>>>>>>> >>> >>                                         relocated + bundled)
>>>>>>>>> >>> >>                                           - It isn't
>>>>>>>>> necessarily that
>>>>>>>>> >>> >> simple,
>>>>>>>>> >>> >>                                         because public and
>>>>>>>>> private
>>>>>>>>> >>> >> deps
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     might
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         interact in
>>>>>>>>> higher-order ways
>>>>>>>>> >>> >>                                         ("public" is
>>>>>>>>> contagious)
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Shading is an
>>>>>>>>> implementation
>>>>>>>>> >>> >> detail of
>>>>>>>>> >>> >>                                         expressing these
>>>>>>>>> >>> >> characteristics.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     We
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         use shading
>>>>>>>>> selectively
>>>>>>>>> >>> >> because of its
>>>>>>>>> >>> >>                                         downsides I
>>>>>>>>> mentioned above.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         But what about this
>>>>>>>>> idea:
>>>>>>>>> >>> >> Introduce
>>>>>>>>> >>> >>                                         shaded deps as a
>>>>>>>>> single
>>>>>>>>> >>> >> separate
>>>>>>>>> >>> >>                                         artifact.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                           -
>>>>>>>>> sdks/java/private-deps:
>>>>>>>>> >>> >> bundled
>>>>>>>>> >>> >>                                         uber jar with
>>>>>>>>> relocated
>>>>>>>>> >>> >> versions of
>>>>>>>>> >>> >>                                         everything we want
>>>>>>>>> to shade
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                           - sdks/java/core
>>>>>>>>> and
>>>>>>>>> >>> >>                                         sdks/java/harness:
>>>>>>>>> no
>>>>>>>>> >>> >> relocation or
>>>>>>>>> >>> >>                                         bundling -
>>>>>>>>> >>> >>                                         depends on
>>>>>>>>> >>> >>
>>>>>>>>>  `beam-sdks-java-private-deps`
>>>>>>>>> >>> >> and
>>>>>>>>> >>> >>                                         imports like
>>>>>>>>> >>> >>
>>>>>>>>> >>> >> `org.apache.beam.sdk.private.com.google.common`
>>>>>>>>> >>> >>                                         directly (this is
>>>>>>>>> what
>>>>>>>>> >>> >>                                         they
>>>>>>>>> >>> >>                                         are rewritten to
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Some benefits
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                           - much faster
>>>>>>>>> builds of
>>>>>>>>> >>> >> other modules
>>>>>>>>> >>> >>                                           - only one shaded
>>>>>>>>> uber jar
>>>>>>>>> >>> >>                                           - rare/no
>>>>>>>>> rebuilds of the
>>>>>>>>> >>> >> uber jar
>>>>>>>>> >>> >>                                           - can use maven
>>>>>>>>> enforcer to
>>>>>>>>> >>> >> forbid
>>>>>>>>> >>> >>                                         imports like
>>>>>>>>> com.google.common
>>>>>>>>> >>> >>                                           - configuration
>>>>>>>>> all in one
>>>>>>>>> >>> >> place
>>>>>>>>> >>> >>                                           - no automated
>>>>>>>>> rewriting of
>>>>>>>>> >>> >> our real
>>>>>>>>> >>> >>                                         code, which has led
>>>>>>>>> to some
>>>>>>>>> >>> >> major
>>>>>>>>> >>> >>                                         confusion
>>>>>>>>> >>> >>                                           - easy to
>>>>>>>>> implement
>>>>>>>>> >>> >> incrementally
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Downsides:
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                           - plenty of
>>>>>>>>> effort work to
>>>>>>>>> >>> >> get there
>>>>>>>>> >>> >>                                           - unclear how
>>>>>>>>> many different
>>>>>>>>> >>> >> such
>>>>>>>>> >>> >>                                         deps modules we
>>>>>>>>> need; sharing
>>>>>>>>> >>> >> them
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                     could
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         get weird
>>>>>>>>> >>> >>                                           - if we hit a
>>>>>>>>> roadblock, we
>>>>>>>>> >>> >> will
>>>>>>>>> >>> >>                                         have committed a
>>>>>>>>> lot of time
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Just something I
>>>>>>>>> was musing as
>>>>>>>>> >>> >> I spent
>>>>>>>>> >>> >>                                         another evening
>>>>>>>>> waiting for
>>>>>>>>> >>> >> slow
>>>>>>>>> >>> >>                                         builds to try to
>>>>>>>>> confirm
>>>>>>>>> >>> >> changes to
>>>>>>>>> >>> >>                                         brittle poms.
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                                         Kenn
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>                         --
>>>>>>>>> >>> >>                         Jean-Baptiste Onofré
>>>>>>>>> >>> >>                         jbono...@apache.org
>>>>>>>>> >>> >> <mailto:jbono...@apache.org>
>>>>>>>>> >>> >>                         http://blog.nanthrax.net
>>>>>>>>> >>> >> <http://blog.nanthrax.net/>
>>>>>>>>> >>> >>                         Talend - http://www.talend.com
>>>>>>>>> >>> >> <http://www.talend.com/>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >>
>>>>>>>>> >>> >
>>>>>>>>> >>> >
>>>>>>>>> >>> >
>>>>>>>>> >>>
>>>>>>>>> >>> --
>>>>>>>>> >>> Jean-Baptiste Onofré
>>>>>>>>> >>> jbono...@apache.org
>>>>>>>>> >>> http://blog.nanthrax.net
>>>>>>>>> >>> Talend - http://www.talend.com
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to