Yep, note I never said to reinvent the wheel, we can copy it from guava,
openwebbeans or any other impl. Point was more to avoid to depend on
something we don't own for that which is after all not that much code. I
also think we can limit it a lot to align it on what is supported by beam
(I'm thinking to coders here) but this can be another topic.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-02 16:33 GMT+01:00 Kenneth Knowles <k...@google.com>:

> On Fri, Feb 2, 2018 at 7:18 AM, Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> Don't forget beam doesn't support much behind it (mainly only a few
>> ParameterizedType due the usage code path) so it is mainly only about
>> handling parameterized types and typevariables recursively. Not a lot of
>> work. Main concern being it is in the API so using a shade as an API is
>> never a good idea, in particular cause the shade can be broken and requires
>> to setup clirr or things like that and when it breaks you are done and need
>> to fork it anyway. Limiting the dependencies for an API - as beam is - is
>> always saner even if it requires a small fork of code.
>>
>
> The main thing that TypeToken is used for is capturing generics that are
> lost by Java reflection. It is a bit tricky, actually.
>
> Kenn
>
>
>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-02 15:49 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>
>>> On Fri, Feb 2, 2018 at 6:41 AM, Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> 2018-02-02 15:37 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>>>
>>>>> Another couple:
>>>>>
>>>>>  - User-facing TypeDescriptor is a very thin wrapper on Guava's
>>>>> TypeToken
>>>>>
>>>>
>>>> Technically reflect Type is enough
>>>>
>>>
>>> If you want to try to remove TypeToken from underneath TypeDescriptor, I
>>> have no objections as long as you expand the test suite to cover all the
>>> functionality where we trust TypeToken's tests. Good luck :-)
>>>
>>> Kenn
>>>
>>>
>>>>
>>>>
>>>>>  - ImmutableList and friends and their builders are very widely used
>>>>> and IMO still add a lot for readability and preventing someone coming 
>>>>> along
>>>>> and adding mistakes to a codebase
>>>>>
>>>>
>>>> Sugar but not required. When you compare the cost of a shade versus of
>>>> duplicating the parts we need there is no real match IMHO.
>>>>
>>>>
>>>>>
>>>>> So considering it all, I would keep a vendored Guava (but also move
>>>>> off where we can, and also have our own improvements). I hope it will be a
>>>>> near-empty build file to generate it so not a maintenance burden.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Thu, Feb 1, 2018 at 8:44 PM, Kenneth Knowles <k...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Nice. This sounds like a great idea (or two?) and goes along with
>>>>>> what I just started for futures.
>>>>>>
>>>>>> Guava: filed https://issues.apache.org/jira/browse/BEAM-3606 and
>>>>>> assigned to Ismaël :-) and converted my futures thing to a subtask.
>>>>>>
>>>>>> Specific things for our micro guava:
>>>>>>
>>>>>>  - checkArgumentNotNull can throw the right exception
>>>>>>  - our own Optional because Java's is not Serializable
>>>>>>  - futures combinators since many are missing, especially Java's
>>>>>> don't do exceptions right
>>>>>>
>>>>>> Protobuf: didn't file an issue because I'm not sure
>>>>>>
>>>>>> I was wondering if pre-shading works. We really need to get rid of it
>>>>>> from public APIs in a 100% reliable way. It is also a problem for 
>>>>>> Dataflow.
>>>>>> I was wondering if one approach is to pre-shade gcpio-protobuf-java,
>>>>>> gcpio-protobuf-java-util, gcpio-grpc-java, etc. Anything that needs to 
>>>>>> take
>>>>>> a Message object. (and do the same for beam-model-protobuf-java since the
>>>>>> model bits have to depend on each other but nothing else).
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Thu, Feb 1, 2018 at 1:56 AM, Ismaël Mejía <ieme...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Huge +1 to get rid of Guava!
>>>>>>>
>>>>>>> This solves annoying dependency issues for some IOs and allow us to
>>>>>>> get rid of the shading that makes current jars bigger than they
>>>>>>> should
>>>>>>> be.
>>>>>>>
>>>>>>> We can create our own 'micro guava' package with some classes for
>>>>>>> things that are hard to migrate, or that we  prefer to still have
>>>>>>> like
>>>>>>> the check* methods for example. Given the size of the task we should
>>>>>>> probably divide it into subtasks, more important is to get rid of it
>>>>>>> for 'sdks/java/core'. We can then attack other areas afterwards.
>>>>>>>
>>>>>>> Other important idea would be to get rid of Protobuf in public APIs
>>>>>>> like GCPIO and to better shade it from leaking into the runners. An
>>>>>>> unexpected side effect of this is a leak of netty via gRPC/protobuf
>>>>>>> that is byting us for the Spark runner, but well that's worth a
>>>>>>> different discussion.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 1, 2018 at 10:08 AM, Romain Manni-Bucau
>>>>>>> <rmannibu...@gmail.com> wrote:
>>>>>>> > a map of list is fine and not a challenge we'll face long I hope ;)
>>>>>>> >
>>>>>>> >
>>>>>>> > Romain Manni-Bucau
>>>>>>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>>>>>>> >
>>>>>>> > 2018-02-01 9:40 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>> >>
>>>>>>> >> Not sure we'll be able to replace them all. Things like guava
>>>>>>> Table and
>>>>>>> >> Multimap don't have great replacements in Java8.
>>>>>>> >>
>>>>>>> >> On Wed, Jan 31, 2018 at 10:11 PM, Jean-Baptiste Onofré <
>>>>>>> j...@nanthrax.net>
>>>>>>> >> wrote:
>>>>>>> >>>
>>>>>>> >>> +1, it was on my TODO for a while waiting the Java8 update.
>>>>>>> >>>
>>>>>>> >>> Regards
>>>>>>> >>> JB
>>>>>>> >>>
>>>>>>> >>> On 02/01/2018 06:56 AM, Romain Manni-Bucau wrote:
>>>>>>> >>> > Why not dropping guava for all beam codebase? With java 8 it
>>>>>>> is quite
>>>>>>> >>> > easy to do
>>>>>>> >>> > it and avoid a bunch of conflicts. Did it in 2 projects with
>>>>>>> quite a
>>>>>>> >>> > good result.
>>>>>>> >>> >
>>>>>>> >>> > Le 1 févr. 2018 06:50, "Lukasz Cwik" <lc...@google.com
>>>>>>> >>> > <mailto:lc...@google.com>> a écrit :
>>>>>>> >>> >
>>>>>>> >>> >     Make sure to include the guava version in the artifact
>>>>>>> name so that
>>>>>>> >>> > we can
>>>>>>> >>> >     have multiple vendored versions.
>>>>>>> >>> >
>>>>>>> >>> >     On Wed, Jan 31, 2018 at 9:16 PM, Kenneth Knowles <
>>>>>>> k...@google.com
>>>>>>> >>> >     <mailto:k...@google.com>> wrote:
>>>>>>> >>> >
>>>>>>> >>> >         I didn't have time for this, but it just bit me. We
>>>>>>> definitely
>>>>>>> >>> > have
>>>>>>> >>> >         Guava on the API surface of runner support code in
>>>>>>> ways that
>>>>>>> >>> > get
>>>>>>> >>> >         incompatibly shaded. I will probably start "1a" by
>>>>>>> making a
>>>>>>> >>> > shaded
>>>>>>> >>> >         library org.apache.beam:vendored-guava and starting to
>>>>>>> use it.
>>>>>>> >>> > It sounds
>>>>>>> >>> >         like there is generally unanimous support for that
>>>>>>> much,
>>>>>>> >>> > anyhow.
>>>>>>> >>> >
>>>>>>> >>> >         Kenn
>>>>>>> >>> >
>>>>>>> >>> >         On Wed, Dec 13, 2017 at 8:31 AM, Aljoscha Krettek
>>>>>>> >>> > <aljos...@apache.org
>>>>>>> >>> >         <mailto:aljos...@apache.org>> wrote:
>>>>>>> >>> >
>>>>>>> >>> >             Thanks Ismaël for bringing up this discussion
>>>>>>> again!
>>>>>>> >>> >
>>>>>>> >>> >             I would be in favour of 1) and more specifically
>>>>>>> of 1a)
>>>>>>> >>> >
>>>>>>> >>> >             Aljoscha
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> >>             On 12. Dec 2017, at 18:56, Lukasz Cwik <
>>>>>>> lc...@google.com
>>>>>>> >>> >>             <mailto:lc...@google.com>> wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>             You can always run tests on post shaded artifacts
>>>>>>> instead
>>>>>>> >>> >> of the
>>>>>>> >>> >>             compiled classes, it just requires us to change
>>>>>>> our maven
>>>>>>> >>> >> surefire
>>>>>>> >>> >>             / gradle test configurations but it is true that
>>>>>>> most IDEs
>>>>>>> >>> >> would
>>>>>>> >>> >>             behave better with a dependency jar unless you
>>>>>>> delegate
>>>>>>> >>> >> all the
>>>>>>> >>> >>             build/test actions to the build system and then
>>>>>>> it won't
>>>>>>> >>> >> matter.
>>>>>>> >>> >>
>>>>>>> >>> >>             On Mon, Dec 11, 2017 at 9:05 PM, Kenneth Knowles
>>>>>>> >>> >> <k...@google.com
>>>>>>> >>> >>             <mailto:k...@google.com>> wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                 There's also, with additional overhead,
>>>>>>> >>> >>
>>>>>>> >>> >>                 1a) A relocated and shipped package for each
>>>>>>> thing we
>>>>>>> >>> >> want to
>>>>>>> >>> >>                 relocate. I think this has also been tried
>>>>>>> outside
>>>>>>> >>> >> Beam...
>>>>>>> >>> >>
>>>>>>> >>> >>                 Pros:
>>>>>>> >>> >>                 * all the pros of 1) plus no bloat beyond
>>>>>>> what is
>>>>>>> >>> >> necessary
>>>>>>> >>> >>                 Cons:
>>>>>>> >>> >>                 * abandons whitelist approach for public deps,
>>>>>>> >>> >> reverting to
>>>>>>> >>> >>                 blacklist approach for trouble things like
>>>>>>> guava, so a
>>>>>>> >>> >> bit
>>>>>>> >>> >>                 less principled
>>>>>>> >>> >>
>>>>>>> >>> >>                 For both 1) and 1a) I would add:
>>>>>>> >>> >>
>>>>>>> >>> >>                 Pros:
>>>>>>> >>> >>                 * clearly readable dependency since code will
>>>>>>> `import
>>>>>>> >>> >>                 org.apache.beam.private.guava21` and IDEs
>>>>>>> will
>>>>>>> >>> >> understand it
>>>>>>> >>> >>                 is a distinct lilbrary
>>>>>>> >>> >>                 * can run tests on unpackaged classes, as
>>>>>>> long as deps
>>>>>>> >>> >> are
>>>>>>> >>> >>                 shaded or provided as jars
>>>>>>> >>> >>                 * no mysterious action at a distance from
>>>>>>> inherited
>>>>>>> >>> >> configuration
>>>>>>> >>> >>                 Cons:
>>>>>>> >>> >>                 * need to adjust imports
>>>>>>> >>> >>
>>>>>>> >>> >>                 Kenn
>>>>>>> >>> >>
>>>>>>> >>> >>                 On Mon, Dec 11, 2017 at 9:57 AM, Lukasz Cwik
>>>>>>> >>> >> <lc...@google.com
>>>>>>> >>> >>                 <mailto:lc...@google.com>> wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                     I would suggest that either we use:
>>>>>>> >>> >>                     1) A common deps package containing shaded
>>>>>>> >>> >> dependencies
>>>>>>> >>> >>                     allows for
>>>>>>> >>> >>                     Pros
>>>>>>> >>> >>                     * doesn't require the user to build an
>>>>>>> uber jar
>>>>>>> >>> >>                     Risks
>>>>>>> >>> >>                     * dependencies package will keep growing
>>>>>>> even if
>>>>>>> >>> >> something
>>>>>>> >>> >>                     is or isn't needed by all of Apache Beam
>>>>>>> leading
>>>>>>> >>> >> to a
>>>>>>> >>> >>                     large jar anyways negating any space
>>>>>>> savings
>>>>>>> >>> >>
>>>>>>> >>> >>                     2) Shade within each module to a common
>>>>>>> location
>>>>>>> >>> >> like
>>>>>>> >>> >>                     org.apache.beam.relocated.guava....
>>>>>>> >>> >>                     Pros
>>>>>>> >>> >>                     * you only get the shaded dependencies of
>>>>>>> the
>>>>>>> >>> >> things that
>>>>>>> >>> >>                     are required
>>>>>>> >>> >>                     * its one less dependency for users to
>>>>>>> manage
>>>>>>> >>> >>                     Risks
>>>>>>> >>> >>                     * requires an uber jar to be built to get
>>>>>>> the
>>>>>>> >>> >> space
>>>>>>> >>> >>                     savings (either by a user or a
>>>>>>> distribution of
>>>>>>> >>> >> Apache
>>>>>>> >>> >>                     Beam) otherwise we negate any space
>>>>>>> savings.
>>>>>>> >>> >>
>>>>>>> >>> >>                     If we either use a common relocation
>>>>>>> scheme or a
>>>>>>> >>> >>                     dependencies jar then each relocation
>>>>>>> should
>>>>>>> >>> >> specifically
>>>>>>> >>> >>                     contain the version number of the package
>>>>>>> because
>>>>>>> >>> >> we would
>>>>>>> >>> >>                     like to allow for us to be using two
>>>>>>> different
>>>>>>> >>> >> versions of
>>>>>>> >>> >>                     the same library.
>>>>>>> >>> >>
>>>>>>> >>> >>                     For the common deps package approach,
>>>>>>> should we
>>>>>>> >>> >> check in
>>>>>>> >>> >>                     code where the imports contain the
>>>>>>> relocated
>>>>>>> >>> >> location
>>>>>>> >>> >>                     (e.g. import
>>>>>>> org.apache.beam.guava.20.0.com
>>>>>>> >>> >>
>>>>>>> >>> >> <http://org.apache.beam.guava.20.0.com/>.google.common.colle
>>>>>>> ct.ImmutableList)?
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>                     On Mon, Dec 11, 2017 at 8:47 AM,
>>>>>>> Jean-Baptiste
>>>>>>> >>> >> Onofré
>>>>>>> >>> >>                     <j...@nanthrax.net 
>>>>>>> >>> >> <mailto:j...@nanthrax.net>>
>>>>>>> wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                         Thanks for bringing that back.
>>>>>>> >>> >>
>>>>>>> >>> >>                         Indeed guava is shaded in different
>>>>>>> uber-jar.
>>>>>>> >>> >> Maybe we
>>>>>>> >>> >>                         can have a common deps module that we
>>>>>>> include
>>>>>>> >>> >> once
>>>>>>> >>> >>                         (but the user will have to explicitly
>>>>>>> define
>>>>>>> >>> >> the dep) ?
>>>>>>> >>> >>
>>>>>>> >>> >>                         Basically, what do you propose for
>>>>>>> protobuf
>>>>>>> >>> >>                         (unfortunately, I don't see an
>>>>>>> obvious) ?
>>>>>>> >>> >>
>>>>>>> >>> >>                         Regards
>>>>>>> >>> >>                         JB
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>                         On 12/11/2017 05:35 PM, Ismaël Mejía
>>>>>>> wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                             Hello, I wanted to bring back
>>>>>>> this subject
>>>>>>> >>> >> because
>>>>>>> >>> >>                             I think we should
>>>>>>> >>> >>                             take action on this and at least
>>>>>>> first
>>>>>>> >>> >> have a
>>>>>>> >>> >>                             shaded version of guava.
>>>>>>> >>> >>                             I was playing with a toy project
>>>>>>> and I did
>>>>>>> >>> >> the
>>>>>>> >>> >>                             procedure we use to
>>>>>>> >>> >>                             submit jars to a Hadoop cluster
>>>>>>> via
>>>>>>> >>> >> Flink/Spark
>>>>>>> >>> >>                             which involves
>>>>>>> >>> >>                             creating an uber jar and I
>>>>>>> realized that
>>>>>>> >>> >> the size
>>>>>>> >>> >>                             of the jar was way
>>>>>>> >>> >>                             bigger than I expected, and the
>>>>>>> fact that
>>>>>>> >>> >> we shade
>>>>>>> >>> >>                             guava in every
>>>>>>> >>> >>                             module contributes to this. I
>>>>>>> found guava
>>>>>>> >>> >> shaded on:
>>>>>>> >>> >>
>>>>>>> >>> >>                             sdks/java/core
>>>>>>> >>> >>                             runners/core-construction-java
>>>>>>> >>> >>                             runners/core-java
>>>>>>> >>> >>                             model/job-management
>>>>>>> >>> >>                             runners/spark
>>>>>>> >>> >>                             sdks/java/io/hadoop-file-system
>>>>>>> >>> >>                             sdks/java/io/kafka
>>>>>>> >>> >>
>>>>>>> >>> >>                             This means at least 6 times more
>>>>>>> of the
>>>>>>> >>> >> size it
>>>>>>> >>> >>                             should which counts in
>>>>>>> >>> >>                             around 15MB more (2.4MB*6 deps)
>>>>>>> of extra
>>>>>>> >>> >> weight
>>>>>>> >>> >>                             that we can simply
>>>>>>> >>> >>                             reduce by using a shaded version
>>>>>>> of the
>>>>>>> >>> >> library.
>>>>>>> >>> >>
>>>>>>> >>> >>                             I add this point to the previous
>>>>>>> ones
>>>>>>> >>> >> mentioned
>>>>>>> >>> >>                             during the discussion
>>>>>>> >>> >>                             because this goes against the end
>>>>>>> user
>>>>>>> >>> >> experience
>>>>>>> >>> >>                             and affects us all
>>>>>>> >>> >>                             (devs/users).
>>>>>>> >>> >>
>>>>>>> >>> >>                             Another question is if we should
>>>>>>> shade
>>>>>>> >>> >> (and how)
>>>>>>> >>> >>                             protocol buffers
>>>>>>> >>> >>                             because now with the portability
>>>>>>> work we
>>>>>>> >>> >> are
>>>>>>> >>> >>                             exposing it closer to the
>>>>>>> >>> >>                             end users. I say this because I
>>>>>>> also found
>>>>>>> >>> >> an
>>>>>>> >>> >>                             issue while running a
>>>>>>> >>> >>                             job on YARN with the spark runner
>>>>>>> because
>>>>>>> >>> >>                             hadoop-common includes
>>>>>>> >>> >>                             protobuf-java 2 and I had to
>>>>>>> explicitly
>>>>>>> >>> >> provide
>>>>>>> >>> >>                             protocol-buffers 3 as
>>>>>>> >>> >>                             a dependency to be able to use
>>>>>>> triggers
>>>>>>> >>> >> (note the
>>>>>>> >>> >>                             Spark runner
>>>>>>> >>> >>                             translates them using some method
>>>>>>> from
>>>>>>> >>> >>                             runners/core-java). Since
>>>>>>> >>> >>                             hadoop-common is provided in the
>>>>>>> cluster
>>>>>>> >>> >> with the
>>>>>>> >>> >>                             older version of
>>>>>>> >>> >>                             protobuf, I am afraid that this
>>>>>>> will bite
>>>>>>> >>> >> us in
>>>>>>> >>> >>                             the future.
>>>>>>> >>> >>
>>>>>>> >>> >>                             Ismaël
>>>>>>> >>> >>
>>>>>>> >>> >>                             ps. There is already a JIRA for
>>>>>>> that
>>>>>>> >>> >> shading for
>>>>>>> >>> >>                             protobuf on
>>>>>>> >>> >>                             hadoop-common but this is not
>>>>>>> coming until
>>>>>>> >>> >> version
>>>>>>> >>> >>                             3 is out.
>>>>>>> >>> >>
>>>>>>> >>> >> https://issues.apache.org/jira/browse/HADOOP-13136
>>>>>>> >>> >>
>>>>>>> >>> >> <https://issues.apache.org/jira/browse/HADOOP-13136>
>>>>>>> >>> >>
>>>>>>> >>> >>                             ps2. Extra curious situation is
>>>>>>> to see
>>>>>>> >>> >> that the
>>>>>>> >>> >>                             dataflow-runner ends
>>>>>>> >>> >>                             up having guava shaded twice via
>>>>>>> its
>>>>>>> >>> >> shaded version on
>>>>>>> >>> >>                             core-construction-java.
>>>>>>> >>> >>
>>>>>>> >>> >>                             ps3. Of course this message means
>>>>>>> a
>>>>>>> >>> >> de-facto +1 at
>>>>>>> >>> >>                             least to do it for
>>>>>>> >>> >>                             guava and evaluate it for other
>>>>>>> libs.
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>                             On Tue, Oct 17, 2017 at 7:29 PM,
>>>>>>> Lukasz
>>>>>>> >>> >> Cwik
>>>>>>> >>> >>                             <lc...@google.com.invalid
>>>>>>> >>> >>                             <mailto:lc...@google.com.invalid>>
>>>>>>> wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                                 An issue to call out is how
>>>>>>> to deal
>>>>>>> >>> >> with our
>>>>>>> >>> >>                                 generated code (.avro and
>>>>>>> >>> >>                                 .proto) as I don't believe
>>>>>>> those
>>>>>>> >>> >> plugins allow
>>>>>>> >>> >>                                 you to generate code using a
>>>>>>> >>> >>                                 shaded package prefix on
>>>>>>> imports.
>>>>>>> >>> >>
>>>>>>> >>> >>                                 On Tue, Oct 17, 2017 at 10:28
>>>>>>> AM,
>>>>>>> >>> >> Thomas Groh
>>>>>>> >>> >>                                 <tg...@google.com.invalid
>>>>>>> >>> >>                                 <mailto:tg...@google.com.inva
>>>>>>> lid>>
>>>>>>> >>> >>                                 wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                                     +1 to the goal. I'm
>>>>>>> hugely in
>>>>>>> >>> >> favor of not
>>>>>>> >>> >>                                     doing the same shading
>>>>>>> work
>>>>>>> >>> >>                                     every time for
>>>>>>> dependencies we
>>>>>>> >>> >> know we'll use.
>>>>>>> >>> >>
>>>>>>> >>> >>                                     This also means that if
>>>>>>> we end up
>>>>>>> >>> >> pulling
>>>>>>> >>> >>                                     in transitive
>>>>>>> dependencies we
>>>>>>> >>> >>                                     don't want in any
>>>>>>> particular
>>>>>>> >>> >> module we can
>>>>>>> >>> >>                                     avoid having to adjust our
>>>>>>> >>> >>                                     repackaging strategy for
>>>>>>> that
>>>>>>> >>> >> module -
>>>>>>> >>> >>                                     which I have run into
>>>>>>> face-first
>>>>>>> >>> >> in
>>>>>>> >>> >>                                     the past.
>>>>>>> >>> >>
>>>>>>> >>> >>                                     On Tue, Oct 17, 2017 at
>>>>>>> 9:48 AM,
>>>>>>> >>> >> Kenneth
>>>>>>> >>> >>                                     Knowles
>>>>>>> <k...@google.com.invalid
>>>>>>> >>> >>                                     <mailto:
>>>>>>> k...@google.com.invalid>>
>>>>>>> >>> >>                                     wrote:
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Hi all,
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Shading is a big part
>>>>>>> of how
>>>>>>> >>> >> we keep
>>>>>>> >>> >>                                         our dependencies sane
>>>>>>> in Beam.
>>>>>>> >>> >> But
>>>>>>> >>> >>                                         downsides: shading is
>>>>>>> super
>>>>>>> >>> >> slow,
>>>>>>> >>> >>                                         causes massive jar
>>>>>>> bloat, and
>>>>>>> >>> >> kind of
>>>>>>> >>> >>                                         hard to get right
>>>>>>> because
>>>>>>> >>> >> artifacts
>>>>>>> >>> >>                                         and namespaces are
>>>>>>> not 1-to-1.
>>>>>>> >>> >>
>>>>>>> >>> >>                                         I know that some
>>>>>>> communities
>>>>>>> >>> >>                                         distribute their own
>>>>>>> shaded
>>>>>>> >>> >>                                         distributions of
>>>>>>> >>> >>                                         dependencies. I had a
>>>>>>> thought
>>>>>>> >>> >> about
>>>>>>> >>> >>                                         doing something
>>>>>>> similar that I
>>>>>>> >>> >> wanted
>>>>>>> >>> >>                                         to throw out there
>>>>>>> for people
>>>>>>> >>> >> to poke
>>>>>>> >>> >>                                         holes in.
>>>>>>> >>> >>
>>>>>>> >>> >>                                         To set the scene,
>>>>>>> here is how
>>>>>>> >>> >> I view
>>>>>>> >>> >>                                         shading:
>>>>>>> >>> >>
>>>>>>> >>> >>                                           - A module has
>>>>>>> public
>>>>>>> >>> >> dependencies
>>>>>>> >>> >>                                         and private
>>>>>>> dependencies.
>>>>>>> >>> >>                                           - Public deps are
>>>>>>> used for
>>>>>>> >>> >> data
>>>>>>> >>> >>                                         interchange; users
>>>>>>> must share
>>>>>>> >>> >> these
>>>>>>> >>> >>
>>>>>>> >>> >>                                     deps.
>>>>>>> >>> >>
>>>>>>> >>> >>                                           - Private deps are
>>>>>>> just
>>>>>>> >>> >>                                         functionality and can
>>>>>>> be
>>>>>>> >>> >> hidden (in
>>>>>>> >>> >>                                         our case,
>>>>>>> >>> >>                                         relocated + bundled)
>>>>>>> >>> >>                                           - It isn't
>>>>>>> necessarily that
>>>>>>> >>> >> simple,
>>>>>>> >>> >>                                         because public and
>>>>>>> private
>>>>>>> >>> >> deps
>>>>>>> >>> >>
>>>>>>> >>> >>                                     might
>>>>>>> >>> >>
>>>>>>> >>> >>                                         interact in
>>>>>>> higher-order ways
>>>>>>> >>> >>                                         ("public" is
>>>>>>> contagious)
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Shading is an
>>>>>>> implementation
>>>>>>> >>> >> detail of
>>>>>>> >>> >>                                         expressing these
>>>>>>> >>> >> characteristics.
>>>>>>> >>> >>
>>>>>>> >>> >>                                     We
>>>>>>> >>> >>
>>>>>>> >>> >>                                         use shading
>>>>>>> selectively
>>>>>>> >>> >> because of its
>>>>>>> >>> >>                                         downsides I mentioned
>>>>>>> above.
>>>>>>> >>> >>
>>>>>>> >>> >>                                         But what about this
>>>>>>> idea:
>>>>>>> >>> >> Introduce
>>>>>>> >>> >>                                         shaded deps as a
>>>>>>> single
>>>>>>> >>> >> separate
>>>>>>> >>> >>                                         artifact.
>>>>>>> >>> >>
>>>>>>> >>> >>                                           -
>>>>>>> sdks/java/private-deps:
>>>>>>> >>> >> bundled
>>>>>>> >>> >>                                         uber jar with
>>>>>>> relocated
>>>>>>> >>> >> versions of
>>>>>>> >>> >>                                         everything we want to
>>>>>>> shade
>>>>>>> >>> >>
>>>>>>> >>> >>                                           - sdks/java/core and
>>>>>>> >>> >>                                         sdks/java/harness: no
>>>>>>> >>> >> relocation or
>>>>>>> >>> >>                                         bundling -
>>>>>>> >>> >>                                         depends on
>>>>>>> >>> >>
>>>>>>>  `beam-sdks-java-private-deps`
>>>>>>> >>> >> and
>>>>>>> >>> >>                                         imports like
>>>>>>> >>> >>
>>>>>>> >>> >> `org.apache.beam.sdk.private.com.google.common`
>>>>>>> >>> >>                                         directly (this is what
>>>>>>> >>> >>                                         they
>>>>>>> >>> >>                                         are rewritten to
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Some benefits
>>>>>>> >>> >>
>>>>>>> >>> >>                                           - much faster
>>>>>>> builds of
>>>>>>> >>> >> other modules
>>>>>>> >>> >>                                           - only one shaded
>>>>>>> uber jar
>>>>>>> >>> >>                                           - rare/no rebuilds
>>>>>>> of the
>>>>>>> >>> >> uber jar
>>>>>>> >>> >>                                           - can use maven
>>>>>>> enforcer to
>>>>>>> >>> >> forbid
>>>>>>> >>> >>                                         imports like
>>>>>>> com.google.common
>>>>>>> >>> >>                                           - configuration all
>>>>>>> in one
>>>>>>> >>> >> place
>>>>>>> >>> >>                                           - no automated
>>>>>>> rewriting of
>>>>>>> >>> >> our real
>>>>>>> >>> >>                                         code, which has led
>>>>>>> to some
>>>>>>> >>> >> major
>>>>>>> >>> >>                                         confusion
>>>>>>> >>> >>                                           - easy to implement
>>>>>>> >>> >> incrementally
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Downsides:
>>>>>>> >>> >>
>>>>>>> >>> >>                                           - plenty of effort
>>>>>>> work to
>>>>>>> >>> >> get there
>>>>>>> >>> >>                                           - unclear how many
>>>>>>> different
>>>>>>> >>> >> such
>>>>>>> >>> >>                                         deps modules we need;
>>>>>>> sharing
>>>>>>> >>> >> them
>>>>>>> >>> >>
>>>>>>> >>> >>                                     could
>>>>>>> >>> >>
>>>>>>> >>> >>                                         get weird
>>>>>>> >>> >>                                           - if we hit a
>>>>>>> roadblock, we
>>>>>>> >>> >> will
>>>>>>> >>> >>                                         have committed a lot
>>>>>>> of time
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Just something I was
>>>>>>> musing as
>>>>>>> >>> >> I spent
>>>>>>> >>> >>                                         another evening
>>>>>>> waiting for
>>>>>>> >>> >> slow
>>>>>>> >>> >>                                         builds to try to
>>>>>>> confirm
>>>>>>> >>> >> changes to
>>>>>>> >>> >>                                         brittle poms.
>>>>>>> >>> >>
>>>>>>> >>> >>                                         Kenn
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>                         --
>>>>>>> >>> >>                         Jean-Baptiste Onofré
>>>>>>> >>> >>                         jbono...@apache.org
>>>>>>> >>> >> <mailto:jbono...@apache.org>
>>>>>>> >>> >>                         http://blog.nanthrax.net
>>>>>>> >>> >> <http://blog.nanthrax.net/>
>>>>>>> >>> >>                         Talend - http://www.talend.com
>>>>>>> >>> >> <http://www.talend.com/>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>> >
>>>>>>> >>>
>>>>>>> >>> --
>>>>>>> >>> Jean-Baptiste Onofré
>>>>>>> >>> jbono...@apache.org
>>>>>>> >>> http://blog.nanthrax.net
>>>>>>> >>> Talend - http://www.talend.com
>>>>>>> >>
>>>>>>> >>
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to