Hi Chesnay,

Thanks for joining this discussion and sharing your thoughts!


> Connectors shouldn't depend on flink-shaded.
>

Perfect! We are on the same page. If you could read through the discussion,
you would realize that, currently, there are many connectors depend on
flink-shaded.


> Connectors are small enough in scope that depending directly on
> guava/jackson/etc. is a fine approach, and they have plenty of other
> dependencies that they need to manage anyway; let's treat these the same
> way.
>

It is even better, if we could do that. Jira tickest[1] are created.


> As for class-loading, there has been a long-standing goal of each
> connector being loaded in their own classloader. That still is the north
> star and the only reasonable way to ensure that multiple connectors can
> be safely used with SQL.
>

What is the current status? Do we have any Jira ticket for that?

Best regards,
Jing

[1] https://issues.apache.org/jira/browse/FLINK-33190

On Wed, Oct 4, 2023 at 4:43 PM Chesnay Schepler <ches...@apache.org> wrote:

> There is no "monolithic" flink-shaded dependency.
> Connectors shouldn't depend on anything that Flink provides, but be
> self-contained as Martijn pointed out.
>
>

> Connectors shouldn't depend on flink-shaded.
>


> The overhead and/or risks of doing/supporting that right now far
> outweigh the benefits.
> ( Because we either have to encode the full version for all dependencies
> into the package, or accept the risk of minor/patch dependency clashes)
>


> Connectors are small enough in scope that depending directly on
> guava/jackson/etc. is a fine approach, and they have plenty of other
> dependencies that they need to manage anyway; let's treat these the same
> way.
>


> Naturally this is also an argument against flink-shaded-connectors; on
> top of that we already experience repo creep and managing releases is
> difficult enough as-is.
>
>

> As for class-loading, there has been a long-standing goal of each
> connector being loaded in their own classloader. That still is the north
> star and the only reasonable way to ensure that multiple connectors can
> be safely used with SQL.
>


> On 02/10/2023 18:32, Jing Ge wrote:
> > Hi Sergey,
> >
> > Thanks for sharing your thoughts. It could somehow help but didn't get to
> > the root of this issue.
> >
> > According to the documentation, Flink shaded is used to provide a single
> > instance of a shaded dependency across sub-modules in Flink repo. Shaded
> > namespaces should be used where shaded dependencies are configured. After
> > connectors have been externalized, it ends up with more repos depending
> on
> > one shaded jar, e.g. guava. This is a "monolithic" dependency setup that
> > makes it difficult to change the root(flink-shade), because any changes
> of
> > the root have to be propagated to all downstream repos. Even worse is
> that
> > not every downstream repo is known while modifying the root.
> >
> > Since all externalized connectors have their own repos and are not
> > sub-modules of Flink anymore, I would suggest the following upgrade:
> >
> > 1. Connectors should use their own classloader instead of Flink's
> > classloader. This will break the monolithic dependency. Connectors and
> > Flink can use different versions of flink-shaded.
> > 2. [optional] It would be even better that all connector repos depend on
> > their own individual shaded repo, e.g. flink-connector-shaded.
> flink-shaded
> > should only be used by Flink.
> >
> > WDYT?
> >
> > Best regards,
> > Jing
> >
> >
> > On Thu, Sep 14, 2023 at 11:28 PM Sergey Nuyanzin <snuyan...@gmail.com>
> > wrote:
> >
> >> Yes, that's a reasonable question, thanks for raising it.
> >>
> >> I think this is not only about flink-shaded, rather about dependencies
> in
> >> general
> >>
> >> I guess there is no rule of thumb, or at least I'm not aware of
> >> Here are my thoughts
> >> 1. If bumping dependency doesn't require breaking changes and passes
> >> existing tests then just bump it
> >> 2. In case there are breaking changes we could consider doing this
> within
> >> next major release
> >>     for minor release
> >>      a. try to answer a question whether it impacts Flink or not
> >>      b. in case it impacts Flink and fix itself is relatively small
> then to
> >> avoid breaking change
> >>         we could copy classes with solutions to Flink repo like it
> usually
> >> happens with Calcite related fixes.
> >>         The problem of this approach is that I guess it will not work
> for
> >> non jvm deps like e.g. RocksDB
> >>      c. In case no way to do it without breaking changes for minor
> release
> >> then probably need sort of announcement motivating to move to another
> major
> >> version where the issue is fixed
> >>
> >> looking forward to seeing other opinions about that
> >>
> >> On Wed, Sep 13, 2023 at 9:47 PM Jing Ge <j...@ververica.com.invalid>
> >> wrote:
> >>
> >>> Hi Sergey,
> >>>
> >>> Thanks for doing the analysis and providing the great insight. I did my
> >> own
> >>> analysis and got the same conclusion. I just wanted to use this example
> >> to
> >>> kick off a discussion and check if there is a common guideline or
> concept
> >>> in the community to handle such cases, since it seems any bump-up might
> >>> have a big impact. This time, we are kind of lucky. What if CVE related
> >>> code has been used in Flink? I do see it is an issue that upgrading a
> lib
> >>> in the flink-shaded repo is not recommended because of the complexity.
> >>> IMHO, we don't want to put the cart before the horse.
> >>>
> >>> Best regards,
> >>> Jing
> >>>
> >>> On Wed, Sep 13, 2023 at 9:11 PM Sergey Nuyanzin <snuyan...@gmail.com>
> >>> wrote:
> >>>
> >>>> Thanks for raising this
> >>>>
> >>>> I would suggest trying to double check whether it actually impacts
> >> Flink
> >>> or
> >>>> not.
> >>>>
> >>>> For instance from one side Calcite between 1.22.0..1.31.0 has a
> >> critical
> >>>> CVE-2022-39135 [1]
> >>>> from another side Flink does not use this functionality and is not
> >>> impacted
> >>>> by this.
> >>>>
> >>>> Regarding guava
> >>>> After closer look at Guava's high CVE you've mentioned [2]
> >>>> Based on Github issue describing the problem (exists since 2016) [3]
> >>>> there is a security issue with class
> >>>> com.google.common.io.FileBackedOutputStream
> >>>>
> >>>> While looking at source code for
> >>>> com.google.common.io.FileBackedOutputStream usage I was not able to
> >> find
> >>>> such.
> >>>> Also I was not able to find usage of
> >>>> org.apache.flink.shaded.guava30.com.google.common.io
> >> .Files#createTempDir
> >>>> which was fixed within commit [4]
> >>>> Also I was not able to find other Guava classes which use the problem
> >>> one.
> >>>> Currently I tend to think that it probably does not impact Flink as
> >> well.
> >>>> Please correct me if I'm wrong
> >>>>
> >>>> [1] https://nvd.nist.gov/vuln/detail/CVE-2022-39135
> >>>> [2] https://nvd.nist.gov/vuln/detail/CVE-2023-2976
> >>>> [3] https://github.com/google/guava/issues/2575
> >>>> [4]
> >>>>
> >>>>
> >>
> https://github.com/google/guava/commit/b3719763b9ca777b94554d7ea2d2b92e27ac6164
> >>>> On Wed, Sep 13, 2023 at 6:26 PM Jing Ge <j...@ververica.com.invalid>
> >>>> wrote:
> >>>>
> >>>>> Hi Sergey, Hi Martijn,
> >>>>>
> >>>>> Thanks for the information and suggestions. It is rational.
> >>>>>
> >>>>> Since on one hand, we should not do the upgrade according to all your
> >>>>> thoughts, and on the other hand, Guava 30.1.1-jre has known CVEs[1]
> >> and
> >>>> the
> >>>>> next closest version that has no CVEs is 32.0.1-jre. Without the
> >>> upgrade,
> >>>>> users might not be able to use Flink 1.17 in their production and
> >> Flink
> >>>>> 1.18 has not been released yet. Do we have any known concept, rule,
> >> or
> >>>>> process in the community to handle such a common CVE issue?
> >>>>>
> >>>>> Best regards,
> >>>>> Jing
> >>>>>
> >>>>>
> >>>>>
> >>>>> [1]
> >>> https://mvnrepository.com/artifact/com.google.guava/guava/30.1.1-jre
> >>>>> On Wed, Sep 13, 2023 at 4:31 PM Martijn Visser <
> >>> martijnvis...@apache.org
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Jing,
> >>>>>>
> >>>>>> Like Sergey said, making that change would break all connectors
> >> that
> >>>> used
> >>>>>> Flink Shaded in the released versions, given that Guava is part of
> >>> the
> >>>>>> package name. I think that was the case for Kafka, Pulsar, JDBC,
> >>>> PubSub,
> >>>>>> HBase, Cassandra and maybe even more. We shouldn't do this upgrade.
> >>>>>>
> >>>>>> Best regards,
> >>>>>>
> >>>>>> Martijn
> >>>>>>
> >>>>>> On Wed, Sep 13, 2023 at 2:54 PM Sergey Nuyanzin <
> >> snuyan...@gmail.com
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Jing,
> >>>>>>>
> >>>>>>> If I remember correctly, bumping of guava to another major
> >> version
> >>>>>> usually
> >>>>>>> leads to package rename
> >>>>>>> since the major version number is a part of the package name...
> >>>>>>> Not 100% sure however this could be the reason for some potential
> >>>>>> breaking
> >>>>>>> changes (we also faced that with connectors while the last
> >>>> flink-shaded
> >>>>>>> update)
> >>>>>>>
> >>>>>>> On Wed, Sep 13, 2023 at 2:43 PM Jing Ge
> >> <j...@ververica.com.invalid
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Martijn,
> >>>>>>>>
> >>>>>>>> For example, bump up guava from 30.1.1-jre to 32.1.2-jre.
> >>>>>>>>
> >>>>>>>> Best regards,
> >>>>>>>> Jing
> >>>>>>>>
> >>>>>>>> On Wed, Sep 13, 2023 at 2:37 PM Martijn Visser <
> >>>>>> martijnvis...@apache.org
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Jing,
> >>>>>>>>>
> >>>>>>>>> A PR to update the readme sounds like a good plan.
> >>>>>>>>>
> >>>>>>>>> I think it depends on what are the expected updates for a
> >>>>>> flink-shaded
> >>>>>>>>> 16.2, since that version doesn't exist.
> >>>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>>
> >>>>>>>>> Martijn
> >>>>>>>>>
> >>>>>>>>> On Wed, Sep 13, 2023 at 1:48 PM Jing Ge
> >>>> <j...@ververica.com.invalid
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Martijn,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for your reply with details. Appreciate it.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Flink-Shaded is usually only updated whenever a
> >>>>>>>>>>> new Flink minor version is released and only at the
> >>> beginning
> >>>>> of
> >>>>>>> the
> >>>>>>>>>>> release cycle, so that there's enough time to stabilize
> >>>> Flink.
> >>>>>>>>>>
> >>>>>>>>>> This is the information I am looking for. It will help devs
> >>>>>>> understand
> >>>>>>>>> the
> >>>>>>>>>> compatibility between different versions of flink and
> >>>>> flink-shaded,
> >>>>>>> if
> >>>>>>>> it
> >>>>>>>>>> could be described in the readme. If you don't mind, I can
> >>>>> create a
> >>>>>>> pr
> >>>>>>>>> and
> >>>>>>>>>> update it.
> >>>>>>>>>>
> >>>>>>>>>> Speaking of this rule, I have a follow-up question: do you
> >>> have
> >>>>> any
> >>>>>>>>> concern
> >>>>>>>>>> if flink-shaded 16.2 will be released and upgraded(from
> >> 16.1
> >>> to
> >>>>>> 16.2)
> >>>>>>>> in
> >>>>>>>>>> Flink 1.17? Is there anything we should pay attention to
> >>> while
> >>>>>>>> releasing
> >>>>>>>>> a
> >>>>>>>>>> new minor flink-shaded version?
> >>>>>>>>>>
> >>>>>>>>>> Best regards,
> >>>>>>>>>> Jing
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Sep 13, 2023 at 9:01 AM Martijn Visser <
> >>>>>>>> martijnvis...@apache.org
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Jing,
> >>>>>>>>>>>
> >>>>>>>>>>> Flink Shaded exists so that Flinks internal usage of
> >>> commonly
> >>>>>> used
> >>>>>>>>>> packages
> >>>>>>>>>>> such as Guava, Jackson inside of Flink don't clash with
> >>>>> different
> >>>>>>>>>> versions
> >>>>>>>>>>> that users might use when creating a Flink application.
> >>> When
> >>>> I
> >>>>>> did
> >>>>>>>> the
> >>>>>>>>>>> upgrade of Flink Shaded, we already ran into a bunch of
> >>>>> problems
> >>>>>>>>> because
> >>>>>>>>>> a
> >>>>>>>>>>> lot of the externalized connectors relied on Flink
> >> Shaded,
> >>>>> which
> >>>>>>> made
> >>>>>>>>> it
> >>>>>>>>>>> problematic to get the connector to work on both Flink
> >> 1.17
> >>>> and
> >>>>>>> Flink
> >>>>>>>>>> 1.18.
> >>>>>>>>>>> There's been quite a lot of effort put into making sure
> >>> that
> >>>>>>>>> externalized
> >>>>>>>>>>> connectors don't rely on Flink Shaded at all anymore, by
> >>>> either
> >>>>>>> using
> >>>>>>>>>> their
> >>>>>>>>>>> own versions of shaded artifacts (which was the case with
> >>> the
> >>>>>>> Pulsar
> >>>>>>>>>>> connector) or just removing the dependency on Flink
> >> Shaded
> >>>> all
> >>>>>>>>> together,
> >>>>>>>>>> by
> >>>>>>>>>>> using regular Java.
> >>>>>>>>>>>
> >>>>>>>>>>> If you would upgrade flink-shaded from 16.1 to 17.0 in
> >>> Flink
> >>>>>> 1.17,
> >>>>>>>> you
> >>>>>>>>>>> would break all externalized connectors that rely on
> >> Flink
> >>>>>> Shaded's
> >>>>>>>>> Guava
> >>>>>>>>>>> version, plus you potentially would impact the runtime
> >>> given
> >>>>> that
> >>>>>>>>>> there's a
> >>>>>>>>>>> newer Netty version etc. Flink-Shaded is usually only
> >>> updated
> >>>>>>>> whenever
> >>>>>>>>> a
> >>>>>>>>>> new Flink minor version is released and only at the
> >> beginning
> >>>> of
> >>>>>> the
> >>>>>>>>>>> release cycle, so that there's enough time to stabilize
> >>>> Flink.
> >>>>>>>>>> All in all, we shouldn't upgrade flink-shaded in Flink
> >> 1.17.
> >>>>>>>>>>> Best regards,
> >>>>>>>>>>>
> >>>>>>>>>>> Martijn
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Sep 12, 2023 at 7:26 PM Jing Ge
> >>>>>> <j...@ververica.com.invalid
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Dev,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Currently Flink 1.17 is using flink-shaded 16.1 and
> >> Flink
> >>>>> 1.18
> >>>>>> is
> >>>>>>>>> using
> >>>>>>>>>>>> flink-shaded 17.0. Do we need to consider any
> >>> compatibility
> >>>>>> rules
> >>>>>>>>>> between
> >>>>>>>>>>>> them? E.g. is there any concern to upgrade flink-shaded
> >>>> from
> >>>>>> 16.1
> >>>>>>>> to
> >>>>>>>>>> 17.x
> >>>>>>>>>>>> for Flink 1.17? Or there are some implicit dependency
> >>> rules
> >>>>>>> between
> >>>>>>>>>>>> them. Looking
> >>>>>>>>>>>> forward to hearing from you.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best regards,
> >>>>>>>>>>>> Jing
> >>>>>>>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Sergey
> >>>>>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Sergey
> >>>>
> >>
> >> --
> >> Best regards,
> >> Sergey
> >>
>
>

Reply via email to