We're still facing the version constraint problem by gradle plugins :(

jiantao yu <yujt...@gmail.com> 于2020年4月22日周三 下午12:08写道:

> Hi saisai,
> Would you please share your progress on merging spark-3 branch into
> master?
> We  are trying iceberg with spark sql, which is only supported in spark 3.
>
> On 2020/03/27 01:53:09, Saisai Shao <s...@gmail.com> wrote:
> > Thanks Ryan, let me take a try.>
> >
> > Best regards,>
> > Saisai>
> >
> > Ryan Blue <rb...@netflix.com.invalid> 于2020年3月27日周五 上午12:15写道:>
> >
> > > Here’s how it was done before:>
> > >
> https://github.com/apache/incubator-iceberg/blob/867ec79a5c2f7619cb10546b5cc7f7bbc7d61621/build.gradle#L225-L244>
>
> > >>
> > > That defines a set of projects called baselineProjects and applies>
> > > baseline like this:>
> > >>
> > > configure(baselineProjects) {>
> > >   apply plugin: 'com.palantir.baseline-checkstyle'>
> > >   ...>
> > > }>
> > >>
> > > The baseline config has since been moved into baseline.gradle>
> > > <
> https://github.com/apache/incubator-iceberg/blob/master/baseline.gradle>>
> > > so changes should probably go into that file. Thanks for looking into
> this!>
> > >>
> > > On Thu, Mar 26, 2020 at 6:23 AM Mass Dosage <ma...@gmail.com> wrote:>
> > >>
> > >> We'd like to know how to do this too. We're working on the Hive>
> > >> integration and Hive requires older versions of many of the libraries
> that>
> > >> Iceberg uses (Guava, Calcite and Avro are being the most
> problematic).>
> > >> We're going to need to shade some of these in the iceberg modules we
> depend>
> > >> on but it would also be very useful to be able to override the
> versions in>
> > >> the iceberg-hive and iceberg-mr modules so that they aren't locked to
> the>
> > >> same versions as the rest of the projects.>
> > >>>
> > >> On Thu, 26 Mar 2020 at 01:53, Saisai Shao <sa...@gmail.com> wrote:>
> > >>>
> > >>> Hi Ryan,>
> > >>>>
> > >>> As mentioned in the meeting, would you please point me out the way
> to>
> > >>> make some submodules excluded from consistent-versions plugin.>
> > >>>>
> > >>> Thanks>
> > >>> Saisai>
> > >>>>
> > >>> Anton Okolnychyi <ao...@apple.com.invalid> 于2020年3月18日周三 上午4:14写道:>
> > >>>>
> > >>>> I am +1 on having spark-2 and spark-3 modules as well.>
> > >>>>>
> > >>>> On 7 Mar 2020, at 15:03, RD <rd...@gmail.com> wrote:>
> > >>>>>
> > >>>> I'm +1 to separate modules for spark-2 and spark-3, after the 0.8>
> > >>>> release.>
> > >>>> I think it would be a big change in organizations to adopt Spark-3>
> > >>>> since that brings in Scala-2.12 which is binary incompatible to
> previous>
> > >>>> Scala versions. Hence this adoption could take a lot of time. I
> know in our>
> > >>>> company we have no near term plans to move to Spark 3.>
> > >>>>>
> > >>>> -Best,>
> > >>>> R.>
> > >>>>>
> > >>>> On Thu, Mar 5, 2020 at 6:33 PM Saisai Shao <sa...@gmail.com>>
> > >>>> wrote:>
> > >>>>>
> > >>>>> I was thinking that if it is possible to limit version lock plugin
> to>
> > >>>>> only iceberg core related subprojects., seems like current>
> > >>>>> consistent-versions plugin doesn't allow to do so. So not sure if
> there're>
> > >>>>> some other plugins which could provide similar functionality with
> more>
> > >>>>> flexibility?>
> > >>>>>>
> > >>>>>  Any suggestions on this?>
> > >>>>>>
> > >>>>> Best regards,>
> > >>>>> Saisai>
> > >>>>>>
> > >>>>> Saisai Shao <sa...@gmail.com> 于2020年3月5日周四 下午3:12写道:>
> > >>>>>>
> > >>>>>> I think the requirement of supporting different version should
> be>
> > >>>>>> quite common. As Iceberg is a table format which should be
> adapted to>
> > >>>>>> different engines like Hive, Flink, Spark. To support different
> versions is>
> > >>>>>> a real problem, Spark is just one case, Hive, Flink could also be
> the case>
> > >>>>>> if the interface is changed across major versions. Also version
> lock may>
> > >>>>>> have problems when several engines coexisted in the same build,
> as they>
> > >>>>>> will transiently introduce lots of dependencies which may be
> conflicted, it>
> > >>>>>> may be hard to figure out one version which could satisfy all,
> and usually>
> > >>>>>> they only confined to a single module.>
> > >>>>>>>
> > >>>>>>  So I think we should figure out a way to support such scenario,
> not>
> > >>>>>> just maintaining branches one by one.>
> > >>>>>>>
> > >>>>>> Ryan Blue <rb...@netflix.com> 于2020年3月5日周四 上午2:53写道:>
> > >>>>>>>
> > >>>>>>> I think the key is that this wouldn't be using the same
> published>
> > >>>>>>> artifacts. This work would create a spark-2.4 artifact and a
> spark-3.0>
> > >>>>>>> artifact. (And possibly a spark-common artifact.)>
> > >>>>>>>>
> > >>>>>>> It seems reasonable to me to have those in the same build
> instead of>
> > >>>>>>> in separate branches, as long as the Spark dependencies are not
> leaked>
> > >>>>>>> outside of the modules. That said, I'd rather have the
> additional checks>
> > >>>>>>> that baseline provides in general since this is a short-term
> problem. It>
> > >>>>>>> would just be nice if we could have versions that are confined
> to a single>
> > >>>>>>> module. The Nebula plugin that baseline uses claims to support
> that, but I>
> > >>>>>>> couldn't get it to work.>
> > >>>>>>>>
> > >>>>>>> On Wed, Mar 4, 2020 at 6:38 AM Saisai Shao <sa...@gmail.com>>
> > >>>>>>> wrote:>
> > >>>>>>>>
> > >>>>>>>> Just think a bit on this. I agree that generally introducing>
> > >>>>>>>> different versions of same dependencies could be error prone.
> But I think>
> > >>>>>>>> the case here should not lead to  issue:>
> > >>>>>>>>>
> > >>>>>>>> 1.  These two sub-modules spark-2 and spark-3 are isolated,
> they're>
> > >>>>>>>> not dependent on either.>
> > >>>>>>>> 2. They can be differentiated by names when generating jars,
> also>
> > >>>>>>>> they will not be relied by other modules in Iceberg.>
> > >>>>>>>>>
> > >>>>>>>> So this dependency issue should not be the case here. And in
> Maven>
> > >>>>>>>> it could be achieved easily. Please correct me if wrong.>
> > >>>>>>>>>
> > >>>>>>>> Best regards,>
> > >>>>>>>> Saisai>
> > >>>>>>>>>
> > >>>>>>>> Saisai Shao <sa...@gmail.com> 于2020年3月4日周三 上午10:01写道:>
> > >>>>>>>>>
> > >>>>>>>>> Thanks Matt,>
> > >>>>>>>>>>
> > >>>>>>>>> If branching is the only choice, then we would potentially
> have>
> > >>>>>>>>> two *master* branches until spark-3 is vastly adopted. That
> will somehow>
> > >>>>>>>>> increase the maintenance burden and lead to inconsistency. IMO
> I'm OK with>
> > >>>>>>>>> the branching way, just think that we should have a clear way
> to keep>
> > >>>>>>>>> tracking of two branches.>
> > >>>>>>>>>>
> > >>>>>>>>> Best,>
> > >>>>>>>>> Saisai>
> > >>>>>>>>>>
> > >>>>>>>>> Matt Cheah <mc...@palantir.com.invalid> 于2020年3月4日周三
> 上午9:50写道:>
> > >>>>>>>>>>
> > >>>>>>>>>> I think it’s generally dangerous and error-prone to try to>
> > >>>>>>>>>> support two versions of the same library in the same build,
> in the same>
> > >>>>>>>>>> published artifacts. This is the stance that Baseline>
> > >>>>>>>>>> <https://github.com/palantir/gradle-baseline> + Gradle>
> > >>>>>>>>>> Consistent Versions>
> > >>>>>>>>>> <https://github.com/palantir/gradle-consistent-versions>
> takes.>
> > >>>>>>>>>> Gradle Consistent Versions is specifically opinionated
> towards building>
> > >>>>>>>>>> against one version of a library across all modules in the
> build.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> I would think that branching would be the best way to build
> and>
> > >>>>>>>>>> publish against multiple versions of a dependency.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> -Matt Cheah>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> *From: *Saisai Shao <sa...@gmail.com>>
> > >>>>>>>>>> *Reply-To: *"dev@iceberg.apache.org" <
> de...@iceberg.apache.org>>
> > >>>>>>>>>> *Date: *Tuesday, March 3, 2020 at 5:45 PM>
> > >>>>>>>>>> *To: *Iceberg Dev List <de...@iceberg.apache.org>>
> > >>>>>>>>>> *Cc: *Ryan Blue <rb...@netflix.com>>
> > >>>>>>>>>> *Subject: *Re: [Discuss] Merge spark-3 branch into master>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> I didn't realized that Gradle cannot support two different>
> > >>>>>>>>>> versions in one build. I think I did such things for Livy to
> build scala>
> > >>>>>>>>>> 2.10 and 2.11 jars simultaneously with Maven. I'm not so
> familiar with>
> > >>>>>>>>>> Gradle thing, I can take a shot to see if there's some hacky
> ways to>
> > >>>>>>>>>> make it work.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> Besides, are we saying that we will move to spark-3 support
> after>
> > >>>>>>>>>> 0.8 release in the master branch to replace Spark-2, or we
> maintain two>
> > >>>>>>>>>> branches for both spark-2 and spark-3 and make two releases?
> From>
> > >>>>>>>>>> my understanding, the adoption of spark-3 may not be so fast,
> and there>
> > >>>>>>>>>> still has lots users who stick on spark-2. Ideally, it might
> be better to>
> > >>>>>>>>>> support two versions in a near future.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> Thanks>
> > >>>>>>>>>>>
> > >>>>>>>>>> Saisai>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> Mass Dosage <ma...@gmail.com> 于2020年3月4日周三 上午1:33写道:>
> > >>>>>>>>>>>
> > >>>>>>>>>> +1 for a 0.8.0 release with Spark 2.4 and then move on for
> Spark>
> > >>>>>>>>>> 3.0 when it's ready.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> On Tue, 3 Mar 2020 at 16:32, Ryan Blue
> <rb...@netflix.com.invalid>>
> > >>>>>>>>>> wrote:>
> > >>>>>>>>>>>
> > >>>>>>>>>> Thanks for bringing this up, Saisai. I tried to do this a
> couple>
> > >>>>>>>>>> of months ago, but ran into a problem with dependency locks.
> I couldn't get>
> > >>>>>>>>>> two different versions of Spark packages in the build with
> baseline, but>
> > >>>>>>>>>> maybe I was missing something. If you can get it working, I
> think it's a>
> > >>>>>>>>>> great idea to get this into master.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> Otherwise, I was thinking about proposing an 0.8.0 release in
> the>
> > >>>>>>>>>> next month or so based on Spark 2.4. Then we could merge the
> branch into>
> > >>>>>>>>>> master and do another release for Spark 3.0 when it's ready.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> rb>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> On Tue, Mar 3, 2020 at 6:07 AM Saisai Shao <>
> > >>>>>>>>>> sai.sai.s...@gmail.com> wrote:>
> > >>>>>>>>>>>
> > >>>>>>>>>> Hi team,>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> I was thinking of merging spark-3 branch into master, also
> per>
> > >>>>>>>>>> the discussion before we could make spark-2 and spark-3
> coexisted into 2>
> > >>>>>>>>>> different sub-modules. With this, one build could generate
> both spark-2 and>
> > >>>>>>>>>> spark-3 runtime jars, user could pick either at preference.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> One concern is that they share lots of common code in
> read/write>
> > >>>>>>>>>> path, this will increase the maintenance overhead to keep
> consistency of>
> > >>>>>>>>>> two copies.>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> So I'd like to hear your thoughts, any suggestions on it?>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> Thanks>
> > >>>>>>>>>>>
> > >>>>>>>>>> Saisai>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> -->
> > >>>>>>>>>>>
> > >>>>>>>>>> Ryan Blue>
> > >>>>>>>>>>>
> > >>>>>>>>>> Software Engineer>
> > >>>>>>>>>>>
> > >>>>>>>>>> Netflix>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>> -->
> > >>>>>>> Ryan Blue>
> > >>>>>>> Software Engineer>
> > >>>>>>> Netflix>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>
> > > -->
> > > Ryan Blue>
> > > Software Engineer>
> > > Netflix>
> > >>
> >

Reply via email to