Thank you for your sharing ^ ^
> 在 2020年4月22日,下午1:40,Saisai Shao <sai.sai.s...@gmail.com> 写道: > > We're still facing the version constraint problem by gradle plugins :( > > > jiantao yu <yujt...@gmail.com <mailto:yujt...@gmail.com>> 于2020年4月22日周三 > 下午12:08写道: > Hi saisai, > Would you please share your progress on merging spark-3 branch into master? > We are trying iceberg with spark sql, which is only supported in spark 3. > > On 2020/03/27 01:53:09, Saisai Shao <s...@gmail.com <mailto:s...@gmail.com>> > wrote: > > Thanks Ryan, let me take a try.> > > > > Best regards,> > > Saisai> > > > > Ryan Blue <rb...@netflix.com.invalid> 于2020年3月27日周五 上午12:15写道:> > > > > > Here’s how it was done before:> > > > https://github.com/apache/incubator-iceberg/blob/867ec79a5c2f7619cb10546b5cc7f7bbc7d61621/build.gradle#L225-L244 > > > > > > <https://github.com/apache/incubator-iceberg/blob/867ec79a5c2f7619cb10546b5cc7f7bbc7d61621/build.gradle#L225-L244>> > > > > > >> > > > That defines a set of projects called baselineProjects and applies> > > > baseline like this:> > > >> > > > configure(baselineProjects) {> > > > apply plugin: 'com.palantir.baseline-checkstyle'> > > > ...> > > > }> > > >> > > > The baseline config has since been moved into baseline.gradle> > > > <https://github.com/apache/incubator-iceberg/blob/master/baseline.gradle > > > <https://github.com/apache/incubator-iceberg/blob/master/baseline.gradle>>> > > > > > > so changes should probably go into that file. Thanks for looking into > > > this!> > > >> > > > On Thu, Mar 26, 2020 at 6:23 AM Mass Dosage <ma...@gmail.com > > > <mailto:ma...@gmail.com>> wrote:> > > >> > > >> We'd like to know how to do this too. We're working on the Hive> > > >> integration and Hive requires older versions of many of the libraries > > >> that> > > >> Iceberg uses (Guava, Calcite and Avro are being the most problematic).> > > >> We're going to need to shade some of these in the iceberg modules we > > >> depend> > > >> on but it would also be very useful to be able to override the versions > > >> in> > > >> the iceberg-hive and iceberg-mr modules so that they aren't locked to > > >> the> > > >> same versions as the rest of the projects.> > > >>> > > >> On Thu, 26 Mar 2020 at 01:53, Saisai Shao <sa...@gmail.com > > >> <mailto:sa...@gmail.com>> wrote:> > > >>> > > >>> Hi Ryan,> > > >>>> > > >>> As mentioned in the meeting, would you please point me out the way to> > > >>> make some submodules excluded from consistent-versions plugin.> > > >>>> > > >>> Thanks> > > >>> Saisai> > > >>>> > > >>> Anton Okolnychyi <ao...@apple.com.invalid> 于2020年3月18日周三 上午4:14写道:> > > >>>> > > >>>> I am +1 on having spark-2 and spark-3 modules as well.> > > >>>>> > > >>>> On 7 Mar 2020, at 15:03, RD <rd...@gmail.com <mailto:rd...@gmail.com>> > > >>>> wrote:> > > >>>>> > > >>>> I'm +1 to separate modules for spark-2 and spark-3, after the 0.8> > > >>>> release.> > > >>>> I think it would be a big change in organizations to adopt Spark-3> > > >>>> since that brings in Scala-2.12 which is binary incompatible to > > >>>> previous> > > >>>> Scala versions. Hence this adoption could take a lot of time. I know > > >>>> in our> > > >>>> company we have no near term plans to move to Spark 3.> > > >>>>> > > >>>> -Best,> > > >>>> R.> > > >>>>> > > >>>> On Thu, Mar 5, 2020 at 6:33 PM Saisai Shao <sa...@gmail.com > > >>>> <mailto:sa...@gmail.com>>> > > >>>> wrote:> > > >>>>> > > >>>>> I was thinking that if it is possible to limit version lock plugin > > >>>>> to> > > >>>>> only iceberg core related subprojects., seems like current> > > >>>>> consistent-versions plugin doesn't allow to do so. So not sure if > > >>>>> there're> > > >>>>> some other plugins which could provide similar functionality with > > >>>>> more> > > >>>>> flexibility?> > > >>>>>> > > >>>>> Any suggestions on this?> > > >>>>>> > > >>>>> Best regards,> > > >>>>> Saisai> > > >>>>>> > > >>>>> Saisai Shao <sa...@gmail.com <mailto:sa...@gmail.com>> 于2020年3月5日周四 > > >>>>> 下午3:12写道:> > > >>>>>> > > >>>>>> I think the requirement of supporting different version should be> > > >>>>>> quite common. As Iceberg is a table format which should be adapted > > >>>>>> to> > > >>>>>> different engines like Hive, Flink, Spark. To support different > > >>>>>> versions is> > > >>>>>> a real problem, Spark is just one case, Hive, Flink could also be > > >>>>>> the case> > > >>>>>> if the interface is changed across major versions. Also version lock > > >>>>>> may> > > >>>>>> have problems when several engines coexisted in the same build, as > > >>>>>> they> > > >>>>>> will transiently introduce lots of dependencies which may be > > >>>>>> conflicted, it> > > >>>>>> may be hard to figure out one version which could satisfy all, and > > >>>>>> usually> > > >>>>>> they only confined to a single module.> > > >>>>>>> > > >>>>>> So I think we should figure out a way to support such scenario, > > >>>>>> not> > > >>>>>> just maintaining branches one by one.> > > >>>>>>> > > >>>>>> Ryan Blue <rb...@netflix.com <mailto:rb...@netflix.com>> > > >>>>>> 于2020年3月5日周四 上午2:53写道:> > > >>>>>>> > > >>>>>>> I think the key is that this wouldn't be using the same published> > > >>>>>>> artifacts. This work would create a spark-2.4 artifact and a > > >>>>>>> spark-3.0> > > >>>>>>> artifact. (And possibly a spark-common artifact.)> > > >>>>>>>> > > >>>>>>> It seems reasonable to me to have those in the same build instead > > >>>>>>> of> > > >>>>>>> in separate branches, as long as the Spark dependencies are not > > >>>>>>> leaked> > > >>>>>>> outside of the modules. That said, I'd rather have the additional > > >>>>>>> checks> > > >>>>>>> that baseline provides in general since this is a short-term > > >>>>>>> problem. It> > > >>>>>>> would just be nice if we could have versions that are confined to a > > >>>>>>> single> > > >>>>>>> module. The Nebula plugin that baseline uses claims to support > > >>>>>>> that, but I> > > >>>>>>> couldn't get it to work.> > > >>>>>>>> > > >>>>>>> On Wed, Mar 4, 2020 at 6:38 AM Saisai Shao <sa...@gmail.com > > >>>>>>> <mailto:sa...@gmail.com>>> > > >>>>>>> wrote:> > > >>>>>>>> > > >>>>>>>> Just think a bit on this. I agree that generally introducing> > > >>>>>>>> different versions of same dependencies could be error prone. But > > >>>>>>>> I think> > > >>>>>>>> the case here should not lead to issue:> > > >>>>>>>>> > > >>>>>>>> 1. These two sub-modules spark-2 and spark-3 are isolated, > > >>>>>>>> they're> > > >>>>>>>> not dependent on either.> > > >>>>>>>> 2. They can be differentiated by names when generating jars, also> > > >>>>>>>> they will not be relied by other modules in Iceberg.> > > >>>>>>>>> > > >>>>>>>> So this dependency issue should not be the case here. And in > > >>>>>>>> Maven> > > >>>>>>>> it could be achieved easily. Please correct me if wrong.> > > >>>>>>>>> > > >>>>>>>> Best regards,> > > >>>>>>>> Saisai> > > >>>>>>>>> > > >>>>>>>> Saisai Shao <sa...@gmail.com <mailto:sa...@gmail.com>> > > >>>>>>>> 于2020年3月4日周三 上午10:01写道:> > > >>>>>>>>> > > >>>>>>>>> Thanks Matt,> > > >>>>>>>>>> > > >>>>>>>>> If branching is the only choice, then we would potentially have> > > >>>>>>>>> two *master* branches until spark-3 is vastly adopted. That will > > >>>>>>>>> somehow> > > >>>>>>>>> increase the maintenance burden and lead to inconsistency. IMO > > >>>>>>>>> I'm OK with> > > >>>>>>>>> the branching way, just think that we should have a clear way to > > >>>>>>>>> keep> > > >>>>>>>>> tracking of two branches.> > > >>>>>>>>>> > > >>>>>>>>> Best,> > > >>>>>>>>> Saisai> > > >>>>>>>>>> > > >>>>>>>>> Matt Cheah <mc...@palantir.com.invalid> 于2020年3月4日周三 上午9:50写道:> > > >>>>>>>>>> > > >>>>>>>>>> I think it’s generally dangerous and error-prone to try to> > > >>>>>>>>>> support two versions of the same library in the same build, in > > >>>>>>>>>> the same> > > >>>>>>>>>> published artifacts. This is the stance that Baseline> > > >>>>>>>>>> <https://github.com/palantir/gradle-baseline > > >>>>>>>>>> <https://github.com/palantir/gradle-baseline>> + Gradle> > > >>>>>>>>>> Consistent Versions> > > >>>>>>>>>> <https://github.com/palantir/gradle-consistent-versions > > >>>>>>>>>> <https://github.com/palantir/gradle-consistent-versions>> > > >>>>>>>>>> takes.> > > >>>>>>>>>> Gradle Consistent Versions is specifically opinionated towards > > >>>>>>>>>> building> > > >>>>>>>>>> against one version of a library across all modules in the > > >>>>>>>>>> build.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> I would think that branching would be the best way to build and> > > >>>>>>>>>> publish against multiple versions of a dependency.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> -Matt Cheah> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> *From: *Saisai Shao <sa...@gmail.com <mailto:sa...@gmail.com>>> > > >>>>>>>>>> *Reply-To: *"dev@iceberg.apache.org > > >>>>>>>>>> <mailto:dev@iceberg.apache.org>" <de...@iceberg.apache.org > > >>>>>>>>>> <mailto:de...@iceberg.apache.org>>> > > >>>>>>>>>> *Date: *Tuesday, March 3, 2020 at 5:45 PM> > > >>>>>>>>>> *To: *Iceberg Dev List <de...@iceberg.apache.org > > >>>>>>>>>> <mailto:de...@iceberg.apache.org>>> > > >>>>>>>>>> *Cc: *Ryan Blue <rb...@netflix.com <mailto:rb...@netflix.com>>> > > >>>>>>>>>> *Subject: *Re: [Discuss] Merge spark-3 branch into master> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> I didn't realized that Gradle cannot support two different> > > >>>>>>>>>> versions in one build. I think I did such things for Livy to > > >>>>>>>>>> build scala> > > >>>>>>>>>> 2.10 and 2.11 jars simultaneously with Maven. I'm not so > > >>>>>>>>>> familiar with> > > >>>>>>>>>> Gradle thing, I can take a shot to see if there's some hacky > > >>>>>>>>>> ways to> > > >>>>>>>>>> make it work.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> Besides, are we saying that we will move to spark-3 support > > >>>>>>>>>> after> > > >>>>>>>>>> 0.8 release in the master branch to replace Spark-2, or we > > >>>>>>>>>> maintain two> > > >>>>>>>>>> branches for both spark-2 and spark-3 and make two releases? > > >>>>>>>>>> From> > > >>>>>>>>>> my understanding, the adoption of spark-3 may not be so fast, > > >>>>>>>>>> and there> > > >>>>>>>>>> still has lots users who stick on spark-2. Ideally, it might be > > >>>>>>>>>> better to> > > >>>>>>>>>> support two versions in a near future.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> Thanks> > > >>>>>>>>>>> > > >>>>>>>>>> Saisai> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> Mass Dosage <ma...@gmail.com <mailto:ma...@gmail.com>> > > >>>>>>>>>> 于2020年3月4日周三 上午1:33写道:> > > >>>>>>>>>>> > > >>>>>>>>>> +1 for a 0.8.0 release with Spark 2.4 and then move on for > > >>>>>>>>>> Spark> > > >>>>>>>>>> 3.0 when it's ready.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> On Tue, 3 Mar 2020 at 16:32, Ryan Blue > > >>>>>>>>>> <rb...@netflix.com.invalid>> > > >>>>>>>>>> wrote:> > > >>>>>>>>>>> > > >>>>>>>>>> Thanks for bringing this up, Saisai. I tried to do this a > > >>>>>>>>>> couple> > > >>>>>>>>>> of months ago, but ran into a problem with dependency locks. I > > >>>>>>>>>> couldn't get> > > >>>>>>>>>> two different versions of Spark packages in the build with > > >>>>>>>>>> baseline, but> > > >>>>>>>>>> maybe I was missing something. If you can get it working, I > > >>>>>>>>>> think it's a> > > >>>>>>>>>> great idea to get this into master.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> Otherwise, I was thinking about proposing an 0.8.0 release in > > >>>>>>>>>> the> > > >>>>>>>>>> next month or so based on Spark 2.4. Then we could merge the > > >>>>>>>>>> branch into> > > >>>>>>>>>> master and do another release for Spark 3.0 when it's ready.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> rb> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> On Tue, Mar 3, 2020 at 6:07 AM Saisai Shao <> > > >>>>>>>>>> sai.sai.s...@gmail.com <mailto:sai.sai.s...@gmail.com>> wrote:> > > >>>>>>>>>>> > > >>>>>>>>>> Hi team,> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> I was thinking of merging spark-3 branch into master, also per> > > >>>>>>>>>> the discussion before we could make spark-2 and spark-3 > > >>>>>>>>>> coexisted into 2> > > >>>>>>>>>> different sub-modules. With this, one build could generate both > > >>>>>>>>>> spark-2 and> > > >>>>>>>>>> spark-3 runtime jars, user could pick either at preference.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> One concern is that they share lots of common code in > > >>>>>>>>>> read/write> > > >>>>>>>>>> path, this will increase the maintenance overhead to keep > > >>>>>>>>>> consistency of> > > >>>>>>>>>> two copies.> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> So I'd like to hear your thoughts, any suggestions on it?> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> Thanks> > > >>>>>>>>>>> > > >>>>>>>>>> Saisai> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> --> > > >>>>>>>>>>> > > >>>>>>>>>> Ryan Blue> > > >>>>>>>>>>> > > >>>>>>>>>> Software Engineer> > > >>>>>>>>>>> > > >>>>>>>>>> Netflix> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>> > > >>>>>>> --> > > >>>>>>> Ryan Blue> > > >>>>>>> Software Engineer> > > >>>>>>> Netflix> > > >>>>>>>> > > >>>>>>> > > >>>>> > > >> > > > --> > > > Ryan Blue> > > > Software Engineer> > > > Netflix> > > >> > >