Thanks Ryan, let me take a try. Best regards, Saisai
Ryan Blue <rb...@netflix.com.invalid> 于2020年3月27日周五 上午12:15写道: > Here’s how it was done before: > https://github.com/apache/incubator-iceberg/blob/867ec79a5c2f7619cb10546b5cc7f7bbc7d61621/build.gradle#L225-L244 > > That defines a set of projects called baselineProjects and applies > baseline like this: > > configure(baselineProjects) { > apply plugin: 'com.palantir.baseline-checkstyle' > ... > } > > The baseline config has since been moved into baseline.gradle > <https://github.com/apache/incubator-iceberg/blob/master/baseline.gradle> > so changes should probably go into that file. Thanks for looking into this! > > On Thu, Mar 26, 2020 at 6:23 AM Mass Dosage <massdos...@gmail.com> wrote: > >> We'd like to know how to do this too. We're working on the Hive >> integration and Hive requires older versions of many of the libraries that >> Iceberg uses (Guava, Calcite and Avro are being the most problematic). >> We're going to need to shade some of these in the iceberg modules we depend >> on but it would also be very useful to be able to override the versions in >> the iceberg-hive and iceberg-mr modules so that they aren't locked to the >> same versions as the rest of the projects. >> >> On Thu, 26 Mar 2020 at 01:53, Saisai Shao <sai.sai.s...@gmail.com> wrote: >> >>> Hi Ryan, >>> >>> As mentioned in the meeting, would you please point me out the way to >>> make some submodules excluded from consistent-versions plugin. >>> >>> Thanks >>> Saisai >>> >>> Anton Okolnychyi <aokolnyc...@apple.com.invalid> 于2020年3月18日周三 上午4:14写道: >>> >>>> I am +1 on having spark-2 and spark-3 modules as well. >>>> >>>> On 7 Mar 2020, at 15:03, RD <rdsr...@gmail.com> wrote: >>>> >>>> I'm +1 to separate modules for spark-2 and spark-3, after the 0.8 >>>> release. >>>> I think it would be a big change in organizations to adopt Spark-3 >>>> since that brings in Scala-2.12 which is binary incompatible to previous >>>> Scala versions. Hence this adoption could take a lot of time. I know in our >>>> company we have no near term plans to move to Spark 3. >>>> >>>> -Best, >>>> R. >>>> >>>> On Thu, Mar 5, 2020 at 6:33 PM Saisai Shao <sai.sai.s...@gmail.com> >>>> wrote: >>>> >>>>> I was thinking that if it is possible to limit version lock plugin to >>>>> only iceberg core related subprojects., seems like current >>>>> consistent-versions plugin doesn't allow to do so. So not sure if there're >>>>> some other plugins which could provide similar functionality with more >>>>> flexibility? >>>>> >>>>> Any suggestions on this? >>>>> >>>>> Best regards, >>>>> Saisai >>>>> >>>>> Saisai Shao <sai.sai.s...@gmail.com> 于2020年3月5日周四 下午3:12写道: >>>>> >>>>>> I think the requirement of supporting different version should be >>>>>> quite common. As Iceberg is a table format which should be adapted to >>>>>> different engines like Hive, Flink, Spark. To support different versions >>>>>> is >>>>>> a real problem, Spark is just one case, Hive, Flink could also be the >>>>>> case >>>>>> if the interface is changed across major versions. Also version lock may >>>>>> have problems when several engines coexisted in the same build, as they >>>>>> will transiently introduce lots of dependencies which may be conflicted, >>>>>> it >>>>>> may be hard to figure out one version which could satisfy all, and >>>>>> usually >>>>>> they only confined to a single module. >>>>>> >>>>>> So I think we should figure out a way to support such scenario, not >>>>>> just maintaining branches one by one. >>>>>> >>>>>> Ryan Blue <rb...@netflix.com> 于2020年3月5日周四 上午2:53写道: >>>>>> >>>>>>> I think the key is that this wouldn't be using the same published >>>>>>> artifacts. This work would create a spark-2.4 artifact and a spark-3.0 >>>>>>> artifact. (And possibly a spark-common artifact.) >>>>>>> >>>>>>> It seems reasonable to me to have those in the same build instead of >>>>>>> in separate branches, as long as the Spark dependencies are not leaked >>>>>>> outside of the modules. That said, I'd rather have the additional checks >>>>>>> that baseline provides in general since this is a short-term problem. It >>>>>>> would just be nice if we could have versions that are confined to a >>>>>>> single >>>>>>> module. The Nebula plugin that baseline uses claims to support that, >>>>>>> but I >>>>>>> couldn't get it to work. >>>>>>> >>>>>>> On Wed, Mar 4, 2020 at 6:38 AM Saisai Shao <sai.sai.s...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Just think a bit on this. I agree that generally introducing >>>>>>>> different versions of same dependencies could be error prone. But I >>>>>>>> think >>>>>>>> the case here should not lead to issue: >>>>>>>> >>>>>>>> 1. These two sub-modules spark-2 and spark-3 are isolated, they're >>>>>>>> not dependent on either. >>>>>>>> 2. They can be differentiated by names when generating jars, also >>>>>>>> they will not be relied by other modules in Iceberg. >>>>>>>> >>>>>>>> So this dependency issue should not be the case here. And in Maven >>>>>>>> it could be achieved easily. Please correct me if wrong. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Saisai >>>>>>>> >>>>>>>> Saisai Shao <sai.sai.s...@gmail.com> 于2020年3月4日周三 上午10:01写道: >>>>>>>> >>>>>>>>> Thanks Matt, >>>>>>>>> >>>>>>>>> If branching is the only choice, then we would potentially have >>>>>>>>> two *master* branches until spark-3 is vastly adopted. That will >>>>>>>>> somehow >>>>>>>>> increase the maintenance burden and lead to inconsistency. IMO I'm OK >>>>>>>>> with >>>>>>>>> the branching way, just think that we should have a clear way to keep >>>>>>>>> tracking of two branches. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Saisai >>>>>>>>> >>>>>>>>> Matt Cheah <mch...@palantir.com.invalid> 于2020年3月4日周三 上午9:50写道: >>>>>>>>> >>>>>>>>>> I think it’s generally dangerous and error-prone to try to >>>>>>>>>> support two versions of the same library in the same build, in the >>>>>>>>>> same >>>>>>>>>> published artifacts. This is the stance that Baseline >>>>>>>>>> <https://github.com/palantir/gradle-baseline> + Gradle >>>>>>>>>> Consistent Versions >>>>>>>>>> <https://github.com/palantir/gradle-consistent-versions> takes. >>>>>>>>>> Gradle Consistent Versions is specifically opinionated towards >>>>>>>>>> building >>>>>>>>>> against one version of a library across all modules in the build. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I would think that branching would be the best way to build and >>>>>>>>>> publish against multiple versions of a dependency. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -Matt Cheah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *From: *Saisai Shao <sai.sai.s...@gmail.com> >>>>>>>>>> *Reply-To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org> >>>>>>>>>> *Date: *Tuesday, March 3, 2020 at 5:45 PM >>>>>>>>>> *To: *Iceberg Dev List <dev@iceberg.apache.org> >>>>>>>>>> *Cc: *Ryan Blue <rb...@netflix.com> >>>>>>>>>> *Subject: *Re: [Discuss] Merge spark-3 branch into master >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I didn't realized that Gradle cannot support two different >>>>>>>>>> versions in one build. I think I did such things for Livy to build >>>>>>>>>> scala >>>>>>>>>> 2.10 and 2.11 jars simultaneously with Maven. I'm not so familiar >>>>>>>>>> with >>>>>>>>>> Gradle thing, I can take a shot to see if there's some hacky ways to >>>>>>>>>> make it work. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Besides, are we saying that we will move to spark-3 support after >>>>>>>>>> 0.8 release in the master branch to replace Spark-2, or we maintain >>>>>>>>>> two >>>>>>>>>> branches for both spark-2 and spark-3 and make two releases? From >>>>>>>>>> my understanding, the adoption of spark-3 may not be so fast, and >>>>>>>>>> there >>>>>>>>>> still has lots users who stick on spark-2. Ideally, it might be >>>>>>>>>> better to >>>>>>>>>> support two versions in a near future. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> Saisai >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Mass Dosage <massdos...@gmail.com> 于2020年3月4日周三 上午1:33写道: >>>>>>>>>> >>>>>>>>>> +1 for a 0.8.0 release with Spark 2.4 and then move on for Spark >>>>>>>>>> 3.0 when it's ready. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, 3 Mar 2020 at 16:32, Ryan Blue <rb...@netflix.com.invalid> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Thanks for bringing this up, Saisai. I tried to do this a couple >>>>>>>>>> of months ago, but ran into a problem with dependency locks. I >>>>>>>>>> couldn't get >>>>>>>>>> two different versions of Spark packages in the build with baseline, >>>>>>>>>> but >>>>>>>>>> maybe I was missing something. If you can get it working, I think >>>>>>>>>> it's a >>>>>>>>>> great idea to get this into master. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Otherwise, I was thinking about proposing an 0.8.0 release in the >>>>>>>>>> next month or so based on Spark 2.4. Then we could merge the branch >>>>>>>>>> into >>>>>>>>>> master and do another release for Spark 3.0 when it's ready. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> rb >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Mar 3, 2020 at 6:07 AM Saisai Shao < >>>>>>>>>> sai.sai.s...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi team, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I was thinking of merging spark-3 branch into master, also per >>>>>>>>>> the discussion before we could make spark-2 and spark-3 coexisted >>>>>>>>>> into 2 >>>>>>>>>> different sub-modules. With this, one build could generate both >>>>>>>>>> spark-2 and >>>>>>>>>> spark-3 runtime jars, user could pick either at preference. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> One concern is that they share lots of common code in read/write >>>>>>>>>> path, this will increase the maintenance overhead to keep >>>>>>>>>> consistency of >>>>>>>>>> two copies. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> So I'd like to hear your thoughts, any suggestions on it? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> Saisai >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Ryan Blue >>>>>>>>>> >>>>>>>>>> Software Engineer >>>>>>>>>> >>>>>>>>>> Netflix >>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ryan Blue >>>>>>> Software Engineer >>>>>>> Netflix >>>>>>> >>>>>> >>>> > > -- > Ryan Blue > Software Engineer > Netflix >