Hi Ryan,

As mentioned in the meeting, would you please point me out the way to make
some submodules excluded from consistent-versions plugin.

Thanks
Saisai

Anton Okolnychyi <aokolnyc...@apple.com.invalid> 于2020年3月18日周三 上午4:14写道:

> I am +1 on having spark-2 and spark-3 modules as well.
>
> On 7 Mar 2020, at 15:03, RD <rdsr...@gmail.com> wrote:
>
> I'm +1 to separate modules for spark-2 and spark-3, after the 0.8 release.
> I think it would be a big change in organizations to adopt Spark-3 since
> that brings in Scala-2.12 which is binary incompatible to previous Scala
> versions. Hence this adoption could take a lot of time. I know in our
> company we have no near term plans to move to Spark 3.
>
> -Best,
> R.
>
> On Thu, Mar 5, 2020 at 6:33 PM Saisai Shao <sai.sai.s...@gmail.com> wrote:
>
>> I was thinking that if it is possible to limit version lock plugin to
>> only iceberg core related subprojects., seems like current
>> consistent-versions plugin doesn't allow to do so. So not sure if there're
>> some other plugins which could provide similar functionality with more
>> flexibility?
>>
>>  Any suggestions on this?
>>
>> Best regards,
>> Saisai
>>
>> Saisai Shao <sai.sai.s...@gmail.com> 于2020年3月5日周四 下午3:12写道:
>>
>>> I think the requirement of supporting different version should be quite
>>> common. As Iceberg is a table format which should be adapted to different
>>> engines like Hive, Flink, Spark. To support different versions is a real
>>> problem, Spark is just one case, Hive, Flink could also be the case if the
>>> interface is changed across major versions. Also version lock may have
>>> problems when several engines coexisted in the same build, as they will
>>> transiently introduce lots of dependencies which may be conflicted, it may
>>> be hard to figure out one version which could satisfy all, and usually they
>>> only confined to a single module.
>>>
>>>  So I think we should figure out a way to support such scenario, not
>>> just maintaining branches one by one.
>>>
>>> Ryan Blue <rb...@netflix.com> 于2020年3月5日周四 上午2:53写道:
>>>
>>>> I think the key is that this wouldn't be using the same published
>>>> artifacts. This work would create a spark-2.4 artifact and a spark-3.0
>>>> artifact. (And possibly a spark-common artifact.)
>>>>
>>>> It seems reasonable to me to have those in the same build instead of in
>>>> separate branches, as long as the Spark dependencies are not leaked outside
>>>> of the modules. That said, I'd rather have the additional checks that
>>>> baseline provides in general since this is a short-term problem. It would
>>>> just be nice if we could have versions that are confined to a single
>>>> module. The Nebula plugin that baseline uses claims to support that, but I
>>>> couldn't get it to work.
>>>>
>>>> On Wed, Mar 4, 2020 at 6:38 AM Saisai Shao <sai.sai.s...@gmail.com>
>>>> wrote:
>>>>
>>>>> Just think a bit on this. I agree that generally introducing different
>>>>> versions of same dependencies could be error prone. But I think the case
>>>>> here should not lead to  issue:
>>>>>
>>>>> 1.  These two sub-modules spark-2 and spark-3 are isolated, they're
>>>>> not dependent on either.
>>>>> 2. They can be differentiated by names when generating jars, also they
>>>>> will not be relied by other modules in Iceberg.
>>>>>
>>>>> So this dependency issue should not be the case here. And in Maven it
>>>>> could be achieved easily. Please correct me if wrong.
>>>>>
>>>>> Best regards,
>>>>> Saisai
>>>>>
>>>>> Saisai Shao <sai.sai.s...@gmail.com> 于2020年3月4日周三 上午10:01写道:
>>>>>
>>>>>> Thanks Matt,
>>>>>>
>>>>>> If branching is the only choice, then we would potentially have two
>>>>>> *master* branches until spark-3 is vastly adopted. That will somehow
>>>>>> increase the maintenance burden and lead to inconsistency. IMO I'm OK 
>>>>>> with
>>>>>> the branching way, just think that we should have a clear way to keep
>>>>>> tracking of two branches.
>>>>>>
>>>>>> Best,
>>>>>> Saisai
>>>>>>
>>>>>> Matt Cheah <mch...@palantir.com.invalid> 于2020年3月4日周三 上午9:50写道:
>>>>>>
>>>>>>> I think it’s generally dangerous and error-prone to try to support
>>>>>>> two versions of the same library in the same build, in the same 
>>>>>>> published
>>>>>>> artifacts. This is the stance that Baseline
>>>>>>> <https://github.com/palantir/gradle-baseline> + Gradle Consistent
>>>>>>> Versions <https://github.com/palantir/gradle-consistent-versions>
>>>>>>> takes. Gradle Consistent Versions is specifically opinionated towards
>>>>>>> building against one version of a library across all modules in the 
>>>>>>> build.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I would think that branching would be the best way to build and
>>>>>>> publish against multiple versions of a dependency.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -Matt Cheah
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From: *Saisai Shao <sai.sai.s...@gmail.com>
>>>>>>> *Reply-To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org>
>>>>>>> *Date: *Tuesday, March 3, 2020 at 5:45 PM
>>>>>>> *To: *Iceberg Dev List <dev@iceberg.apache.org>
>>>>>>> *Cc: *Ryan Blue <rb...@netflix.com>
>>>>>>> *Subject: *Re: [Discuss] Merge spark-3 branch into master
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I didn't realized that Gradle cannot support two different versions
>>>>>>> in one build. I think I did such things for Livy to build scala 2.10 and
>>>>>>> 2.11 jars simultaneously with Maven. I'm not so familiar with Gradle 
>>>>>>> thing,
>>>>>>> I can take a shot to see if there's some hacky ways to make it work.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Besides, are we saying that we will move to spark-3 support after
>>>>>>> 0.8 release in the master branch to replace Spark-2, or we maintain two
>>>>>>> branches for both spark-2 and spark-3 and make two releases? From
>>>>>>> my understanding, the adoption of spark-3 may not be so fast, and there
>>>>>>> still has lots users who stick on spark-2. Ideally, it might be better 
>>>>>>> to
>>>>>>> support two versions in a near future.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Saisai
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Mass Dosage <massdos...@gmail.com> 于2020年3月4日周三 上午1:33写道:
>>>>>>>
>>>>>>> +1 for a 0.8.0 release with Spark 2.4 and then move on for Spark 3.0
>>>>>>> when it's ready.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, 3 Mar 2020 at 16:32, Ryan Blue <rb...@netflix.com.invalid>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for bringing this up, Saisai. I tried to do this a couple of
>>>>>>> months ago, but ran into a problem with dependency locks. I couldn't get
>>>>>>> two different versions of Spark packages in the build with baseline, but
>>>>>>> maybe I was missing something. If you can get it working, I think it's a
>>>>>>> great idea to get this into master.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Otherwise, I was thinking about proposing an 0.8.0 release in the
>>>>>>> next month or so based on Spark 2.4. Then we could merge the branch into
>>>>>>> master and do another release for Spark 3.0 when it's ready.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> rb
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 3, 2020 at 6:07 AM Saisai Shao <sai.sai.s...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi team,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I was thinking of merging spark-3 branch into master, also per the
>>>>>>> discussion before we could make spark-2 and spark-3 coexisted into 2
>>>>>>> different sub-modules. With this, one build could generate both spark-2 
>>>>>>> and
>>>>>>> spark-3 runtime jars, user could pick either at preference.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> One concern is that they share lots of common code in read/write
>>>>>>> path, this will increase the maintenance overhead to keep consistency of
>>>>>>> two copies.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> So I'd like to hear your thoughts, any suggestions on it?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Saisai
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Ryan Blue
>>>>>>>
>>>>>>> Software Engineer
>>>>>>>
>>>>>>> Netflix
>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>>>
>>>
>

Reply via email to