I'm very much partial to the "First" option, as it's far less effort
for approximately the same value (in my opinion, but in light of the
enthusiasm above for hadoop2, I could be very wrong on my assessment
of the value).

I'm going to upload a patch to ACCUMULO-1402 soon (tiny polishing
left), to demonstrate a way to push redundant jars, with an extra
classifier (though I still have to build twice, to avoid
maven-invoker-plugin complexity) for hadoop2-compatible binaries. If
you don't mind, I'll tag you with a request to review that patch, as
I'd like more details about the classifier issues you mention, in
context.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Tue, May 14, 2013 at 8:27 PM, Benson Margulies <bimargul...@gmail.com> wrote:
> Maven will malfunction in various entertaining ways if you try to
> change the GAV of the output of the build using a profile.
>
> Maven will malfunction in various entertaining ways if you use
> classifiers on real-live-JAR files that get used as
> real-live-dependencies, because it has no concept of a
> pom-per-classifier.
>
> Where does this leave you/us? (I'm not sure that I've earned an 'us'
> recently around here.)
>
> First, I note that 'Apache releases are source releases'. So, one
> resort of scoundrels here would be to support only one hadoop in the
> convenience binaries that get pushed to Maven Central, and let other
> hadoop users take the source release and build for themselves.
>
> Second, I am reduced to suggesting an elaboration of the build in
> which some tool edits poms and runs builds. The maven-invoker-plugin
> could be used to run that, but a plain old script in a plain old
> language might be less painful.
>
> I appreciate that this may not be an appealing contribution to where
> things are, but it might be the best of the evil choices.
>
>
> On Tue, May 14, 2013 at 7:50 PM, John Vines <vi...@apache.org> wrote:
>> The compiled code is compiled code. There are no concerns of dependency
>> resolution. So I see no issues in using the profile to define the gav if
>> that is feasible.
>>
>> Sent from my phone, please pardon the typos and brevity.
>> On May 14, 2013 7:47 PM, "Christopher" <ctubb...@apache.org> wrote:
>>
>>> Response to Benson inline, but additional note here:
>>>
>>> It should be noted that the situation will be made worse for the
>>> solution I was considering for ACCUMULO-1402, which would move the
>>> accumulo artifacts, classified by the hadoop2 variant, into the
>>> profiles... meaning they will no longer resolve transitively when they
>>> did before. Can go into details on that ticket, if needed.
>>>
>>> On Tue, May 14, 2013 at 7:41 PM, Benson Margulies <bimargul...@gmail.com>
>>> wrote:
>>> > On Tue, May 14, 2013 at 7:36 PM, Christopher <ctubb...@apache.org>
>>> wrote:
>>> >> Benson-
>>> >>
>>> >> They produce different byte-code. That's why we're even considering
>>> >> this. ACCUMULO-1402 is the ticket under which our intent is to add
>>> >> classifiers, so that they can be distinguished.
>>> >
>>> > whoops, missed that.
>>> >
>>> > Then how do people succeed in just fixing up their dependencies and
>>> using it?
>>>
>>> The specific differences are things like changes from abstract class
>>> to an interface. Apparently an import of these do not produce
>>> compatible byte-code, even though the method signature looks the same.
>>>
>>> > In any case, speaking as a Maven-maven, classifiers are absolutely,
>>> > positively, a cure worse than the disease. If you want the details
>>> > just ask.
>>>
>>> Agreed. I just don't see a good alternative here.
>>>
>>> >>
>>> >> All-
>>> >>
>>> >> To Keith's point, I think perhaps all this concern is a non-issue...
>>> >> because as Keith points out, the dependencies in question are marked
>>> >> as "provided", and dependency resolution doesn't occur for provided
>>> >> dependencies anyway... so even if we leave off the profiles, we're in
>>> >> the same boat. Maybe not the boat we should be in... but certainly not
>>> >> a sinking one as I had first imagined. It's as afloat as it was
>>> >> before, when they were not in a profile, but still marked as
>>> >> "provided".
>>> >>
>>> >> --
>>> >> Christopher L Tubbs II
>>> >> http://gravatar.com/ctubbsii
>>> >>
>>> >>
>>> >> On Tue, May 14, 2013 at 7:09 PM, Benson Margulies <
>>> bimargul...@gmail.com> wrote:
>>> >>> I just doesn't make very much sense to me to have two different GAV's
>>> >>> for the very same .class files, just to get different dependencies in
>>> >>> the poms. However, if someone really wanted that, I'd look to make
>>> >>> some scripting that created this downstream from the main build.
>>> >>>
>>> >>>
>>> >>> On Tue, May 14, 2013 at 6:16 PM, John Vines <vi...@apache.org> wrote:
>>> >>>> They're the same currently. I was requesting separate gavs for hadoop
>>> 2.
>>> >>>> It's been on the mailing list and jira.
>>> >>>>
>>> >>>> Sent from my phone, please pardon the typos and brevity.
>>> >>>> On May 14, 2013 6:14 PM, "Keith Turner" <ke...@deenlo.com> wrote:
>>> >>>>
>>> >>>>> On Tue, May 14, 2013 at 5:51 PM, Benson Margulies <
>>> bimargul...@gmail.com
>>> >>>>> >wrote:
>>> >>>>>
>>> >>>>> > I am a maven developer, and I'm offering this advice based on my
>>> >>>>> > understanding of reason why that generic advice is offered.
>>> >>>>> >
>>> >>>>> > If you have different profiles that _build different results_ but
>>> all
>>> >>>>> > deliver the same GAV, you have chaos.
>>> >>>>> >
>>> >>>>>
>>> >>>>> What GAV are we currently producing for hadoop 1 and hadoop 2?
>>> >>>>>
>>> >>>>>
>>> >>>>> >
>>> >>>>> > If you have different profiles that test against different
>>> versions of
>>> >>>>> > dependencies, but all deliver the same byte code at the end of the
>>> >>>>> > day, you don't have chaos.
>>> >>>>> >
>>> >>>>> >
>>> >>>>> >
>>> >>>>> > On Tue, May 14, 2013 at 5:48 PM, Christopher <ctubb...@apache.org>
>>> >>>>> wrote:
>>> >>>>> > > I think it's interesting that Option 4 seems to be most
>>> preferred...
>>> >>>>> > > because it's the *only* option that is explicitly advised
>>> against by
>>> >>>>> > > the Maven developers (from the information I've read). I can see
>>> its
>>> >>>>> > > appeal, but I really don't think that we should introduce an
>>> explicit
>>> >>>>> > > problem for users (that applies to users using even the Hadoop
>>> version
>>> >>>>> > > we directly build against... not just those using Hadoop 2... I
>>> don't
>>> >>>>> > > know if that point was clear), to only partially support a
>>> version of
>>> >>>>> > > Hadoop that is still alpha and has never had a stable release.
>>> >>>>> > >
>>> >>>>> > > BTW, Option 4 was how I had have achieved a solution for
>>> >>>>> > > ACCUMULO-1402, but am reluctant to apply that patch, with this
>>> issue
>>> >>>>> > > outstanding, as it may exacerbate the problem.
>>> >>>>> > >
>>> >>>>> > > Another implication for Option 4 (the current "solution") is for
>>> >>>>> > > 1.6.0, with the planned accumulo-maven-plugin... because it
>>> means that
>>> >>>>> > > the accumulo-maven-plugin will need to be configured like this:
>>> >>>>> > > <plugin>
>>> >>>>> > >   <groupId>org.apache.accumulo</groupId>
>>> >>>>> > >   <artifactId>accumulo-maven-plugin</artifactId>
>>> >>>>> > >   <dependencies>
>>> >>>>> > >    ... all the required hadoop 1 dependencies to make the plugin
>>> work,
>>> >>>>> > > even though this version only works against hadoop 1 anyway...
>>> >>>>> > >   </dependencies>
>>> >>>>> > >   ...
>>> >>>>> > > </plugin>
>>> >>>>> > >
>>> >>>>> > > --
>>> >>>>> > > Christopher L Tubbs II
>>> >>>>> > > http://gravatar.com/ctubbsii
>>> >>>>> > >
>>> >>>>> > >
>>> >>>>> > > On Tue, May 14, 2013 at 5:42 PM, Christopher <
>>> ctubb...@apache.org>
>>> >>>>> > wrote:
>>> >>>>> > >> I think Option 2 is the best solution for "waiting until we
>>> have the
>>> >>>>> > >> time to solve the problem correctly", as it ensures that
>>> transitive
>>> >>>>> > >> dependencies work for the stable version of Hadoop, and using
>>> Hadoop2
>>> >>>>> > >> is a very simple documentation issue for how to apply the patch
>>> and
>>> >>>>> > >> rebuild. Option 4 doesn't wait... it explicitly introduces a
>>> problem
>>> >>>>> > >> for users.
>>> >>>>> > >>
>>> >>>>> > >> Option 1 is how I'm tentatively thinking about fixing it
>>> properly in
>>> >>>>> > 1.6.0.
>>> >>>>> > >>
>>> >>>>> > >>
>>> >>>>> > >> --
>>> >>>>> > >> Christopher L Tubbs II
>>> >>>>> > >> http://gravatar.com/ctubbsii
>>> >>>>> > >>
>>> >>>>> > >>
>>> >>>>> > >> On Tue, May 14, 2013 at 4:56 PM, John Vines <vi...@apache.org>
>>> wrote:
>>> >>>>> > >>> I'm an advocate of option 4. You say that it's ignoring the
>>> problem,
>>> >>>>> > >>> whereas I think it's waiting until we have the time to solve
>>> the
>>> >>>>> > problem
>>> >>>>> > >>> correctly. Your reasoning for this is for standardizing for
>>> maven
>>> >>>>> > >>> conventions, but the other options, while more 'correct' from
>>> a maven
>>> >>>>> > >>> standpoint or a larger headache for our user base and
>>> ourselves. In
>>> >>>>> > either
>>> >>>>> > >>> case, we're going to be breaking some sort of convention, and
>>> while
>>> >>>>> > it's
>>> >>>>> > >>> not good, we should be doing the one that's less bad for US.
>>> The
>>> >>>>> > important
>>> >>>>> > >>> thing here, now, is that the poms work and we should go with
>>> the
>>> >>>>> method
>>> >>>>> > >>> that leaves the work minimal for our end users to utilize them.
>>> >>>>> > >>>
>>> >>>>> > >>> I do agree that 1. is the correct option in the long run. More
>>> >>>>> > >>> specifically, I think it boils down to having a single module
>>> >>>>> > compatibility
>>> >>>>> > >>> layer, which is how hbase deals with this issue. But like you
>>> said,
>>> >>>>> we
>>> >>>>> > >>> don't have the time to engineer a proper solution. So let
>>> sleeping
>>> >>>>> > dogs lie
>>> >>>>> > >>> and we can revamp the whole system for 1.5.1 or 1.6.0 when we
>>> have
>>> >>>>> the
>>> >>>>> > >>> cycles to do it right.
>>> >>>>> > >>>
>>> >>>>> > >>>
>>> >>>>> > >>> On Tue, May 14, 2013 at 4:40 PM, Christopher <
>>> ctubb...@apache.org>
>>> >>>>> > wrote:
>>> >>>>> > >>>
>>> >>>>> > >>>> So, I've run into a problem with ACCUMULO-1402 that requires a
>>> >>>>> larger
>>> >>>>> > >>>> discussion about how Accumulo 1.5.0 should support Hadoop2.
>>> >>>>> > >>>>
>>> >>>>> > >>>> The problem is basically that profiles should not contain
>>> >>>>> > >>>> dependencies, because profiles don't get activated
>>> transitively. A
>>> >>>>> > >>>> slide deck by the Maven developers point this out as a bad
>>> >>>>> practice...
>>> >>>>> > >>>> yet it's a practice we rely on for our current implementation
>>> of
>>> >>>>> > >>>> Hadoop2 support
>>> >>>>> > >>>> (
>>> >>>>> http://www.slideshare.net/aheritier/geneva-jug-30th-march-2010-maven
>>> >>>>> > >>>> slide 80).
>>> >>>>> > >>>>
>>> >>>>> > >>>> What this means is that even if we go through the work of
>>> publishing
>>> >>>>> > >>>> binary artifacts compiled against Hadoop2, neither our Hadoop1
>>> >>>>> > >>>> binaries or our Hadoop2 binaries will be able to transitively
>>> >>>>> resolve
>>> >>>>> > >>>> any dependencies defined in profiles. This has significant
>>> >>>>> > >>>> implications to user code that depends on Accumulo Maven
>>> artifacts.
>>> >>>>> > >>>> Every user will essentially have to explicitly add Hadoop
>>> >>>>> dependencies
>>> >>>>> > >>>> for every Accumulo artifact that has dependencies on Hadoop,
>>> either
>>> >>>>> > >>>> because we directly or transitively depend on Hadoop (they'll
>>> have
>>> >>>>> to
>>> >>>>> > >>>> peek into the profiles in our POMs and copy/paste the profile
>>> into
>>> >>>>> > >>>> their project). This becomes more complicated when we
>>> consider how
>>> >>>>> > >>>> users will try to use things like Instamo.
>>> >>>>> > >>>>
>>> >>>>> > >>>> There are workarounds, but none of them are really pleasant.
>>> >>>>> > >>>>
>>> >>>>> > >>>> 1. The best way to support both major Hadoop APIs is to have
>>> >>>>> separate
>>> >>>>> > >>>> modules with separate dependencies directly in the POM. This
>>> is a
>>> >>>>> fair
>>> >>>>> > >>>> amount of work, and in my opinion, would be too disruptive for
>>> >>>>> 1.5.0.
>>> >>>>> > >>>> This solution also gets us separate binaries for separate
>>> supported
>>> >>>>> > >>>> versions, which is useful.
>>> >>>>> > >>>>
>>> >>>>> > >>>> 2. A second option, and the preferred one I think for 1.5.0,
>>> is to
>>> >>>>> put
>>> >>>>> > >>>> a Hadoop2 patch in the branch's contrib directory
>>> >>>>> > >>>> (branches/1.5/contrib) that patches the POM files to support
>>> >>>>> building
>>> >>>>> > >>>> against Hadoop2. (Acknowledgement to Keith for suggesting this
>>> >>>>> > >>>> solution.)
>>> >>>>> > >>>>
>>> >>>>> > >>>> 3. A third option is to fork Accumulo, and maintain two
>>> separate
>>> >>>>> > >>>> builds (a more traditional technique). This adds merging
>>> nightmare
>>> >>>>> for
>>> >>>>> > >>>> features/patches, but gets around some reflection hacks that
>>> we may
>>> >>>>> > >>>> have been motivated to do in the past. I'm not a fan of this
>>> option,
>>> >>>>> > >>>> particularly because I don't want to replicate the fork
>>> nightmare
>>> >>>>> that
>>> >>>>> > >>>> has been the history of early Hadoop itself.
>>> >>>>> > >>>>
>>> >>>>> > >>>> 4. The last option is to do nothing and to continue to build
>>> with
>>> >>>>> the
>>> >>>>> > >>>> separate profiles as we are, and make users discover and
>>> specify
>>> >>>>> > >>>> transitive dependencies entirely on their own. I think this
>>> is the
>>> >>>>> > >>>> worst option, as it essentially amounts to "ignore the
>>> problem".
>>> >>>>> > >>>>
>>> >>>>> > >>>> At the very least, it does not seem reasonable to complete
>>> >>>>> > >>>> ACCUMULO-1402 for 1.5.0, given the complexity of this issue.
>>> >>>>> > >>>>
>>> >>>>> > >>>> Thoughts? Discussion? Vote on option?
>>> >>>>> > >>>>
>>> >>>>> > >>>> --
>>> >>>>> > >>>> Christopher L Tubbs II
>>> >>>>> > >>>> http://gravatar.com/ctubbsii
>>> >>>>> > >>>>
>>> >>>>> >
>>> >>>>>
>>>

Reply via email to