Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-08 Thread Christopher
On Fri, Jul 8, 2016 at 5:05 PM Sean Busbey  wrote:

> On Fri, Jul 8, 2016 at 3:40 PM, Christopher  wrote:
> > On Fri, Jul 8, 2016 at 11:20 AM Sean Busbey  wrote:
> >> Would we be bumping the Hadoop version while incrementing our minor
> >> version number or our major version number?
> >>
> >>
> >>
> > Minor only, because it's not a breaking change necessarily, and it's
> > unrelated to API. It'd still be reasonable for somebody to patch the 1.x
> > version to use the earlier Hadoop/HTrace versions easily.
> >
> > Specifically, I was thinking for 1.8.0. Since H2.8 isn't out yet, that'd
> > mean either no change in 1.8.0, or a change to make it sync up with H2.7.
>
> My only concern would be that updating our listed Hadoop dependency
> version would make it easy for someone to accidentally rely on a
> Hadoop API call that wasn't in earlier versions, which would then make
> it harder for an interested person to patch their 1.y version to use
> the earlier Hadoop version.
>
> HBase checks compilation against different hadoop versions in their
> precommit checks. We could add something like that to our nightly
> builds maybe?
>
> Now that we're discussing it, I can't actually remember if we ever
> documented what version(s) of Hadoop we expect to work with. So maybe
> updating to the latest minor release of 2.y on each Accumulo 1.y minor
> release can just be our new thing.
>
> --
> busbey
>

I don't know that we'd have to update every time... but we can certainly
make it a point to consider prior to releasing.

Personally, I'm okay with newer versions of Accumulo requiring newer
versions of Hadoop, and using newer APIs which don't work on older Hadoops.
What we release is a baseline anyway... if users have specific needs for
specific deployments, they may have to do some
backporting/patching/dependency convergence/integration, and I think that's
okay. We can even try to help them along on the mailing lists when this
occurs. I don't think it's reasonable for us to try to make long-term
guarantees about being able to run on such a wide range of versions of
Hadoop. It's just not tenable to do that sort of thing upstream. We can be
cognizant and helpful, but sometimes it's easier to keep development going
by moving to newer deps.


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-08 Thread Josh Elser

Sean Busbey wrote:

On Fri, Jul 8, 2016 at 3:40 PM, Christopher  wrote:

On Fri, Jul 8, 2016 at 11:20 AM Sean Busbey  wrote:

Would we be bumping the Hadoop version while incrementing our minor
version number or our major version number?




Minor only, because it's not a breaking change necessarily, and it's
unrelated to API. It'd still be reasonable for somebody to patch the 1.x
version to use the earlier Hadoop/HTrace versions easily.

Specifically, I was thinking for 1.8.0. Since H2.8 isn't out yet, that'd
mean either no change in 1.8.0, or a change to make it sync up with H2.7.


My only concern would be that updating our listed Hadoop dependency
version would make it easy for someone to accidentally rely on a
Hadoop API call that wasn't in earlier versions, which would then make
it harder for an interested person to patch their 1.y version to use
the earlier Hadoop version.

HBase checks compilation against different hadoop versions in their
precommit checks. We could add something like that to our nightly
builds maybe?

Now that we're discussing it, I can't actually remember if we ever
documented what version(s) of Hadoop we expect to work with. So maybe
updating to the latest minor release of 2.y on each Accumulo 1.y minor
release can just be our new thing.


It was 2.2.0 for the longest time. I feel like we switched it to 
2.6.something when we ran into some issues with UGI+Kerberos.


1.8.0 seems to be good timing for this (as is likely Christopher's 
reason for bringing it up now).


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-08 Thread Sean Busbey
On Fri, Jul 8, 2016 at 3:40 PM, Christopher  wrote:
> On Fri, Jul 8, 2016 at 11:20 AM Sean Busbey  wrote:
>> Would we be bumping the Hadoop version while incrementing our minor
>> version number or our major version number?
>>
>>
>>
> Minor only, because it's not a breaking change necessarily, and it's
> unrelated to API. It'd still be reasonable for somebody to patch the 1.x
> version to use the earlier Hadoop/HTrace versions easily.
>
> Specifically, I was thinking for 1.8.0. Since H2.8 isn't out yet, that'd
> mean either no change in 1.8.0, or a change to make it sync up with H2.7.

My only concern would be that updating our listed Hadoop dependency
version would make it easy for someone to accidentally rely on a
Hadoop API call that wasn't in earlier versions, which would then make
it harder for an interested person to patch their 1.y version to use
the earlier Hadoop version.

HBase checks compilation against different hadoop versions in their
precommit checks. We could add something like that to our nightly
builds maybe?

Now that we're discussing it, I can't actually remember if we ever
documented what version(s) of Hadoop we expect to work with. So maybe
updating to the latest minor release of 2.y on each Accumulo 1.y minor
release can just be our new thing.

-- 
busbey


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-08 Thread Christopher
On Fri, Jul 8, 2016 at 11:20 AM Sean Busbey  wrote:

> On Thu, Jul 7, 2016 at 4:42 PM, Christopher  wrote:
> > Ah, my mistake. I thought it was 2.7 and later. Well, then I guess the
> > question is whether we should bump to 2.8, then. I'm not a fan of the
> shim
> > layer. I'd rather provide support for downstream packagers trying to
> > backport for HTrace3, if anybody ends up requiring that, than provide a
> > shim to preserve use of the older HTrace.
> >
>
> Hadoop 2.8 isn't out yet, though it now has no blockers listed in
> JIRA. We could ask the Hadoop community what their current thoughts
> are on timing. Hadoop 3 has a release manager that has said an initial
> alpha release is "close", so maybe we'd be dealing with that first.
>
>
Oh, I see. Well, in that case, I think we should stick with whatever ships
with 2.7, and if it's not the same as 2.6, we should bump our dependency on
Hadoop to 2.7 to stay in sync.


>
> > On Thu, Jul 7, 2016 at 5:30 PM Billie Rinaldi 
> > wrote:
> >
> >> I'm in favor of bumping our Hadoop version to 2.7. We are already on the
> >> same htrace version as Hadoop 2.7. (The discussion in ACCUMULO-4171 is
> >> relevant to Hadoop 2.8 and later.)
> >>
> >> Billie
> >>
> >> On Thu, Jul 7, 2016 at 2:20 PM, Christopher 
> wrote:
> >>
> >> > Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171,
> I'm
> >> of
> >> > the opinion that we should probably bump our Hadoop version to 2.7 and
> >> > HTrace version to what Hadoop is using, to keep them in sync.
> >> >
> >> > Does anybody disagree?
> >> >
> >>
>
> Would we be bumping the Hadoop version while incrementing our minor
> version number or our major version number?
>
>
>
Minor only, because it's not a breaking change necessarily, and it's
unrelated to API. It'd still be reasonable for somebody to patch the 1.x
version to use the earlier Hadoop/HTrace versions easily.

Specifically, I was thinking for 1.8.0. Since H2.8 isn't out yet, that'd
mean either no change in 1.8.0, or a change to make it sync up with H2.7.


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-08 Thread Sean Busbey
On Fri, Jul 8, 2016 at 10:19 AM, Sean Busbey  wrote:
> On Thu, Jul 7, 2016 at 4:42 PM, Christopher  wrote:
>> Ah, my mistake. I thought it was 2.7 and later. Well, then I guess the
>> question is whether we should bump to 2.8, then. I'm not a fan of the shim
>> layer. I'd rather provide support for downstream packagers trying to
>> backport for HTrace3, if anybody ends up requiring that, than provide a
>> shim to preserve use of the older HTrace.
>>
>
> Hadoop 2.8 isn't out yet, though it now has no blockers listed in
> JIRA. We could ask the Hadoop community what their current thoughts
> are on timing. Hadoop 3 has a release manager that has said an initial
> alpha release is "close", so maybe we'd be dealing with that first.
>

I take this back, it looks like HADOOP-12893 got reopened and is
blocking releases again, at least on 2.y.z releases.

(that issue is LICENSE/NOTICE getting fixed for Hadoop)

-- 
busbey


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-08 Thread Sean Busbey
On Thu, Jul 7, 2016 at 4:42 PM, Christopher  wrote:
> Ah, my mistake. I thought it was 2.7 and later. Well, then I guess the
> question is whether we should bump to 2.8, then. I'm not a fan of the shim
> layer. I'd rather provide support for downstream packagers trying to
> backport for HTrace3, if anybody ends up requiring that, than provide a
> shim to preserve use of the older HTrace.
>

Hadoop 2.8 isn't out yet, though it now has no blockers listed in
JIRA. We could ask the Hadoop community what their current thoughts
are on timing. Hadoop 3 has a release manager that has said an initial
alpha release is "close", so maybe we'd be dealing with that first.


> On Thu, Jul 7, 2016 at 5:30 PM Billie Rinaldi 
> wrote:
>
>> I'm in favor of bumping our Hadoop version to 2.7. We are already on the
>> same htrace version as Hadoop 2.7. (The discussion in ACCUMULO-4171 is
>> relevant to Hadoop 2.8 and later.)
>>
>> Billie
>>
>> On Thu, Jul 7, 2016 at 2:20 PM, Christopher  wrote:
>>
>> > Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171, I'm
>> of
>> > the opinion that we should probably bump our Hadoop version to 2.7 and
>> > HTrace version to what Hadoop is using, to keep them in sync.
>> >
>> > Does anybody disagree?
>> >
>>

Would we be bumping the Hadoop version while incrementing our minor
version number or our major version number?


-- 
busbey


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-07 Thread Christopher
I'm sure I know some people trying to use Accumulo+HDFS tracing, and it's
going to cause a problem no matter what, because Hadoop and Accumulo aren't
always upgraded at the same time. I just want to make sure it gets better
at some point, if both are sufficiently up-to-date.

Backporting patches to support specific users in custom environments isn't
a big deal, I think, so long as those backports don't have to be maintained
indefinitely, and the conflict will be resolved at some point in the
roadmap.

On Thu, Jul 7, 2016 at 6:07 PM Billie Rinaldi 
wrote:

> Ah, that makes more sense. I would be fine with bumping the htrace
> dependency to match the most recent version of Hadoop that we support and
> not doing a shim layer. We might want to check in with any users who are
> using the Accumulo+HDFS tracing to see if this would be a problem for them.
> I am not sure if anyone is using it or not.
>
> Billie
>
> On Thu, Jul 7, 2016 at 2:42 PM, Christopher  wrote:
>
> > Ah, my mistake. I thought it was 2.7 and later. Well, then I guess the
> > question is whether we should bump to 2.8, then. I'm not a fan of the
> shim
> > layer. I'd rather provide support for downstream packagers trying to
> > backport for HTrace3, if anybody ends up requiring that, than provide a
> > shim to preserve use of the older HTrace.
> >
> > On Thu, Jul 7, 2016 at 5:30 PM Billie Rinaldi 
> > wrote:
> >
> > > I'm in favor of bumping our Hadoop version to 2.7. We are already on
> the
> > > same htrace version as Hadoop 2.7. (The discussion in ACCUMULO-4171 is
> > > relevant to Hadoop 2.8 and later.)
> > >
> > > Billie
> > >
> > > On Thu, Jul 7, 2016 at 2:20 PM, Christopher 
> wrote:
> > >
> > > > Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171,
> > I'm
> > > of
> > > > the opinion that we should probably bump our Hadoop version to 2.7
> and
> > > > HTrace version to what Hadoop is using, to keep them in sync.
> > > >
> > > > Does anybody disagree?
> > > >
> > >
> >
>


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-07 Thread Billie Rinaldi
Ah, that makes more sense. I would be fine with bumping the htrace
dependency to match the most recent version of Hadoop that we support and
not doing a shim layer. We might want to check in with any users who are
using the Accumulo+HDFS tracing to see if this would be a problem for them.
I am not sure if anyone is using it or not.

Billie

On Thu, Jul 7, 2016 at 2:42 PM, Christopher  wrote:

> Ah, my mistake. I thought it was 2.7 and later. Well, then I guess the
> question is whether we should bump to 2.8, then. I'm not a fan of the shim
> layer. I'd rather provide support for downstream packagers trying to
> backport for HTrace3, if anybody ends up requiring that, than provide a
> shim to preserve use of the older HTrace.
>
> On Thu, Jul 7, 2016 at 5:30 PM Billie Rinaldi 
> wrote:
>
> > I'm in favor of bumping our Hadoop version to 2.7. We are already on the
> > same htrace version as Hadoop 2.7. (The discussion in ACCUMULO-4171 is
> > relevant to Hadoop 2.8 and later.)
> >
> > Billie
> >
> > On Thu, Jul 7, 2016 at 2:20 PM, Christopher  wrote:
> >
> > > Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171,
> I'm
> > of
> > > the opinion that we should probably bump our Hadoop version to 2.7 and
> > > HTrace version to what Hadoop is using, to keep them in sync.
> > >
> > > Does anybody disagree?
> > >
> >
>


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-07 Thread Christopher
Ah, my mistake. I thought it was 2.7 and later. Well, then I guess the
question is whether we should bump to 2.8, then. I'm not a fan of the shim
layer. I'd rather provide support for downstream packagers trying to
backport for HTrace3, if anybody ends up requiring that, than provide a
shim to preserve use of the older HTrace.

On Thu, Jul 7, 2016 at 5:30 PM Billie Rinaldi 
wrote:

> I'm in favor of bumping our Hadoop version to 2.7. We are already on the
> same htrace version as Hadoop 2.7. (The discussion in ACCUMULO-4171 is
> relevant to Hadoop 2.8 and later.)
>
> Billie
>
> On Thu, Jul 7, 2016 at 2:20 PM, Christopher  wrote:
>
> > Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171, I'm
> of
> > the opinion that we should probably bump our Hadoop version to 2.7 and
> > HTrace version to what Hadoop is using, to keep them in sync.
> >
> > Does anybody disagree?
> >
>


Re: [DISCUSS] Htrace4, Hadoop 2.7

2016-07-07 Thread Billie Rinaldi
I'm in favor of bumping our Hadoop version to 2.7. We are already on the
same htrace version as Hadoop 2.7. (The discussion in ACCUMULO-4171 is
relevant to Hadoop 2.8 and later.)

Billie

On Thu, Jul 7, 2016 at 2:20 PM, Christopher  wrote:

> Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171, I'm of
> the opinion that we should probably bump our Hadoop version to 2.7 and
> HTrace version to what Hadoop is using, to keep them in sync.
>
> Does anybody disagree?
>


[DISCUSS] Htrace4, Hadoop 2.7

2016-07-07 Thread Christopher
Thinking about https://issues.apache.org/jira/browse/ACCUMULO-4171, I'm of
the opinion that we should probably bump our Hadoop version to 2.7 and
HTrace version to what Hadoop is using, to keep them in sync.

Does anybody disagree?