Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Joey Echeverria
Cloudera announced last night our support for Accumulo 1.4.3 on CDH4:

http://www.slideshare.net/JoeyEcheverria/apache-accumulo-and-cloudera

This required back porting about 11 patches in whole or in part from the
1.5 line on top of 1.4.3. Our release is still in a semi-private beta, but
when it's fully public it will be downloadable along with all of the extra
patches that we committed.

My question is if the community would be interested in us pulling those
back ports upstream?

I believe this would violate the previously agreed upon rule of no feature
back ports to 1.4.3, depending on how we label support for Hadoop 2.0.

Thoughts?

-Joey


Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Eric Newton
My question is if the community would be interested in us pulling those
back ports upstream?

Yes, please.


Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Joey Echeverria
We have both the unit tests and the full system test suite hooked up to a
Jenkins build server.

There are still a couple of tests that fail periodically with the full
system test due to timeouts. We're working on those which is why our
current release is just a beta.

There are no API changes or Accumulo behavior changes. You can use
unmodified 1.4.x clients with our release of the server daemons.

-Joey


On Fri, Jul 26, 2013 at 11:45 AM, Keith Turner ke...@deenlo.com wrote:

 On Fri, Jul 26, 2013 at 11:02 AM, Joey Echeverria j...@cloudera.com
 wrote:

  Cloudera announced last night our support for Accumulo 1.4.3 on CDH4:
 
  http://www.slideshare.net/JoeyEcheverria/apache-accumulo-and-cloudera
 
  This required back porting about 11 patches in whole or in part from the
  1.5 line on top of 1.4.3. Our release is still in a semi-private beta,
 but
  when it's fully public it will be downloadable along with all of the
 extra
  patches that we committed.
 
  My question is if the community would be interested in us pulling those
  back ports upstream?
 

 What testing has been done?  It would be nice to run accumulo's full test
 suite against 1.4.3+CDH4.

 Are there any Accumulo API changes or Accumulo behavior changes?


  I believe this would violate the previously agreed upon rule of no
 feature
  back ports to 1.4.3, depending on how we label support for Hadoop 2.0.


  Thoughts?
 
  -Joey
 




-- 
Joey Echeverria
Director, Federal FTS
Cloudera, Inc.


Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Keith Turner
On Fri, Jul 26, 2013 at 12:24 PM, Joey Echeverria j...@cloudera.com wrote:

 We have both the unit tests and the full system test suite hooked up to a
 Jenkins build server.


If these patches are going to be included with 1.4.4 or 1.4.5, I would like
to see the following test run using CDH4 on at least a 5 node cluster.
 More nodes would be better.

  * unit test
  * Functional test
  * 24 hr Continuous ingest + verification
  * 24 hr Continuous ingest + verification + agitation
  * 24 hr Random walk
  * 24 hr Random walk + agitation

I may be able to assist with this, but I can not make any promises.



 There are still a couple of tests that fail periodically with the full
 system test due to timeouts. We're working on those which is why our
 current release is just a beta.

 There are no API changes or Accumulo behavior changes. You can use
 unmodified 1.4.x clients with our release of the server daemons.


Great.  I think this would be a good patch for 1.4.   I assume that if a
user stays with Hadoop 1 there are no dependency changes?



 -Joey


 On Fri, Jul 26, 2013 at 11:45 AM, Keith Turner ke...@deenlo.com wrote:

  On Fri, Jul 26, 2013 at 11:02 AM, Joey Echeverria j...@cloudera.com
  wrote:
 
   Cloudera announced last night our support for Accumulo 1.4.3 on CDH4:
  
   http://www.slideshare.net/JoeyEcheverria/apache-accumulo-and-cloudera
  
   This required back porting about 11 patches in whole or in part from
 the
   1.5 line on top of 1.4.3. Our release is still in a semi-private beta,
  but
   when it's fully public it will be downloadable along with all of the
  extra
   patches that we committed.
  
   My question is if the community would be interested in us pulling those
   back ports upstream?
  
 
  What testing has been done?  It would be nice to run accumulo's full test
  suite against 1.4.3+CDH4.
 
  Are there any Accumulo API changes or Accumulo behavior changes?
 
 
   I believe this would violate the previously agreed upon rule of no
  feature
   back ports to 1.4.3, depending on how we label support for Hadoop
 2.0.
 
 
   Thoughts?
  
   -Joey
  
 



 --
 Joey Echeverria
 Director, Federal FTS
 Cloudera, Inc.



Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Joey Echeverria
 If these patches are going to be included with 1.4.4 or 1.4.5, I would like
 to see the following test run using CDH4 on at least a 5 node cluster.
  More nodes would be better.

   * unit test
   * Functional test
   * 24 hr Continuous ingest + verification
   * 24 hr Continuous ingest + verification + agitation
   * 24 hr Random walk
   * 24 hr Random walk + agitation

 I may be able to assist with this, but I can not make any promises.

Sure thing. Is there already a write-up on running this full battery
of tests? I have a 10 node cluster that I can use for this.


 Great.  I think this would be a good patch for 1.4.   I assume that if a
 user stays with Hadoop 1 there are no dependency changes?

Yup. It works the same way as 1.5 where all of the dependency changes
are in a Hadoop 2.0 profile.

-Joey


Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Billie Rinaldi
On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote:

  If these patches are going to be included with 1.4.4 or 1.4.5, I would
 like
  to see the following test run using CDH4 on at least a 5 node cluster.
   More nodes would be better.
 
* unit test
* Functional test
* 24 hr Continuous ingest + verification
* 24 hr Continuous ingest + verification + agitation
* 24 hr Random walk
* 24 hr Random walk + agitation
 
  I may be able to assist with this, but I can not make any promises.

 Sure thing. Is there already a write-up on running this full battery
 of tests? I have a 10 node cluster that I can use for this.


  Great.  I think this would be a good patch for 1.4.   I assume that if a
  user stays with Hadoop 1 there are no dependency changes?

 Yup. It works the same way as 1.5 where all of the dependency changes
 are in a Hadoop 2.0 profile.


In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0)
to make the compatibility requirements simpler; we ended up without
dependency changes in the hadoop version profiles.  Will 1.4 still work
with 0.20 with these patches?  If there are dependency changes in the
profiles, 1.4 would have to be compiled against a hadoop version compatible
with the running version of hadoop, correct?  We had some trouble in the
1.5 release process with figuring out how to provide multiple binary
artifacts (each compiled against a different version of hadoop) for the
same release.  Just something we should consider before we are in the midst
of releasing 1.4.4.

Billie


 -Joey



Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread Keith Turner
On Fri, Jul 26, 2013 at 2:33 PM, Joey Echeverria j...@cloudera.com wrote:

  If these patches are going to be included with 1.4.4 or 1.4.5, I would
 like
  to see the following test run using CDH4 on at least a 5 node cluster.
   More nodes would be better.
 
* unit test
* Functional test
* 24 hr Continuous ingest + verification
* 24 hr Continuous ingest + verification + agitation
* 24 hr Random walk
* 24 hr Random walk + agitation
 
  I may be able to assist with this, but I can not make any promises.

 Sure thing. Is there already a write-up on running this full battery
 of tests? I have a 10 node cluster that I can use for this.


There are some instructions.

test/system/continuous/README
test/system/randomwalk/README

Continuous ingest has a lot of options.  For release testing we do
something like the following.

  #configure may need to adjust max mappers and max reducers to make map
reduce job run faster
  start-ingest.sh
  start-walker.sh
  #sleep 24hr
  stop-ingest.sh
  stop-walker.sh
  run-verify.sh

There continuous dir has scripts for starting and stopping the agitator.
 We also use this script to agitate while running random walk test.

For random walk we use the All.xml graph, configure it to log errors to
NFS, and run a walker on each node.  We look in NFS for walkers that died
or got stuck.  The random walk framework will log a message if a node in
the graph gets stuck.  It will also log a message when it gets unstuck.



  Great.  I think this would be a good patch for 1.4.   I assume that if a
  user stays with Hadoop 1 there are no dependency changes?

 Yup. It works the same way as 1.5 where all of the dependency changes
 are in a Hadoop 2.0 profile.

 -Joey



Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-07-26 Thread dlmarion


Will 1.4 still work with 0.20 with these patches? 



Great point Billie. 



- Original Message -


From: Billie Rinaldi billie.rina...@gmail.com 
To: dev@accumulo.apache.org 
Sent: Friday, July 26, 2013 3:02:41 PM 
Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch 

On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: 

  If these patches are going to be included with 1.4.4 or 1.4.5, I would 
 like 
  to see the following test run using CDH4 on at least a 5 node cluster. 
   More nodes would be better. 
  
    * unit test 
    * Functional test 
    * 24 hr Continuous ingest + verification 
    * 24 hr Continuous ingest + verification + agitation 
    * 24 hr Random walk 
    * 24 hr Random walk + agitation 
  
  I may be able to assist with this, but I can not make any promises. 
 
 Sure thing. Is there already a write-up on running this full battery 
 of tests? I have a 10 node cluster that I can use for this. 
 
 
  Great.  I think this would be a good patch for 1.4.   I assume that if a 
  user stays with Hadoop 1 there are no dependency changes? 
 
 Yup. It works the same way as 1.5 where all of the dependency changes 
 are in a Hadoop 2.0 profile. 
 

In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) 
to make the compatibility requirements simpler; we ended up without 
dependency changes in the hadoop version profiles.  Will 1.4 still work 
with 0.20 with these patches?  If there are dependency changes in the 
profiles, 1.4 would have to be compiled against a hadoop version compatible 
with the running version of hadoop, correct?  We had some trouble in the 
1.5 release process with figuring out how to provide multiple binary 
artifacts (each compiled against a different version of hadoop) for the 
same release.  Just something we should consider before we are in the midst 
of releasing 1.4.4. 

Billie 


 -Joey