Hadoop 2.0 Support for Accumulo 1.4 Branch
Cloudera announced last night our support for Accumulo 1.4.3 on CDH4: http://www.slideshare.net/JoeyEcheverria/apache-accumulo-and-cloudera This required back porting about 11 patches in whole or in part from the 1.5 line on top of 1.4.3. Our release is still in a semi-private beta, but when it's fully public it will be downloadable along with all of the extra patches that we committed. My question is if the community would be interested in us pulling those back ports upstream? I believe this would violate the previously agreed upon rule of no feature back ports to 1.4.3, depending on how we label support for Hadoop 2.0. Thoughts? -Joey
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
My question is if the community would be interested in us pulling those back ports upstream? Yes, please.
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
We have both the unit tests and the full system test suite hooked up to a Jenkins build server. There are still a couple of tests that fail periodically with the full system test due to timeouts. We're working on those which is why our current release is just a beta. There are no API changes or Accumulo behavior changes. You can use unmodified 1.4.x clients with our release of the server daemons. -Joey On Fri, Jul 26, 2013 at 11:45 AM, Keith Turner ke...@deenlo.com wrote: On Fri, Jul 26, 2013 at 11:02 AM, Joey Echeverria j...@cloudera.com wrote: Cloudera announced last night our support for Accumulo 1.4.3 on CDH4: http://www.slideshare.net/JoeyEcheverria/apache-accumulo-and-cloudera This required back porting about 11 patches in whole or in part from the 1.5 line on top of 1.4.3. Our release is still in a semi-private beta, but when it's fully public it will be downloadable along with all of the extra patches that we committed. My question is if the community would be interested in us pulling those back ports upstream? What testing has been done? It would be nice to run accumulo's full test suite against 1.4.3+CDH4. Are there any Accumulo API changes or Accumulo behavior changes? I believe this would violate the previously agreed upon rule of no feature back ports to 1.4.3, depending on how we label support for Hadoop 2.0. Thoughts? -Joey -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
On Fri, Jul 26, 2013 at 12:24 PM, Joey Echeverria j...@cloudera.com wrote: We have both the unit tests and the full system test suite hooked up to a Jenkins build server. If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. There are still a couple of tests that fail periodically with the full system test due to timeouts. We're working on those which is why our current release is just a beta. There are no API changes or Accumulo behavior changes. You can use unmodified 1.4.x clients with our release of the server daemons. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? -Joey On Fri, Jul 26, 2013 at 11:45 AM, Keith Turner ke...@deenlo.com wrote: On Fri, Jul 26, 2013 at 11:02 AM, Joey Echeverria j...@cloudera.com wrote: Cloudera announced last night our support for Accumulo 1.4.3 on CDH4: http://www.slideshare.net/JoeyEcheverria/apache-accumulo-and-cloudera This required back porting about 11 patches in whole or in part from the 1.5 line on top of 1.4.3. Our release is still in a semi-private beta, but when it's fully public it will be downloadable along with all of the extra patches that we committed. My question is if the community would be interested in us pulling those back ports upstream? What testing has been done? It would be nice to run accumulo's full test suite against 1.4.3+CDH4. Are there any Accumulo API changes or Accumulo behavior changes? I believe this would violate the previously agreed upon rule of no feature back ports to 1.4.3, depending on how we label support for Hadoop 2.0. Thoughts? -Joey -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. -Joey
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) to make the compatibility requirements simpler; we ended up without dependency changes in the hadoop version profiles. Will 1.4 still work with 0.20 with these patches? If there are dependency changes in the profiles, 1.4 would have to be compiled against a hadoop version compatible with the running version of hadoop, correct? We had some trouble in the 1.5 release process with figuring out how to provide multiple binary artifacts (each compiled against a different version of hadoop) for the same release. Just something we should consider before we are in the midst of releasing 1.4.4. Billie -Joey
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
On Fri, Jul 26, 2013 at 2:33 PM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. There are some instructions. test/system/continuous/README test/system/randomwalk/README Continuous ingest has a lot of options. For release testing we do something like the following. #configure may need to adjust max mappers and max reducers to make map reduce job run faster start-ingest.sh start-walker.sh #sleep 24hr stop-ingest.sh stop-walker.sh run-verify.sh There continuous dir has scripts for starting and stopping the agitator. We also use this script to agitate while running random walk test. For random walk we use the All.xml graph, configure it to log errors to NFS, and run a walker on each node. We look in NFS for walkers that died or got stuck. The random walk framework will log a message if a node in the graph gets stuck. It will also log a message when it gets unstuck. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. -Joey
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
Will 1.4 still work with 0.20 with these patches? Great point Billie. - Original Message - From: Billie Rinaldi billie.rina...@gmail.com To: dev@accumulo.apache.org Sent: Friday, July 26, 2013 3:02:41 PM Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) to make the compatibility requirements simpler; we ended up without dependency changes in the hadoop version profiles. Will 1.4 still work with 0.20 with these patches? If there are dependency changes in the profiles, 1.4 would have to be compiled against a hadoop version compatible with the running version of hadoop, correct? We had some trouble in the 1.5 release process with figuring out how to provide multiple binary artifacts (each compiled against a different version of hadoop) for the same release. Just something we should consider before we are in the midst of releasing 1.4.4. Billie -Joey