Re: Contribute Examples/Exercises
Sean, My original thought here was that we could adapt these to fit the mold of the Accumulo examples that are shipped with core. Does that make any sense? Either way, the contrib approach seems reasonable as well, just not what I first thought. -d On Tue, Nov 11, 2014 at 12:18 PM, Sean Busbey bus...@cloudera.com wrote: I think these would be an excellent addition, given the improvements David suggested. Since they span a range of Accumulo uses, they are probably best served as a contrib repository[1] rather than an Accumulo version-specific example. In any case this will involve adopting a non-trivial code base developed outside of the project; we'll need to go through the ASF Incubator[2]. Before that can happen we'll need to call a vote[3]. [1]: http://accumulo.apache.org/contrib.html [2]: http://incubator.apache.org/faq.html#proposed_new_codebase http://incubator.apache.org/ip-clearance/index.html [3]: http://accumulo.apache.org/bylaws.html#actions On Tue, Nov 11, 2014 at 10:38 AM, Josh Elser josh.el...@gmail.com wrote: I've given a quick glance over them -- they look like they'd be a great addition! We'd have to figure out some mechanism to distribute the exercises (as we can't compile them), but that's a manageable problem. If you want to open an issue on JIRA, that'd be the first step to get these into the codebase. Some things to think about meanwhile: * Check out the coding practices and code formatting guidelines - http://accumulo.apache.org/source.html#coding-practices * Add ASL headers to the files * Figure out where might be a good place to include these in the Accumulo tree - maybe examples/training? * Consider what documentation would be needed for someone to self-guide themselves through these examples * Look into redistribution rights on the included twitter.json file. I'm not sure what Twitter's terms of service are. It may be easier to write a script that will generate some example tweets. It keeps us from being liable for what those tweets contain and also prevents us from having to distribute a big blob. Thanks again! David Medinets wrote: Can you add descriptions of the exercises to the README file? Many people finding that page would move to the next one with details to catch their interest. On Tue, Nov 11, 2014 at 9:49 AM, Chris Riganochris.p.rig...@gmail.com wrote: I believe they would be of benefit. On Mon, Nov 10, 2014 at 12:43 PM, Adam J. Shookadamjsh...@gmail.com wrote: I had just finished finalizing the training materials for a basic and advanced Accumulo class my company ClearEdge IT offers. I think some of the more advanced code tutorials would be valuable to contribute to the Accumulo examples library. The examples all work with status updates from Twitter and include topics such as: - Basic Reading/Writing - Indexing tweets and creating a program to retrieve tweets based on given search terms - Bulk ingestion of the tweets - Using MapReduce to building a geo-index table for the tweets with latitude/longitude information via z-points - Levering the geo-index to retrieve tweets from a given lat/long bounding box - Custom iterators such as filters and combiners If this is something the community would be interested in, please take the time to review them at https://github.com/adamjshook/accumulo-training and let me know if there are any you think would be worth contributing. I'd be happy to take the time to massage them to meet the standards. Cheers, --Adam -- Sean
Re: Contribute Examples/Exercises
I'll sound a note of caution. I love examples, but would not want to adversely affect the build times of Accumulo. Perhaps a link to the examples GitHub page at http://accumulo.apache.org/papers.html would suffice?
Re: Contribute Examples/Exercises
+1 -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, Nov 12, 2014 at 12:57 PM, David Medinets david.medin...@gmail.com wrote: I'll sound a note of caution. I love examples, but would not want to adversely affect the build times of Accumulo. Perhaps a link to the examples GitHub page at http://accumulo.apache.org/papers.html would suffice?
Re: Contribute Examples/Exercises
Adapting to the existing examples makes a bit more sense to me. If they're a contrib, I think they'd be better served as an externally maintained contrib. If we accept them into the project as an internal contrib, that kind of reflects a willingness and an obligation to maintain them, and that's less likely as an internal contrib than as part of the build examples or as an externally linked contrib, I think (from observing the state of the existing internal contrib repos). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, Nov 12, 2014 at 11:18 AM, Donald Miner dmi...@clearedgeit.com wrote: Sean, My original thought here was that we could adapt these to fit the mold of the Accumulo examples that are shipped with core. Does that make any sense? Either way, the contrib approach seems reasonable as well, just not what I first thought. -d On Tue, Nov 11, 2014 at 12:18 PM, Sean Busbey bus...@cloudera.com wrote: I think these would be an excellent addition, given the improvements David suggested. Since they span a range of Accumulo uses, they are probably best served as a contrib repository[1] rather than an Accumulo version-specific example. In any case this will involve adopting a non-trivial code base developed outside of the project; we'll need to go through the ASF Incubator[2]. Before that can happen we'll need to call a vote[3]. [1]: http://accumulo.apache.org/contrib.html [2]: http://incubator.apache.org/faq.html#proposed_new_codebase http://incubator.apache.org/ip-clearance/index.html [3]: http://accumulo.apache.org/bylaws.html#actions On Tue, Nov 11, 2014 at 10:38 AM, Josh Elser josh.el...@gmail.com wrote: I've given a quick glance over them -- they look like they'd be a great addition! We'd have to figure out some mechanism to distribute the exercises (as we can't compile them), but that's a manageable problem. If you want to open an issue on JIRA, that'd be the first step to get these into the codebase. Some things to think about meanwhile: * Check out the coding practices and code formatting guidelines - http://accumulo.apache.org/source.html#coding-practices * Add ASL headers to the files * Figure out where might be a good place to include these in the Accumulo tree - maybe examples/training? * Consider what documentation would be needed for someone to self-guide themselves through these examples * Look into redistribution rights on the included twitter.json file. I'm not sure what Twitter's terms of service are. It may be easier to write a script that will generate some example tweets. It keeps us from being liable for what those tweets contain and also prevents us from having to distribute a big blob. Thanks again! David Medinets wrote: Can you add descriptions of the exercises to the README file? Many people finding that page would move to the next one with details to catch their interest. On Tue, Nov 11, 2014 at 9:49 AM, Chris Rigano chris.p.rig...@gmail.com wrote: I believe they would be of benefit. On Mon, Nov 10, 2014 at 12:43 PM, Adam J. Shookadamjsh...@gmail.com wrote: I had just finished finalizing the training materials for a basic and advanced Accumulo class my company ClearEdge IT offers. I think some of the more advanced code tutorials would be valuable to contribute to the Accumulo examples library. The examples all work with status updates from Twitter and include topics such as: - Basic Reading/Writing - Indexing tweets and creating a program to retrieve tweets based on given search terms - Bulk ingestion of the tweets - Using MapReduce to building a geo-index table for the tweets with latitude/longitude information via z-points - Levering the geo-index to retrieve tweets from a given lat/long bounding box - Custom iterators such as filters and combiners If this is something the community would be interested in, please take the time to review them at https://github.com/adamjshook/accumulo-training and let me know if there are any you think would be worth contributing. I'd be happy to take the time to massage them to meet the standards. Cheers, --Adam -- Sean
Re: Contribute Examples/Exercises
Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. Donald Miner wrote: Sean, My original thought here was that we could adapt these to fit the mold of the Accumulo examples that are shipped with core. Does that make any sense? Either way, the contrib approach seems reasonable as well, just not what I first thought. -d On Tue, Nov 11, 2014 at 12:18 PM, Sean Busbeybus...@cloudera.com wrote: I think these would be an excellent addition, given the improvements David suggested. Since they span a range of Accumulo uses, they are probably best served as a contrib repository[1] rather than an Accumulo version-specific example. In any case this will involve adopting a non-trivial code base developed outside of the project; we'll need to go through the ASF Incubator[2]. Before that can happen we'll need to call a vote[3]. [1]: http://accumulo.apache.org/contrib.html [2]: http://incubator.apache.org/faq.html#proposed_new_codebase http://incubator.apache.org/ip-clearance/index.html [3]: http://accumulo.apache.org/bylaws.html#actions On Tue, Nov 11, 2014 at 10:38 AM, Josh Elserjosh.el...@gmail.com wrote: I've given a quick glance over them -- they look like they'd be a great addition! We'd have to figure out some mechanism to distribute the exercises (as we can't compile them), but that's a manageable problem. If you want to open an issue on JIRA, that'd be the first step to get these into the codebase. Some things to think about meanwhile: * Check out the coding practices and code formatting guidelines - http://accumulo.apache.org/source.html#coding-practices * Add ASL headers to the files * Figure out where might be a good place to include these in the Accumulo tree - maybe examples/training? * Consider what documentation would be needed for someone to self-guide themselves through these examples * Look into redistribution rights on the included twitter.json file. I'm not sure what Twitter's terms of service are. It may be easier to write a script that will generate some example tweets. It keeps us from being liable for what those tweets contain and also prevents us from having to distribute a big blob. Thanks again! David Medinets wrote: Can you add descriptions of the exercises to the README file? Many people finding that page would move to the next one with details to catch their interest. On Tue, Nov 11, 2014 at 9:49 AM, Chris Riganochris.p.rig...@gmail.com wrote: I believe they would be of benefit. On Mon, Nov 10, 2014 at 12:43 PM, Adam J. Shookadamjsh...@gmail.com wrote: I had just finished finalizing the training materials for a basic and advanced Accumulo class my company ClearEdge IT offers. I think some of the more advanced code tutorials would be valuable to contribute to the Accumulo examples library. The examples all work with status updates from Twitter and include topics such as: - Basic Reading/Writing - Indexing tweets and creating a program to retrieve tweets based on given search terms - Bulk ingestion of the tweets - Using MapReduce to building a geo-index table for the tweets with latitude/longitude information via z-points - Levering the geo-index to retrieve tweets from a given lat/long bounding box - Custom iterators such as filters and combiners If this is something the community would be interested in, please take the time to review them at https://github.com/adamjshook/accumulo-training and let me know if there are any you think would be worth contributing. I'd be happy to take the time to massage them to meet the standards. Cheers, --Adam -- Sean
Re: Contribute Examples/Exercises
My first glance at the code didn't raise any concerns about significantly impacting the build time. David Medinets wrote: I'll sound a note of caution. I love examples, but would not want to adversely affect the build times of Accumulo. Perhaps a link to the examples GitHub page at http://accumulo.apache.org/papers.html would suffice?
Re: Contribute Examples/Exercises
On Wed, Nov 12, 2014 at 12:31 PM, Josh Elser josh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. -- Sean
Re: Contribute Examples/Exercises
Sean Busbey wrote: On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. From http://incubator.apache.org/ip-clearance/index.html From time to time, an external codebase is brought into the ASF that is not a separate incubating project but still represents a substantial contribution that was not developed within the ASF's source control system and on our public mailing lists. Not to look a gift-horse in the mouth (it is great work), but I don't see these examples as substantial. I haven't found guidelines yet that better clarify the definition of substantial.
Re: Contribute Examples/Exercises
I'll take that as you disagree with my consideration of substantial. Thanks. Mike Drob wrote: The proposed contribution is a collection of 11 examples. It's clearly non-trivial, which is probably enough to be considered substantial On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com wrote: Sean Busbey wrote: On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. From http://incubator.apache.org/ip-clearance/index.html From time to time, an external codebase is brought into the ASF that is not a separate incubating project but still represents a substantial contribution that was not developed within the ASF's source control system and on our public mailing lists. Not to look a gift-horse in the mouth (it is great work), but I don't see these examples as substantial. I haven't found guidelines yet that better clarify the definition of substantial.
Re: Contribute Examples/Exercises
+1 for adding the examples to contrib. I was, myself, reading over this email wondering how a set of 11 separate examples on the use of Accumulo would fit into the core codebase- especially as more are contributed over tinme. I like the idea of giving community members an outlet for contributing examples that they've built so that we can continue to foster that without having to fit them in the core codebase. It just seems more maintainable. On Wed, Nov 12, 2014 at 2:19 PM, Josh Elser josh.el...@gmail.com wrote: I'll take that as you disagree with my consideration of substantial. Thanks. Mike Drob wrote: The proposed contribution is a collection of 11 examples. It's clearly non-trivial, which is probably enough to be considered substantial On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com wrote: Sean Busbey wrote: On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. From http://incubator.apache.org/ip-clearance/index.html From time to time, an external codebase is brought into the ASF that is not a separate incubating project but still represents a substantial contribution that was not developed within the ASF's source control system and on our public mailing lists. Not to look a gift-horse in the mouth (it is great work), but I don't see these examples as substantial. I haven't found guidelines yet that better clarify the definition of substantial.
Re: Contribute Examples/Exercises
My worry with a contrib module is that, historically, code which goes moves to a contrib is just one step away from the grave. I think there's precedence for keeping them in core (as Christopher had mentioned, next to examples/simple) which would benefit people externally (more how do I do X examples) and internally (keep devs honest about how our APIs are implemented). Bringing the examples into the core also encourages us to grow the community which has been stagnant with respect to new committers for about 9 months now. Corey Nolet wrote: +1 for adding the examples to contrib. I was, myself, reading over this email wondering how a set of 11 separate examples on the use of Accumulo would fit into the core codebase- especially as more are contributed over tinme. I like the idea of giving community members an outlet for contributing examples that they've built so that we can continue to foster that without having to fit them in the core codebase. It just seems more maintainable. On Wed, Nov 12, 2014 at 2:19 PM, Josh Elserjosh.el...@gmail.com wrote: I'll take that as you disagree with my consideration of substantial. Thanks. Mike Drob wrote: The proposed contribution is a collection of 11 examples. It's clearly non-trivial, which is probably enough to be considered substantial On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com wrote: Sean Busbey wrote: On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. From http://incubator.apache.org/ip-clearance/index.html From time to time, an external codebase is brought into the ASF that is not a separate incubating project but still represents a substantial contribution that was not developed within the ASF's source control system and on our public mailing lists. Not to look a gift-horse in the mouth (it is great work), but I don't see these examples as substantial. I haven't found guidelines yet that better clarify the definition of substantial.
Re: Contribute Examples/Exercises
For what it's worth, my intention was to update the examples to fit the ASF and Accumulo standards and add them as examples of advanced usage/patterns. I don't expect all 11 to be contributed, as over half of them are fairly simple examples showing off the basic Accumulo usage -- examples that are already in the core. This thread was just a means to see if it was worth the effort to massage a selection of the examples into the ASF standards for the core examples library. --Adam On Wed, Nov 12, 2014 at 2:54 PM, Josh Elser josh.el...@gmail.com wrote: My worry with a contrib module is that, historically, code which goes moves to a contrib is just one step away from the grave. I think there's precedence for keeping them in core (as Christopher had mentioned, next to examples/simple) which would benefit people externally (more how do I do X examples) and internally (keep devs honest about how our APIs are implemented). Bringing the examples into the core also encourages us to grow the community which has been stagnant with respect to new committers for about 9 months now. Corey Nolet wrote: +1 for adding the examples to contrib. I was, myself, reading over this email wondering how a set of 11 separate examples on the use of Accumulo would fit into the core codebase- especially as more are contributed over tinme. I like the idea of giving community members an outlet for contributing examples that they've built so that we can continue to foster that without having to fit them in the core codebase. It just seems more maintainable. On Wed, Nov 12, 2014 at 2:19 PM, Josh Elserjosh.el...@gmail.com wrote: I'll take that as you disagree with my consideration of substantial. Thanks. Mike Drob wrote: The proposed contribution is a collection of 11 examples. It's clearly non-trivial, which is probably enough to be considered substantial On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com wrote: Sean Busbey wrote: On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. From http://incubator.apache.org/ip-clearance/index.html From time to time, an external codebase is brought into the ASF that is not a separate incubating project but still represents a substantial contribution that was not developed within the ASF's source control system and on our public mailing lists. Not to look a gift-horse in the mouth (it is great work), but I don't see these examples as substantial. I haven't found guidelines yet that better clarify the definition of substantial.
Re: Contribute Examples/Exercises
Josh, My worry with a contrib module is that, historically, code which goes moves to a contrib is just one step away from the grave. You do have a good point. My hope was that this could be the beginning of our changing history so that we could begin to encourage the community to contribute their own source directly and give them an outlet for doing so. I understand that's also the intent of hosting open source repos under ASF to begin with- so I'm partial to either outcome. I think there's precedence for keeping them in core (as Christopher had mentioned, next to examples/simple) which would benefit people externally (more how do I do X examples) and internally (keep devs honest about how our APIs are implemented). I would think that would just require keeping the repos up to date as versions change so they wouldn't get out of date and possibly releasing them w/ our other releases. Wherever they end up living, thank you Adam for the contributions! On Wed, Nov 12, 2014 at 2:54 PM, Josh Elser josh.el...@gmail.com wrote: My worry with a contrib module is that, historically, code which goes moves to a contrib is just one step away from the grave. I think there's precedence for keeping them in core (as Christopher had mentioned, next to examples/simple) which would benefit people externally (more how do I do X examples) and internally (keep devs honest about how our APIs are implemented). Bringing the examples into the core also encourages us to grow the community which has been stagnant with respect to new committers for about 9 months now. Corey Nolet wrote: +1 for adding the examples to contrib. I was, myself, reading over this email wondering how a set of 11 separate examples on the use of Accumulo would fit into the core codebase- especially as more are contributed over tinme. I like the idea of giving community members an outlet for contributing examples that they've built so that we can continue to foster that without having to fit them in the core codebase. It just seems more maintainable. On Wed, Nov 12, 2014 at 2:19 PM, Josh Elserjosh.el...@gmail.com wrote: I'll take that as you disagree with my consideration of substantial. Thanks. Mike Drob wrote: The proposed contribution is a collection of 11 examples. It's clearly non-trivial, which is probably enough to be considered substantial On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com wrote: Sean Busbey wrote: On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com wrote: Personally, I didn't really think that this contribution was in the spirit of what the new codebase adoption guidelines were meant to cover. Some extra examples which leverage what Accumulo already does seems more like improvements for new Accumulo users than anything else. It's content developed out side of the project list. That's all it takes to require the trip through the Incubator checks as far as the ASF guidelines are concerned. From http://incubator.apache.org/ip-clearance/index.html From time to time, an external codebase is brought into the ASF that is not a separate incubating project but still represents a substantial contribution that was not developed within the ASF's source control system and on our public mailing lists. Not to look a gift-horse in the mouth (it is great work), but I don't see these examples as substantial. I haven't found guidelines yet that better clarify the definition of substantial.