Re: Contribute Examples/Exercises

2014-11-12 Thread Donald Miner
Sean,

My original thought here was that we could adapt these to fit the mold of
the Accumulo examples that are shipped with core. Does that make any sense?

Either way, the contrib approach seems reasonable as well, just not what I
first thought.

-d

On Tue, Nov 11, 2014 at 12:18 PM, Sean Busbey bus...@cloudera.com wrote:

 I think these would be an excellent addition, given the improvements David
 suggested. Since they span a range of Accumulo uses, they are probably best
 served as a contrib repository[1] rather than an Accumulo version-specific
 example.

 In any case this will involve adopting a non-trivial code base developed
 outside of the project; we'll need to go through the ASF Incubator[2].
 Before that can happen we'll need to call a vote[3].

 [1]: http://accumulo.apache.org/contrib.html
 [2]: http://incubator.apache.org/faq.html#proposed_new_codebase
   http://incubator.apache.org/ip-clearance/index.html
 [3]: http://accumulo.apache.org/bylaws.html#actions

 On Tue, Nov 11, 2014 at 10:38 AM, Josh Elser josh.el...@gmail.com wrote:

  I've given a quick glance over them -- they look like they'd be a great
  addition!
 
  We'd have to figure out some mechanism to distribute the exercises (as we
  can't compile them), but that's a manageable problem.
 
  If you want to open an issue on JIRA, that'd be the first step to get
  these into the codebase. Some things to think about meanwhile:
 
  * Check out the coding practices and code formatting guidelines -
  http://accumulo.apache.org/source.html#coding-practices
  * Add ASL headers to the files
  * Figure out where might be a good place to include these in the Accumulo
  tree  - maybe examples/training?
  * Consider what documentation would be needed for someone to self-guide
  themselves through these examples
  * Look into redistribution rights on the included twitter.json file. I'm
  not sure what Twitter's terms of service are. It may be easier to write a
  script that will generate some example tweets. It keeps us from being
  liable for what those tweets contain and also prevents us from having to
  distribute a big blob.
 
  Thanks again!
 
 
  David Medinets wrote:
 
  Can you add descriptions of the exercises to the README file? Many
  people finding that page would move to the next one with details to
  catch their interest.
 
  On Tue, Nov 11, 2014 at 9:49 AM, Chris Riganochris.p.rig...@gmail.com
  wrote:
 
  I believe they would be of benefit.
 
  On Mon, Nov 10, 2014 at 12:43 PM, Adam J. Shookadamjsh...@gmail.com
  wrote:
 
   I had just finished finalizing the training materials for a basic and
  advanced Accumulo class my company ClearEdge IT offers.  I think some
 of
  the more advanced code tutorials would be valuable to contribute to
 the
  Accumulo examples library.
 
  The examples all work with status updates from Twitter and include
  topics
  such as:
 
  - Basic Reading/Writing
  - Indexing tweets and creating a program to retrieve tweets based
 on
  given search terms
  - Bulk ingestion of the tweets
  - Using MapReduce to building a geo-index table for the tweets
 with
  latitude/longitude information via z-points
  - Levering the geo-index to retrieve tweets from a given lat/long
  bounding box
  - Custom iterators such as filters and combiners
 
  If this is something the community would be interested in, please take
  the
  time to review them at
 https://github.com/adamjshook/accumulo-training
  and
  let me know if there are any you think would be worth contributing.
  I'd be
  happy to take the time to massage them to meet the standards.
 
  Cheers,
  --Adam
 
 


 --
 Sean



Re: Contribute Examples/Exercises

2014-11-12 Thread David Medinets
I'll sound a note of caution. I love examples, but would not want to
adversely affect the build times of Accumulo. Perhaps a link to the
examples GitHub page at http://accumulo.apache.org/papers.html would
suffice?


Re: Contribute Examples/Exercises

2014-11-12 Thread Christopher
+1


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Wed, Nov 12, 2014 at 12:57 PM, David Medinets david.medin...@gmail.com
wrote:

 I'll sound a note of caution. I love examples, but would not want to
 adversely affect the build times of Accumulo. Perhaps a link to the
 examples GitHub page at http://accumulo.apache.org/papers.html would
 suffice?



Re: Contribute Examples/Exercises

2014-11-12 Thread Christopher
Adapting to the existing examples makes a bit more sense to me. If they're
a contrib, I think they'd be better served as an externally maintained
contrib. If we accept them into the project as an internal contrib, that
kind of reflects a willingness and an obligation to maintain them, and
that's less likely as an internal contrib than as part of the build
examples or as an externally linked contrib, I think (from observing the
state of the existing internal contrib repos).


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Wed, Nov 12, 2014 at 11:18 AM, Donald Miner dmi...@clearedgeit.com
wrote:

 Sean,

 My original thought here was that we could adapt these to fit the mold of
 the Accumulo examples that are shipped with core. Does that make any sense?

 Either way, the contrib approach seems reasonable as well, just not what I
 first thought.

 -d

 On Tue, Nov 11, 2014 at 12:18 PM, Sean Busbey bus...@cloudera.com wrote:

  I think these would be an excellent addition, given the improvements
 David
  suggested. Since they span a range of Accumulo uses, they are probably
 best
  served as a contrib repository[1] rather than an Accumulo
 version-specific
  example.
 
  In any case this will involve adopting a non-trivial code base developed
  outside of the project; we'll need to go through the ASF Incubator[2].
  Before that can happen we'll need to call a vote[3].
 
  [1]: http://accumulo.apache.org/contrib.html
  [2]: http://incubator.apache.org/faq.html#proposed_new_codebase
http://incubator.apache.org/ip-clearance/index.html
  [3]: http://accumulo.apache.org/bylaws.html#actions
 
  On Tue, Nov 11, 2014 at 10:38 AM, Josh Elser josh.el...@gmail.com
 wrote:
 
   I've given a quick glance over them -- they look like they'd be a great
   addition!
  
   We'd have to figure out some mechanism to distribute the exercises (as
 we
   can't compile them), but that's a manageable problem.
  
   If you want to open an issue on JIRA, that'd be the first step to get
   these into the codebase. Some things to think about meanwhile:
  
   * Check out the coding practices and code formatting guidelines -
   http://accumulo.apache.org/source.html#coding-practices
   * Add ASL headers to the files
   * Figure out where might be a good place to include these in the
 Accumulo
   tree  - maybe examples/training?
   * Consider what documentation would be needed for someone to self-guide
   themselves through these examples
   * Look into redistribution rights on the included twitter.json file.
 I'm
   not sure what Twitter's terms of service are. It may be easier to
 write a
   script that will generate some example tweets. It keeps us from being
   liable for what those tweets contain and also prevents us from having
 to
   distribute a big blob.
  
   Thanks again!
  
  
   David Medinets wrote:
  
   Can you add descriptions of the exercises to the README file? Many
   people finding that page would move to the next one with details to
   catch their interest.
  
   On Tue, Nov 11, 2014 at 9:49 AM, Chris Rigano
 chris.p.rig...@gmail.com
   wrote:
  
   I believe they would be of benefit.
  
   On Mon, Nov 10, 2014 at 12:43 PM, Adam J. Shookadamjsh...@gmail.com
 
   wrote:
  
I had just finished finalizing the training materials for a basic
 and
   advanced Accumulo class my company ClearEdge IT offers.  I think
 some
  of
   the more advanced code tutorials would be valuable to contribute to
  the
   Accumulo examples library.
  
   The examples all work with status updates from Twitter and include
   topics
   such as:
  
   - Basic Reading/Writing
   - Indexing tweets and creating a program to retrieve tweets
 based
  on
   given search terms
   - Bulk ingestion of the tweets
   - Using MapReduce to building a geo-index table for the tweets
  with
   latitude/longitude information via z-points
   - Levering the geo-index to retrieve tweets from a given
 lat/long
   bounding box
   - Custom iterators such as filters and combiners
  
   If this is something the community would be interested in, please
 take
   the
   time to review them at
  https://github.com/adamjshook/accumulo-training
   and
   let me know if there are any you think would be worth contributing.
   I'd be
   happy to take the time to massage them to meet the standards.
  
   Cheers,
   --Adam
  
  
 
 
  --
  Sean
 



Re: Contribute Examples/Exercises

2014-11-12 Thread Josh Elser
Personally, I didn't really think that this contribution was in the 
spirit of what the new codebase adoption guidelines were meant to cover.


Some extra examples which leverage what Accumulo already does seems more 
like improvements for new Accumulo users than anything else.


Donald Miner wrote:

Sean,

My original thought here was that we could adapt these to fit the mold of
the Accumulo examples that are shipped with core. Does that make any sense?

Either way, the contrib approach seems reasonable as well, just not what I
first thought.

-d

On Tue, Nov 11, 2014 at 12:18 PM, Sean Busbeybus...@cloudera.com  wrote:


I think these would be an excellent addition, given the improvements David
suggested. Since they span a range of Accumulo uses, they are probably best
served as a contrib repository[1] rather than an Accumulo version-specific
example.

In any case this will involve adopting a non-trivial code base developed
outside of the project; we'll need to go through the ASF Incubator[2].
Before that can happen we'll need to call a vote[3].

[1]: http://accumulo.apache.org/contrib.html
[2]: http://incubator.apache.org/faq.html#proposed_new_codebase
   http://incubator.apache.org/ip-clearance/index.html
[3]: http://accumulo.apache.org/bylaws.html#actions

On Tue, Nov 11, 2014 at 10:38 AM, Josh Elserjosh.el...@gmail.com  wrote:


I've given a quick glance over them -- they look like they'd be a great
addition!

We'd have to figure out some mechanism to distribute the exercises (as we
can't compile them), but that's a manageable problem.

If you want to open an issue on JIRA, that'd be the first step to get
these into the codebase. Some things to think about meanwhile:

* Check out the coding practices and code formatting guidelines -
http://accumulo.apache.org/source.html#coding-practices
* Add ASL headers to the files
* Figure out where might be a good place to include these in the Accumulo
tree  - maybe examples/training?
* Consider what documentation would be needed for someone to self-guide
themselves through these examples
* Look into redistribution rights on the included twitter.json file. I'm
not sure what Twitter's terms of service are. It may be easier to write a
script that will generate some example tweets. It keeps us from being
liable for what those tweets contain and also prevents us from having to
distribute a big blob.

Thanks again!


David Medinets wrote:


Can you add descriptions of the exercises to the README file? Many
people finding that page would move to the next one with details to
catch their interest.

On Tue, Nov 11, 2014 at 9:49 AM, Chris Riganochris.p.rig...@gmail.com
wrote:


I believe they would be of benefit.

On Mon, Nov 10, 2014 at 12:43 PM, Adam J. Shookadamjsh...@gmail.com
wrote:

  I had just finished finalizing the training materials for a basic and

advanced Accumulo class my company ClearEdge IT offers.  I think some

of

the more advanced code tutorials would be valuable to contribute to

the

Accumulo examples library.

The examples all work with status updates from Twitter and include
topics
such as:

 - Basic Reading/Writing
 - Indexing tweets and creating a program to retrieve tweets based

on

 given search terms
 - Bulk ingestion of the tweets
 - Using MapReduce to building a geo-index table for the tweets

with

 latitude/longitude information via z-points
 - Levering the geo-index to retrieve tweets from a given lat/long
 bounding box
 - Custom iterators such as filters and combiners

If this is something the community would be interested in, please take
the
time to review them at

https://github.com/adamjshook/accumulo-training

and
let me know if there are any you think would be worth contributing.
I'd be
happy to take the time to massage them to meet the standards.

Cheers,
--Adam




--
Sean





Re: Contribute Examples/Exercises

2014-11-12 Thread Josh Elser
My first glance at the code didn't raise any concerns about 
significantly impacting the build time.


David Medinets wrote:

I'll sound a note of caution. I love examples, but would not want to
adversely affect the build times of Accumulo. Perhaps a link to the
examples GitHub page at http://accumulo.apache.org/papers.html would
suffice?


Re: Contribute Examples/Exercises

2014-11-12 Thread Sean Busbey
On Wed, Nov 12, 2014 at 12:31 PM, Josh Elser josh.el...@gmail.com wrote:

 Personally, I didn't really think that this contribution was in the spirit
 of what the new codebase adoption guidelines were meant to cover.

 Some extra examples which leverage what Accumulo already does seems more
 like improvements for new Accumulo users than anything else.


It's content developed out side of the project list. That's all it takes to
require the trip through the Incubator checks as far as the ASF guidelines
are concerned.


-- 
Sean


Re: Contribute Examples/Exercises

2014-11-12 Thread Josh Elser



Sean Busbey wrote:

On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com  wrote:


Personally, I didn't really think that this contribution was in the spirit
of what the new codebase adoption guidelines were meant to cover.

Some extra examples which leverage what Accumulo already does seems more
like improvements for new Accumulo users than anything else.



It's content developed out side of the project list. That's all it takes to
require the trip through the Incubator checks as far as the ASF guidelines
are concerned.




From http://incubator.apache.org/ip-clearance/index.html


From time to time, an external codebase is brought into the ASF that is 
not a separate incubating project but still represents a substantial 
contribution that was not developed within the ASF's source control 
system and on our public mailing lists.



Not to look a gift-horse in the mouth (it is great work), but I don't 
see these examples as substantial. I haven't found guidelines yet that 
better clarify the definition of substantial.


Re: Contribute Examples/Exercises

2014-11-12 Thread Josh Elser
I'll take that as you disagree with my consideration of substantial. 
Thanks.


Mike Drob wrote:

The proposed contribution is a collection of 11 examples. It's clearly
non-trivial, which is probably enough to be considered substantial

On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com  wrote:



Sean Busbey wrote:


On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com
wrote:

  Personally, I didn't really think that this contribution was in the

spirit
of what the new codebase adoption guidelines were meant to cover.

Some extra examples which leverage what Accumulo already does seems more
like improvements for new Accumulo users than anything else.


  It's content developed out side of the project list. That's all it

takes to
require the trip through the Incubator checks as far as the ASF guidelines
are concerned.




 From http://incubator.apache.org/ip-clearance/index.html


 From time to time, an external codebase is brought into the ASF that is
not a separate incubating project but still represents a substantial
contribution that was not developed within the ASF's source control system
and on our public mailing lists.


Not to look a gift-horse in the mouth (it is great work), but I don't see
these examples as substantial. I haven't found guidelines yet that better
clarify the definition of substantial.





Re: Contribute Examples/Exercises

2014-11-12 Thread Corey Nolet
+1 for adding the examples to contrib.

I was, myself, reading over this email wondering how a set of 11 separate
examples on the use of Accumulo would fit into the core codebase-
especially as more are contributed over tinme. I like the idea of giving
community members an outlet for contributing examples that they've built so
that we can continue to foster that without having to fit them in the core
codebase. It just seems more maintainable.


On Wed, Nov 12, 2014 at 2:19 PM, Josh Elser josh.el...@gmail.com wrote:

 I'll take that as you disagree with my consideration of substantial.
 Thanks.


 Mike Drob wrote:

 The proposed contribution is a collection of 11 examples. It's clearly
 non-trivial, which is probably enough to be considered substantial

 On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com
 wrote:


 Sean Busbey wrote:

  On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com
 wrote:

   Personally, I didn't really think that this contribution was in the

 spirit
 of what the new codebase adoption guidelines were meant to cover.

 Some extra examples which leverage what Accumulo already does seems
 more
 like improvements for new Accumulo users than anything else.


   It's content developed out side of the project list. That's all it

 takes to
 require the trip through the Incubator checks as far as the ASF
 guidelines
 are concerned.



   From http://incubator.apache.org/ip-clearance/index.html

 
  From time to time, an external codebase is brought into the ASF that is
 not a separate incubating project but still represents a substantial
 contribution that was not developed within the ASF's source control
 system
 and on our public mailing lists.
 

 Not to look a gift-horse in the mouth (it is great work), but I don't see
 these examples as substantial. I haven't found guidelines yet that
 better
 clarify the definition of substantial.





Re: Contribute Examples/Exercises

2014-11-12 Thread Josh Elser
My worry with a contrib module is that, historically, code which goes 
moves to a contrib is just one step away from the grave. I think there's 
precedence for keeping them in core (as Christopher had mentioned, next 
to examples/simple) which would benefit people externally (more how do 
I do X examples) and internally (keep devs honest about how our APIs 
are implemented).


Bringing the examples into the core also encourages us to grow the 
community which has been stagnant with respect to new committers for 
about 9 months now.


Corey Nolet wrote:

+1 for adding the examples to contrib.

I was, myself, reading over this email wondering how a set of 11 separate
examples on the use of Accumulo would fit into the core codebase-
especially as more are contributed over tinme. I like the idea of giving
community members an outlet for contributing examples that they've built so
that we can continue to foster that without having to fit them in the core
codebase. It just seems more maintainable.


On Wed, Nov 12, 2014 at 2:19 PM, Josh Elserjosh.el...@gmail.com  wrote:


I'll take that as you disagree with my consideration of substantial.
Thanks.


Mike Drob wrote:


The proposed contribution is a collection of 11 examples. It's clearly
non-trivial, which is probably enough to be considered substantial

On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com
wrote:



Sean Busbey wrote:

  On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com

wrote:

   Personally, I didn't really think that this contribution was in the


spirit
of what the new codebase adoption guidelines were meant to cover.

Some extra examples which leverage what Accumulo already does seems
more
like improvements for new Accumulo users than anything else.


   It's content developed out side of the project list. That's all it


takes to
require the trip through the Incubator checks as far as the ASF
guidelines
are concerned.



   From http://incubator.apache.org/ip-clearance/index.html


  From time to time, an external codebase is brought into the ASF that is
not a separate incubating project but still represents a substantial
contribution that was not developed within the ASF's source control
system
and on our public mailing lists.


Not to look a gift-horse in the mouth (it is great work), but I don't see
these examples as substantial. I haven't found guidelines yet that
better
clarify the definition of substantial.






Re: Contribute Examples/Exercises

2014-11-12 Thread Adam J. Shook
For what it's worth, my intention was to update the examples to fit the ASF
and Accumulo standards and add them as examples of advanced
usage/patterns.  I don't expect all 11 to be contributed, as over half of
them are fairly simple examples showing off the basic Accumulo usage --
examples that are already in the core.  This thread was just a means to see
if it was worth the effort to massage a selection of the examples into the
ASF standards for the core examples library.

--Adam

On Wed, Nov 12, 2014 at 2:54 PM, Josh Elser josh.el...@gmail.com wrote:

 My worry with a contrib module is that, historically, code which goes
 moves to a contrib is just one step away from the grave. I think there's
 precedence for keeping them in core (as Christopher had mentioned, next to
 examples/simple) which would benefit people externally (more how do I do
 X examples) and internally (keep devs honest about how our APIs are
 implemented).

 Bringing the examples into the core also encourages us to grow the
 community which has been stagnant with respect to new committers for about
 9 months now.


 Corey Nolet wrote:

 +1 for adding the examples to contrib.

 I was, myself, reading over this email wondering how a set of 11 separate
 examples on the use of Accumulo would fit into the core codebase-
 especially as more are contributed over tinme. I like the idea of giving
 community members an outlet for contributing examples that they've built
 so
 that we can continue to foster that without having to fit them in the core
 codebase. It just seems more maintainable.


 On Wed, Nov 12, 2014 at 2:19 PM, Josh Elserjosh.el...@gmail.com  wrote:

  I'll take that as you disagree with my consideration of substantial.
 Thanks.


 Mike Drob wrote:

  The proposed contribution is a collection of 11 examples. It's clearly
 non-trivial, which is probably enough to be considered substantial

 On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com
 wrote:


  Sean Busbey wrote:

   On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com

 wrote:

Personally, I didn't really think that this contribution was in the

  spirit
 of what the new codebase adoption guidelines were meant to cover.

 Some extra examples which leverage what Accumulo already does seems
 more
 like improvements for new Accumulo users than anything else.


It's content developed out side of the project list. That's all it

  takes to
 require the trip through the Incubator checks as far as the ASF
 guidelines
 are concerned.



From http://incubator.apache.org/ip-clearance/index.html

 
   From time to time, an external codebase is brought into the ASF that
 is
 not a separate incubating project but still represents a substantial
 contribution that was not developed within the ASF's source control
 system
 and on our public mailing lists.
 

 Not to look a gift-horse in the mouth (it is great work), but I don't
 see
 these examples as substantial. I haven't found guidelines yet that
 better
 clarify the definition of substantial.






Re: Contribute Examples/Exercises

2014-11-12 Thread Corey Nolet
Josh,

 My worry with a contrib module is that, historically, code which goes
moves to a contrib is just one step away from the grave.

You do have a good point. My hope was that this could be the beginning of
our changing history so that we could begin to encourage the community to
contribute their own source directly and give them an outlet for doing so.
I understand that's also the intent of hosting open source repos under ASF
to begin with- so I'm partial to either outcome.

 I think there's precedence for keeping them in core (as Christopher had
mentioned, next to examples/simple) which would benefit people externally
(more how do I do X examples) and internally (keep devs honest about how
our APIs are implemented).

I would think that would just require keeping the repos up to date as
versions change so they wouldn't get out of date and possibly releasing
them w/ our other releases.


Wherever they end up living, thank you Adam for the contributions!



On Wed, Nov 12, 2014 at 2:54 PM, Josh Elser josh.el...@gmail.com wrote:

 My worry with a contrib module is that, historically, code which goes
 moves to a contrib is just one step away from the grave. I think there's
 precedence for keeping them in core (as Christopher had mentioned, next to
 examples/simple) which would benefit people externally (more how do I do
 X examples) and internally (keep devs honest about how our APIs are
 implemented).

 Bringing the examples into the core also encourages us to grow the
 community which has been stagnant with respect to new committers for about
 9 months now.


 Corey Nolet wrote:

 +1 for adding the examples to contrib.

 I was, myself, reading over this email wondering how a set of 11 separate
 examples on the use of Accumulo would fit into the core codebase-
 especially as more are contributed over tinme. I like the idea of giving
 community members an outlet for contributing examples that they've built
 so
 that we can continue to foster that without having to fit them in the core
 codebase. It just seems more maintainable.


 On Wed, Nov 12, 2014 at 2:19 PM, Josh Elserjosh.el...@gmail.com  wrote:

  I'll take that as you disagree with my consideration of substantial.
 Thanks.


 Mike Drob wrote:

  The proposed contribution is a collection of 11 examples. It's clearly
 non-trivial, which is probably enough to be considered substantial

 On Wed, Nov 12, 2014 at 12:58 PM, Josh Elserjosh.el...@gmail.com
 wrote:


  Sean Busbey wrote:

   On Wed, Nov 12, 2014 at 12:31 PM, Josh Elserjosh.el...@gmail.com

 wrote:

Personally, I didn't really think that this contribution was in the

  spirit
 of what the new codebase adoption guidelines were meant to cover.

 Some extra examples which leverage what Accumulo already does seems
 more
 like improvements for new Accumulo users than anything else.


It's content developed out side of the project list. That's all it

  takes to
 require the trip through the Incubator checks as far as the ASF
 guidelines
 are concerned.



From http://incubator.apache.org/ip-clearance/index.html

 
   From time to time, an external codebase is brought into the ASF that
 is
 not a separate incubating project but still represents a substantial
 contribution that was not developed within the ASF's source control
 system
 and on our public mailing lists.
 

 Not to look a gift-horse in the mouth (it is great work), but I don't
 see
 these examples as substantial. I haven't found guidelines yet that
 better
 clarify the definition of substantial.