Re: Any objections to git hosting for Incubator projects?

2011-12-07 Thread Leo Neumeyer
Also in S4, everyone agrees. It was discussed in the dev mailing list.

-leo

On Mon, Dec 5, 2011 at 2:02 PM, Ross Gardler rgard...@opendirective.com wrote:
 Sent from my mobile device, please forgive errors and brevity.
 On Dec 5, 2011 10:21 AM, Mark Struberg strub...@yahoo.de wrote:

 +1 for DeltaSpike
 I thinkthe other requests over at asf-infra also did come from Mentors
 (as far as I have seen).


 Correct for Callback. My proposal links to the dev list thread in which all
 mentors agree to help.

 Ross


 LieGrue,
 strub


 - Original Message -
  From: Bertrand Delacretaz bdelacre...@apache.org
  To: general@incubator.apache.org; Joe Schaefer joe_schae...@yahoo.com
  Cc:
  Sent: Monday, December 5, 2011 9:57 AM
  Subject: Re: Any objections to git hosting for Incubator projects?
 
  Hi Joe,
 
  On Sat, Dec 3, 2011 at 6:46 PM, Joe Schaefer joe_schae...@yahoo.com
  wrote:
   So earlier this week infrastructure put out an
   RFP regarding early adoption of git hosting at
   the ASF and 3 Incubator projects have responded:
   callback, s4, and deltaspike...
 
  Very cool.
 
 
   Unless there are formal objections to such submissions
   infrastructure will evaluate their proposals just
   as if they came from the IPMC itself
 
  I'm ok as long as you have evidence (via messages or votes on public
  mailing lists) that those podlings' mentors support those requests.
 
  -Bertrand
 
  -
  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
  For additional commands, e-mail: general-h...@incubator.apache.org
 

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org




-- 

Leo Neumeyer (@leoneu)

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Git to SVN

2011-11-09 Thread Leo Neumeyer
cross-posting to incubator in case someone there also had to go through the
git to svn migration and can share their experience.

Is there a process we can follow to put s4 in the beta git program? I know
this can create additional work for the infra folks but seems we may
already have a beta system perhaps we can just try it and help improve it.
We all benefit.

-leo

On Tue, Nov 8, 2011 at 11:48 PM, Bruce Robbins
bruce.robbins...@gmail.comwrote:

 I am the fellow trying to convert the S4 git repository to SVN with
 history preserved. I gave the following a whirl:

 http://code.google.com/p/support/wiki/ImportingFromGit
 http://sandrotosi.blogspot.com/2010/02/migrate-git-repo-to-svn-one.html

 http://blog.paulbetts.org/index.php/2007/12/02/moving-git-repositories-to-svntfs/

 The first two basically had the same issue: it required me to manually
 resolve all merges that had been performed in the history (and we have
 many). The third one just produced a series of error messages.

 I have not tried the perl script yet. I will contact Jukka Zittings.

 However, I agree with Jeremy: it would be great if we didn't have to
 preserve history from git to svn, and then sometime later from svn back to
 git.

 On Tue, Nov 8, 2011 at 6:08 AM, Jeremy Thomerson 
 jer...@thomersonfamily.com wrote:

 On Tue, Nov 8, 2011 at 12:26 AM, Paul Davis paul.joseph.da...@gmail.com
 wrote:

   Since so many projects seem to be migrating from svn to git, is there
 any
   way we can continue using git for this project. (I mean use git only,
  not as
   a mirror).
 
  While I can't comment on best practices for Git-to-SVN moving, I'd
  just like to chime in to note that so many projects is not a good
  description of the current state of Git support at the ASF. Currently
  there is a single project using it in a limited manner as a test for
  wider support. There is no proposed time table for if/when Git support
  will be expanded so I would proceed as if it doesn't exist.
 

 Is it possible that this project could be added as a beta tester of Git?
 It really doesn't make sense to make them migrate git-svn-git.  If the
 Git experiment fails, then they do a single migration.  If it succeeds, no
 migration necessary.

 Jeremy Thomerson





-- 

Leo Neumeyer (@leoneu)


Re: Git to SVN

2011-11-09 Thread Leo Neumeyer
I'm curious if you also discussed using a hosted service like GitHub for
the projects. Seems to me that it would save us so much in resources and
time to take advantage of their free accounts for open source projects and
they seem to be doing a pretty good job. Perhaps there are concerns about
relying on a third party but for an organization powered by volunteers,
using a free service like this could be a great benefit. Perhaps there are
good reasons why this cannot be done but was wondering if it was discussed
at all.

-leo

On Wed, Nov 9, 2011 at 10:46 AM, Christian Grobmeier grobme...@gmail.comwrote:

 On Wed, Nov 9, 2011 at 7:38 PM, Jeremy Thomerson
 jer...@thomersonfamily.com wrote:
 
  On Wed, Nov 9, 2011 at 1:30 PM, Christian Grobmeier grobme...@gmail.com
 
  wrote:
 
  In addition you can ask on wave-dev. They have finally decided to
  loose history and do an initial checkin because it was not easily
  possible to convert.
 
  Why are we doing this on any project when we potentially have a git
 solution
  around the corner?  Why not add these projects to the git experiment and
  then only take this drastic action if the git experiment fails?

 This discussion was already held before a while. There are many
 questions to solve before.

 Cheers,
 Christian
 
  Jeremy Thomerson
 



 --
 http://www.grobmeier.de




-- 

Leo Neumeyer (@leoneu)


Re: [VOTE] S4 to join the Incubator

2011-09-28 Thread Leo Neumeyer
 with Zookeeper. He has been an active
 contributor to Hadoop.
 * Flavio Junqueira has a background in distributed computing. He is a
 committer of ZooKeeper, a ZooKeeper PMC member, and a committer of
 BookKeeper;
 * Matthieu Morel has extensive background in distributed systems, he
 likes theory and loves to implement things. He has been the main
 designer and implementor of S4 checkpointing.* Anish Nair has been the
 project’s main customer. With his background on natural language
 processing and algorithms he developed the applications that drove the
 S4 design including processing of social feeds and real-time
 recommendation engines.
 * Leo Neumeyer has a background in signal processing and statistical
 modeling but has been advocating clean simple software design
 throughout his career. At Yahoo! he conceived and championed the S4
 project as a solution to improve monetization in search advertising.
 * Bruce Robbins has been the main S4 developer, taking the concept
 from idea to releases. Bruce engineering experience ranges from
 programming Mainframe computers to assembly code.
 
 === Alignment ===
 
 S4 brings stream processing capabilities that complement Hadoop's
 batch processing capabilities.
 
 == Known Risks ==
 
 === Orphaned Products ===
 
 S4 has been used in production at Yahoo! and is being evaluated by
 other organizations. The developers have continued to support the
 project on their own time. We believe that adoption will increase
 significantly as more tools and documentation become available. As the
 project evolves, we may see new ideas that we may want to adopt or, if
 it makes sense and is practical, we may want to merge two or more open
 source projects. We believe that there is a clear need to have a well
 supported open source stream processing platform and therefore, there
 is low risk of the project becoming orphan. However, we are open to
 combining projects in order to have fewer projects with a more active
 community. Ultimately, this will be decided by the design ideas, the
 implementation quality, and the adoption.
 
 === Inexperience with Open Source ===
 
 The S4 code was open sourced by Yahoo! under Apache 2.0 license. One
 committer of the S4 project, Flavio Junqueira, is intimately familiar
 with the Apache model for open-source development and is experienced
 with working with new contributors.  Flavio is both a committer a PMC
 member for ZooKeeper. The other developers have had experience as
 contributors in other open-source projects. Most of the original S4
 developers continue to be committers.
 
 === Homogeneous Developers ===
 
 The initial set of committers for S4 represent four different
 companies: A9, Linkedin, Quantbench, and Yahoo!. This set is diverse
 enough for a starting project.
 
 === Reliance on Salaried Developers ===
 
 Some committers are contributing as part of their jobs, but as we move
 to a more diverse set of developers we expect a good mix of salaried
 and volunteer time.
 
 === Relationships with Other Apache Projects ===
 
 S4 relies on the following Apache projects:
 
 * BCEL (bytecode generation library)
 * commons cli (command line interface)
 * commons logging (needed by some other dependency)
 * log4j
 * commons jexl (expression processing)
 * zookeeper
 * Maven and its usual plug-ins (build time only)
 
 Compared to existing projects, S4 complements existing functionality
 in a few ways summarized below:
 * Flume: S4 processes streams in a distributed fashion and enables
 applications to form arbitrary graphs of processing elements. Flume
 focuses on accumulating streams of logs in a centalized repository for
 batch processing;
 * Kafka: Kafka is a pub/sub messaging layer that interposes
 generation of events and processing, while S4 itself forwards events
 and processes them in a stream fashion.
 * Hadoop: Hadoop focuses on batch processing of large data sets,
 while S4 is a platform for stream processing of events. We would like
 to implement extensions that enable processing in both platforms with
 the same code.
 
 === An Excessive Fascination with the Apache Brand ===
 
 The project has already received a significant amount of attention and
 so far has been associated with Yahoo!. We would like, however, to
 foster the development of a community around S4 that evolves
 independently of the interests of a single company. Given the reliance
 of S4 on some Apache projects and the principles promoted by the
 foundation, we find it a suitable home for the project.
 
 == Documentation ==
 
 * S4 Website: http://s4.io
 * S4 documentation: http://docs.s4.io/
 * S4 Forum: 
 http://groups.google.com/**group/s4-project/topicshttp://groups.google.com/group/s4-project/topics
 * S4 Mailing list (with archives): http://groups.google.com/**
 group/s4-project http://groups.google.com/group/s4-project
 
 == Source and Intellectual Property Submission Plan ==
 
 The S4 source code is already licensed under Apache Software License
 2.0

Re: [VOTE] S4 to join the Incubator

2011-09-26 Thread Leo Neumeyer
 use Hadoop by segmenting the input stream into data batches.
 This solution is not efficient, results in high latency, and
 introduces unnecessary complexity.
 
 The S4 design is primarily driven by large scale applications for data
 mining and machine learning in a production environment. We think that
 the S4 design is surprisingly flexible and lends itself to run in
 large clusters built with commodity hardware.
 
 S4 enables application programmers to focus more on the application
 and less on the infrastructure. S4 also provides a consistent graph
 oriented programming model that, if widely adopted, will facilitate
 sharing of basic component across developers.
 
 == Initial Goals ==
 
 The basic S4 infrastructure is complete and can be used in real-world
 applications. However, many additional components need to be developed
 and improved. Some areas we hope to focus on in Apache:
 
  * Add a reliable communication protocol option to the communication
 layer for low bandwidth control messages that require guaranteed
 delivery.
  * Higher-performance serialization and inter-node communication.
  * Functionality to save the state of PEs at runtime transparently and
 restore it at startup.
  * Intelligent load shedding strategies.
  * Dynamic load balancing to make it possible to add and remove nodes
 from the cluster without data loss.
  * Dynamic application loading and unloading.
  * Migration to a pure object-oriented design that takes advantage of
 Java static typing using Generics in the framework code. (Keep it
 simple for the application developer.)
  * Eliminate string identifiers and XML configuration.
  * Adopt JSR 330 (Dependency Injection for Java).
  * Add real-time query support.
  * Add a cluster management system.
 
 Clearly this is a long list but sets the high level roadmap for the project.
 
 == Current Status ==
 
 The project has been under development at Yahoo! since late 2008, and
 it was open sourced in October 2010. Since then we have received
 patches from developers, started a discussion forum, and improved the
 documentation.
 
 === Meritocracy ===
 
 The S4 project was initially developed at Yahoo! Labs, a
 research-oriented organization that values original ideas and
 individual contributions. The design evolved in a bottom up fashion,
 where decisions were driven by the application and the long-term
 viability and flexibility of the platform. Once the project became
 open-source it continued to be managed by those who were actively
 doing the work.
 
 === Community ===
 
 S4 is currently in use internally at Yahoo!, and since it was released
 as an open source project it has received positive feedback and
 contributions from developers.
 
 === Core Developers ===
 
 S4 developers span a few companies and work on a voluntary basis. We
 expect to have developers from other organizations joining the team in
 the next few months, especially if S4 joins the Apache Incubator
 project. Being an Apache Incubator project is likely to attract the
 attention of more talented developers.
 
 One interesting aspect of the current group of developers is the
 diverse background:
 
  * Kishore Gopalakrishna was the main developer of the communication
 layer and the integration with Zookeeper. He has been an active
 contributor to Hadoop.
  * Flavio Junqueira has a background in distributed computing. He is a
 committer of ZooKeeper, a ZooKeeper PMC member, and a committer of
 BookKeeper;
  * Matthieu Morel has extensive background in distributed systems, he
 likes theory and loves to implement things. He has been the main
 designer and implementor of S4 checkpointing.* Anish Nair has been the
 project’s main customer. With his background on natural language
 processing and algorithms he developed the applications that drove the
 S4 design including processing of social feeds and real-time
 recommendation engines.
  * Leo Neumeyer has a background in signal processing and statistical
 modeling but has been advocating clean simple software design
 throughout his career. At Yahoo! he conceived and championed the S4
 project as a solution to improve monetization in search advertising.
  * Bruce Robbins has been the main S4 developer, taking the concept
 from idea to releases. Bruce engineering experience ranges from
 programming Mainframe computers to assembly code.
 
 === Alignment ===
 
 S4 brings stream processing capabilities that complement Hadoop's
 batch processing capabilities.
 
 == Known Risks ==
 
 === Orphaned Products ===
 
 S4 has been used in production at Yahoo! and is being evaluated by
 other organizations. The developers have continued to support the
 project on their own time. We believe that adoption will increase
 significantly as more tools and documentation become available. As the
 project evolves, we may see new ideas that we may want to adopt or, if
 it makes sense and is practical, we may want to merge two or more open
 source projects. We believe

Re: [PROPOSAL] S4 for the Apache Incubator

2011-09-15 Thread Leo Neumeyer
Phil and all,

Great discussion and so happy you want to join the team. No need to apologize !!

My feeling is that if someone wants to join the project as a contributor and 
has technical merit he or she will become a committer pretty quickly. I think 
that having a minimal protocol is useful to make sure people get to know each 
other and the project. In fact, the current policy seems good to me: 
http://incubator.apache.org/guides/participation.html I love the DO-ocracy 
concept and seems to be the best way to become a committer.

So I propose that those who are interested and can volunteer some time, start 
thinking on how to contribute. If the project is accepted we will discuss the 
details in the mailing list.

Thanks again!
-leo

On Sep 15, 2011, at 5:54 PM, Phillip Rhodes wrote:

 On Thu, Sep 15, 2011 at 5:34 PM, Flavio Junqueira f...@s4.io wrote:
 
 I have read the guide to participation:
 
http://incubator.apache.org/guides/participation.html
 
 and I understand from there that people shouldn't simply jump in as an
 initial committer without a short introduction and without acknowledgment
 from the proposer.
 
 Since I was one of those people, let me issue a mea culpa here.  Despite
 having read the participation guidelines (more than once even) I apparently
 slipped into a bit of a conditioned response, from observed behavior.  For
 better or worse, it has become not uncommon (in my experience) to see
 people simply jump in and add themselves.  In retrospect, yes, it probably
 is a bit rude, and I apologize for my part in this.
 
 I suppose t's just what Roy said in 2006: everybody
 saw a certain process appearing to happen, assumed it was policy and
 didn't give it any further thought.  Guess I'm guilty of that.
 
 
 Our expectation when we submitted the proposal was that the initial set of
 committers would comprise the people who have initially contributed to get
 the current code to this stage, and we were not expecting arbitrary requests
 to join the initial list of committers.
 
 While jumping in is - as we've already established - in bad taste, I
 *think* that
 (most|any|some) projects entering incubation should expect such requests.
 Part of the focus of the incubator, as I've understood it, is to
 promote sufficient
 diversity in the community and the team, that no one block of people can 
 kill
 the project by dropping out or whatever.  Having new initial
 committers that have
 no outstanding connection to the project is one way to achieve that.
 In this regard, the
 incubation period is radically different from other times in the
 project lifecycle.
 Or, again, that has been my understanding.
 
 Then again, maybe it only appears that way because some projects make
 it a point to
 appeal to people *to* join in as initial committers.
 
 Of course, as a potential Apache
 project (now potentially incubator, but looking forward to being TLP in the
 future), we are ready to work towards building a community, which includes
 granting the status of committer to contributors. However, we'd like new
 committers to earn their status by showing commitment to the community and
 demonstrating technical merit.
 
 Absolutely, and entering the incubator is the only time - AFAIK - that 
 projects
 here tend to take a slightly different stance.  It's all about seeding
 the initial pool
 before the project gets underway.   That said, I'm not sure projects
 are required
 to accept an additional initial committers beyond what the proposer suggests.
 
 
 For my own part, I'll just say that I'm excited about S4, very happy
 to volunteer to help, and
 if you guys want me, I'm in.  If not, take me off the list and it'll
 all be cool.  FSM knows, I have
 plenty of stuff to keep me occupied already.  ;-)
 
 As far as introduction goes...  Well, I founded Fogbeam Labs, started
 the ScrewPile project to
 develop an OSS suite of Enterprise Knowledge Management software.
 I've been a professional
 software engineer for the past 12-13 years, working mostly in Java,
 but some C, C++, Python
 and Groovy as well.  If anyone wants to know more about me, just ask,
 or see:  https://plus.google.com/u/1/114301088526097505896/about
 
 
 Cheers,
 
 
 Phil
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[PROPOSAL] S4 for the Apache Incubator

2011-09-14 Thread Leo Neumeyer
Dear all,

I would like to propose S4 to be an Apache Incubator project.  S4 is a
distributed streaming platform written in Java.

Here is a link to the proposal in the Incubator wiki:
http://wiki.apache.org/incubator/S4Proposal

Thanks,
Leo Neumeyer

http://s4.io
http://twitter.com/leoneu

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org