Re: [VOTE] accept Pig into Incubator
+1 On 25/09/2007, Doug Cutting [EMAIL PROTECTED] wrote: I would like to call the Incubator PMC to vote to incubate the proposed Pig project. Discussion on this list evidenced broad interest in this project, which bodes well for its ability to build a diverse developer community. http://wiki.apache.org/incubator/PigProposal +1 Doug --- = Proposal for Pig Project = == Abstract == Pig is a platform for analyzing large data sets. == Proposal == The Pig project consists of high-level languages for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: 1. ''Ease of programming''. It is trivial to achieve parallel execution of simple, embarrassingly parallel data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. 2. ''Optimization opportunities''. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. 3. ''Extensibility''. Users can create their own functions to do special-purpose processing. == Background == Pig started as a research project at Yahoo! in May of 2006 to combine ideas in parallel databases and distributed computing. The first internal release took place in July 2006. The first release was a simple front-end to the Hadoop Map/Reduce framework. The following releases added new features and evolved the language based on user feedback. In July 2007, pig was taken over by a development team and the first production version is due to be released on 9/28/07. Since its inception, we had observed a steady growth of the user community within Yahoo!. In April 2007, Pig was released under a BSD-type license. Several external parties are using this version and have expressed interest in collaborating on its development. == Rationale == In an information-centric world, innovation is driven by ad-hoc analysis of large data sets. For example, search engine companies routinely deploy and refine services based on analyzing the recorded behavior of users, publishers, and advertisers. The rate of innovation depends on the efficiency with which data can be analyzed. To analyze large data sets efficiently, one needs parallelism. The cheapest and most scalable form of parallelism is cluster computing. Unfortunately, programming for a cluster computing environment is difficult and time-consuming. Pig makes it easy to harness the power of cluster computing for ad-hoc data analysis. While other language exist that try to achieve the same goals, we believe that Pig provides more flexibility and gives more control to the end user. SQL typically requires (1) importing data from a user's preferred format into a database system's internal format (2) well-structured, normalized data with a declared schema, and (3) programs expressed in declarative SELECT-FROM-WHERE blocks. In contrast, Pig Latin facilitates (1) interoperability, i.e. data may be read/written in a format accepted by other applications such as text editors or graph generators (2) flexibility, i.e. data may be loosely structured or have structure that is defined operationally, and (3) adoption by programmers who find procedural programming more natural than declarative programming. Sawzall is a scripting language used at Google on top of Map-Reduce. A sawzall program has a fairly rigid structure consisting of a filtering phase (the map step) followed by an aggregation phase (the reduce step). Furthermore, only the filtering phase can be written by the user, and only a pre-built set of aggregations are available (new ones are non-trivial to add). While Pig Latin has similar higher level primitives like filtering and aggregation, an arbitrary number of them can be flexibly chained together in a Pig Latin program, and all primitives can use user-defined functions with equal ease. Further, Pig Latin has additional primitives such as cogrouping, that allow operations such as joins (which require multiple programs in Sawzall) to be written in a single line in Pig Latin. Further, Pig Latin is designed to be embedded into other languages, and can use functions written in other languages. Thus, in contrast to Sawzall, it directly caters to a
Re: [VOTE] accept Pig into Incubator
On Wednesday 26 September 2007 01:20, Doug Cutting wrote: I would like to call the Incubator PMC to vote to incubate the proposed Pig project. Discussion on this list evidenced broad interest in this project, which bodes well for its ability to build a diverse developer community. http://wiki.apache.org/incubator/PigProposal +1 Cheers -- Niclas Hedhman, Software Developer I live here; http://tinyurl.com/2qq9er I work here; http://tinyurl.com/2ymelc I relax here; http://tinyurl.com/2cgsug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
Noel J. Bergman wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! ahh, now I understand why you've been trying to get me on the mail list :) The biggest departure I know of was not in the incubator, it was the implementation of bits of WS-RF that HP was doing under WS;, what was it, Apache Muse? suddenly corporate priorities got changed and all FTEs got reassigned to something else. It just sat there for a while before IBM took up the challenge with a port to Axis2. similarly, there was a bit of stutter in Axis1 when the IBM team suddenly dropped of the net. There was lots of other active developers, but there were whole swathes of things like Java-to-WSDL code that came from IBM and which the others suddenly needed to learn, because till now that area had been well covered by the IBM folk, but not oustandingly well documented. At least they provided lots of tests, which does make it easier for others to take on the maintenance task -it reduces the amount of damage done while learning. It seems to me then, that the problem is more than just in-incubator. -steve - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] accept Pig into Incubator
On Sep 25, 2007, at 1:20 PM, Doug Cutting wrote: I would like to call the Incubator PMC to vote to incubate the proposed Pig project. Discussion on this list evidenced broad interest in this project, which bodes well for its ability to build a diverse developer community. http://wiki.apache.org/incubator/PigProposal +1 +1 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene4C proposal
Hi All! We're interested to follow the work on Lucene4C project at http://incubator.apache.org/lucene4c/ What are the requirements ? Best Regards - INEODEV Team S.A.R.L INEODEV 9, rue Med Sekir, H.Dey, Alger, Algérie Mobile: +213 70 493773 Fax:+213 17 036623 [EMAIL PROTECTED] - This e-mail may contain confidential information and is intended solely for the addressee, and any disclosure of this information is strictly prohibited and may be unlawful. If you have received this e-mail by mistake, please notify us immediately and delete this mail. - - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Interested in contributing to Lokahi
Hi, I would like to commit to Lokahi, but I am unable to find relevant information. Kindly let me know if and how how I can. -- Thanks and Regards, Sonal
Re: Interested in contributing to Lokahi
Sonal, On 9/26/07, Sonal Goyal [EMAIL PROTECTED] wrote: I would like to commit to Lokahi, but I am unable to find relevant information. Kindly let me know if and how how I can. The first thing to do is join the lokahi-dev mailing list. If you haven't already read the introduction for new contributors at http://www.apache.org/dev/#committers, please do. After that, start scratching your itch by sending in patches attached to JIRA issues at https://issues.apache.org/jira/browse/LOKAHI. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Interested in contributing to Lokahi
There seems to have no JIRA issues opened ? On 9/26/07, Yoav Shapira [EMAIL PROTECTED] wrote: Sonal, On 9/26/07, Sonal Goyal [EMAIL PROTECTED] wrote: I would like to commit to Lokahi, but I am unable to find relevant information. Kindly let me know if and how how I can. The first thing to do is join the lokahi-dev mailing list. If you haven't already read the introduction for new contributors at http://www.apache.org/dev/#committers, please do. After that, start scratching your itch by sending in patches attached to JIRA issues at https://issues.apache.org/jira/browse/LOKAHI. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Luciano Resende Apache Tuscany Committer http://people.apache.org/~lresende http://lresende.blogspot.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Interested in contributing to Lokahi
Hi, On 9/26/07, Luciano Resende [EMAIL PROTECTED] wrote: There seems to have no JIRA issues opened ? That's right. It's strange, but reflective of the low activity level on the project at this time. As people contribute more and use the product more, I'm sure new JIRA issues will be opened. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Interested in contributing to Lokahi
Any release plans ? On 9/26/07, Yoav Shapira [EMAIL PROTECTED] wrote: Hi, On 9/26/07, Luciano Resende [EMAIL PROTECTED] wrote: There seems to have no JIRA issues opened ? That's right. It's strange, but reflective of the low activity level on the project at this time. As people contribute more and use the product more, I'm sure new JIRA issues will be opened. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Luciano Resende Apache Tuscany Committer http://people.apache.org/~lresende http://lresende.blogspot.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On 9/25/07, Guillaume Nodet [EMAIL PROTECTED] wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. +1 Theres more of an issue IMO with projects that don't come thru the incubator, since they don't have to meet the Incubator's stringent graduation requirement. As an example - Tapestry was pushed out to a TLP from Jakarta, but the following blog from a Tapestry committer doesn't make good reading from a community PoV: http://agileskills2.org/blog/2007/09/my_thoughts_on_the_differences.html Niall On 9/25/07, Noel J. Bergman [EMAIL PROTECTED] wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Interested in contributing to Lokahi
Any release plans ? That would best be asked on the project's own mailing list. :-) Right now, they are discussing a release after merging MySQL support, because right now Lokahi requires Oracle, which is a non-started for most users. --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene4C proposal
What do you have in mind with lucene4c? That project was closed down a while ago. There is also the goal of Lucy http:// lucene.apache.org/lucy/ that hasn't materialized yet either. Erik On Sep 26, 2007, at 9:10 AM, [EMAIL PROTECTED] wrote: Hi All! We're interested to follow the work on Lucene4C project at http://incubator.apache.org/lucene4c/ What are the requirements ? Best Regards - INEODEV Team S.A.R.L INEODEV 9, rue Med Sekir, H.Dey, Alger, Algérie Mobile: +213 70 493773 Fax:+213 17 036623 [EMAIL PROTECTED] - This e-mail may contain confidential information and is intended solely for the addressee, and any disclosure of this information is strictly prohibited and may be unlawful. If you have received this e-mail by mistake, please notify us immediately and delete this mail. - - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] accept Pig into Incubator
I would like to express my interest in using and contributing to Pig project. +1 Bhupesh Nigel Daley-2 wrote: On Sep 25, 2007, at 10:26 AM, Doug Cutting wrote: I would like to call the Incubator PMC to vote to incubate the proposed Pig project. Discussion on this list evidenced broad interest in this project, which bodes well for its ability to build a diverse developer community. http://wiki.apache.org/incubator/PigProposal +1 Nigel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- View this message in context: http://www.nabble.com/-VOTE--accept-Pig-into-Incubator-tf4517021.html#a12910051 Sent from the Apache Incubator - General mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] accept Pig into Incubator
On 9/25/07, Doug Cutting [EMAIL PROTECTED] wrote: I would like to call the Incubator PMC to vote to incubate the proposed Pig project. +1 -- Gianugo Rabellino Sourcesense, making sense of Open Source: http://www.sourcesense.com (blogging at http://www.rabellino.it/blog/) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]