Re: Cleaning up documentation from website

2011-05-12 Thread Eli Collins
On Wednesday, May 11, 2011, Ian Holsman wrote: > I think we need a 0.18 doc's as some people are still running this. > other than that (and what is 'current'.. it should point to the stable > release imho) I'm good with it. There are some non trivial 0.19 clusters running too. What is our defi

Re: OT: anyone else going to berlin buzzwords?

2011-05-12 Thread Doug Cutting
I'll be there, but only for the conference on Monday and Tuesday. Doug On 05/11/2011 12:31 PM, Bernd Fondermann wrote: > On Wed, May 11, 2011 at 11:08, Steve Loughran wrote: >> On 11/05/2011 00:53, Ian Holsman wrote: >>> >>> If so.. I'll be there.. let's catch up. >>> >>> http://www.berlinbuzzwo

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Steve Loughran
On 11/05/2011 22:24, Eric Baldeschwieler wrote: This is a really interesting topic! I completely agree that we need to get ahead of this. I would be really interested in learning of any experience other apache projects, such as apache or tomcat have with these issues. I don't know about apa

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Steve Loughran
On 12/05/2011 03:26, M. C. Srivas wrote: While the HCK is a great idea to check quickly if an implementation is "compliant", we still need a written specification to define what is meant by compliance, something akin to a set of RFC's, or a set of docs like the IEEE POSIX specifications. For

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Steve Loughran
On 12/05/2011 00:20, Aaron Kimball wrote: What does it mean to "implement" those interfaces? I'm +1 for a TCK-based definition. In addition to statically implementing a set of interfaces, each interface also implicitly includes a set of acceptable inputs and predicted outputs (or ranges of output

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Segel, Mike
While IANAL... As long as any implementation follows Apache's license regarding derivative works, it's fair game. (this is my understanding YMMV) The APL is very liberal in what one can do with a derivative work... Surely Apache has some lawyers who can summarize what is allowable when talking

Re: OT: anyone else going to berlin buzzwords?

2011-05-12 Thread Ted Dunning
I will be there Monday-Thursday. Monday sounds good for food and beverages. On Thu, May 12, 2011 at 2:13 AM, Doug Cutting wrote: > I'll be there, but only for the conference on Monday and Tuesday. > > Doug > > On 05/11/2011 12:31 PM, Bernd Fondermann wrote: > > On Wed, May 11, 2011 at 11:08, St

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Milind Bhandarkar
HCK and written specifications are not mutually exclusive. However, given the evolving nature of Hadoop APIs, functional tests need to evolve as well, and having them tied to a "current stable" version is easier to do than it is to tie the written specifications. - milind -- Milind Bhandarkar mb

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Allen Wittenauer
On May 12, 2011, at 2:23 AM, Steve Loughran wrote: > I think Sun NFS might be a good example of similar defacto standard, or MS > SMB -it is up to others to show they are compatible with what is effective > the reference implementation. Being closed source, there is no option for > anyone to in

Re: OT: anyone else going to berlin buzzwords?

2011-05-12 Thread Bernd Fondermann
I will try to determine a location nearby the buzz, make a reservation and post to the appropriate lists. Bernd On Thu, May 12, 2011 at 17:44, Ted Dunning wrote: > I will be there Monday-Thursday. > > Monday sounds good for food and beverages. > > On Thu, May 12, 2011 at 2:13 AM, Doug Cutting

Re: OT: anyone else going to berlin buzzwords?

2011-05-12 Thread Ted Dunning
Actually, I just heard from Isabel that there will be a Buzzwords barbecue that evening. Should we all just meet there and then adjourn for beers after dinner? On Thu, May 12, 2011 at 11:02 AM, Bernd Fondermann < bernd.fonderm...@googlemail.com> wrote: > I will try to determine a location nearby

Re: OT: anyone else going to berlin buzzwords?

2011-05-12 Thread Bernd Fondermann
On Thu, May 12, 2011 at 20:12, Ted Dunning wrote: > Actually, I just heard from Isabel that there will be a Buzzwords barbecue > that evening. Sorry about that, I checked the programme and pinged Isabel to make sure we don't get in conflict with the conf. And I definitively don't want to miss tha

Apache Hadoop Hackathon: 5/18 in Palo Alto and San Francisco

2011-05-12 Thread Jeff Hammerbacher
Hey, Thanks to everyone who came out for the Apache Hadoop Hackathon yesterday in Palo Alto and San Francisco. We had 35 people sign up from a great cross section of companies: Yahoo!, Cloudera, Facebook, Apple, Twitter, Foursquare, AOL, Ngmoco, StumbleUpon, Trend Micro, Conviva, and more. We had

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Konstantin Boudnik
TCK (or JCK initially) was done as a tool to basically compare Java Lang specs with a particular implementation including but not limited to an extensive suite of say compiler tests. So I assume before we can embark on any sort of HCK suite some formal specs would have to be defined. It's rather h

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Konstantin Boudnik
On Thu, May 12, 2011 at 09:45, Milind Bhandarkar wrote: > HCK and written specifications are not mutually exclusive. However, given > the evolving nature of Hadoop APIs, functional tests need to evolve as I would actually expand it to 'functional and system tests' because latter are capable of va

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Milind Bhandarkar
The problem with (only) specs is that they are written in natural language, and subject to human interpretation, and since humans are bad at natural language interpretation, this gives rise to something called standards bodies and lawyers, and that has never been good for anyone in the past ;-) No

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Milind Bhandarkar
Cos, Can you give me an example of a "system test" that is not a functional test ? My assumption was that the functionality being tested is specific to a component, and that inter-component interactions (that's what you meant, right?) would be taken care by the public interface and semantics of a

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Ted Dunning
Did anybody propose natural language only specifications? On Thu, May 12, 2011 at 8:37 PM, Milind Bhandarkar wrote: > The problem with (only) specs is that they are written in natural > language, and subject to human interpretation, >

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Milind Bhandarkar
Ok, my mistake. They have only asked for documented specifications. I may have been influenced by all the specifications I have read. All of them were in English, which is characterized as a natural language. But then, if you are proposing a specification in a non-natural-language, isn't that call

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Eric Baldeschwieler
label: print "+1"; goto label; I could not agree more with everything you said steve! The Apache Hadoop project should own the definition of Apache Hadoop. Hadoop is far from done. The interfaces need to keep evolving to get to a place where we can be proud of them. I support "vendors" buil

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Ted Dunning
I would say that an English spec with associated test suite is a middle ground. On Thu, May 12, 2011 at 9:52 PM, Milind Bhandarkar wrote: > Ok, my mistake. They have only asked for documented specifications. I may > have been influenced by all the specifications I have read. All of them > were i

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Konstantin Boudnik
The way it has been done in JCK was a specs written in somewhat formalized language and a tool (called testgen, written in Perl if I remember correctly) which was dynamically generating a lot of lang tests. I think this is a middle ground Milind has mentioned. BTW, it was a _huge_ effort: Sun had

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Doug Cutting
Certification semms like mission creep. Our mission is to produce open-source software. If we wish to produce testing software, that seems fine. But running a certification program for non-open-source software seems like a different task. The Hadoop mark should only be used to refer to open-sou

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Konstantin Boudnik
On Thu, May 12, 2011 at 20:40, Milind Bhandarkar wrote: > Cos, > > Can you give me an example of a "system test" that is not a functional > test ? My assumption was that the functionality being tested is specific > to a component, and that inter-component interactions (that's what you > meant, rig

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Milind Bhandarkar
Sure. As I said before, they are not mutually exclusive. Just stating my experience that specs without a test suite are of no use. If I were to prioritize, I would give priority to a TCK over natural-language specs. That's all. So far, I have seen many replacements for HDFS as InputFormat and Outp