On May 24, 2011, at 9:23 AM, Steve Loughran wrote:
I've drafted a policy on the wiki based on this discussion.
http://wiki.apache.org/hadoop/Defining%20Hadoop
Others need to look at, edit, etc, then we can vote on whether to take it
into the managed documentation.
I think it looks
I've drafted a policy on the wiki based on this discussion.
http://wiki.apache.org/hadoop/Defining%20Hadoop
Others need to look at, edit, etc, then we can vote on whether to take
it into the managed documentation.
Agree.
On May 12, 2011, at 11:16 PM, Doug Cutting wrote:
Certification semms like mission creep. Our mission is to produce
open-source software. If we wish to produce testing software, that
seems fine. But running a certification program for non-open-source
software seems like a different
On 05/17/2011 07:53 PM, Matthew Foley wrote:
And this statement of permission in the publicly available FAQ constitutes a
license,
so it is imprecise to say that ASF doesn't license its trademarks. :-)
That's not the way I interpret it. I believe that a license would be
required to permit a
:19 AM
To: general@hadoop.apache.org
Cc: Matthew Foley
Subject: Re: Defining Hadoop Compatibility -revisiting-
TESS only has registered trademarks -- that's the kind of trademark you put
an (R) next to.
But you can have an ordinary unregistered trademark -- the kind you put a tm
next to --
just
Matt,
Have you read Apache's trademark policy page?
http://www.apache.org/foundation/marks/
Apache does not generally license its trademarks. Constructions like,
Acme Foo powered by Apache Bar are generally permitted as they are not
deemed to create confusion about the origin of Bar.
Cheers,
On 13/05/11 05:52, Milind Bhandarkar wrote:
Ok, my mistake. They have only asked for documented specifications. I may
have been influenced by all the specifications I have read. All of them
were in English, which is characterized as a natural language.
But then, if you are proposing a
On 13/05/11 23:57, Allen Wittenauer wrote:
On May 13, 2011, at 3:53 PM, Ted Dunning wrote:
But distribution Z includes X kind of implies the existence of some such
that X != Y, Y != empty-set and X+Y = Z, at least in common usage.
Isn't that the same as a non-trunk change?
So doesn't this
On 13/05/11 23:16, Doug Cutting wrote:
On 05/14/2011 12:13 AM, Allen Wittenauer wrote:
So what do we do about companies that release a product that says includes Apache
Hadoop but includes patches that aren't committed to trunk?
We yell at them to get those patches into trunk already. This
On 13/05/11 07:16, Doug Cutting wrote:
Certification semms like mission creep. Our mission is to produce
open-source software. If we wish to produce testing software, that
seems fine. But running a certification program for non-open-source
software seems like a different task.
+1
That
But Cloudera's release is a bit murky.
The math example is a bit flawed...
X represents the set of stable releases.
Y represents the set of available patches.
C represents the set of Cloudera releases.
So if C contains a release X(n) plus a set of patches that is contained in Y,
Then does it
On Mon, May 16, 2011 at 10:19 AM, Allen Wittenauer a...@apache.org wrote:
On May 16, 2011, at 5:00 AM, Segel, Mike wrote:
X represents the set of stable releases.
Y represents the set of available patches.
C represents the set of Cloudera releases.
So if C contains a release X(n) plus a set
On May 16, 2011, at 2:09 PM, Eli Collins wrote:
Allen,
There are few things in Hadoop in CDH that are not in trunk,
branch-20-security, or branch-20-append. The stuff in this category
is not major (eg HADOOP-6605, better JAVA_HOME detection).
But that's my point: when is it no
Does Hadoop compatibility and the ability to say includes Apache
Hadoop only apply when we're talking about MR and HDFS APIs?
It is confusing isn't it.
We could go down the route java did and say that the API's are 'hadoop' and
ours is just a reference implementation of it. (but
On trademarks, what about the phrase: New distribution for Apache
Hadoop? I've seen that used, and its something that replaces most of the
stack. I believe Apache Hadoop is trademarked in this context, even if
Hadoop alone isn't.
Compatible with Apache Hadoop is a smaller issue, defining some
We have the following method coverage:
Common ~60%
HDFS ~80%
MR ~70%
(better analysis will be available after our projects are connected to
Sonar, I think).
While method coverage isn't completely adequate answer to your
question, I'd say there is a possibility to sneak in some
My understanding is that a history if defending your trade mark is more
important than registration. Apache does defend Hadoop.
---
E14 - typing on glass
On May 16, 2011, at 6:52 PM, Segel, Mike mse...@navteq.com wrote:
Let me clarify...
I searched on Hadoop as a term in any TM.
Nothing
of attack prove their worth by hitting back. - Piet Hein (via
Tom White)
--- On Mon, 5/16/11, Scott Carey sc...@richrelevance.com wrote:
From: Scott Carey sc...@richrelevance.com
Subject: Re: Defining Hadoop Compatibility -revisiting-
To: general@hadoop.apache.org general@hadoop.apache.org
Cc
Interesting point! I can see a future where there are many folks mixing and
matching hadoop and non-hadoop components. Swapping out HDFS seems
particularly popular.
On May 13, 2011, at 4:17 PM, Ian Holsman wrote:
...
I think thats a great idea.
Maybe we should also create names/marks
Good point.
On May 12, 2011, at 11:16 PM, Doug Cutting wrote:
Certification semms like mission creep. Our mission is to produce
open-source software. If we wish to produce testing software, that
seems fine. But running a certification program for non-open-source
software seems like a
Good point.
Tests are a must for the Hadoop community to meet its own goals (quality and
backwards compatibility). Writing detailed specs for something that is
evolving this quickly is challenging. Also in a lot of cases, documenting the
current APIs to POSIX like detail will mainly
The way it has been done in JCK was a specs written in somewhat
formalized language and a tool (called testgen, written in Perl if I
remember correctly) which was dynamically generating a lot of lang
tests. I think this is a middle ground Milind has mentioned.
BTW, it was a _huge_ effort: Sun had
Certification semms like mission creep. Our mission is to produce
open-source software. If we wish to produce testing software, that
seems fine. But running a certification program for non-open-source
software seems like a different task.
The Hadoop mark should only be used to refer to
On Thu, May 12, 2011 at 20:40, Milind Bhandarkar
mbhandar...@linkedin.com wrote:
Cos,
Can you give me an example of a system test that is not a functional
test ? My assumption was that the functionality being tested is specific
to a component, and that inter-component interactions (that's
Sure. As I said before, they are not mutually exclusive. Just stating my
experience that specs without a test suite are of no use. If I were to
prioritize, I would give priority to a TCK over natural-language specs.
That's all.
So far, I have seen many replacements for HDFS as InputFormat and
Cos,
I remember the issues about the inter-component interactions at that
point when you were part of the Yahoo Hadoop FIT team (I was on the other
side of the same floor, remember ? ;-)
Things like, Can Pig take full URIs as input, and so works with viewfs,
Can Local jobtracker still use HDFS
On Tue, May 10, 2011 at 3:29 AM, Steve Loughran ste...@apache.org wrote:
I think we should revisit this issue before people with their own agendas
define what compatibility with Apache Hadoop is for us
I agree completely. As you point out, this week we've had a flood of
products calling
Key seems to be how one would interpret version. Replace it with a synonym
like variant and this may be the intent.
On 5/13/11 9:50 AM, Doug Cutting cutt...@gmail.com wrote:
Yes, but there's an is earlier in the sentence.
Doug
On May 13, 2011 3:44 PM, Ted Dunning tdunn...@maprtech.com
On May 13, 2011, at 1:53 AM, Doug Cutting wrote:
Here certified is probably just intended to mean that the software
uses a certified open source license, e.g., listed at
http://www.opensource.org/licenses/. However they should say that this
includes or contains the various Apache products,
On Fri, May 13, 2011 at 00:11, Milind Bhandarkar
mbhandar...@linkedin.com wrote:
Cos,
I remember the issues about the inter-component interactions at that
point when you were part of the Yahoo Hadoop FIT team (I was on the other
side of the same floor, remember ? ;-)
Vaguely ;) Of course I
On 05/13/2011 07:28 PM, Allen Wittenauer wrote:
If it has a modified version of Hadoop (i.e., not an actual Apache
release or patches which have never been committed to trunk), are
they allowed to say includes Apache Hadoop?
No. Those are the two cases we permit. We used to say that it was
On May 13, 2011, at 2:55 PM, Doug Cutting wrote:
On 05/13/2011 07:28 PM, Allen Wittenauer wrote:
If it has a modified version of Hadoop (i.e., not an actual Apache
release or patches which have never been committed to trunk), are
they allowed to say includes Apache Hadoop?
No. Those are
On 05/14/2011 12:13 AM, Allen Wittenauer wrote:
So what do we do about companies that release a product that says includes
Apache Hadoop but includes patches that aren't committed to trunk?
We yell at them to get those patches into trunk already. This policy
was clarified after that product
On May 13, 2011, at 3:16 PM, Doug Cutting wrote:
On 05/14/2011 12:13 AM, Allen Wittenauer wrote:
So what do we do about companies that release a product that says includes
Apache Hadoop but includes patches that aren't committed to trunk?
We yell at them to get those patches into trunk
On 05/14/2011 12:17 AM, Allen Wittenauer wrote:
... and if those patches are rejected by the community?
It would be very strange, since they've mostly been released in 203,
although not yet having been committed to trunk.
Doug
On May 13, 2011, at 2:55 PM, Doug Cutting wrote:
On 05/13/2011 07:28 PM, Allen Wittenauer wrote:
If it has a modified version of Hadoop (i.e., not an actual Apache
release or patches which have never been committed to trunk), are
they allowed to say includes Apache Hadoop?
No. Those are
But distribution Z includes X kind of implies the existence of some such
that X != Y, Y != empty-set and X+Y = Z, at least in common usage.
Isn't that the same as a non-trunk change?
So doesn't this mean that your question reduces to the question of what
happens when non-Apache changes are made
On May 13, 2011, at 3:53 PM, Ted Dunning wrote:
But distribution Z includes X kind of implies the existence of some such
that X != Y, Y != empty-set and X+Y = Z, at least in common usage.
Isn't that the same as a non-trunk change?
So doesn't this mean that your question reduces to the
On May 14, 2011, at 12:41 AM, Owen O'Malley wrote:
On Tue, May 10, 2011 at 3:29 AM, Steve Loughran ste...@apache.org wrote:
I think we should revisit this issue before people with their own agendas
define what compatibility with Apache Hadoop is for us
I agree completely. As you point
On 12/05/2011 03:26, M. C. Srivas wrote:
While the HCK is a great idea to check quickly if an implementation is
compliant, we still need a written specification to define what is meant
by compliance, something akin to a set of RFC's, or a set of docs like the
IEEE POSIX specifications.
For
On 12/05/2011 00:20, Aaron Kimball wrote:
What does it mean to implement those interfaces? I'm +1 for a TCK-based
definition. In addition to statically implementing a set of interfaces, each
interface also implicitly includes a set of acceptable inputs and predicted
outputs (or ranges of
While IANAL...
As long as any implementation follows Apache's license regarding derivative
works, it's fair game. (this is my understanding YMMV)
The APL is very liberal in what one can do with a derivative work...
Surely Apache has some lawyers who can summarize what is allowable when
HCK and written specifications are not mutually exclusive. However, given
the evolving nature of Hadoop APIs, functional tests need to evolve as
well, and having them tied to a current stable version is easier to do
than it is to tie the written specifications.
- milind
--
Milind Bhandarkar
On May 12, 2011, at 2:23 AM, Steve Loughran wrote:
I think Sun NFS might be a good example of similar defacto standard, or MS
SMB -it is up to others to show they are compatible with what is effective
the reference implementation. Being closed source, there is no option for
anyone to
TCK (or JCK initially) was done as a tool to basically compare Java
Lang specs with a particular implementation including but not limited
to an extensive suite of say compiler tests.
So I assume before we can embark on any sort of HCK suite some formal
specs would have to be defined. It's rather
On Thu, May 12, 2011 at 09:45, Milind Bhandarkar
mbhandar...@linkedin.com wrote:
HCK and written specifications are not mutually exclusive. However, given
the evolving nature of Hadoop APIs, functional tests need to evolve as
I would actually expand it to 'functional and system tests' because
The problem with (only) specs is that they are written in natural
language, and subject to human interpretation, and since humans are bad at
natural language interpretation, this gives rise to something called
standards bodies and lawyers, and that has never been good for anyone in
the past ;-)
Cos,
Can you give me an example of a system test that is not a functional
test ? My assumption was that the functionality being tested is specific
to a component, and that inter-component interactions (that's what you
meant, right?) would be taken care by the public interface and semantics
of a
label:
print +1;
goto label;
I could not agree more with everything you said steve! The Apache Hadoop
project should own the definition of Apache Hadoop. Hadoop is far from done.
The interfaces need to keep evolving to get to a place where we can be proud of
them.
I support vendors
I would say that an English spec with associated test suite is a middle
ground.
On Thu, May 12, 2011 at 9:52 PM, Milind Bhandarkar mbhandar...@linkedin.com
wrote:
Ok, my mistake. They have only asked for documented specifications. I may
have been influenced by all the specifications I have
This is a really interesting topic! I completely agree that we need to get
ahead of this.
I would be really interested in learning of any experience other apache
projects, such as apache or tomcat have with these issues.
---
E14 - typing on glass
On May 10, 2011, at 6:31 AM, Steve Loughran
As a specific example of how these are important, over in Mahout-land we
have been wrestling with determining just what it means to have dependencies
in the lib directory inside a jar. This isn't documented, behaves
differently in different versions of Hadoop and means that some Mahout
programs
52 matches
Mail list logo