Re: [DISCUSSION] Pig.next

Eric Lubow Thu, 03 Mar 2011 12:04:38 -0800

Coming from a user's perspective, I would have the following to say:

Anyone who is using Hadoop has an obvious understanding that 1.0 doesn't
really mean much if it's in use (which Pig obviously is).  What 1.0 has the
potential to do for someone like me is that I may be able to go to Amazon
and say, look, Pig is at 1.0 and you are still offering 0.6 on EMR.  Having
Pig on something like EMR is what allows wider spread adoption because it
lowers the barrier to entry.


I am not an expert at any of this stuff (in fact, I don't even know Java),
but I am able to use Hadoop and then train others to write MR jobs with a
fair amount of ease because of a query language like Pig.  Tagging it with
1.0 might make a statement to larger organizations, but most smaller
companies and startups just want to know it's usable.  And since there is no
alpha or beta attached anywhere, that's good enough for most.

The only caveat is that I am working off of Pig 0.6 because all my data is
in S3 and I use Elastic Map Reduce for my jobs.

The only other thing I would say is that if Pig goes 1.0, can it get a new
logo? I know there are a lot of +1s for this so I figured I would throw my
+1 here too.

-e

On Thu, Mar 3, 2011 at 13:43, Alan Gates <ga...@yahoo-inc.com> wrote:

> I agree that there will probably need to be several 0.9.x releases as the
> new optimization and parser work mature.  As a consequence of this it may be
> longer between 0.9 and Pig.next then there has been between the last few
> releases.  That only delays the question of what we call Pig.next, it does
> not answer it.
>
> To me, declaring 1.0 would mean the following things:
>
> 1) Pig is ready for production use, at least by the brave.
> 2) It is still rough around the edges, you do not get a smooth product
> until 2.0 or later.
> 3) We will not make non-backward compatible changes to interfaces we have
> declared stable.
>
> Pig is in use in production in multiple places, I do not think anyone will
> argue that it is not rough around the edges, and because we have users who
> run tens of thousands of Pig jobs daily non-backward compatible changes are
> impossible anyway.
>
> As for waiting for Hadoop to go 1.0, that is like waiting for Congress to
> fix social security.  I am sure they will get there, but I may be retired
> first.  In all seriousness, the Hadoop project has not been moving with
> speed or agility over the last few years, and I do not think waiting for
> them to do something is a good idea.  Nor do I see it as necessary.  Before
> we could go 1.0 would we insist that every jar we import is >= 1.0?  Yes we
> are bound more tightly to Hadoop then we are to log4j.  But we are still our
> own project.  1.0 is a claim we are making about ourselves, not about the
> platform we run on.  We should choose our release numbering in a way that
> sends a clear message to our users, and let those same users evaluate Hadoop
> separately.
>
> Also the argument that we should not go 1.0 because we are changing a lot
> of things is bogus.  We are always changing a lot of things.  If 1.0 means
> we will not make any major changes, then we will not get there until we go
> into some kinds of maintenance mode where we deem the majority of the work
> to have been done.  I hope I have retired before we reach that state.
>
> My perspective on what 1.0 means obviously comes from a developer inside
> the project.  I would be interested in hearing from users and anyone with a
> more marketing oriented perspective on what message 1.0 would send to
> (potential) pig users.
>
> Alan.
>
> On Mar 2, 2011, at 6:31 PM, Dmitriy Ryaboy wrote:
>
>  I am worried that the new optimization plan work has not had a chance to
>> settle in, and we are releasing a brand new parser for the language in
>> 0.9.
>> Those are pretty significant changes, if the idea behind calling something
>> a
>> "1.0" is stability, we may want to give them a release to mature a bit. Of
>> course we can just release 0.9x for a while until we feel this stuff has
>> been tested in a wide enough variety of installations / hadoop
>> configurations / use cases.
>>
>> D
>>
>> On Wed, Mar 2, 2011 at 4:52 PM, Olga Natkovich <ol...@yahoo-inc.com>
>> wrote:
>>
>>  Pig Users and Developers,
>>>
>>> We are starting to plan the work after Pig 0.9. One thing we need to
>>> decide
>>> is what name/number to give to the next release: Pig 0.10 or Pig 1.0.
>>>
>>> I believe that we are ready to declare 1.0. Here are my reasons:
>>>
>>> (1)     We are mature enough and produce good quality releases
>>> (2)     Our interface no longer change in major ways
>>> (3)     We have a growing user community and we want the newcomers to
>>> know
>>> that our releases are stable
>>> (4)     If the next release is 0.10 and we decide that we should switch
>>> on
>>> the following release going from 0.10 to 1.0 will generate a lot of
>>> confusion.
>>>
>>> I wanted to start this conversation and see what others think before
>>> deciding if it is worth while to call a vote.
>>>
>>> Olga
>>>
>>>
>
Eric Lubow e: eric.lu...@gmail.com w: eric.lubow.org

Re: [DISCUSSION] Pig.next

Reply via email to