Dmitriy,
I think what you are saying is something similar to alpha/beta releases.
(maybe beta1, beta2 .. is better).
So the first release could be 1.0.0_beta1. I scheme will be easier for
users to understand.
But I am not sure what the criteria for promoting a release from betaX
to general release should be.
Thanks,
Thejas
On 10/24/11 5:38 PM, Dmitriy Ryaboy wrote:
To be a little more concrete about what I am saying here -- I don't think we
should put a "1.0" label on any *.0 release. 0.8.1 is pretty solid; 0.9.0
has some holes, 0.9.1 is better. If we put 1.0 on what is currently being
thought of as 0.10, it will have some stability / usability issues (things
tend to show up after we make a release and people in the wild start trying
it), and those issues will make a poor impression on those who expect 1.0 to
be shiny and polished after so much time. I'm in favor of waiting a couple
of dot releases, promoting a stabilized release into 1.0, and going from
there. So, pictorially:
-- trunk --- 0.11-dev ----------0.12-dev------------------| 1.2-dev!
\ \
\ \ ---------------- 0.11.0 --------------------| 1.1.0!
\
\------- 0.10.0 ------- 0.10.1 ------- 0.10.2 --------| 1.0.0 !!
On Mon, Oct 24, 2011 at 12:43 PM, Dmitriy Ryaboy<[email protected]> wrote:
I am good with Scheme 2.
We are finding a fair number of issues trying to move from Pig 0.8.1 to
0.9, and I don't think those issues are fixed in 10, either.. not sure that
this "stabilization" process has happened yet.
D
On Mon, Oct 24, 2011 at 11:59 AM, Daniel Dai<[email protected]>wrote:
Yes, we need a versioning scheme. There are two versioning scheme I can
think of:
Scheme 1:
<major>.<patch>
<major> will be the feature rich release every 3 month
<patch> will be the bug fix release when necessary
Nov release will be 1.0, Feb release will be 2.0. There will be 1.1, 2.1
etc
for bug fixes.
Scheme 2:
<major>.<minor>.<patch>
Most of our 3 month release will be counted as<minor> release unless
there
are major user facing/disruptive changes.
Nov release will be 1.0.0, Feb release will be 1.1.0. There will be 1.0.1,
1.1.1 etc for bug fixes.
I personally prefer scheme 2, increasing major version too frequently
might
be confusing to users. How's other folks feel?
Daniel
On Sat, Oct 22, 2011 at 2:31 AM, Gianmarco De Francisci Morales<
[email protected]> wrote:
Hi,
just my 2 cents.
I think the issue here is not 1.0 vs 0.10, but what's the versioning
scheme
we want to use for Pig.
Up to now it has been just an increasing number after a '0.' prefix,
changed
when the community felt it was time. I think this works well for a small
project, but it is somewhat fuzzy.
I like the idea of having<major>.<minor>.<patch> versions like many
other
projects. It's a very clear and almost standard way of versioning a
piece
of
software. It has clear rules on when to change each of the numbers, and
lets
the user get an idea of backward compatibility at a glance.
So, to conclude, I am in favor of going 1.0 (or 1.0.0) as long as we
decide
a clear versioning policy (whichever it is).
So that the 1.0 milestone would mark the beginning of our new policy.
Cheers,
--
Gianmarco
On Fri, Oct 21, 2011 at 23:10,<[email protected]> wrote:
If one were to rewrite input and output formats to use the webhdfs://
APIs, this would not be an issue, right ?
- milind
On 10/21/11 1:50 PM, "Santhosh Srinivasan"<[email protected]> wrote:
If I was not clear in my earlier email, I apologize for the lack of
clarity. I am no longer in favour of waiting for Hadoop API stability
across Hadoop versions. It's a pipe dream.
When we had PigInputFormat and PigOutputFormat, your reasoning would
be
spot on. I am concerned about the following. Our tight integration
with
Hadoop due to the use of Input and Output format might lead to a
break
in
backward compatibility. I am not sure if the comparison with that of
Java
is valid. Probably a majority of the users don't use JNI. Its very
hard
to use Pig without writing custom load and store functions. The
default
load and store don't suffice for a majority of use cases that I have
observed.
I am trying to get all factors that might influence this decision.
From
the few emails that have been exchanged since yesterday, we have the
following factors:
1. Hadoop 0.20.205 (support for Append)
2. Hadoop 0.22
3. Hadoop 0.23
4. Maturity of the new parser
5. Stability of the new logical plan
6. Other components in the eco system.
- Avro (1.5.4, 1.4.1, ...)
- Cassandra (1.0.0, 0.8.7, ...)
- Chukwa (0.4.0, 0.3.0, ...)
- Hama (0.3.0, 0.2.0, ...)
- Hbase (0.90.4, 0.90.3, 0.90.2, 0.90.1, ...)
- Hive (Releases - 0.7.1, 0.7.0, 0.6.0, ...)
- Zookeeper (3.3.3, 3.3.2, 3.2.2, 3.1.2, ...)
Santhosh
-----Original Message-----
From: Thejas Nair [mailto:[email protected]]
Sent: Friday, October 21, 2011 11:22 AM
To: [email protected]
Subject: Re: Next Pig release proposal
Santosh,
I thought you meant API stability for hadoop across major versions,
but
I
guess you are referring to stability within 0.23 versions. But
argument
applies to that as well, if 0.23.1 is not compatible with 0.23.0, we
need
to call the release for 0.23.1 as 'pig 1.x for 0.23.1 api' .
We just need to communicate to the users that the
InputFormat/OutputFormat api's (and any anything else we expose from
hadoop) depends on the hadoop version they are using.
I think it is just like different JNI libraries that you would write
for
different OS. But the java version remains the same across OSs.
-Thejas
On 10/21/11 10:59 AM, Santhosh Srinivasan wrote:
Thejas,
I guess you did not read my email completely. You are referring to
the
premise without examining the conclusion. I am repasting my entire
email
to avoid confusion (I hate truncated references). If you could
respond
again, it will bring us onto the same page.
<email>
Ref: http://tinyurl.com/4ng8upa (last discussion on 1.0)
How far have we progressed from our last discussion in March. There
was
no consensus on the 1.0 release. Opinions ranged from having more
releases to bake in the maturity of the new parser and logical plan
changes to compatibility with Hadoop API (was compared to Social
Security - a very hot topic these days).
My concerns were around Hadoop API stability. I have heard that the
APIs will not be stable for at least 1 year. This is taking me away
from
the Hadoop API stability factor (They passed healthcare in that
duration. Really!) Do we want compatibility with 0.23 as a gating
factor
- not sure if this is anywhere close to getting done in the near
future.
Will we support append (0.20.205)?
Btw, Hbase has been doing 0.90.1, 0.90.2, etc. So we can take a
look
at
this option too.
Santhosh
-----Original Message-----
From: Olga Natkovich [mailto:[email protected]]
Sent: Thursday, October 20, 2011 4:40 PM
To: [email protected]
Subject: Next Pig release proposal
Hi,
Here is what I propose we do for the next Pig release:
(1) Branch early next week - we have major features and many
bug
fixes in and will be fixing remaining bugs on the branch
(2) Publish the release by 11/15 - that will give us a couple of
weeks to stabilize the branch and get last minute bug fixes in
(3) Make this release a 1.0 release. Reasons to go for 1.0 and
not
0.10
a. This release has minimal number of features and was
focused
on
code stabilization and bug fixes. We believe it will be a stable
release
<email/>
Thanks,
Santhosh
-----Original Message-----
From: Thejas Nair [mailto:[email protected]]
Sent: Friday, October 21, 2011 10:45 AM
To: [email protected]
Subject: Re: Next Pig release proposal
On 10/20/11 4:58 PM, Santhosh Srinivasan wrote:
Ref: http://tinyurl.com/4ng8upa (last discussion on 1.0)
How far have we progressed from our last discussion in March.
There
was no consensus on the 1.0 release. Opinions ranged from having
more
releases to bake in the maturity of the new parser and logical plan
changes to compatibility with Hadoop API (was compared to Social
Security - a very hot topic these days).
My concerns were around Hadoop API stability.
Over the next year or so, there are going to be two API versions of
hadoop to be supported - 0.20.x api's and 0.23 apis, as we will have
userbase on both.
I think it is just a matter of releasing pig 1.0 for 0.20.x api's
and
1.0 for 0.23.x api's. We will have to come up with a numbering
scheme
that reflects 'for hadoop version X' in our pig releases, regardless
of
it being 0.10 or 1.0.
As there will be support for different api's of hadoop in pig
releases,
I don't see a reason why the hadoop api stability should stop pig
from
going 1.0 .
-Thejas