All,

I'm glad to see a plan to release master before the incompatible changes.
It will be great to have a release that uses the separated out ORC code
base.

As I had previously discussed on list, I've been working on building a
proposed 2.2 branch, which builds on 2.1.1 and cherry picks a bunch of
changes. The list of patches was picked to synergize with the efforts of
the QA team at Hortonworks. You can look at the patches on
https://github.com/omalley/hive/tree/branch-2.2 .

The branch is still pretty rough, but I don't expect many large code
changes. I'd like to propose that we take my branch as branch-2.2 and set
up Pengcheng's branch as branch-2.3. We should also consider the packaging
changes (shrouding protobuf, guava, and kyro) that would make integration
with Spark easier in branch-2.3.

Thanks,
   Owen


On Thu, Mar 23, 2017 at 11:15 PM, Ashutosh Chauhan <hashut...@apache.org>
wrote:

> So, I just pushed branch-2 from current tip.
> Now, lets get 2.2 release out as soon as possible.
>
> Thanks,
> Ashutosh
>
> On Thu, Mar 23, 2017 at 9:46 PM, Ashutosh Chauhan <hashut...@apache.org>
> wrote:
>
> > Cutting a branch should not slow down a 2.2 release. If any thing, this
> > should help in achieving stabilization faster for release since branch
> > won't get any new potentially destabilizing changes but only patches to
> fix
> > existing known issues.
> >
> > On Thu, Mar 23, 2017 at 8:22 AM, Eugene Koifman <
> ekoif...@hortonworks.com>
> > wrote:
> >
> >> +1 to make a release first
> >>
> >> On 3/22/17, 2:06 PM, "Sergey Shelukhin" <ser...@hortonworks.com> wrote:
> >>
> >>     Hmm.. should we release these first, and then cut branch-2?
> >>     Otherwise during the releases, the patches for 2.2/2.3 will need to
> >> go to
> >>     3 (4?) places (master, branch-2, branch-2.2, branch-2.3?).
> >>     There’s no rush to cut the branch if everything in 2.2/2.3 has to go
> >> to
> >>     3.0 anyway.
> >>
> >>     On 17/3/22, 13:53, "Pengcheng Xiong" <pxi...@apache.org> wrote:
> >>
> >>     >I would like to work as the Release Manager if possible. As Owen
> >> points
> >>     >out, he is working on 2.2 and I will work on 2.3. Thanks.
> >>     >
> >>     >On Wed, Mar 22, 2017 at 1:32 PM, Ashutosh Chauhan <
> >> hashut...@apache.org>
> >>     >wrote:
> >>     >
> >>     >> Unless there is more feedback, I plan to cut branch-2 in a day or
> >> two
> >>     >>from
> >>     >> current master. As multiple people have suggested on this thread,
> >> we
> >>     >>should
> >>     >> do a 2.2 release soon. Currently there are 177 issues
> >>     >> <https://issues.apache.org/jira/issues/?jql=project%20%
> >>     >> 3D%20HIVE%20AND%20resolution%20%3D%20Unresolved%20AND%20cf%
> >>     >> 5B12310320%5D%20%3D%202.2.0%20ORDER%20BY%20priority%20DESC>
> >>     >> targeted for 2.2 release. We can use branch-2 to land these
> >> patches and
> >>     >>for
> >>     >> additional stabilization efforts. Any volunteer for Release
> Manager
> >>     >>driving
> >>     >> 2.2 release?
> >>     >>
> >>     >> Thanks,
> >>     >> Ashutosh
> >>     >>
> >>     >> On Fri, Mar 10, 2017 at 4:23 PM, Ashutosh Chauhan <
> >> hashut...@apache.org>
> >>     >> wrote:
> >>     >>
> >>     >> > I hear what you are saying. Lets begin with 3 concerns:
> >>     >> >
> >>     >> > - How will we keep the community motivated on fixing both
> master
> >> and
> >>     >> > branch-2?
> >>     >> > Until we do a stable release from master, stable releases can
> >> come
> >>     >>only
> >>     >> > from branch-2. If a contributor wants to see their fix reach to
> >> users
> >>     >>on
> >>     >> a
> >>     >> > stable line quickly they would have to have a fix on branch-2.
> >> Also, a
> >>     >> > release manager can pick whatever fixes she wants, so even if
> >>     >>contributor
> >>     >> > doesn't commit it on branch-2, a release manger who wants to
> do a
> >>     >>release
> >>     >> > containing a set of fixes thats always possible.
> >>     >> >
> >>     >> > - *Harder cherry-picks between master and branch-2*.
> >>     >> > That is certainly possible. But hope is we want to keep
> branch-2
> >>     >>stable,
> >>     >> > so we don't backport large features which may run into this
> >> issue.
> >>     >> Smaller
> >>     >> > focussed bug fix backport should be possible.
> >>     >> >
> >>     >> >
> >>     >> >    - *Removal of MR2 on the master branch*.
> >>     >> > This is something I personally would like to see. But exact
> >> timing of
> >>     >>it
> >>     >> > will be decided by community. I am certainly not saying that as
> >> soon
> >>     >>as
> >>     >> > branch-2 is created, lets remove MR2 on master.
> >>     >> >
> >>     >> > I would also say that in the end ASF is volunteer organization,
> >> we
> >>     >>cant
> >>     >> > force people to adopt one branch or another. Its upto the
> >> contributors
> >>     >> what
> >>     >> > jiras they work on and when and where they commit it.
> >>     >> > By not creating a branch-2 only thing we can guarantee is that
> >> rate of
> >>     >> > development on master to remain slow because we don't want to
> >> start
> >>     >>doing
> >>     >> > backward incompatible changes without explicitly acknowledging
> >> that.
> >>     >> >
> >>     >> > Thanks,
> >>     >> > Ashutosh
> >>     >> >
> >>     >> > On Thu, Mar 9, 2017 at 12:01 PM, Sergio Pena
> >>     >><sergio.p...@cloudera.com>
> >>     >> > wrote:
> >>     >> >
> >>     >> >> Hey Ashutosh, thanks for soliciting feedback on this.
> >>     >> >>
> >>     >> >> I like the idea you're proposing; maintaining compatibility
> and
> >> at
> >>     >>the
> >>     >> >> same time adding newer features to
> >>     >> >> Hive consumes a lot of development time and effort.
> >>     >> >>
> >>     >> >> However, I think some users and companies have just started to
> >> use
> >>     >>Hive
> >>     >> >> 2.x
> >>     >> >> branch as their main major upgrade on Hive
> >>     >> >> (possible due to waiting for stabilization and testing
> >> upgrades), but
> >>     >> >> cutting this major branch that just has 1 year of life
> >>     >> >> might make us look like we will forget about the quality of
> >> Hive 2.x
> >>     >>as
> >>     >> we
> >>     >> >> did with branch-1.
> >>     >> >>
> >>     >> >> Hive 1.x latest version was 1.2, and its development stopped
> >> because
> >>     >>new
> >>     >> >> features on Hive 2.x
> >>     >> >> Hive 2.x latest version is 2.1, and we want to create Hive 3.x
> >>     >>because
> >>     >> of
> >>     >> >> newer features and incompatibilities.
> >>     >> >> Will Hive 3.x have the same future after 3.1 is released?
> >>     >> >>
> >>     >> >> What I'm also concerned is about these three things:
> >>     >> >>
> >>     >> >>    - *Branch-2 quality commitment*.
> >>     >> >>    How will we keep the community motivated on fixing both
> >> master and
> >>     >> >>    branch-2?
> >>     >> >>    - *Harder cherry-picks between master and branch-2*.
> >>     >> >>    Because master will be incompatible by nature, then
> >> cherry-picks
> >>     >>to
> >>     >> >>    branch-2 will be harder.
> >>     >> >>    - *Removal of MR2 on the master branch*.
> >>     >> >>    This was marked as deprecated just last year, but MR2 is
> >> still an
> >>     >> >> engine
> >>     >> >>    that is used by several users.
> >>     >> >>
> >>     >> >> I accept that the end of life of major versions will come at
> >> some
> >>     >>point,
> >>     >> >> and these concerns will expire,
> >>     >> >> but Hive 2.x is kind of young, isn't it?
> >>     >> >>
> >>     >> >> Should we try to stabilize the Hive 2.x line first, and have a
> >> few
> >>     >>more
> >>     >> >> releases before starting to work on Hive 3.0?
> >>     >> >> Should we add more test coverage to Hive jenkins jobs to
> >> validate
> >>     >>Hive
> >>     >> 2.x
> >>     >> >> quality?
> >>     >> >> Should we agree on a date about when we should drop community
> >>     >>support on
> >>     >> >> Hive versions to let users know about this?
> >>     >> >>
> >>     >> >> Again, I like your proposal, but I'm afraid that users who
> just
> >>     >>upgraded
> >>     >> >> to
> >>     >> >> 2.x won't have any more features and improvements
> >>     >> >> because they will be developed on 3.0.
> >>     >> >>
> >>     >> >> - Sergio
> >>     >> >>
> >>     >> >>
> >>     >> >>
> >>     >> >> On Mon, Mar 6, 2017 at 1:24 PM, Ashutosh Chauhan <
> >>     >> >> ashutosh.chau...@gmail.com
> >>     >> >> > wrote:
> >>     >> >>
> >>     >> >> > The way it helps shedding debt  is because dev can now do
> >>     >>refactoring
> >>     >> >> > without fear of breaking some rarely used features. The way
> >> that
> >>     >>helps
> >>     >> >> for
> >>     >> >> > adding feature faster is since codebase is lean and easier
> to
> >>     >>reason
> >>     >> >> about
> >>     >> >> > its much easier to add new features.
> >>     >> >> >
> >>     >> >> > More importantly though, it also helps users because we are
> >> setting
> >>     >> the
> >>     >> >> > expectation from dev community. They can expect that future
> >>     >>releases
> >>     >> of
> >>     >> >> 2.x
> >>     >> >> > to be backward compatible. At the same time whenever they
> >> decide to
> >>     >> >> upgrade
> >>     >> >> > they only need to test their application once against 3.x as
> >>     >>oppose to
> >>     >> >> > continuous breakage of one form or another if we continue to
> >> make
> >>     >> >> > incompatible changes in master without branching for 2.x
> >>     >> >> >
> >>     >> >> > Thanks,
> >>     >> >> > Ashutosh
> >>     >> >> >
> >>     >> >> > On Sat, Mar 4, 2017 at 10:19 AM, Edward Capriolo <
> >>     >> edlinuxg...@gmail.com
> >>     >> >> >
> >>     >> >> > wrote:
> >>     >> >> >
> >>     >> >> > > Also i dont follow how we remove
> >>     >> >> > >
> >>     >> >> > > On Saturday, March 4, 2017, Edward Capriolo
> >>     >><edlinuxg...@gmail.com>
> >>     >> >> > wrote:
> >>     >> >> > >
> >>     >> >> > > >
> >>     >> >> > > >
> >>     >> >> > > > On Fri, Mar 3, 2017 at 8:46 PM, Thejas Nair <
> >>     >> thejas.n...@gmail.com
> >>     >> >> > > > <javascript:_e(%7B%7D,'cvml','thejas.n...@gmail.com
> ');>>
> >> wrote:
> >>     >> >> > > >
> >>     >> >> > > >> +1
> >>     >> >> > > >> There are some features that are incomplete and what I
> >> would
> >>     >>not
> >>     >> >> > > recommend
> >>     >> >> > > >> for any real production use.The 'legacy authorization
> >> mode'
> >>     >>is a
> >>     >> >> great
> >>     >> >> > > >> example of that -
> >>     >> >> > > >> https://cwiki.apache.org/confl
> >> uence/display/Hive/Hive+Defaul
> >>     >> >> > > >> t+Authorization+-+Legacy+Mode
> >>     >> >> > > >> . It is inherently insecure mode that nobody should be
> >> using.
> >>     >> >> > > >>
> >>     >> >> > > >> There is also potential to cleanup of the thrift api.
> >> However,
> >>     >> >> there
> >>     >> >> > are
> >>     >> >> > > >> many users of this api, we would need to go the
> >> deprecation
> >>     >>then
> >>     >> >> > remove
> >>     >> >> > > >> after couple of releases route or so for that.
> >>     >> >> > > >>
> >>     >> >> > > >> I am sure there are many other candidates. We will have
> >> to
> >>     >> evaluate
> >>     >> >> > each
> >>     >> >> > > >> of
> >>     >> >> > > >> those features on the risk/benefit of keeping them and
> >>     >>arriving
> >>     >> at
> >>     >> >> a
> >>     >> >> > > >> decision.
> >>     >> >> > > >>
> >>     >> >> > > >> Also, +1 on getting a 2.2 release out before we branch.
> >>     >> >> > > >>
> >>     >> >> > > >>
> >>     >> >> > > >>
> >>     >> >> > > >> On Fri, Mar 3, 2017 at 1:50 PM, Ashutosh Chauhan <
> >>     >> >> > hashut...@apache.org
> >>     >> >> > > >> <javascript:_e(%7B%7D,'cvml','hashut...@apache.org
> ');>>
> >>     >> >> > > >> wrote:
> >>     >> >> > > >>
> >>     >> >> > > >> > Hi all,
> >>     >> >> > > >> >
> >>     >> >> > > >> > Hive project has come a long way. With wide-spread
> >> adoption
> >>     >> also
> >>     >> >> > comes
> >>     >> >> > > >> > expectations. Expectation of being backward
> compatible
> >> and
> >>     >>not
> >>     >> >> > > breaking
> >>     >> >> > > >> > things. However that doesn't come free of cost and
> >> results
> >>     >>in
> >>     >> >> lot of
> >>     >> >> > > >> legacy
> >>     >> >> > > >> > code which can't be refactored without fear of
> breaking
> >>     >>things.
> >>     >> >> As a
> >>     >> >> > > >> result
> >>     >> >> > > >> > project has accumulated lot of debt over time. At the
> >> same
> >>     >>time
> >>     >> >> > there
> >>     >> >> > > >> are
> >>     >> >> > > >> > also lot of features which have seen little uptake.
> We
> >> may
> >>     >>want
> >>     >> >> to
> >>     >> >> > > drop
> >>     >> >> > > >> > some of those.
> >>     >> >> > > >> >
> >>     >> >> > > >> > In order to move forward and shed that debt we may
> >> need a
> >>     >>major
> >>     >> >> > > version
> >>     >> >> > > >> > release which allows us to make backward incompatible
> >>     >>changes
> >>     >> and
> >>     >> >> > drop
> >>     >> >> > > >> > rarely used features. At the same time there are lots
> >> of
> >>     >>users
> >>     >> >> which
> >>     >> >> > > are
> >>     >> >> > > >> > consuming currently released 2.1 , 2.2 branches and
> >> expect
> >>     >>them
> >>     >> >> to
> >>     >> >> > > stay
> >>     >> >> > > >> on
> >>     >> >> > > >> > it for some time. So, I propose that we create
> >> branch-2 from
> >>     >> >> current
> >>     >> >> > > tip
> >>     >> >> > > >> > and do future 2.x releases from that branch and keep
> it
> >>     >> backward
> >>     >> >> > > >> > compatible. This will allow devs to land breaking
> >> changes on
> >>     >> >> master
> >>     >> >> > > and
> >>     >> >> > > >> > pave way to release hive 3.0 in future.
> >>     >> >> > > >> >
> >>     >> >> > > >> > Ofcourse, each specific incompatible change and
> >> feature drop
> >>     >> >> even
> >>     >> >> > on
> >>     >> >> > > >> > master need to be evaluated on its own merit on
> >>     >>corresponding
> >>     >> >> jira.
> >>     >> >> > > This
> >>     >> >> > > >> > email is just a solicitation of feedback for creating
> >>     >>branch-2
> >>     >> >> and
> >>     >> >> > > >> allowing
> >>     >> >> > > >> > breaking changes in master. Thoughts?
> >>     >> >> > > >> >
> >>     >> >> > > >> > Thanks,
> >>     >> >> > > >> > Ashutosh
> >>     >> >> > > >> >
> >>     >> >> > > >>
> >>     >> >> > > >
> >>     >> >> > > > One of the challenges of the developers conducting the
> >>     >> risk-benefit
> >>     >> >> > > > analysis are that the developers are mostly focused on
> new
> >>     >> features,
> >>     >> >> > but
> >>     >> >> > > > there are deployments of hive that are 5+ years old and
> >> people
> >>     >> that
> >>     >> >> > rely
> >>     >> >> > > on
> >>     >> >> > > > the features are not on the mailing list.
> >>     >> >> > > >
> >>     >> >> > > > For example I developed and use this frequently:
> >>     >> >> > > >
> >>     >> >> > > > https://community.hortonworks.
> >> com/articles/8861/apache-hive-
> >>     >> >> > > > groovy-udf-examples.html
> >>     >> >> > > >
> >>     >> >> > > > My career went away from hive for a while. I was quite
> >>     >>surprised
> >>     >> to
> >>     >> >> > find
> >>     >> >> > > > out the cli->beeline it was more or less decided not to
> >> port
> >>     >>it. I
> >>     >> >> > > learned
> >>     >> >> > > > of this the first time I was forced to work in a hive
> >> server
> >>     >>only
> >>     >> >> > > > environment and it did not work.
> >>     >> >> > > >
> >>     >> >> > > > Now I have to go and spend time adding this back so I
> >> don't
> >>     >>have
> >>     >> to
> >>     >> >> > work
> >>     >> >> > > > around it not being there.
> >>     >> >> > > >
> >>     >> >> > > > What we should do continue/doing is making code that is
> >>     >>modular we
> >>     >> >> need
> >>     >> >> > > to
> >>     >> >> > > > break hard dependencies like ThriftSerde or OrcSerde
> being
> >>     >> "native"
> >>     >> >> and
> >>     >> >> > > > having to be linked to the metastore move them out into
> >> proper
> >>     >> >> > > submodules.
> >>     >> >> > > > There is too much code that only works for one
> >> implementation
> >>     >>of a
> >>     >> >> > serde
> >>     >> >> > > > etc.
> >>     >> >> > > >
> >>     >> >> > > >
> >>     >> >> > > >
> >>     >> >> > >
> >>     >> >> > > I would like a timeline to understand this. It sounds as
> if
> >>     >>master
> >>     >> is
> >>     >> >> not
> >>     >> >> > > releasable currently, so already broken in a way. We make
> a
> >>     >>branch
> >>     >> and
> >>     >> >> > > aggreasively break it more?
> >>     >> >> > >
> >>     >> >> > > Im not following what makes this branching policy makes
> >> adding
> >>     >> >> features
> >>     >> >> > > faster or how it helps shed debt faster.
> >>     >> >> > >
> >>     >> >> > >
> >>     >> >> > > --
> >>     >> >> > > Sorry this was sent from mobile. Will do less grammar and
> >> spell
> >>     >> check
> >>     >> >> > than
> >>     >> >> > > usual.
> >>     >> >> > >
> >>     >> >> >
> >>     >> >>
> >>     >> >
> >>     >> >
> >>     >>
> >>
> >>
> >>
> >>
> >
>

Reply via email to