All, I'm glad to see a plan to release master before the incompatible changes. It will be great to have a release that uses the separated out ORC code base.
As I had previously discussed on list, I've been working on building a proposed 2.2 branch, which builds on 2.1.1 and cherry picks a bunch of changes. The list of patches was picked to synergize with the efforts of the QA team at Hortonworks. You can look at the patches on https://github.com/omalley/hive/tree/branch-2.2 . The branch is still pretty rough, but I don't expect many large code changes. I'd like to propose that we take my branch as branch-2.2 and set up Pengcheng's branch as branch-2.3. We should also consider the packaging changes (shrouding protobuf, guava, and kyro) that would make integration with Spark easier in branch-2.3. Thanks, Owen On Thu, Mar 23, 2017 at 11:15 PM, Ashutosh Chauhan <hashut...@apache.org> wrote: > So, I just pushed branch-2 from current tip. > Now, lets get 2.2 release out as soon as possible. > > Thanks, > Ashutosh > > On Thu, Mar 23, 2017 at 9:46 PM, Ashutosh Chauhan <hashut...@apache.org> > wrote: > > > Cutting a branch should not slow down a 2.2 release. If any thing, this > > should help in achieving stabilization faster for release since branch > > won't get any new potentially destabilizing changes but only patches to > fix > > existing known issues. > > > > On Thu, Mar 23, 2017 at 8:22 AM, Eugene Koifman < > ekoif...@hortonworks.com> > > wrote: > > > >> +1 to make a release first > >> > >> On 3/22/17, 2:06 PM, "Sergey Shelukhin" <ser...@hortonworks.com> wrote: > >> > >> Hmm.. should we release these first, and then cut branch-2? > >> Otherwise during the releases, the patches for 2.2/2.3 will need to > >> go to > >> 3 (4?) places (master, branch-2, branch-2.2, branch-2.3?). > >> There’s no rush to cut the branch if everything in 2.2/2.3 has to go > >> to > >> 3.0 anyway. > >> > >> On 17/3/22, 13:53, "Pengcheng Xiong" <pxi...@apache.org> wrote: > >> > >> >I would like to work as the Release Manager if possible. As Owen > >> points > >> >out, he is working on 2.2 and I will work on 2.3. Thanks. > >> > > >> >On Wed, Mar 22, 2017 at 1:32 PM, Ashutosh Chauhan < > >> hashut...@apache.org> > >> >wrote: > >> > > >> >> Unless there is more feedback, I plan to cut branch-2 in a day or > >> two > >> >>from > >> >> current master. As multiple people have suggested on this thread, > >> we > >> >>should > >> >> do a 2.2 release soon. Currently there are 177 issues > >> >> <https://issues.apache.org/jira/issues/?jql=project%20% > >> >> 3D%20HIVE%20AND%20resolution%20%3D%20Unresolved%20AND%20cf% > >> >> 5B12310320%5D%20%3D%202.2.0%20ORDER%20BY%20priority%20DESC> > >> >> targeted for 2.2 release. We can use branch-2 to land these > >> patches and > >> >>for > >> >> additional stabilization efforts. Any volunteer for Release > Manager > >> >>driving > >> >> 2.2 release? > >> >> > >> >> Thanks, > >> >> Ashutosh > >> >> > >> >> On Fri, Mar 10, 2017 at 4:23 PM, Ashutosh Chauhan < > >> hashut...@apache.org> > >> >> wrote: > >> >> > >> >> > I hear what you are saying. Lets begin with 3 concerns: > >> >> > > >> >> > - How will we keep the community motivated on fixing both > master > >> and > >> >> > branch-2? > >> >> > Until we do a stable release from master, stable releases can > >> come > >> >>only > >> >> > from branch-2. If a contributor wants to see their fix reach to > >> users > >> >>on > >> >> a > >> >> > stable line quickly they would have to have a fix on branch-2. > >> Also, a > >> >> > release manager can pick whatever fixes she wants, so even if > >> >>contributor > >> >> > doesn't commit it on branch-2, a release manger who wants to > do a > >> >>release > >> >> > containing a set of fixes thats always possible. > >> >> > > >> >> > - *Harder cherry-picks between master and branch-2*. > >> >> > That is certainly possible. But hope is we want to keep > branch-2 > >> >>stable, > >> >> > so we don't backport large features which may run into this > >> issue. > >> >> Smaller > >> >> > focussed bug fix backport should be possible. > >> >> > > >> >> > > >> >> > - *Removal of MR2 on the master branch*. > >> >> > This is something I personally would like to see. But exact > >> timing of > >> >>it > >> >> > will be decided by community. I am certainly not saying that as > >> soon > >> >>as > >> >> > branch-2 is created, lets remove MR2 on master. > >> >> > > >> >> > I would also say that in the end ASF is volunteer organization, > >> we > >> >>cant > >> >> > force people to adopt one branch or another. Its upto the > >> contributors > >> >> what > >> >> > jiras they work on and when and where they commit it. > >> >> > By not creating a branch-2 only thing we can guarantee is that > >> rate of > >> >> > development on master to remain slow because we don't want to > >> start > >> >>doing > >> >> > backward incompatible changes without explicitly acknowledging > >> that. > >> >> > > >> >> > Thanks, > >> >> > Ashutosh > >> >> > > >> >> > On Thu, Mar 9, 2017 at 12:01 PM, Sergio Pena > >> >><sergio.p...@cloudera.com> > >> >> > wrote: > >> >> > > >> >> >> Hey Ashutosh, thanks for soliciting feedback on this. > >> >> >> > >> >> >> I like the idea you're proposing; maintaining compatibility > and > >> at > >> >>the > >> >> >> same time adding newer features to > >> >> >> Hive consumes a lot of development time and effort. > >> >> >> > >> >> >> However, I think some users and companies have just started to > >> use > >> >>Hive > >> >> >> 2.x > >> >> >> branch as their main major upgrade on Hive > >> >> >> (possible due to waiting for stabilization and testing > >> upgrades), but > >> >> >> cutting this major branch that just has 1 year of life > >> >> >> might make us look like we will forget about the quality of > >> Hive 2.x > >> >>as > >> >> we > >> >> >> did with branch-1. > >> >> >> > >> >> >> Hive 1.x latest version was 1.2, and its development stopped > >> because > >> >>new > >> >> >> features on Hive 2.x > >> >> >> Hive 2.x latest version is 2.1, and we want to create Hive 3.x > >> >>because > >> >> of > >> >> >> newer features and incompatibilities. > >> >> >> Will Hive 3.x have the same future after 3.1 is released? > >> >> >> > >> >> >> What I'm also concerned is about these three things: > >> >> >> > >> >> >> - *Branch-2 quality commitment*. > >> >> >> How will we keep the community motivated on fixing both > >> master and > >> >> >> branch-2? > >> >> >> - *Harder cherry-picks between master and branch-2*. > >> >> >> Because master will be incompatible by nature, then > >> cherry-picks > >> >>to > >> >> >> branch-2 will be harder. > >> >> >> - *Removal of MR2 on the master branch*. > >> >> >> This was marked as deprecated just last year, but MR2 is > >> still an > >> >> >> engine > >> >> >> that is used by several users. > >> >> >> > >> >> >> I accept that the end of life of major versions will come at > >> some > >> >>point, > >> >> >> and these concerns will expire, > >> >> >> but Hive 2.x is kind of young, isn't it? > >> >> >> > >> >> >> Should we try to stabilize the Hive 2.x line first, and have a > >> few > >> >>more > >> >> >> releases before starting to work on Hive 3.0? > >> >> >> Should we add more test coverage to Hive jenkins jobs to > >> validate > >> >>Hive > >> >> 2.x > >> >> >> quality? > >> >> >> Should we agree on a date about when we should drop community > >> >>support on > >> >> >> Hive versions to let users know about this? > >> >> >> > >> >> >> Again, I like your proposal, but I'm afraid that users who > just > >> >>upgraded > >> >> >> to > >> >> >> 2.x won't have any more features and improvements > >> >> >> because they will be developed on 3.0. > >> >> >> > >> >> >> - Sergio > >> >> >> > >> >> >> > >> >> >> > >> >> >> On Mon, Mar 6, 2017 at 1:24 PM, Ashutosh Chauhan < > >> >> >> ashutosh.chau...@gmail.com > >> >> >> > wrote: > >> >> >> > >> >> >> > The way it helps shedding debt is because dev can now do > >> >>refactoring > >> >> >> > without fear of breaking some rarely used features. The way > >> that > >> >>helps > >> >> >> for > >> >> >> > adding feature faster is since codebase is lean and easier > to > >> >>reason > >> >> >> about > >> >> >> > its much easier to add new features. > >> >> >> > > >> >> >> > More importantly though, it also helps users because we are > >> setting > >> >> the > >> >> >> > expectation from dev community. They can expect that future > >> >>releases > >> >> of > >> >> >> 2.x > >> >> >> > to be backward compatible. At the same time whenever they > >> decide to > >> >> >> upgrade > >> >> >> > they only need to test their application once against 3.x as > >> >>oppose to > >> >> >> > continuous breakage of one form or another if we continue to > >> make > >> >> >> > incompatible changes in master without branching for 2.x > >> >> >> > > >> >> >> > Thanks, > >> >> >> > Ashutosh > >> >> >> > > >> >> >> > On Sat, Mar 4, 2017 at 10:19 AM, Edward Capriolo < > >> >> edlinuxg...@gmail.com > >> >> >> > > >> >> >> > wrote: > >> >> >> > > >> >> >> > > Also i dont follow how we remove > >> >> >> > > > >> >> >> > > On Saturday, March 4, 2017, Edward Capriolo > >> >><edlinuxg...@gmail.com> > >> >> >> > wrote: > >> >> >> > > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > > On Fri, Mar 3, 2017 at 8:46 PM, Thejas Nair < > >> >> thejas.n...@gmail.com > >> >> >> > > > <javascript:_e(%7B%7D,'cvml','thejas.n...@gmail.com > ');>> > >> wrote: > >> >> >> > > > > >> >> >> > > >> +1 > >> >> >> > > >> There are some features that are incomplete and what I > >> would > >> >>not > >> >> >> > > recommend > >> >> >> > > >> for any real production use.The 'legacy authorization > >> mode' > >> >>is a > >> >> >> great > >> >> >> > > >> example of that - > >> >> >> > > >> https://cwiki.apache.org/confl > >> uence/display/Hive/Hive+Defaul > >> >> >> > > >> t+Authorization+-+Legacy+Mode > >> >> >> > > >> . It is inherently insecure mode that nobody should be > >> using. > >> >> >> > > >> > >> >> >> > > >> There is also potential to cleanup of the thrift api. > >> However, > >> >> >> there > >> >> >> > are > >> >> >> > > >> many users of this api, we would need to go the > >> deprecation > >> >>then > >> >> >> > remove > >> >> >> > > >> after couple of releases route or so for that. > >> >> >> > > >> > >> >> >> > > >> I am sure there are many other candidates. We will have > >> to > >> >> evaluate > >> >> >> > each > >> >> >> > > >> of > >> >> >> > > >> those features on the risk/benefit of keeping them and > >> >>arriving > >> >> at > >> >> >> a > >> >> >> > > >> decision. > >> >> >> > > >> > >> >> >> > > >> Also, +1 on getting a 2.2 release out before we branch. > >> >> >> > > >> > >> >> >> > > >> > >> >> >> > > >> > >> >> >> > > >> On Fri, Mar 3, 2017 at 1:50 PM, Ashutosh Chauhan < > >> >> >> > hashut...@apache.org > >> >> >> > > >> <javascript:_e(%7B%7D,'cvml','hashut...@apache.org > ');>> > >> >> >> > > >> wrote: > >> >> >> > > >> > >> >> >> > > >> > Hi all, > >> >> >> > > >> > > >> >> >> > > >> > Hive project has come a long way. With wide-spread > >> adoption > >> >> also > >> >> >> > comes > >> >> >> > > >> > expectations. Expectation of being backward > compatible > >> and > >> >>not > >> >> >> > > breaking > >> >> >> > > >> > things. However that doesn't come free of cost and > >> results > >> >>in > >> >> >> lot of > >> >> >> > > >> legacy > >> >> >> > > >> > code which can't be refactored without fear of > breaking > >> >>things. > >> >> >> As a > >> >> >> > > >> result > >> >> >> > > >> > project has accumulated lot of debt over time. At the > >> same > >> >>time > >> >> >> > there > >> >> >> > > >> are > >> >> >> > > >> > also lot of features which have seen little uptake. > We > >> may > >> >>want > >> >> >> to > >> >> >> > > drop > >> >> >> > > >> > some of those. > >> >> >> > > >> > > >> >> >> > > >> > In order to move forward and shed that debt we may > >> need a > >> >>major > >> >> >> > > version > >> >> >> > > >> > release which allows us to make backward incompatible > >> >>changes > >> >> and > >> >> >> > drop > >> >> >> > > >> > rarely used features. At the same time there are lots > >> of > >> >>users > >> >> >> which > >> >> >> > > are > >> >> >> > > >> > consuming currently released 2.1 , 2.2 branches and > >> expect > >> >>them > >> >> >> to > >> >> >> > > stay > >> >> >> > > >> on > >> >> >> > > >> > it for some time. So, I propose that we create > >> branch-2 from > >> >> >> current > >> >> >> > > tip > >> >> >> > > >> > and do future 2.x releases from that branch and keep > it > >> >> backward > >> >> >> > > >> > compatible. This will allow devs to land breaking > >> changes on > >> >> >> master > >> >> >> > > and > >> >> >> > > >> > pave way to release hive 3.0 in future. > >> >> >> > > >> > > >> >> >> > > >> > Ofcourse, each specific incompatible change and > >> feature drop > >> >> >> even > >> >> >> > on > >> >> >> > > >> > master need to be evaluated on its own merit on > >> >>corresponding > >> >> >> jira. > >> >> >> > > This > >> >> >> > > >> > email is just a solicitation of feedback for creating > >> >>branch-2 > >> >> >> and > >> >> >> > > >> allowing > >> >> >> > > >> > breaking changes in master. Thoughts? > >> >> >> > > >> > > >> >> >> > > >> > Thanks, > >> >> >> > > >> > Ashutosh > >> >> >> > > >> > > >> >> >> > > >> > >> >> >> > > > > >> >> >> > > > One of the challenges of the developers conducting the > >> >> risk-benefit > >> >> >> > > > analysis are that the developers are mostly focused on > new > >> >> features, > >> >> >> > but > >> >> >> > > > there are deployments of hive that are 5+ years old and > >> people > >> >> that > >> >> >> > rely > >> >> >> > > on > >> >> >> > > > the features are not on the mailing list. > >> >> >> > > > > >> >> >> > > > For example I developed and use this frequently: > >> >> >> > > > > >> >> >> > > > https://community.hortonworks. > >> com/articles/8861/apache-hive- > >> >> >> > > > groovy-udf-examples.html > >> >> >> > > > > >> >> >> > > > My career went away from hive for a while. I was quite > >> >>surprised > >> >> to > >> >> >> > find > >> >> >> > > > out the cli->beeline it was more or less decided not to > >> port > >> >>it. I > >> >> >> > > learned > >> >> >> > > > of this the first time I was forced to work in a hive > >> server > >> >>only > >> >> >> > > > environment and it did not work. > >> >> >> > > > > >> >> >> > > > Now I have to go and spend time adding this back so I > >> don't > >> >>have > >> >> to > >> >> >> > work > >> >> >> > > > around it not being there. > >> >> >> > > > > >> >> >> > > > What we should do continue/doing is making code that is > >> >>modular we > >> >> >> need > >> >> >> > > to > >> >> >> > > > break hard dependencies like ThriftSerde or OrcSerde > being > >> >> "native" > >> >> >> and > >> >> >> > > > having to be linked to the metastore move them out into > >> proper > >> >> >> > > submodules. > >> >> >> > > > There is too much code that only works for one > >> implementation > >> >>of a > >> >> >> > serde > >> >> >> > > > etc. > >> >> >> > > > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > I would like a timeline to understand this. It sounds as > if > >> >>master > >> >> is > >> >> >> not > >> >> >> > > releasable currently, so already broken in a way. We make > a > >> >>branch > >> >> and > >> >> >> > > aggreasively break it more? > >> >> >> > > > >> >> >> > > Im not following what makes this branching policy makes > >> adding > >> >> >> features > >> >> >> > > faster or how it helps shed debt faster. > >> >> >> > > > >> >> >> > > > >> >> >> > > -- > >> >> >> > > Sorry this was sent from mobile. Will do less grammar and > >> spell > >> >> check > >> >> >> > than > >> >> >> > > usual. > >> >> >> > > > >> >> >> > > >> >> >> > >> >> > > >> >> > > >> >> > >> > >> > >> > >> > > >