Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a very large scale in production right now and I don't think we have any plan to change it soon.
On Tue, Jul 21, 2020 at 11:28 AM David <dam6...@gmail.com> wrote: > Hello, > > Thanks for the feedback. > > Just a quick recap: I did propose this @dev and I received unanimous +1's > from the community. After a couple months, I created the PR. > > Certainly open to discussion, but there hasn't been any discussion thus far > because there have been no objections until this point. > > HoS has low adoption, heavy technical debt, and the manner in which its > build process is setup is impeding some other work that is not even related > to HoS. > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan would be to > use Tez moving forward. > > My point about the vendor's move to Tez is that HoS adoption is very low, > it's only going lower, and while I don't know the specifics of it, there > must be some migration plan in place there (i.e., it must be possible to do > it already). > > Thanks, > David > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang <xu...@apache.org> wrote: > > > Hi David, > > > > While a vendor may not support a component in an open source project, > > removing it or not is a decision by and for the community. I certainly > > understand that the vendor you mentioned has contributed a great deal > > (including my personal effort while working there), it's not up to the > > vendor to make a call like what is proposed here. > > > > As a community, we should have gone through a thorough discussion and > > reached a consensus before actually making such a big change, in my > > opinion. > > > > Thanks, > > Xuefu > > > > On Tue, Jul 21, 2020 at 8:49 AM David <dam6...@gmail.com> wrote: > > > > > Hey, > > > > > > Thanks for the input. > > > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their > latest > > > offering. > > > > > > "Tez is now the only supported execution engine, existing queries that > > > change execution mode to Spark or MapReduce within a session, for > > example, > > > fail." > > > > > > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > > > > So I don't know who will be supporting this feature moving forward, but > > > there has been a lot of work done to make this change as painless as > > > possible. Simply set the engine to 'tez' and remove the HoS-related > > > settings should address many use cases. > > > > > > Thanks. > > > > > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z <usxu...@gmail.com> wrote: > > > > > > > Sorry for chiming in late. However, I don't think we should remove > Hive > > > on > > > > Spark just because of a technical problem. This is rather a big > > decision > > > > that we need to be careful about. There are users that will be left > > high > > > > and dry by this move. > > > > > > > > If the community decides to desupport and eventually remove it, I > think > > > we > > > > need to have a due process. We also need a deprecation plan if that's > > we > > > > decide to do. Before that, I'm -1 on this proposal. > > > > > > > > Thanks, > > > > Xuefu > > > > > > > > On Tue, Jul 21, 2020 at 7:57 AM David <dam6...@gmail.com> wrote: > > > > > > > > > Hello Team, > > > > > > > > > > https://github.com/apache/hive/pull/1285 > > > > > > > > > > Thanks. > > > > > > > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V <gop...@apache.org> wrote: > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > Cheers, > > > > > > Gopal > > > > > > > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > > > > +1 > > > > > > > > > > > > > > -Jesús > > > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates < > alanfga...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > >> +1. > > > > > > >> > > > > > > >> Alan. > > > > > > >> > > > > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > > > > >> <pjayachand...@cloudera.com.invalid> wrote: > > > > > > >> > > > > > > >>> +1 > > > > > > >>> > > > > > > >>>> On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > > > > hashut...@apache.org> > > > > > > >>> wrote: > > > > > > >>>> > > > > > > >>>> +1 > > > > > > >>>> > > > > > > >>>> On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > > > dam6...@gmail.com> > > > > > > >> wrote: > > > > > > >>>> > > > > > > >>>>> Hello Gang, > > > > > > >>>>> > > > > > > >>>>> I have spent some time working on upgrading Avro (far less > > than > > > > > > >> others): > > > > > > >>>>> > > > > > > >>>>> https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > >>>>> > > > > > > >>>>> This should be a relatively easy thing to do, but is > blocked > > by > > > > > > >>>>> Hive-on-Spark. HoS has a weird thing where it downloads > some > > > > > > >>>>> cloud-storage-hosted file of Spark-Hadoop as part of its > > maven > > > > run. > > > > > > >>>>> > > > > > > >>>>> Since HoS is not going to receive updates from the major > > > vendors, > > > > > is > > > > > > >> it > > > > > > >>>>> time to simply remove it? > > > > > > >>>>> > > > > > > >>>>> Tests are currently disabled: > > > > > > >>>>> https://issues.apache.org/jira/browse/HIVE-23137 > > > > > > >>>>> > > > > > > >>>>> Thanks. > > > > > > >>>>> > > > > > > >>> > > > > > > >>> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Xuefu Zhang > > > > > > > > "In Honey We Trust!" > > > > > > > > > >