Re: Any plan for new hive 3 or 4 release?

2021-02-23 Thread Mass Dosage
I would love to see a HIve 3.1 release which is capable of being used on
Java 11 like Hive 2 is.

What is the main difference going to be between Hive 3 and 4? The removal
of MR?

On Mon, 22 Feb 2021 at 16:46, Zoltan Haindrich  wrote:

> Hey Michel!
>
> Yes it was a long time ago we had a release; we have quite a few new
> features in master.
> I think we are scaring people for some time now that we will be dropping
> MR support...I think we should do that.
>
> I would really like to see a new Hive release in the near future as well -
> there is no way for users to even try out new features.
> I was planning to add nightly builds to package the latest master's state
> into a deployable artifact - I think a service like may help pretest our
> next release; I think it
> won't take much to do it so I'll probably throw it together in the next
> couple days!
>
> cheers,
> Zoltan
>
> On 2/21/21 2:27 PM, Michel Sumbul wrote:
> > Hi Guys,
> >
> > If I'm not wrong, the last release of Hive 3.x is 18 months old.
> > I wanted to ask if you had any roadmap / plan to release a new version of
> > Hive 3.x or Hive 4?
> >
> > Thanks,
> > Michel
> >
>


Re: [EXTERNAL] Hive meetup

2021-02-23 Thread Mass Dosage
I'm interested, I'd like to propose talking about future releases and
making these more regular as well as the absolute pain that the Hive build
is with all its flaky unit tests. I know some work has been done on this in
the past but I think it's a huge barrier to new developers, especially
casual ones who want to fix a small bug but can never get all the tests to
pass. Hive-Iceberg is another good topic.

On Tue, 23 Feb 2021 at 11:20, Peter Vary  wrote:

> +1 for the meetup
>
> If the team is interested, we can talk about Hive-Iceberg integration
>
> Thanks,
> Peter
>
> > On Feb 23, 2021, at 04:34, Aasha  wrote:
> >
> > +1
> >
> >> On 22-Feb-2021, at 11:54 PM, Matt McCline 
> >> 
> wrote:
> >>
> >> Definitely interested.
> >>
> >> -Original Message-
> >> From: Zoltan Haindrich 
> >> Sent: Monday, February 22, 2021 10:17 AM
> >> To: dev@hive.apache.org
> >> Subject: [EXTERNAL] Hive meetup
> >>
> >> Hey All!
> >>
> >> It was quite some time ago when we had a meetup - and in these covid
> times it would be online-only anyway :) We were mentioning this lately here
> and there at Cloudera.
> >> I think we could have a few talks spanning 2-3 hours or so.
> >>
> >> Are there any interest in it?
> >>
> >> I would be happy to talk about how hive-test-kube works and how
> hive-dev-box is employed during testing.
> >>
> >> cheers,
> >> Zoltan
>
>


Re: Hive meetup

2021-03-02 Thread Mass Dosage
I have a strong preference for the week delay to the 17th you suggest as
the 10th is my daughter's birthday ;)

Thanks,

Adrian

On Tue, 2 Mar 2021 at 11:53, Zoltan Haindrich  wrote:

> Hey All!
>
> Cool I think then we could mark 1700 UTC as a starting time!
>
> I don't know if the March 10. is still good for everyone (especially to
> those who will be presenting)
> To avoid rushing things I might be inclined to postpone it with one week -
> to March 17.
>
> To assemble a schedule/etc I've opened a drive document - feel free to
> extend/update/etc!
>
> https://docs.google.com/document/d/12jaWa7e6jvVjUaxoMWNJcjvTjnNoqwdCAMyswY1OiUg/edit?usp=sharing
>
> After around thursday - I'll try to post the resulting document to
> meetup.com if possible :)
>
> cheers,
> Zoltan
>
> On 2/25/21 4:08 PM, Sankar Hariappan wrote:
> > Thanks Zoltan for the initiative!
> > UTC 5 pm should work for India folks.
> >
> > Thanks,
> > Sankar
> >
> > -Original Message-
> > From: Stamatis Zampetakis 
> > Sent: 25 February 2021 20:35
> > To: dev 
> > Subject: [EXTERNAL] Re: Hive meetup
> >
> > Great initiative Zoltan!
> >
> > I would be very happy to learn more about Hive and the subjects outlined
> so far seem very interesting.
> >
> > Looking forward,
> > Stamatis
> >
> > On Thu, Feb 25, 2021 at 9:51 AM Zoltan Haindrich  wrote:
> >
> >> Hey All!
> >>
> >> Thank you for responing back - It's great to see that there is demand
> >> for this thing :)
> >>
> >>   > However in one of the next ones we could have talks about
> >> particular  > features, integrations, improvements etc.
> >>   > Would be more than happy to talk about the latest ORC upgrade and
> >> Hive's  > lazy decoding feature.
> >>
> >> That would be great! I don't think the current content is already
> >> fixed...but I think it's important to not overload the meetup with
> >> content
> >> - because this will be a
> >> meeting after a long time I think there will be some open discussions
> >> around some further topics as well...
> >> So as you've suggested: I think making it more frequent would be
> >> really valuable.
> >>
> >> Does anyone have any preference on the date?  I just picked marc.10
> >> but I don't have any hard preference.
> >>
> >> Setting the time would be more difficult - because this will be a 2-3
> >> hour session and we seem to be scattered all around the globe...
> >> I was thinking UTC 5pm or 6pm
> >>
> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> >> timeanddate.com%2Fworldclock%2Fmeetingtime.html%3Fiso%3D20210310%26p1%
> >> 3D50%26p2%3D137%26p3%3D136%26p4%3D70%26p5%3D176&data=04%7C01%7CSan
> >> kar.Hariappan%40microsoft.com%7C047ab5d096a54503a75c08d8d99ebc2d%7C72f
> >> 988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637498623094983481%7CUnknown%7
> >> CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> >> CI6Mn0%3D%7C1000&sdata=pjV9i7G82xNZcpvuiNugA7YMmfVBkGDNSq9i7DZNN8Q
> >> %3D&reserved=0
> >>
> >> Please feel free to suggest other dates/times/etc!
> >>
> >> cheers,
> >> Zoltan
> >>
> >> On 2/24/21 10:57 AM, Panos Garefalakis wrote:
> >>> Great idea indeed! I wish we could make these somewhat regular.
> >>> Seems like we already have a few of topics to discuss:
> >>> * New testing infra
> >>> * Iceberg integration
> >>> * Release process
> >>>
> >>>
> >>> Cheers,
> >>> Panagiotis
> >>>
> >>> On Tue, Feb 23, 2021 at 1:51 PM Mass Dosage 
> >> wrote:
> >>>
> >>>> I'm interested, I'd like to propose talking about future releases
> >>>> and making these more regular as well as the absolute pain that the
> >>>> Hive
> >> build
> >>>> is with all its flaky unit tests. I know some work has been done on
> >> this in
> >>>> the past but I think it's a huge barrier to new developers,
> >>>> especially casual ones who want to fix a small bug but can never
> >>>> get all the tests
> >> to
> >>>> pass. Hive-Iceberg is another good topic.
> >>>>
> >>>> On Tue, 23 Feb 2021 at 11:20, Peter Vary
> >>>> 
> >>>&

Re: HIVE-2.4 release plans

2020-01-03 Thread Mass Dosage
+1 for this, or for a Hive 2.3.7 release. We are blocked from releasing
some of our projects which use Hive 2.3.x on Java >8 due to
https://issues.apache.org/jira/browse/HIVE-21508 which we helped get merged
but it hasn't been released yet. Similarly we'd like to be able to use some
Parquet related functionality which didn't work but is now fixed via
https://issues.apache.org/jira/browse/HIVE-22249 and also merged and ready
to be released.

Thanks,

Adrian

On Wed, 11 Dec 2019 at 15:25, Oleksiy S 
wrote:

> Hi all.
>
> Are there any plans for Hive-2.4 release?
>
> --
> Oleksiy
>


Re: HIVE-21508 and Hive 2.3.7 question

2020-02-11 Thread Mass Dosage
+1.

At Expedia Group  we are big users of Hive and are also experiencing issues
with not being able to use Hive 2.3.x on Java >8 which is starting to
seriously impact some of our applications which require Java 11. We worked
on HIVE-21508 in order to get it merged into the various branches and have
been asking for a Hive 2.3.7 release for months with no replies to our
questions on this mailing list.

Could someone from the Hive community please answer and let us know if
there is the possibility of a Hive 2.3.7 release? I've seen at least two
other requests for this on the list over the past few months.

If not we will be forced to fork the current 2.3 branch and release our own
version of Hive 2.3.7 to Maven Central (with a different group id) so that
we can use it (it sounds like this would be useful to others out there
too). We'd really rather not do this but I don't see any other solutions.

Thanks,

Adrian
-- 
Adrian Woodhead
Principal Engineer
Expedia Group - 407 St John Street, London, EC1V 4EX


On Thu, 30 Jan 2020 at 07:34, Hyukjin Kwon  wrote:

> Hi Hive dev team,
>
> As informed earlier, I, Yuming and many people from spark dev have made
> huge efforts
> to let Spark use official Hive release. Thanks Alan and all Hive dev for
> all the efforts for Hive 2.3.6 to make Spark support JDK 11.
>
> Few months ago, an unexpected problem was found. Spark throws
> ClassCastException when
> initializing HiveMetaStoreClient.
> Please see SPARK-29245 
> for
> more details. This has fixed by HIVE-21508
> .
> We postponed the Hive release request to Spark code freeze schedule to
> avoid multiple requests.
>
> Spark is going to freeze code 31st January (tomorrow), and I currently
> foresee the RC starts around March. So, this will be hopefully the last
> request for Hive release for Spark 3.0.
>
> I was wondering if we could release Hive 2.3.7 soon so Spark can uses it.
>
> Thanks.
>


Re: HIVE-21508 and Hive 2.3.7 question

2020-03-31 Thread Mass Dosage
Hey all,

We've made some progress on this and are getting closer to a 2.3.7 release.
Alan has identified 2 tests failing on the 2.3 branch that are fixed in
newer versions of Hive so he is proposing to backport the fixes for them.
The ticket for that is https://issues.apache.org/jira/browse/HIVE-23086 if
you want to watch it and vote it up. Hopefully we can get that merged soon
and then we'll be good to go.

Thanks,

Adrian

On Sun, 8 Mar 2020 at 02:41, Hyukjin Kwon  wrote:

> Thank you so much, Alan and all.
>
> 2020년 3월 8일 (일) 오전 10:36, Yuming Wang 님이 작성:
>
>> Great, thank you Alan and Adrian.
>>
>> On Sun, Mar 8, 2020 at 8:13 AM Alan Gates  wrote:
>>
>>> I'm working with Adrian on getting a 2.3.7 release out.  That will pick
>>> up everything that is already on the 2.3 branch.
>>>
>>> Alan.
>>>
>>> On Sat, Mar 7, 2020 at 6:02 AM Yuming Wang  wrote:
>>>
>>>> Hi Alan and Owen,
>>>>
>>>> Is there any plans to release Hive 2.3.7 or Hive 2.4.0? It may be the
>>>> only one that supports Java 11. Hive 3.x can not support it because of
>>>> HIVE-22097 <https://issues.apache.org/jira/browse/HIVE-22097>.
>>>>
>>>> On Tue, Feb 11, 2020 at 7:32 PM Mass Dosage 
>>>> wrote:
>>>>
>>>>> +1.
>>>>>
>>>>> At Expedia Group  we are big users of Hive and are also experiencing
>>>>> issues with not being able to use Hive 2.3.x on Java >8 which is starting
>>>>> to seriously impact some of our applications which require Java 11. We
>>>>> worked on HIVE-21508 in order to get it merged into the various branches
>>>>> and have been asking for a Hive 2.3.7 release for months with no replies 
>>>>> to
>>>>> our questions on this mailing list.
>>>>>
>>>>> Could someone from the Hive community please answer and let us know if
>>>>> there is the possibility of a Hive 2.3.7 release? I've seen at least two
>>>>> other requests for this on the list over the past few months.
>>>>>
>>>>> If not we will be forced to fork the current 2.3 branch and release
>>>>> our own version of Hive 2.3.7 to Maven Central (with a different group id)
>>>>> so that we can use it (it sounds like this would be useful to others out
>>>>> there too). We'd really rather not do this but I don't see any other
>>>>> solutions.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Adrian
>>>>> --
>>>>> Adrian Woodhead
>>>>> Principal Engineer
>>>>> Expedia Group - 407 St John Street, London, EC1V 4EX
>>>>>
>>>>>
>>>>> On Thu, 30 Jan 2020 at 07:34, Hyukjin Kwon 
>>>>> wrote:
>>>>>
>>>>>> Hi Hive dev team,
>>>>>>
>>>>>> As informed earlier, I, Yuming and many people from spark dev have
>>>>>> made
>>>>>> huge efforts
>>>>>> to let Spark use official Hive release. Thanks Alan and all Hive dev
>>>>>> for
>>>>>> all the efforts for Hive 2.3.6 to make Spark support JDK 11.
>>>>>>
>>>>>> Few months ago, an unexpected problem was found. Spark throws
>>>>>> ClassCastException when
>>>>>> initializing HiveMetaStoreClient.
>>>>>> Please see SPARK-29245 <
>>>>>> https://issues.apache.org/jira/browse/SPARK-29245> for
>>>>>> more details. This has fixed by HIVE-21508
>>>>>> <https://issues.apache.org/jira/browse/HIVE-21508>.
>>>>>> We postponed the Hive release request to Spark code freeze schedule to
>>>>>> avoid multiple requests.
>>>>>>
>>>>>> Spark is going to freeze code 31st January (tomorrow), and I currently
>>>>>> foresee the RC starts around March. So, this will be hopefully the
>>>>>> last
>>>>>> request for Hive release for Spark 3.0.
>>>>>>
>>>>>> I was wondering if we could release Hive 2.3.7 soon so Spark can uses
>>>>>> it.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>


Re: HIVE-21508 and Hive 2.3.7 question

2020-04-01 Thread Mass Dosage
I think, given that we're so close to potentially cutting a 2.3.7 release
(see Alan's separate post to the mailing list) that we shouldn't add
anything else at this this stage. This could potentially be of interest for
a 2.3.8 or 2.4.0 release if the rest of the Hive community agrees.

Thanks,

Adrian

On Tue, 31 Mar 2020 at 13:24, David Mollitor  wrote:

> Hello Team,
>
> Just to throw one more thing in there, awhile ago I put a good chunk of
> time into shoring up the ZK Lock Manager because I worked with a lot of
> folks on locking issues. HDP/CLDR moved away from ZK and is using a RDBMS
> and therefore never paid it much mind. Any interest in rolling it into Hive
> 2?
>
> HIVE-21469
>
> On Tue, Mar 31, 2020, 5:20 AM Mass Dosage  wrote:
>
> > Hey all,
> >
> > We've made some progress on this and are getting closer to a 2.3.7
> release.
> > Alan has identified 2 tests failing on the 2.3 branch that are fixed in
> > newer versions of Hive so he is proposing to backport the fixes for them.
> > The ticket for that is https://issues.apache.org/jira/browse/HIVE-23086
> if
> > you want to watch it and vote it up. Hopefully we can get that merged
> soon
> > and then we'll be good to go.
> >
> > Thanks,
> >
> > Adrian
> >
> > On Sun, 8 Mar 2020 at 02:41, Hyukjin Kwon  wrote:
> >
> > > Thank you so much, Alan and all.
> > >
> > > 2020년 3월 8일 (일) 오전 10:36, Yuming Wang 님이 작성:
> > >
> > >> Great, thank you Alan and Adrian.
> > >>
> > >> On Sun, Mar 8, 2020 at 8:13 AM Alan Gates 
> wrote:
> > >>
> > >>> I'm working with Adrian on getting a 2.3.7 release out.  That will
> pick
> > >>> up everything that is already on the 2.3 branch.
> > >>>
> > >>> Alan.
> > >>>
> > >>> On Sat, Mar 7, 2020 at 6:02 AM Yuming Wang  wrote:
> > >>>
> > >>>> Hi Alan and Owen,
> > >>>>
> > >>>> Is there any plans to release Hive 2.3.7 or Hive 2.4.0? It may be
> the
> > >>>> only one that supports Java 11. Hive 3.x can not support it because
> of
> > >>>> HIVE-22097 <https://issues.apache.org/jira/browse/HIVE-22097>.
> > >>>>
> > >>>> On Tue, Feb 11, 2020 at 7:32 PM Mass Dosage 
> > >>>> wrote:
> > >>>>
> > >>>>> +1.
> > >>>>>
> > >>>>> At Expedia Group  we are big users of Hive and are also
> experiencing
> > >>>>> issues with not being able to use Hive 2.3.x on Java >8 which is
> > starting
> > >>>>> to seriously impact some of our applications which require Java 11.
> > We
> > >>>>> worked on HIVE-21508 in order to get it merged into the various
> > branches
> > >>>>> and have been asking for a Hive 2.3.7 release for months with no
> > replies to
> > >>>>> our questions on this mailing list.
> > >>>>>
> > >>>>> Could someone from the Hive community please answer and let us know
> > if
> > >>>>> there is the possibility of a Hive 2.3.7 release? I've seen at
> least
> > two
> > >>>>> other requests for this on the list over the past few months.
> > >>>>>
> > >>>>> If not we will be forced to fork the current 2.3 branch and release
> > >>>>> our own version of Hive 2.3.7 to Maven Central (with a different
> > group id)
> > >>>>> so that we can use it (it sounds like this would be useful to
> others
> > out
> > >>>>> there too). We'd really rather not do this but I don't see any
> other
> > >>>>> solutions.
> > >>>>>
> > >>>>> Thanks,
> > >>>>>
> > >>>>> Adrian
> > >>>>> --
> > >>>>> Adrian Woodhead
> > >>>>> Principal Engineer
> > >>>>> Expedia Group - 407 St John Street, London, EC1V 4EX
> > >>>>>
> > >>>>>
> > >>>>> On Thu, 30 Jan 2020 at 07:34, Hyukjin Kwon 
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi Hive dev team,
> > >>>>>>
> > >>>>>> As informed earlier, I, Yuming and many people from spark dev have
> > >>>>>> made
> > >>>>>> huge efforts
> > >>>>>> to let Spark use official Hive release. Thanks Alan and all Hive
> dev
> > >>>>>> for
> > >>>>>> all the efforts for Hive 2.3.6 to make Spark support JDK 11.
> > >>>>>>
> > >>>>>> Few months ago, an unexpected problem was found. Spark throws
> > >>>>>> ClassCastException when
> > >>>>>> initializing HiveMetaStoreClient.
> > >>>>>> Please see SPARK-29245 <
> > >>>>>> https://issues.apache.org/jira/browse/SPARK-29245> for
> > >>>>>> more details. This has fixed by HIVE-21508
> > >>>>>> <https://issues.apache.org/jira/browse/HIVE-21508>.
> > >>>>>> We postponed the Hive release request to Spark code freeze
> schedule
> > to
> > >>>>>> avoid multiple requests.
> > >>>>>>
> > >>>>>> Spark is going to freeze code 31st January (tomorrow), and I
> > currently
> > >>>>>> foresee the RC starts around March. So, this will be hopefully the
> > >>>>>> last
> > >>>>>> request for Hive release for Spark 3.0.
> > >>>>>>
> > >>>>>> I was wondering if we could release Hive 2.3.7 soon so Spark can
> > uses
> > >>>>>> it.
> > >>>>>>
> > >>>>>> Thanks.
> > >>>>>>
> > >>>>>
> >
>


[jira] [Created] (HIVE-15965) Metastore incorrectly re-uses a broken database connection

2017-02-17 Thread Mass Dosage (JIRA)
Mass Dosage created HIVE-15965:
--

 Summary: Metastore incorrectly re-uses a broken database connection
 Key: HIVE-15965
 URL: https://issues.apache.org/jira/browse/HIVE-15965
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: storage-2.2.0
Reporter: Mass Dosage


*Background*
In our setup we have a shared standalone MetaStore server running on EMR that 
is accessed by various clients (Hive CLI, HiveServer2, Spark etc.) and connects 
to an external MariaDB database for the MetaStore DB. It came to our attention 
that MetaStore (or rather the underlying DataNucleus / BoneCP combo) will keep 
re-using the same DB connections even when those get suddenly closed for a 
reason that renders them unusable.

For instance, due to a bug in the MariaDB JDBC driver v1.3.6 (see 
https://jira.mariadb.org/browse/CONJ-270), a huge query including over 8 
thousand parameter placeholders (e.g. partition IDs in case of a 
{{get_partitions_by_expr}} function call)
will yield a {{java.nio.BufferOverflowException}} and cause the SQL connection 
be closed by the driver itself.

This will ultimately result in the abortion of all further MetaStore Thrift 
calls due to the failure of {{bonecp.ConnectionHandle.prepareStatement()}}.

Such scenarios will be then caught by DataNucleus and translated to an 
appropriate {{JDOException}}, only to be "ignored" by the 
MetaStore.{{RetryingHMSHandler}} will, of course, continue retrying the failing 
operation, but this is already pointless by that time since they will 
invariably fail as long as the SQL connection remains closed. Please see the 
attached MetaStore log [^hive.log] for details

(captured from Hive 2.1.1 running on Windows in Eclipse IDE).

 *Proposed behavior*

We suggest that MetaStore should automatically renew the DB connection whenever:

* The connection gets closed by one of the underlying frameworks (DataNucleus, 
BoneCP, JDBC driver); or
* Query timeout is detected.

This feature should be optional and configurable (disabled by default for 
backward compatibility). Reconnection failures could probably be treated as 
fatal errors and cause the immediate termination of MetaStore.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)