subject:"hive on spark"

Re: Should we consider Spark3 support for Hive on Spark

2022-08-24 Thread Jan Fili

Yes exactly.

This is what is recommended, cause hive on Spark has little interest.
However there is nothing enforcing not todo it.

Important to me cause i sit here and work on grassroots marrieng hive on
kafka-streams.

Owen O'Malley  schrieb am Mi., 24. Aug. 2022, 18:51:

> Hive on Spark is not recommended. The recommended path is to use either
> Tez or LLAP. If you already are using Spark 3, it would be far easier to
> use Spark SQL.
>
> .. Owen
>
> On Wed, Aug 24, 2022 at 3:46 AM Fred Bai 
> wrote:
>
>> Hi everyone:
>>
>> Do we have any support for Hive on Spark? I need Hive on Spark, but my
>> Spark version is 3.X.
>>
>> I found Hive incompatible with Spark3, I modify a lot of code to be
>> compatible.
>>
>> Hive on Spark has deprecated?
>>
>> And. Hive on Spark is very slow when the job executes.
>>
>

Re: Should we consider Spark3 support for Hive on Spark

2022-08-24 Thread Owen O'Malley

Hive on Spark is not recommended. The recommended path is to use either Tez
or LLAP. If you already are using Spark 3, it would be far easier to use
Spark SQL.

.. Owen

On Wed, Aug 24, 2022 at 3:46 AM Fred Bai  wrote:

> Hi everyone:
>
> Do we have any support for Hive on Spark? I need Hive on Spark, but my
> Spark version is 3.X.
>
> I found Hive incompatible with Spark3, I modify a lot of code to be
> compatible.
>
> Hive on Spark has deprecated?
>
> And. Hive on Spark is very slow when the job executes.
>

Re: Should we consider Spark3 support for Hive on Spark

2022-08-24 Thread hernan saab via user

Do you honestly believe that a non apache community dev can just fork hive, 
modify the code and make it work with any version of spark? Is that what you 
are suggesting? Please, let us know if this is the case.


Sent from Yahoo Mail for iPad


On Wednesday, August 24, 2022, 6:13 AM, Jan Fili  wrote:

Can always fork to get things going ;)

*sorry for spam*

Am Mi., 24. Aug. 2022 um 06:34 Uhr schrieb hernan saab via user
:
>
>
> Hey Fred,
>
> Contrary to what you may perceive from the hive docs, what you are trying to 
> do is not plug and play.
> Only apache committers can do what you are trying to do.
> Use canned solutions such as confluence or AWS EMR and save yourself weeks of 
> wasted effort.
>
> Hernán
> On Tuesday, August 23, 2022 at 08:46:30 PM PDT, Fred Bai 
>  wrote:
>
>
> Hi everyone:
>
> Do we have any support for Hive on Spark? I need Hive on Spark, but my Spark 
> version is 3.X.
>
> I found Hive incompatible with Spark3, I modify a lot of code to be 
> compatible.
>
> Hive on Spark has deprecated?
>
> And. Hive on Spark is very slow when the job executes.

Re: Should we consider Spark3 support for Hive on Spark

2022-08-24 Thread Jan Fili

Can always fork to get things going ;)

*sorry for spam*

Am Mi., 24. Aug. 2022 um 06:34 Uhr schrieb hernan saab via user
:
>
>
> Hey Fred,
>
> Contrary to what you may perceive from the hive docs, what you are trying to 
> do is not plug and play.
> Only apache committers can do what you are trying to do.
> Use canned solutions such as confluence or AWS EMR and save yourself weeks of 
> wasted effort.
>
> Hernán
> On Tuesday, August 23, 2022 at 08:46:30 PM PDT, Fred Bai 
>  wrote:
>
>
> Hi everyone:
>
> Do we have any support for Hive on Spark? I need Hive on Spark, but my Spark 
> version is 3.X.
>
> I found Hive incompatible with Spark3, I modify a lot of code to be 
> compatible.
>
> Hive on Spark has deprecated?
>
> And. Hive on Spark is very slow when the job executes.

Re: Should we consider Spark3 support for Hive on Spark

2022-08-23 Thread hernan saab via user

 
Hey Fred,
Contrary to what you may perceive from the hive docs, what you are trying to do 
is not plug and play.Only apache committers can do what you are trying to 
do.Use canned solutions such as confluence or AWS EMR and save yourself weeks 
of wasted effort.
HernánOn Tuesday, August 23, 2022 at 08:46:30 PM PDT, Fred Bai 
 wrote:  
 
 Hi everyone:
Do we have any support for Hive on Spark? I need Hive on Spark, but my Spark 
version is 3.X.
I found Hive incompatible with Spark3, I modify a lot of code to be compatible.
Hive on Spark has deprecated? 
And. Hive on Spark is very slow when the job executes.

Should we consider Spark3 support for Hive on Spark

2022-08-23 Thread Fred Bai

Hi everyone:

Do we have any support for Hive on Spark? I need Hive on Spark, but my
Spark version is 3.X.

I found Hive incompatible with Spark3, I modify a lot of code to be
compatible.

Hive on Spark has deprecated?

And. Hive on Spark is very slow when the job executes.

Re: Time to Remove Hive-on-Spark

2022-04-12 Thread Peter Vary

+1 from my side too.

I have created PR against the current branch.
Still needs some work, and as many reviews as possible, because it is quite
big, and I might made some mistakes
https://issues.apache.org/jira/browse/HIVE-26134
https://github.com/apache/hive/pull/3201

Thanks,
Peter

On Thu, 10 Feb 2022 at 17:43, Zoltan Haindrich  wrote:

> Hey,
>
> I think there is no real interest in this feature; we don't have
> users/contributors backing it - last development was around 2018 October;
> there were ~2 bugfix commits ever
> since that...we should stop carrying dead weight...another 2 weeks went by
> since Stamatis have reminded us that after 1.5 years(!) nothing have
> changed.
>
> +1 on removing it
>
> cheers,
> Zoltan
>
> you may inspect some of the recent changes with:
> git log -c `find . -type f -path '**/spark/**'|grep -v xml|grep -v
> properties|grep -v q.out`
>
>
> On 1/28/22 2:32 PM, Stamatis Zampetakis wrote:
> > Hi team,
> >
> > Almost one year has passed since the last exchange in this discussion and
> > if I am not wrong there has been no effort to revive Hive-on-Spark. To be
> > more precise, I don't think I have seen any Spark related JIRA for quite
> > some time now and although I don't want to rush into conclusions, there
> > does not seem to be any community member involved in maintaining or
> adding
> > new features in this part of the code.
> >
> > Keeping dead code in the repository does not do any good to the project
> and
> > puts a non-negligible burden to future maintainers.
> >
> > Clearly, we cannot make a new Hive release where a major feature is
> > completely untested so either someone commits to re-enable/fix the
> > respective tests soon or we move forward the work started by David and
> drop
> > support for Hive-on-Spark.
> >
> > I would like to ask the community if there is anyone who can take up this
> > maintenance task and enable/fix Spark related tests in the next month or
> so?
> >
> > Best,
> > Stamatis
> >
> > On Sat, Feb 27, 2021 at 4:17 AM Edward Capriolo 
> > wrote:
> >
> >> I do not know how it works for most of the world. But in cloudera where
> the
> >> TEZ options were never popular hive-on-spark represents a solid way to
> get
> >> things done for small datasets lower latency.
> >>
> >> As for the spark adoption. You know a while ago I came up with some
> ways to
> >> make hive more  spark like. One of them was a found a way to make
> "compile"
> >> a hive keyword so folks could build UDFs on the fly. It was such an
> >> uphil climb. Folks found a way to make it disabled by default for
> security.
> >> Then later when things moved from CLI to beeline it was like the ONLY
> thing
> >> that I found not ported. Like it was extremely frustrating.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Jul 27, 2020 at 3:19 PM David  wrote:
> >>
> >>> Hello  Xuefu,
> >>>
> >>> I am not part of the Cloudera Hive product team,  though I volunteer to
> >>> work on small projects from time to time.  Perhaps someone from that
> team
> >>> can chime in with some of their thoughts, but personally, I think that
> in
> >>> the long run, there will be more of a merge between Hive-on-Spark and
> >> other
> >>> Spark-native offerings.  I'm not sure what the differentiation will be
> >>> going forward.  With that said, are there any developers on this
> mailing
> >>> list who are willing to take on the maintenance effort of keeping HoS
> >>> moving forward?
> >>>
> >>> http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/
> >>>
> >>>
> >>
> https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html
> >>>
> >>>
> >>> Thanks.
> >>>
> >>> On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang  wrote:
> >>>
> >>>> Previous reasoning seemed to suggest a lack of user adoption. Now we
> >> are
> >>>> concerned about ongoing maintenance effort. Both are valid
> >>> considerations.
> >>>> However, I think we should have ways to find out the answers.
> >> Therefore,
> >>> I
> >>>> suggest the following be carried out:
> >>>>
> >>>> 1. Send out the proposal (removing Hive on Spark) to users including
> >>>

Re: Time to Remove Hive-on-Spark

2022-02-10 Thread Zoltan Haindrich

Hey,

I think there is no real interest in this feature; we don't have users/contributors backing it - last development was around 2018 October; there were ~2 bugfix commits ever
since that...we should stop carrying dead weight...another 2 weeks went by since Stamatis have reminded us that after 1.5 years(!) nothing have changed.

+1 on removing it

cheers,
Zoltan

you may inspect some of the recent changes with:
git log -c `find . -type f -path '**/spark/**'|grep -v xml|grep -v
properties|grep -v q.out`

On 1/28/22 2:32 PM, Stamatis Zampetakis wrote:

Hi team,

Almost one year has passed since the last exchange in this discussion and
if I am not wrong there has been no effort to revive Hive-on-Spark. To be
more precise, I don't think I have seen any Spark related JIRA for quite
some time now and although I don't want to rush into conclusions, there
does not seem to be any community member involved in maintaining or adding
new features in this part of the code.

Keeping dead code in the repository does not do any good to the project and
puts a non-negligible burden to future maintainers.

Clearly, we cannot make a new Hive release where a major feature is
completely untested so either someone commits to re-enable/fix the
respective tests soon or we move forward the work started by David and drop
support for Hive-on-Spark.

I would like to ask the community if there is anyone who can take up this
maintenance task and enable/fix Spark related tests in the next month or so?

Best,
Stamatis

On Sat, Feb 27, 2021 at 4:17 AM Edward Capriolo
wrote:

I do not know how it works for most of the world. But in cloudera where the
TEZ options were never popular hive-on-spark represents a solid way to get
things done for small datasets lower latency.

As for the spark adoption. You know a while ago I came up with some ways to
make hive more spark like. One of them was a found a way to make "compile"
a hive keyword so folks could build UDFs on the fly. It was such an
uphil climb. Folks found a way to make it disabled by default for security.
Then later when things moved from CLI to beeline it was like the ONLY thing
that I found not ported. Like it was extremely frustrating.

On Mon, Jul 27, 2020 at 3:19 PM David wrote:

Hello Xuefu,

I am not part of the Cloudera Hive product team, though I volunteer to
work on small projects from time to time. Perhaps someone from that team
can chime in with some of their thoughts, but personally, I think that in
the long run, there will be more of a merge between Hive-on-Spark and

other

Spark-native offerings. I'm not sure what the differentiation will be
going forward. With that said, are there any developers on this mailing
list who are willing to take on the maintenance effort of keeping HoS
moving forward?

http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html

Thanks.

On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang wrote:

Previous reasoning seemed to suggest a lack of user adoption. Now we

are

concerned about ongoing maintenance effort. Both are valid

considerations.

However, I think we should have ways to find out the answers.

Therefore,

suggest the following be carried out:

1. Send out the proposal (removing Hive on Spark) to users including
user@hive.apache.org and get their feedback.
2. Ask if any developers on this mailing list are willing to take on

1 2 3 4 >

1 - 100 of 305 matches

Mail list logo