+1 for the HWI -> HiveServer approach.

Building out rich APIs in the HiveServer (thrift currently, and possible
REST at some point), would allow the HiveServer to focus on the functional
API. The HWI (and others) could then focus on rich UI functionality. The two
would have a clean decoupling, which would reduce complexity of the
codebases and help abid by the KISS principle.



On Wed, Aug 26, 2009 at 2:42 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:

> On Wed, Aug 26, 2009 at 3:25 PM, Raghu Murthy<rmur...@facebook.com> wrote:
> > Even if we decided to have multiple HiveServers, wouldn't it be possible
> for
> > HWI to randomly pick a HiveServer to connect to per query/client?
> >
> > On 8/26/09 12:16 PM, "Ashish Thusoo" <athu...@facebook.com> wrote:
> >
> >> +1 for ajaxing this baby.
> >>
> >> On the broader question of whether we should combine HWI and HiveServer
> - I
> >> think there are definite deployment and code reuse advantages in doing
> so,
> >> however keeping them separate also has the advantage that we can cluster
> >> HiveServers independently from HWI. Since the HiveServer sits in the
> data
> >> path, the independent scaling may have advantages. I am not sure how
> strong of
> >> an argument that is to not put them together. Simplicity obviously
> indicates
> >> that we should have them together.
> >>
> >> Thoughts?
> >>
> >> Ashish
> >>
> >> -----Original Message-----
> >> From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
> >> Sent: Wednesday, August 26, 2009 9:45 AM
> >> To: hive-user@hadoop.apache.org
> >> Subject: Re: Adding jar files when running hive in hwi mode or
> hiveserver mode
> >>
> >> On Tue, Aug 25, 2009 at 8:13 PM, Vijay<tec...@gmail.com> wrote:
> >>> Yep, I got it and now it works perfectly! I like hwi btw! It
> >>> definitely makes things easier for a wider audience to try out hive.
> >>> Your new session result bucket idea is very nice as well. I will keep
> >>> trying more things and see if anything else comes up but so far it
> looks
> >>> great!
> >>> Thanks Edward!
> >>>
> >>> On Tue, Aug 25, 2009 at 7:25 AM, Edward Capriolo
> >>> <edlinuxg...@gmail.com>
> >>> wrote:
> >>>>
> >>>> On Tue, Aug 25, 2009 at 10:18 AM, Edward
> >>>> Capriolo<edlinuxg...@gmail.com>
> >>>> wrote:
> >>>>> On Mon, Aug 24, 2009 at 10:13 PM, Vijay<tec...@gmail.com> wrote:
> >>>>>> Probably spoke too soon :) I added this comment to the JIRA ticket
> >>>>>> above.
> >>>>>>
> >>>>>> Hi, I tried the latest patch on trunk and there seems to be a
> problem.
> >>>>>>
> >>>>>> I was interested in using the "add jar " command to add jar files
> >>>>>> to the path. However, by the time the command flows through the
> >>>>>> SessionState to the AddResourceProcessor (in
> >>>>>>
> >>>>>> ./ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProc
> >>>>>> essor.java), the command word "add" is not being stripped so the
> >>>>>> resource processor is trying to find a ResourceType of "ADD."
> >>>>>>
> >>>>>> I'm not sure if this was an existing bug or was a result of the
> >>>>>> current set of changes.
> >>>>>>
> >>>>>> [ Show > ]
> >>>>>> Vijay added a comment - 24/Aug/09 07:12 PM Hi, I tried the latest
> >>>>>> patch on trunk and there seems to be a problem. I was interested
> >>>>>> in using the "add jar " command to add jar files to the path.
> >>>>>> However, by the time the command flows through the SessionState to
> >>>>>> the AddResourceProcessor (in
> >>>>>>
> >>>>>> ./ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProc
> >>>>>> essor.java), the command word "add" is not being stripped so the
> >>>>>> resource processor is trying to find a ResourceType of "ADD." I'm
> >>>>>> not sure if this was an existing bug or was a result of the
> >>>>>> current set of changes.
> >>>>>> On Mon, Aug 24, 2009 at 5:30 PM, Vijay <tec...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> That's awesome and looks like exactly what I needed. Local file
> >>>>>>> system requirement is perfectly ok for now. I will check it out
> right
> >>>>>>> away!
> >>>>>>> Hopefully it will be checked in soon.
> >>>>>>>
> >>>>>>> Thanks Edward!
> >>>>>>>
> >>>>>>> On Mon, Aug 24, 2009 at 5:14 PM, Edward Capriolo
> >>>>>>> <edlinuxg...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> On Mon, Aug 24, 2009 at 8:09 PM, Prasad
> >>>>>>>> Chakka<pcha...@facebook.com>
> >>>>>>>> wrote:
> >>>>>>>>> Vijay, there is no solution for it yet. There may be a jira
> >>>>>>>>> open but AFAIK, no one is working on it. You are welcome to
> >>>>>>>>> contribute this feature.
> >>>>>>>>>
> >>>>>>>>> Prasad
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ________________________________
> >>>>>>>>> From: Vijay <tec...@gmail.com>
> >>>>>>>>> Reply-To: <hive-user@hadoop.apache.org>
> >>>>>>>>> Date: Mon, 24 Aug 2009 16:59:28 -0700
> >>>>>>>>> To: <hive-user@hadoop.apache.org>
> >>>>>>>>> Subject: Re: Adding jar files when running hive in hwi mode or
> >>>>>>>>> hiveserver mode
> >>>>>>>>>
> >>>>>>>>> Hi, is there any solution for this? How does everybody include
> >>>>>>>>> custom jar files running hive in a non-cli mode?
> >>>>>>>>>
> >>>>>>>>> Thanks in advance,
> >>>>>>>>> Vijay
> >>>>>>>>>
> >>>>>>>>> On Sat, Aug 22, 2009 at 6:19 PM, Vijay <tec...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> When I run hive in cli mode, I add the hive_contrib.jar file
> >>>>>>>>> using this
> >>>>>>>>> command:
> >>>>>>>>>
> >>>>>>>>> hive> add jar lib/hive_contrib.jar
> >>>>>>>>>
> >>>>>>>>> Is there a way to do this automatically when running hive in
> >>>>>>>>> hwi or hiveserver modes? Or do I have to add the jar file
> >>>>>>>>> explicitly to any of the startup scripts?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Vijay,
> >>>>>>>>
> >>>>>>>> Currently HWI does not support this. The changes in
> >>>>>>>> https://issues.apache.org/jira/browse/HIVE-716 will make this
> >>>>>>>> possible (although I did not test but it should work as the cli
> >>>>>>>> does). The file will have to be in the servers local file
> >>>>>>>> system. We could probably include 'commons upload' to the web
> >>>>>>>> interface if there was a need for it.
> >>>>>>>>
> >>>>>>>> HIVE-716 should be in trunk soon. It does apply cleanly if its
> >>>>>>>> something you need today, Edward
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> I just committed a new version of the patch. You were correct, the
> >>>>> clidriver trims the first token off set and add queries hwi was not
> >>>>> doing that. Also let me know your impressions of HWI.
> >>>>>
> >>>>> The new features are the 'ResultBucket' a buffer of the last x
> >>>>> results viewable from the web interface, and the ability to supply
> >>>>> more then one query at a time.
> >>>>>
> >>>>> These two features should add much usability now as you can do
> >>>>> things like explain, show tables, etc and not have to dump the
> >>>>> results to a file.
> >>>>>
> >>>>> Edward
> >>>>>
> >>>>
> >>>> False statement:
> >>>>>> I just committed a new version of the patch
> >>>>
> >>>> In actuality, I updated the Jira with a new patch.
> >>>>
> >>>> It is still early AM. all the gears are not turning yet.
> >>>>
> >>>> Edward
> >>>
> >>>
> >>
> >> Vijay,
> >>
> >>>> It definitely makes things easier for a wider audience to try out
> >>>> hive
> >>
> >> That was always the goal. I often wonder which direction we should take
> HWI
> >> in.
> >> Should HWI have some REST-ful stubs to turn it into a remote job
> submission
> >> system?
> >> HiveServer uses thrift and I believe thrift has an HTTP-Transport so you
> might
> >> not need HWI to provide this.
> >>
> >> Should we ajax things like the result bucket or the entire interface so
> it has
> >> that ooo aaahhh effect?
> >>
> >> Really the larger question HWI has it's own multi-session management,
> >> HiveServer has this as well (now way back when it did not) . Should HWI
> just
> >> front end HiveServer?
> >>
> >> Does anyone have any thoughts?
> >> Edward
> >
> >
>
> I think Raghu is correct. HiveClient->HiveServer happens on a
> permanent TCP connection (I think?). If you had a back end cluster of
> HiveServers,  and you had a load balancer or proxy with
> sticky-session/session-tracking/source-ip policy. HWI would be
> configured with the virtual IP address of the load balancer and would
> connect and stay connected to a random HiveServer in the farm.
>
> I am naturally partial to the way it is now because I came up with it :)
>
> I like the idea of having a REST-ful/XML-RPC or some web service style
> interface for job submit.
>
> My thinking behind HWI has always been KISS. Keep It Simple Stupid.
> Anyone should be able to hack a few web pages onto it. Adding thrift,
> ajax, XML-RPC layers definitely ups the complexity.
>
> It think it makes sense to do HWI->HiveServer. I will have to take a
> deeper look at what HiveServer and thrift offers to be sure.
>
> Edward
>

Reply via email to