Re: Best practice for automating jobs

Qiang Wang Thu, 10 Jan 2013 17:31:58 -0800

I believe the HWI (Hive Web Interface) can give you a hand.

https://github.com/anjuke/hwi


You can use the HWI to submit and run queries concurrently.
Partition management can be achieved by creating crontabs using the HWI.

It's simple and easy to use. Hope it helps.

Regards,
Qiang


2013/1/11 Tom Brown <tombrow...@gmail.com>

> All,
>
> I want to automate jobs against Hive (using an external table with
> ever growing partitions), and I'm running into a few challenges:
>
> Concurrency - If I run Hive as a thrift server, I can only safely run
> one job at a time. As such, it seems like my best bet will be to run
> it from the command line and setup a brand new instance for each job.
> That quite a bit of a hassle to solves a seemingly common problem, so
> I want to know if there are any accepted patterns or best practices
> for this?
>
> Partition management - New partitions will be added regularly. If I
> have to setup multiple instances of Hive for each (potentially)
> overlapping job, it will be difficult to keep track of the partitions
> that have been added. In the context of the preceding question, what
> is the best way to add metadata about new partitions?
>
> Thanks in advance!
>
> --Tom
>

Re: Best practice for automating jobs

Reply via email to