Hi, You can also use oozie's fork fearure which acts as a workflow scheduler to run jobs in parallel. You just need to define all our hql's inside the workflow.XML to make it run in parallel. On Apr 22, 2014 3:14 AM, "Subramanian, Sanjay (HQP)" < [email protected]> wrote:
> Hey > > Instead of going into HIVE CLI > I would propose 2 ways > > *NOHUP * > nohup hive -f path/to/query/file/*hive1.hql* >> ./hive1.hql_`date > +%Y-%m-%d-%H–%M–%S`.log 2>&1 > nohup hive -f path/to/query/file/*hive2.hql* >> ./hive2.hql_`date > +%Y-%m-%d-%H–%M–%S`.log 2>&1 > nohup hive -f path/to/query/file/*hive3.hql* >> ./hive3.hql_`date > +%Y-%m-%d-%H–%M–%S`.log 2>&1 > nohup hive -f path/to/query/file/*hive4.hql* >> ./hive4.hql_`date > +%Y-%m-%d-%H–%M–%S`.log 2>&1 > nohup hive -f path/to/query/file/*hive5.hql* >> ./hive5.hql_`date > +%Y-%m-%d-%H–%M–%S`.log 2>&1 > > Each statement above will launch MR jobs on your cluster and depending > on the cluster configs the jobs will run parallelly > Scheduling jobs on the MR cluster is independent of Hive > > *SCREEN sessions* > > - Create a Screen session > - screen –S hive_query1 > - U r inside the screen session hive_query1 > - hive -f path/to/query/file/*hive1.hql* > - Ctrl A D > - U detach from a screen session > - Repeat for each hive query u want to run > - I.e. Say 5 screen sessions, each running a have query > - To display screen session active > - screen -x > - To attach to a screen session > - screen -x hive_query1 > > > Thanks > > Warm Regards > > > Sanjay > > > From: saurabh <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Monday, April 21, 2014 at 1:53 PM > To: "[email protected]" <[email protected]> > Subject: Executing Hive Queries in Parallel > > > Hi, > I need some inputs to execute hive queries in parallel. I tried doing > this using CLI (by opening multiple ssh connection) and executed 4 HQL's; > it was observed that the queries are getting executed sequentially. All the > FOUR queries got submitted however while the first one was in execution > mode the other were in pending state. I was performing this activity on the > EMR running on Batch mode hence didn't able to dig into the logs. > > The hive CLI uses native hive connection which by default uses the FIFO > scheduler. This might be one of the reason for the queries getting > executed in sequence. > > I also observed that when multiple queries are executed using multiple > HUE sessions, it provides the parallel execution functionality. Can you > please suggest how the functionality of HUE can be replicated using CLI? > > I am aware of beeswax client however i am not sure how this can be used > during EMR- batch mode processing. > > Thanks in advance for going through this. Kindly let me know your > thoughts on the same. > >
