Hi,

What I'm planning to do is develop a reporting platform using existing
data. I have an existing RDBMS which has large number of records. So I'm
using. (
http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture
)

 - Scoop - Extract data from RDBMS to Hadoop
 - Hadoop - Storage platform -> *Deployment Completed*
 - Hive - Datawarehouse
 - Spark - Read time processing -> *Deployment Completed*

I'm planning to deploy Hive on Spark but I can't find the installation
steps. I tried to read the official '[Hive on Spark][1]' guide but it has
problems. As an example it says under 'Configuring Yarn'
`yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
but does not imply where should I do it. Also as per the guide
configurations are set in the Hive runtime shell which is not permanent
according to my knowledge.

Given that I read [this][2] but it does not have any steps.

Please provide me the steps to run Hive on Spark on Ubuntu as a production
system?


  [1]:
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
  [2]:
http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark

-- 
Regards,
Dasun Hegoda, Software Engineer
www.dasunhegoda.com | dasunheg...@gmail.com

Reply via email to