I recommend to use a Hadoop distribution containing these technologies. I think you get also other useful tools for your scenario, such as Auditing using sentry or ranger.
> On 20 Nov 2015, at 10:48, Mich Talebzadeh <m...@peridale.co.uk> wrote: > > Well > > “I'm planning to deploy Hive on Spark but I can't find the installation > steps. I tried to read the official '[Hive on Spark][1]' guide but it has > problems. As an example it says under 'Configuring Yarn' > `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler` > but does not imply where should I do it. Also as per the guide > configurations are set in the Hive runtime shell which is not permanent > according to my knowledge.” > > You can do that in yarn-site.xml file which is normally under > $HADOOP_HOME/etc/hadoop. > > > HTH > > > > Mich Talebzadeh > > Sybase ASE 15 Gold Medal Award 2008 > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", > ISBN 978-0-9563693-0-7. > co-author "Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4 > Publications due shortly: > Complex Event Processing in Heterogeneous Environments, ISBN: > 978-0-9563693-3-8 > Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume > one out shortly > > http://talebzadehmich.wordpress.com > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this message > shall not be understood as given or endorsed by Peridale Technology Ltd, its > subsidiaries or their employees, unless expressly so stated. It is the > responsibility of the recipient to ensure that this email is virus free, > therefore neither Peridale Ltd, its subsidiaries nor their employees accept > any responsibility. > > From: Dasun Hegoda [mailto:dasunheg...@gmail.com] > Sent: 20 November 2015 09:36 > To: user@hive.apache.org > Subject: Hive on Spark - Hadoop 2 - Installation - Ubuntu > > Hi, > > What I'm planning to do is develop a reporting platform using existing data. > I have an existing RDBMS which has large number of records. So I'm using. > (http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture) > > - Scoop - Extract data from RDBMS to Hadoop > - Hadoop - Storage platform -> *Deployment Completed* > - Hive - Datawarehouse > - Spark - Read time processing -> *Deployment Completed* > > I'm planning to deploy Hive on Spark but I can't find the installation steps. > I tried to read the official '[Hive on Spark][1]' guide but it has problems. > As an example it says under 'Configuring Yarn' > `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler` > but does not imply where should I do it. Also as per the guide > configurations are set in the Hive runtime shell which is not permanent > according to my knowledge. > > Given that I read [this][2] but it does not have any steps. > > Please provide me the steps to run Hive on Spark on Ubuntu as a production > system? > > > [1]: > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started > [2]: > http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark > > -- > Regards, > Dasun Hegoda, Software Engineer > www.dasunhegoda.com | dasunheg...@gmail.com