Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Jörn Franke Fri, 20 Nov 2015 09:41:53 -0800

I think the most recent versions of cloudera or Hortonworks should include all 
these components - try their Sandboxes.


> On 20 Nov 2015, at 12:54, Dasun Hegoda <dasunheg...@gmail.com> wrote:
> 
> Where can I get a Hadoop distribution containing these technologies? Link?
> 
>> On Fri, Nov 20, 2015 at 5:22 PM, Jörn Franke <jornfra...@gmail.com> wrote:
>> I recommend to use a Hadoop distribution containing these technologies. I 
>> think you get also other useful tools for your scenario, such as Auditing 
>> using sentry or ranger.
>> 
>>> On 20 Nov 2015, at 10:48, Mich Talebzadeh <m...@peridale.co.uk> wrote:
>>> 
>>> Well
>>> 
>>>  
>>> 
>>> “I'm planning to deploy Hive on Spark but I can't find the installation 
>>> steps. I tried to read the official '[Hive on Spark][1]' guide but it has 
>>> problems. As an example it says under 'Configuring Yarn' 
>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
>>>  but does not imply where should I do it. Also as per the guide 
>>> configurations are set in the Hive runtime shell which is not permanent 
>>> according to my knowledge.”
>>> 
>>>  
>>> 
>>> You can do that in yarn-site.xml file which is normally under 
>>> $HADOOP_HOME/etc/hadoop.
>>> 
>>>  
>>> 
>>>  
>>> 
>>> HTH
>>> 
>>>  
>>> 
>>>  
>>> 
>>>  
>>> 
>>> Mich Talebzadeh
>>> 
>>>  
>>> 
>>> Sybase ASE 15 Gold Medal Award 2008
>>> 
>>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>> 
>>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>> 
>>> Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
>>> ISBN 978-0-9563693-0-7.
>>> 
>>> co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
>>> 978-0-9759693-0-4
>>> 
>>> Publications due shortly:
>>> 
>>> Complex Event Processing in Heterogeneous Environments, ISBN: 
>>> 978-0-9563693-3-8
>>> 
>>> Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume 
>>> one out shortly
>>> 
>>>  
>>> 
>>> http://talebzadehmich.wordpress.com
>>> 
>>>  
>>> 
>>> NOTE: The information in this email is proprietary and confidential. This 
>>> message is for the designated recipient only, if you are not the intended 
>>> recipient, you should destroy it immediately. Any information in this 
>>> message shall not be understood as given or endorsed by Peridale Technology 
>>> Ltd, its subsidiaries or their employees, unless expressly so stated. It is 
>>> the responsibility of the recipient to ensure that this email is virus 
>>> free, therefore neither Peridale Ltd, its subsidiaries nor their employees 
>>> accept any responsibility.
>>> 
>>>  
>>> 
>>> From: Dasun Hegoda [mailto:dasunheg...@gmail.com] 
>>> Sent: 20 November 2015 09:36
>>> To: user@hive.apache.org
>>> Subject: Hive on Spark - Hadoop 2 - Installation - Ubuntu
>>> 
>>>  
>>> 
>>> Hi,
>>> 
>>>  
>>> 
>>> What I'm planning to do is develop a reporting platform using existing 
>>> data. I have an existing RDBMS which has large number of records. So I'm 
>>> using. 
>>> (http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture)
>>> 
>>>  
>>> 
>>>  - Scoop - Extract data from RDBMS to Hadoop
>>> 
>>>  - Hadoop - Storage platform -> *Deployment Completed*
>>> 
>>>  - Hive - Datawarehouse
>>> 
>>>  - Spark - Read time processing -> *Deployment Completed*
>>> 
>>>  
>>> 
>>> I'm planning to deploy Hive on Spark but I can't find the installation 
>>> steps. I tried to read the official '[Hive on Spark][1]' guide but it has 
>>> problems. As an example it says under 'Configuring Yarn' 
>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
>>>  but does not imply where should I do it. Also as per the guide 
>>> configurations are set in the Hive runtime shell which is not permanent 
>>> according to my knowledge.
>>> 
>>>  
>>> 
>>> Given that I read [this][2] but it does not have any steps.
>>> 
>>>  
>>> 
>>> Please provide me the steps to run Hive on Spark on Ubuntu as a production 
>>> system?
>>> 
>>>  
>>> 
>>>  
>>> 
>>>   [1]: 
>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
>>> 
>>>   [2]: 
>>> http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark
>>> 
>>>  
>>> 
>>> --
>>> 
>>> Regards,
>>> 
>>> Dasun Hegoda, Software Engineer  
>>> www.dasunhegoda.com | dasunheg...@gmail.com
>>> 
> 
> 
> 
> -- 
> Regards,
> Dasun Hegoda, Software Engineer  
> www.dasunhegoda.com | dasunheg...@gmail.com

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Reply via email to