Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Sai Gopalakrishnan Fri, 20 Nov 2015 22:59:08 -0800

Hi everyone,


Thank you for your responses. I think Mich's suggestion is a great one, will go 
with it. As Alan suggested, using compactor in Hive should help out with 
managing the delta files.


@Dasun, pardon me for deviating from the topic. Regarding configuration, you 
could try a packaged distribution (Hortonworks , Cloudera or MapR) like  Jörn 
Franke said. I use Hortonworks, its open-source and compatible with Linux and 
Windows, provides detailed documentation for installation and can be installed 
in less than a day provided you're all set with the hardware. 
http://hortonworks.com/hdp/downloads/

[http://hortonworks.com/wp-content/themes/hortonworks/images/layout/content/hworks_FB.png]<http://hortonworks.com/hdp/downloads/>

Download Hadoop - Hortonworks
Download Apache Hadoop for the enterprise with Hortonworks Data Platform. Data 
access, storage, governance, security and operations across Linux and Windows
Read more...<http://hortonworks.com/hdp/downloads/>




Regards,

Sai

________________________________
From: Dasun Hegoda <dasunheg...@gmail.com>
Sent: Saturday, November 21, 2015 8:00 AM
To: user@hive.apache.org
Subject: Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Hi Mich, Hi Sai, Hi Jorn,

Thank you very much for the information. I think we are deviating from the 
original question. Hive on Spark on Ubuntu. Can you please kindly tell me the 
configuration steps?



On Fri, Nov 20, 2015 at 11:10 PM, Jörn Franke 
<jornfra...@gmail.com<mailto:jornfra...@gmail.com>> wrote:
I think the most recent versions of cloudera or Hortonworks should include all 
these components - try their Sandboxes.

On 20 Nov 2015, at 12:54, Dasun Hegoda 
<dasunheg...@gmail.com<mailto:dasunheg...@gmail.com>> wrote:

Where can I get a Hadoop distribution containing these technologies? Link?

On Fri, Nov 20, 2015 at 5:22 PM, Jörn Franke 
<jornfra...@gmail.com<mailto:jornfra...@gmail.com>> wrote:
I recommend to use a Hadoop distribution containing these technologies. I think 
you get also other useful tools for your scenario, such as Auditing using 
sentry or ranger.

On 20 Nov 2015, at 10:48, Mich Talebzadeh 
<m...@peridale.co.uk<mailto:m...@peridale.co.uk>> wrote:

Well

“I'm planning to deploy Hive on Spark but I can't find the installation steps. 
I tried to read the official '[Hive on Spark][1]' guide but it has problems. As 
an example it says under 'Configuring Yarn' 
`yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
 but does not imply where should I do it. Also as per the guide configurations 
are set in the Hive runtime shell which is not permanent according to my 
knowledge.”

You can do that in yarn-site.xml file which is normally under 
$HADOOP_HOME/etc/hadoop.


HTH



Mich Talebzadeh

Sybase ASE 15 Gold Medal Award 2008
A Winning Strategy: Running the most Critical Financial Data on ASE 15
http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
978-0-9759693-0-4
Publications due shortly:
Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one 
out shortly

http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any 
responsibility.

From: Dasun Hegoda [mailto:dasunheg...@gmail.com]
Sent: 20 November 2015 09:36
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Hi,

What I'm planning to do is develop a reporting platform using existing data. I 
have an existing RDBMS which has large number of records. So I'm using. 
(http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture)

 - Scoop - Extract data from RDBMS to Hadoop
 - Hadoop - Storage platform -> *Deployment Completed*
 - Hive - Datawarehouse
 - Spark - Read time processing -> *Deployment Completed*

I'm planning to deploy Hive on Spark but I can't find the installation steps. I 
tried to read the official '[Hive on Spark][1]' guide but it has problems. As 
an example it says under 'Configuring Yarn' 
`yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
 but does not imply where should I do it. Also as per the guide configurations 
are set in the Hive runtime shell which is not permanent according to my 
knowledge.

Given that I read [this][2] but it does not have any steps.

Please provide me the steps to run Hive on Spark on Ubuntu as a production 
system?


  [1]: 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
  [2]: 
http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark

--
Regards,
Dasun Hegoda, Software Engineer
www.dasunhegoda.com<http://www.dasunhegoda.com/> | 
dasunheg...@gmail.com<mailto:dasunheg...@gmail.com>



--
Regards,
Dasun Hegoda, Software Engineer
www.dasunhegoda.com<http://www.dasunhegoda.com/> | 
dasunheg...@gmail.com<mailto:dasunheg...@gmail.com>



--
Regards,
Dasun Hegoda, Software Engineer
www.dasunhegoda.com<http://www.dasunhegoda.com/> | 
dasunheg...@gmail.com<mailto:dasunheg...@gmail.com>
[Aspire Systems]

This e-mail message and any attachments are for the sole use of the intended 
recipient(s) and may contain proprietary, confidential, trade secret or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited and may be a violation of law. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy all 
copies of the original message.

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Reply via email to