Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Dasun Hegoda Sun, 22 Nov 2015 23:05:48 -0800

Anyone????

On Sat, Nov 21, 2015 at 1:32 PM, Dasun Hegoda <dasunheg...@gmail.com> wrote:


> Thank you very much but I would like to do the integration of these
> components myself rather than using a packaged distribution. I think I have
> come to right place. Can you please kindly tell me the configuration
> steps run Hive on Spark?
>
> At least someone please elaborate these steps.
>
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
> .
>
> Because at the latter part of the guide configurations are set in the
> Hive runtime shell which is not permanent according to my knowledge.
>
> Please help me to get this done. Also I'm planning write a detailed guide
> with configuration steps to run Hive on Spark. So others can benefited from
> it and not troubled like me.
>
> Can someone please kindly tell me the configuration steps run Hive on
> Spark?
>
>
> On Sat, Nov 21, 2015 at 12:28 PM, Sai Gopalakrishnan <
> sai.gopalakrish...@aspiresys.com> wrote:
>
>> Hi everyone,
>>
>>
>> Thank you for your responses. I think Mich's suggestion is a great one,
>> will go with it. As Alan suggested, using compactor in Hive should help out
>> with managing the delta files.
>>
>>
>> @Dasun, pardon me for deviating from the topic. Regarding configuration,
>> you could try a packaged distribution (Hortonworks , Cloudera or MapR)
>> like  Jörn Franke said. I use Hortonworks, its open-source and compatible
>> with Linux and Windows, provides detailed documentation for installation
>> and can be installed in less than a day provided you're all set with the
>> hardware. http://hortonworks.com/hdp/downloads/
>> <http://hortonworks.com/hdp/downloads/>
>> Download Hadoop - Hortonworks
>> Download Apache Hadoop for the enterprise with Hortonworks Data Platform.
>> Data access, storage, governance, security and operations across Linux and
>> Windows
>> Read more... <http://hortonworks.com/hdp/downloads/>
>>
>>
>> Regards,
>>
>> Sai
>>
>> ------------------------------
>> *From:* Dasun Hegoda <dasunheg...@gmail.com>
>> *Sent:* Saturday, November 21, 2015 8:00 AM
>> *To:* user@hive.apache.org
>> *Subject:* Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu
>>
>> Hi Mich, Hi Sai, Hi Jorn,
>>
>> Thank you very much for the information. I think we are deviating from
>> the original question. Hive on Spark on Ubuntu. Can you please kindly tell
>> me the configuration steps?
>>
>>
>>
>> On Fri, Nov 20, 2015 at 11:10 PM, Jörn Franke <jornfra...@gmail.com>
>> wrote:
>>
>>> I think the most recent versions of cloudera or Hortonworks should
>>> include all these components - try their Sandboxes.
>>>
>>> On 20 Nov 2015, at 12:54, Dasun Hegoda <dasunheg...@gmail.com> wrote:
>>>
>>> Where can I get a Hadoop distribution containing these technologies?
>>> Link?
>>>
>>> On Fri, Nov 20, 2015 at 5:22 PM, Jörn Franke <jornfra...@gmail.com>
>>> wrote:
>>>
>>>> I recommend to use a Hadoop distribution containing these technologies.
>>>> I think you get also other useful tools for your scenario, such as Auditing
>>>> using sentry or ranger.
>>>>
>>>> On 20 Nov 2015, at 10:48, Mich Talebzadeh <m...@peridale.co.uk> wrote:
>>>>
>>>> Well
>>>>
>>>>
>>>>
>>>> “I'm planning to deploy Hive on Spark but I can't find the
>>>> installation steps. I tried to read the official '[Hive on Spark][1]' guide
>>>> but it has problems. As an example it says under 'Configuring Yarn'
>>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
>>>> but does not imply where should I do it. Also as per the guide
>>>> configurations are set in the Hive runtime shell which is not permanent
>>>> according to my knowledge.”
>>>>
>>>>
>>>>
>>>> You can do that in yarn-site.xml file which is normally under
>>>> $HADOOP_HOME/etc/hadoop.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> HTH
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> *Sybase ASE 15 Gold Medal Award 2008*
>>>>
>>>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>>>
>>>>
>>>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>>>
>>>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase
>>>> ASE 15", ISBN 978-0-9563693-0-7*.
>>>>
>>>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
>>>> 978-0-9759693-0-4*
>>>>
>>>> *Publications due shortly:*
>>>>
>>>> *Complex Event Processing in Heterogeneous Environments*, ISBN:
>>>> 978-0-9563693-3-8
>>>>
>>>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, 
>>>> volume
>>>> one out shortly
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>>
>>>> NOTE: The information in this email is proprietary and confidential.
>>>> This message is for the designated recipient only, if you are not the
>>>> intended recipient, you should destroy it immediately. Any information in
>>>> this message shall not be understood as given or endorsed by Peridale
>>>> Technology Ltd, its subsidiaries or their employees, unless expressly so
>>>> stated. It is the responsibility of the recipient to ensure that this email
>>>> is virus free, therefore neither Peridale Ltd, its subsidiaries nor their
>>>> employees accept any responsibility.
>>>>
>>>>
>>>>
>>>> *From:* Dasun Hegoda [mailto:dasunheg...@gmail.com
>>>> <dasunheg...@gmail.com>]
>>>> *Sent:* 20 November 2015 09:36
>>>> *To:* user@hive.apache.org
>>>> *Subject:* Hive on Spark - Hadoop 2 - Installation - Ubuntu
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> What I'm planning to do is develop a reporting platform using existing
>>>> data. I have an existing RDBMS which has large number of records. So I'm
>>>> using. (
>>>> http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture
>>>> )
>>>>
>>>>
>>>>
>>>>  - Scoop - Extract data from RDBMS to Hadoop
>>>>
>>>>  - Hadoop - Storage platform -> *Deployment Completed*
>>>>
>>>>  - Hive - Datawarehouse
>>>>
>>>>  - Spark - Read time processing -> *Deployment Completed*
>>>>
>>>>
>>>>
>>>> I'm planning to deploy Hive on Spark but I can't find the installation
>>>> steps. I tried to read the official '[Hive on Spark][1]' guide but it has
>>>> problems. As an example it says under 'Configuring Yarn'
>>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`
>>>> but does not imply where should I do it. Also as per the guide
>>>> configurations are set in the Hive runtime shell which is not permanent
>>>> according to my knowledge.
>>>>
>>>>
>>>>
>>>> Given that I read [this][2] but it does not have any steps.
>>>>
>>>>
>>>>
>>>> Please provide me the steps to run Hive on Spark on Ubuntu as a
>>>> production system?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>   [1]:
>>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
>>>>
>>>>   [2]:
>>>> http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Regards,
>>>>
>>>> Dasun Hegoda, Software Engineer
>>>> www.dasunhegoda.com | dasunheg...@gmail.com
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Dasun Hegoda, Software Engineer
>>> www.dasunhegoda.com | dasunheg...@gmail.com
>>>
>>>
>>
>>
>> --
>> Regards,
>> Dasun Hegoda, Software Engineer
>> www.dasunhegoda.com | dasunheg...@gmail.com
>> [image: Aspire Systems]
>>
>> This e-mail message and any attachments are for the sole use of the
>> intended recipient(s) and may contain proprietary, confidential, trade
>> secret or privileged information. Any unauthorized review, use, disclosure
>> or distribution is prohibited and may be a violation of law. If you are not
>> the intended recipient, please contact the sender by reply e-mail and
>> destroy all copies of the original message.
>>
>
>
>
> --
> Regards,
> Dasun Hegoda, Software Engineer
> www.dasunhegoda.com | dasunheg...@gmail.com
>



-- 
Regards,
Dasun Hegoda, Software Engineer
www.dasunhegoda.com | dasunheg...@gmail.com

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

Reply via email to