Anyone???? On Sat, Nov 21, 2015 at 1:32 PM, Dasun Hegoda <dasunheg...@gmail.com> wrote:
> Thank you very much but I would like to do the integration of these > components myself rather than using a packaged distribution. I think I have > come to right place. Can you please kindly tell me the configuration > steps run Hive on Spark? > > At least someone please elaborate these steps. > > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started > . > > Because at the latter part of the guide configurations are set in the > Hive runtime shell which is not permanent according to my knowledge. > > Please help me to get this done. Also I'm planning write a detailed guide > with configuration steps to run Hive on Spark. So others can benefited from > it and not troubled like me. > > Can someone please kindly tell me the configuration steps run Hive on > Spark? > > > On Sat, Nov 21, 2015 at 12:28 PM, Sai Gopalakrishnan < > sai.gopalakrish...@aspiresys.com> wrote: > >> Hi everyone, >> >> >> Thank you for your responses. I think Mich's suggestion is a great one, >> will go with it. As Alan suggested, using compactor in Hive should help out >> with managing the delta files. >> >> >> @Dasun, pardon me for deviating from the topic. Regarding configuration, >> you could try a packaged distribution (Hortonworks , Cloudera or MapR) >> like Jörn Franke said. I use Hortonworks, its open-source and compatible >> with Linux and Windows, provides detailed documentation for installation >> and can be installed in less than a day provided you're all set with the >> hardware. http://hortonworks.com/hdp/downloads/ >> <http://hortonworks.com/hdp/downloads/> >> Download Hadoop - Hortonworks >> Download Apache Hadoop for the enterprise with Hortonworks Data Platform. >> Data access, storage, governance, security and operations across Linux and >> Windows >> Read more... <http://hortonworks.com/hdp/downloads/> >> >> >> Regards, >> >> Sai >> >> ------------------------------ >> *From:* Dasun Hegoda <dasunheg...@gmail.com> >> *Sent:* Saturday, November 21, 2015 8:00 AM >> *To:* user@hive.apache.org >> *Subject:* Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu >> >> Hi Mich, Hi Sai, Hi Jorn, >> >> Thank you very much for the information. I think we are deviating from >> the original question. Hive on Spark on Ubuntu. Can you please kindly tell >> me the configuration steps? >> >> >> >> On Fri, Nov 20, 2015 at 11:10 PM, Jörn Franke <jornfra...@gmail.com> >> wrote: >> >>> I think the most recent versions of cloudera or Hortonworks should >>> include all these components - try their Sandboxes. >>> >>> On 20 Nov 2015, at 12:54, Dasun Hegoda <dasunheg...@gmail.com> wrote: >>> >>> Where can I get a Hadoop distribution containing these technologies? >>> Link? >>> >>> On Fri, Nov 20, 2015 at 5:22 PM, Jörn Franke <jornfra...@gmail.com> >>> wrote: >>> >>>> I recommend to use a Hadoop distribution containing these technologies. >>>> I think you get also other useful tools for your scenario, such as Auditing >>>> using sentry or ranger. >>>> >>>> On 20 Nov 2015, at 10:48, Mich Talebzadeh <m...@peridale.co.uk> wrote: >>>> >>>> Well >>>> >>>> >>>> >>>> “I'm planning to deploy Hive on Spark but I can't find the >>>> installation steps. I tried to read the official '[Hive on Spark][1]' guide >>>> but it has problems. As an example it says under 'Configuring Yarn' >>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler` >>>> but does not imply where should I do it. Also as per the guide >>>> configurations are set in the Hive runtime shell which is not permanent >>>> according to my knowledge.” >>>> >>>> >>>> >>>> You can do that in yarn-site.xml file which is normally under >>>> $HADOOP_HOME/etc/hadoop. >>>> >>>> >>>> >>>> >>>> >>>> HTH >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Mich Talebzadeh >>>> >>>> >>>> >>>> *Sybase ASE 15 Gold Medal Award 2008* >>>> >>>> A Winning Strategy: Running the most Critical Financial Data on ASE 15 >>>> >>>> >>>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf >>>> >>>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase >>>> ASE 15", ISBN 978-0-9563693-0-7*. >>>> >>>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN >>>> 978-0-9759693-0-4* >>>> >>>> *Publications due shortly:* >>>> >>>> *Complex Event Processing in Heterogeneous Environments*, ISBN: >>>> 978-0-9563693-3-8 >>>> >>>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, >>>> volume >>>> one out shortly >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> NOTE: The information in this email is proprietary and confidential. >>>> This message is for the designated recipient only, if you are not the >>>> intended recipient, you should destroy it immediately. Any information in >>>> this message shall not be understood as given or endorsed by Peridale >>>> Technology Ltd, its subsidiaries or their employees, unless expressly so >>>> stated. It is the responsibility of the recipient to ensure that this email >>>> is virus free, therefore neither Peridale Ltd, its subsidiaries nor their >>>> employees accept any responsibility. >>>> >>>> >>>> >>>> *From:* Dasun Hegoda [mailto:dasunheg...@gmail.com >>>> <dasunheg...@gmail.com>] >>>> *Sent:* 20 November 2015 09:36 >>>> *To:* user@hive.apache.org >>>> *Subject:* Hive on Spark - Hadoop 2 - Installation - Ubuntu >>>> >>>> >>>> >>>> Hi, >>>> >>>> >>>> >>>> What I'm planning to do is develop a reporting platform using existing >>>> data. I have an existing RDBMS which has large number of records. So I'm >>>> using. ( >>>> http://stackoverflow.com/questions/33635234/hadoop-2-7-spark-hive-jasperreports-scoop-architecuture >>>> ) >>>> >>>> >>>> >>>> - Scoop - Extract data from RDBMS to Hadoop >>>> >>>> - Hadoop - Storage platform -> *Deployment Completed* >>>> >>>> - Hive - Datawarehouse >>>> >>>> - Spark - Read time processing -> *Deployment Completed* >>>> >>>> >>>> >>>> I'm planning to deploy Hive on Spark but I can't find the installation >>>> steps. I tried to read the official '[Hive on Spark][1]' guide but it has >>>> problems. As an example it says under 'Configuring Yarn' >>>> `yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler` >>>> but does not imply where should I do it. Also as per the guide >>>> configurations are set in the Hive runtime shell which is not permanent >>>> according to my knowledge. >>>> >>>> >>>> >>>> Given that I read [this][2] but it does not have any steps. >>>> >>>> >>>> >>>> Please provide me the steps to run Hive on Spark on Ubuntu as a >>>> production system? >>>> >>>> >>>> >>>> >>>> >>>> [1]: >>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started >>>> >>>> [2]: >>>> http://stackoverflow.com/questions/26018306/how-to-configure-hive-to-use-spark >>>> >>>> >>>> >>>> -- >>>> >>>> Regards, >>>> >>>> Dasun Hegoda, Software Engineer >>>> www.dasunhegoda.com | dasunheg...@gmail.com >>>> >>>> >>> >>> >>> -- >>> Regards, >>> Dasun Hegoda, Software Engineer >>> www.dasunhegoda.com | dasunheg...@gmail.com >>> >>> >> >> >> -- >> Regards, >> Dasun Hegoda, Software Engineer >> www.dasunhegoda.com | dasunheg...@gmail.com >> [image: Aspire Systems] >> >> This e-mail message and any attachments are for the sole use of the >> intended recipient(s) and may contain proprietary, confidential, trade >> secret or privileged information. Any unauthorized review, use, disclosure >> or distribution is prohibited and may be a violation of law. If you are not >> the intended recipient, please contact the sender by reply e-mail and >> destroy all copies of the original message. >> > > > > -- > Regards, > Dasun Hegoda, Software Engineer > www.dasunhegoda.com | dasunheg...@gmail.com > -- Regards, Dasun Hegoda, Software Engineer www.dasunhegoda.com | dasunheg...@gmail.com