Re: Learning Spark

2019-07-05 Thread Alex A. Reda
Hello, I also second Gourav's point regarding "Spark the definitive guide" book. This is great for learning both Scala and python based SPARK. But as others mentioned, you will need to continuously read the documentation as SPARK is still undergoing a lot of improvements. I list additional

Re: Learning Spark

2019-07-05 Thread Gourav Sengupta
okay this is all something which I would disagree with. Dr. Matei Zaharia created SPARK Then he and Bill Chambers wrote a book on SPARK recently He is still the main thinking power behind SPARK (look at his research in Stanford) The name of the book is "SPARK the definitive guide", its the best

Re: Learning Spark

2019-07-05 Thread Vikas Garg
Thanks!!! On Fri, 5 Jul 2019 at 15:38, Chris Teoh wrote: > Scala is better suited to data engineering work. It also has better > integration with other components like HBase, Kafka, etc. > > Python is great for data scientists as there are more data science > libraries available in Python. > >

Re: Learning Spark

2019-07-05 Thread Chris Teoh
Scala is better suited to data engineering work. It also has better integration with other components like HBase, Kafka, etc. Python is great for data scientists as there are more data science libraries available in Python. On Fri., 5 Jul. 2019, 7:40 pm Vikas Garg, wrote: > Is there any

Re: Learning Spark

2019-07-05 Thread Vikas Garg
Is there any disadvantage of using Python? I have gone through multiple articles which says that Python has advantages over Scala. Scala is super fast in comparison but Python has more pre-built libraries and options for analytics. Still should I go with Scala? On Fri, 5 Jul 2019 at 13:07, Kurt

Re: Learning Spark

2019-07-05 Thread Kurt Fehlhauer
Since you are a data engineer I would start by learning Scala. The parts of Scala you would need to learn are pretty basic. Start with the examples on the Spark website, which gives examples in multiple languages. Think of Scala as a typed version of Python. You will find that the error messages

Re: Learning Spark

2019-07-04 Thread Vikas Garg
I am currently working as a data engineer and I am working on Power BI, SSIS (ETL Tool). For learning purpose, I have done the setup PySpark and also able to run queries through Spark on multi node cluster DB (I am using Vertica DB and later will move on HDFS or SQL Server). I have good knowledge

Re: Learning Spark

2019-07-04 Thread ayan guha
My best advise is to go through the docs and listen to lots of demo/videos from spark committers. On Fri, 5 Jul 2019 at 3:03 pm, Kurt Fehlhauer wrote: > Are you a data scientist or data engineer? > > > On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg wrote: > >> Hi, >> >> I am new Spark learner. Can

Re: Learning Spark

2019-07-04 Thread Kurt Fehlhauer
Are you a data scientist or data engineer? On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg wrote: > Hi, > > I am new Spark learner. Can someone guide me with the strategy towards > getting expertise in PySpark. > > Thanks!!! >

Re: learning Spark

2017-12-05 Thread makoto
This gitbook explains Spark compotents in detail. 'Mastering Apache Spark 2' https://www.gitbook.com/book/jaceklaskowski/mastering-apache-spark/details 2017-12-04 12:48 GMT+09:00 Manuel Sopena Ballesteros < manuel...@garvan.org.au>: > Dear Spark community, > > > > Is there any resource

Re: learning Spark

2017-12-05 Thread Jean Georges Perrin
When you pick a book, make sure it covers the version of Spark you want to deploy. There are a lot of books out there that focus a lot on Spark 1.x. Spark 2.x generalizes the dataframe API, introduces Tungsten, etc. All might not be relevant to a pure “sys admin” learning, but it is good to

Re: learning Spark

2017-12-04 Thread Elior Malul
Also, our community is responsive on stack overflow - also, I will be happy to help whenever I can. > On Dec 5, 2017, at 9:14 AM, yohann jardin wrote: > > Plenty of documentation is available on Spark website itself: >

Re: learning Spark

2017-12-04 Thread yohann jardin
Plenty of documentation is available on Spark website itself: http://spark.apache.org/docs/latest/#where-to-go-from-here You’ll find deployment guides, tuning, etc. Yohann Jardin Le 05-Dec-17 à 1:38 AM, Somasundaram Sekar a écrit : Learning Spark - ORielly publication as a starter and official

Re: learning Spark

2017-12-04 Thread Somasundaram Sekar
Learning Spark - ORielly publication as a starter and official doc On 4 Dec 2017 9:19 am, "Manuel Sopena Ballesteros" wrote: > Dear Spark community, > > > > Is there any resource (books, online course, etc.) available that you know > of to learn about spark? I am

Re: Learning Spark

2015-04-06 Thread Akhil Das
We had few sessions at Sigmoid, you could go through the meetup page for details: http://www.meetup.com/Real-Time-Data-Processing-and-Cloud-Computing/ On 6 Apr 2015 18:01, Abhideep Chakravarty abhideep.chakrava...@mindtree.com wrote: Hi all, We are here planning to setup a Spark learning

Re: Learning Spark

2015-04-06 Thread Ted Yu
bq. I need to know on what all databases You can access HBase using Spark. Cheers On Mon, Apr 6, 2015 at 5:59 AM, Akhil Das ak...@sigmoidanalytics.com wrote: We had few sessions at Sigmoid, you could go through the meetup page for details: