Hello,
I also second Gourav's point regarding "Spark the definitive guide" book.
This is great for learning both Scala and python based SPARK. But as others
mentioned, you will need to continuously read the documentation as SPARK is
still undergoing a lot of improvements. I list additional resourc
okay this is all something which I would disagree with.
Dr. Matei Zaharia created SPARK
Then he and Bill Chambers wrote a book on SPARK recently
He is still the main thinking power behind SPARK (look at his research in
Stanford)
The name of the book is "SPARK the definitive guide", its the best ev
Thanks!!!
On Fri, 5 Jul 2019 at 15:38, Chris Teoh wrote:
> Scala is better suited to data engineering work. It also has better
> integration with other components like HBase, Kafka, etc.
>
> Python is great for data scientists as there are more data science
> libraries available in Python.
>
> O
Scala is better suited to data engineering work. It also has better
integration with other components like HBase, Kafka, etc.
Python is great for data scientists as there are more data science
libraries available in Python.
On Fri., 5 Jul. 2019, 7:40 pm Vikas Garg, wrote:
> Is there any disadva
Is there any disadvantage of using Python? I have gone through multiple
articles which says that Python has advantages over Scala.
Scala is super fast in comparison but Python has more pre-built libraries
and options for analytics.
Still should I go with Scala?
On Fri, 5 Jul 2019 at 13:07, Kurt
Since you are a data engineer I would start by learning Scala. The parts of
Scala you would need to learn are pretty basic. Start with the examples on
the Spark website, which gives examples in multiple languages. Think of
Scala as a typed version of Python. You will find that the error messages
te
I am currently working as a data engineer and I am working on Power BI,
SSIS (ETL Tool). For learning purpose, I have done the setup PySpark and
also able to run queries through Spark on multi node cluster DB (I am using
Vertica DB and later will move on HDFS or SQL Server).
I have good knowledge
My best advise is to go through the docs and listen to lots of demo/videos
from spark committers.
On Fri, 5 Jul 2019 at 3:03 pm, Kurt Fehlhauer wrote:
> Are you a data scientist or data engineer?
>
>
> On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg wrote:
>
>> Hi,
>>
>> I am new Spark learner. Can
Are you a data scientist or data engineer?
On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg wrote:
> Hi,
>
> I am new Spark learner. Can someone guide me with the strategy towards
> getting expertise in PySpark.
>
> Thanks!!!
>
Hi,
I am new Spark learner. Can someone guide me with the strategy towards
getting expertise in PySpark.
Thanks!!!
This gitbook explains Spark compotents in detail.
'Mastering Apache Spark 2'
https://www.gitbook.com/book/jaceklaskowski/mastering-apache-spark/details
2017-12-04 12:48 GMT+09:00 Manuel Sopena Ballesteros <
manuel...@garvan.org.au>:
> Dear Spark community,
>
>
>
> Is there any resource (book
When you pick a book, make sure it covers the version of Spark you want to
deploy. There are a lot of books out there that focus a lot on Spark 1.x. Spark
2.x generalizes the dataframe API, introduces Tungsten, etc. All might not be
relevant to a pure “sys admin” learning, but it is good to know
ere
> <http://spark.apache.org/docs/latest/#where-to-go-from-here>
> You’ll find deployment guides, tuning, etc.
> Yohann Jardin
>
> Le 05-Dec-17 à 1:38 AM, Somasundaram Sekar a écrit :
>> Learning Spark - ORielly publication as a starter and official doc
>>
>>
Plenty of documentation is available on Spark website itself:
http://spark.apache.org/docs/latest/#where-to-go-from-here
You’ll find deployment guides, tuning, etc.
Yohann Jardin
Le 05-Dec-17 à 1:38 AM, Somasundaram Sekar a écrit :
Learning Spark - ORielly publication as a starter and official
Learning Spark - ORielly publication as a starter and official doc
On 4 Dec 2017 9:19 am, "Manuel Sopena Ballesteros"
wrote:
> Dear Spark community,
>
>
>
> Is there any resource (books, online course, etc.) available that you know
> of to learn about spark? I am
Dear Spark community,
Is there any resource (books, online course, etc.) available that you know of
to learn about spark? I am interested in the sys admin side of it? like the
different parts inside spark, how spark works internally, best ways to
install/deploy/monitor and how to get best perfo
Hi,
There are a lof of stuff to cover here depending on the business and your
needs
Do you mean:
1. Hardware spec for Spark master and nodes
2. The number of nodes, How to scale the nodes
3. Where to set up Spark nodes, on the same Hardware nodes as HDFS
(assuming using Hadoop) or o
Please suggest some good resources to learn Spark administration.
Hi,
As I'm beginner in Spark, I'm looking for someone who's also beginner to
learn and train on Spark together.
Please contact me if interested
Cordially,
.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Install-via-directions-in-Learning-Spark-Exception-when-running-bin-pyspark-tp25043p25049.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
doing that.
Robin
-
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Install-via-directions-in-Learning-Spark
Greetings all,
Excited to be learning spark. I am working through the "Learning Spark"
book and I am having trouble getting Spark installed and running.
This is what I have done so far.
I installed Spark from here:
http://spark.apache.org/downloads.html
selecting 1.5.1, pr
bq. I need to know on what all databases
You can access HBase using Spark.
Cheers
On Mon, Apr 6, 2015 at 5:59 AM, Akhil Das
wrote:
> We had few sessions at Sigmoid, you could go through the meetup page for
> details:
>
> http://www.meetup.com/Real-Time-Data-Processing-and-Cloud-Computing/
> On
We had few sessions at Sigmoid, you could go through the meetup page for
details:
http://www.meetup.com/Real-Time-Data-Processing-and-Cloud-Computing/
On 6 Apr 2015 18:01, "Abhideep Chakravarty" <
abhideep.chakrava...@mindtree.com> wrote:
> Hi all,
>
>
>
> We are here planning to setup a Spark
Hi all,
We are here planning to setup a Spark learning session series. I need all of
your input to create a TOC for this program i.e. what all to cover if we need
to start from basics and upto what we should go to cover all the aspects of
Spark in details.
Also, I need to know on what all dat
(This mailing list concerns Spark itself rather than the book about
Spark. Your question is about building code that isn't part of Spark,
so, the right place to ask is
https://github.com/databricks/learning-spark You have a typo in
"pachage" but I assume that's just your typo
Hi,
I am trying to build this project
https://github.com/databricks/learning-spark with mvn package.This should
work out of the box but unfortunately it doesn't. In fact, I get the
following error:
mvn pachage -X
> Apache Maven 3.0.5
> Maven home: /usr/share/maven
> Java ver
27 matches
Mail list logo