Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-03-01 Thread Robin East
Mohammed and I both obviously have a certain bias here but I have to agree with 
him - the documentation is pretty good but other sources are necessary to 
supplement. (Good) books are a curated source of information that can short-cut 
a lot of the learning. 
---
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action 






> On 1 Mar 2016, at 16:13, Mohammed Guller  wrote:
> 
> I agree that the Spark official documentation is pretty good. However, a book 
> also serves a useful purpose. It provides a structured roadmap for learning a 
> new technology. Everything is nicely organized for the reader. For somebody 
> who has just started learning Spark, the amount of material on the Internet 
> can be overwhelming. There are ton of blogs and presentations on the 
> Internet. A beginner could easily spend months reading them and still be 
> lost. If you are experienced, it is easy to figure out what to read and what 
> to skip.
>  
> I also agree that a book becomes outdated at some point, but not right away. 
> For example, a book covering DataFrames and Spark ML is not outdated yet.
>  
> Mohammed
> Author: Big Data Analytics with Spark 
> 
>  
> From: charles li [mailto:charles.up...@gmail.com 
> ] 
> Sent: Monday, February 29, 2016 1:39 AM
> To: Ashok Kumar
> Cc: User
> Subject: Re: Recommendation for a good book on Spark, beginner to moderate 
> knowledge
>  
> since spark is under actively developing, so take a book to learn it is 
> somehow outdated to some degree.
>  
> I would like to suggest learn it from several ways as bellow:
>  
> spark official document, trust me, you will go through this for several time 
> if you want to learn in well : http://spark.apache.org/ 
> 
> spark summit, lots of videos and slide, high quality : 
> https://spark-summit.org/ 
> databricks' blog : https://databricks.com/blog 
> attend spark meetup : http://www.meetup.com/ 
> try spark 3-party package if needed and convenient : 
> http://spark-packages.org/ 
> and I just start to blog my spark learning memo on my blog: 
> http://litaotao.github.io  
>  
> in a word, I think the best way to learn it is official document + databricks 
> blog + others' blog ===>>> your blog [ tutorial by you or just memo for your 
> learning ]
>  
> On Mon, Feb 29, 2016 at 4:50 PM, Ashok Kumar  > wrote:
> Thank you all for valuable advice. Much appreciated
>  
> Best
>  
> 
> On Sunday, 28 February 2016, 21:48, Ashok Kumar  > wrote:
>  
> 
>   Hi Gurus,
>  
> Appreciate if you recommend me a good book on Spark or documentation for 
> beginner to moderate knowledge
>  
> I very much like to skill myself on transformation and action methods.
>  
> FYI, I have already looked at examples on net. However, some of them not 
> clear at least to me.
>  
> Warmest regards
>  
> 
> 
> 
>  
> -- 
> --
> a spark lover, a quant, a developer and a good man.
>  
> http://github.com/litaotao 


RE: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-03-01 Thread Mohammed Guller
I agree that the Spark official documentation is pretty good. However, a book 
also serves a useful purpose. It provides a structured roadmap for learning a 
new technology. Everything is nicely organized for the reader. For somebody who 
has just started learning Spark, the amount of material on the Internet can be 
overwhelming. There are ton of blogs and presentations on the Internet. A 
beginner could easily spend months reading them and still be lost. If you are 
experienced, it is easy to figure out what to read and what to skip.

I also agree that a book becomes outdated at some point, but not right away. 
For example, a book covering DataFrames and Spark ML is not outdated yet.

Mohammed
Author: Big Data Analytics with 
Spark

From: charles li [mailto:charles.up...@gmail.com]
Sent: Monday, February 29, 2016 1:39 AM
To: Ashok Kumar
Cc: User
Subject: Re: Recommendation for a good book on Spark, beginner to moderate 
knowledge

since spark is under actively developing, so take a book to learn it is somehow 
outdated to some degree.

I would like to suggest learn it from several ways as bellow:


  *   spark official document, trust me, you will go through this for several 
time if you want to learn in well : http://spark.apache.org/
  *   spark summit, lots of videos and slide, high quality : 
https://spark-summit.org/
  *   databricks' blog : https://databricks.com/blog
  *   attend spark meetup : http://www.meetup.com/
  *   try spark 3-party package if needed and convenient : 
http://spark-packages.org/
  *   and I just start to blog my spark learning memo on my blog: 
http://litaotao.github.io

in a word, I think the best way to learn it is official document + databricks 
blog + others' blog ===>>> your blog [ tutorial by you or just memo for your 
learning ]

On Mon, Feb 29, 2016 at 4:50 PM, Ashok Kumar 
> wrote:
Thank you all for valuable advice. Much appreciated

Best

On Sunday, 28 February 2016, 21:48, Ashok Kumar 
> wrote:

  Hi Gurus,

Appreciate if you recommend me a good book on Spark or documentation for 
beginner to moderate knowledge

I very much like to skill myself on transformation and action methods.

FYI, I have already looked at examples on net. However, some of them not clear 
at least to me.

Warmest regards




--
--
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-29 Thread charles li
since spark is under actively developing, so take a book to learn it is
somehow outdated to some degree.

I would like to suggest learn it from several ways as bellow:


   - spark official document, trust me, you will go through this for
   several time if you want to learn in well : http://spark.apache.org/
   - spark summit, lots of videos and slide, high quality :
   https://spark-summit.org/
   - databricks' blog : https://databricks.com/blog
   - attend spark meetup : http://www.meetup.com/
   - try spark 3-party package if needed and convenient :
   http://spark-packages.org/
   - and I just start to blog my spark learning memo on my blog:
   http://litaotao.github.io


in a word, I think the best way to learn it is official *document +
databricks blog + others' blog ===>>> your blog [ tutorial by you or just
memo for your learning ]*

On Mon, Feb 29, 2016 at 4:50 PM, Ashok Kumar 
wrote:

> Thank you all for valuable advice. Much appreciated
>
> Best
>
>
> On Sunday, 28 February 2016, 21:48, Ashok Kumar 
> wrote:
>
>
>   Hi Gurus,
>
> Appreciate if you recommend me a good book on Spark or documentation for
> beginner to moderate knowledge
>
> I very much like to skill myself on transformation and action methods.
>
> FYI, I have already looked at examples on net. However, some of them not
> clear at least to me.
>
> Warmest regards
>
>
>


-- 
*--*
a spark lover, a quant, a developer and a good man.

http://github.com/litaotao


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-29 Thread Ashok Kumar
Thank you all for valuable advice. Much appreciated
Best 

On Sunday, 28 February 2016, 21:48, Ashok Kumar  
wrote:
 

   Hi Gurus,
Appreciate if you recommend me a good book on Spark or documentation for 
beginner to moderate knowledge
I very much like to skill myself on transformation and action methods.
FYI, I have already looked at examples on net. However, some of them not clear 
at least to me.
Warmest regards

  

RE: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Mohammed Guller
Hi Ashok,

Another book recommendation (I am the author): “Big Data Analytics with Spark”

The first half of the book is specifically written for people just getting 
started with Big Data and Spark.

Mohammed
Author: Big Data Analytics with 
Spark

From: Suhaas Lang [mailto:suhaas.l...@gmail.com]
Sent: Sunday, February 28, 2016 6:21 PM
To: Jules Damji
Cc: Ashok Kumar; User
Subject: Re: Recommendation for a good book on Spark, beginner to moderate 
knowledge


Thanks, Jules!
On Feb 28, 2016 7:47 PM, "Jules Damji" 
> wrote:
Suhass,

When I referred to interactive shells, I was referring the the Scala & Python 
interactive language shells. Both Python & Scala come with respective 
interacive shells. By just typing “python” or “scala” (assume the installation 
bin directory is in your $PATH), it will put fire up the shell.

As for the “pyspark” and “spark-shell”, they both come with the Spark 
installation and are in $spark_install_dir/bin directory.

Have a go at them. Best way to learn the language.

Cheers
Jules

--
“Language is the palate from which we draw all colors of our life.”
Jules Damji
dmat...@comcast.net




On Feb 28, 2016, at 4:08 PM, Suhaas Lang 
> wrote:


Jules,

Could you please post links to these interactive shells for Python and Scala?
On Feb 28, 2016 5:32 PM, "Jules Damji" 
> wrote:
Hello Ashoka,

"Learning Spark," from O'Reilly, is certainly a good start, and all basic video 
tutorials from Spark Summit Training, "Spark Essentials", are excellent 
supplementary materials.

And the best (and most effective) way to teach yourself is really firing up the 
spark-shell or pyspark and doing it yourself—immersing yourself by trying all 
basic transformations and actions on RDDs, with contrived small data sets.

I've discovered that learning Scala & Python through their interactive shell, 
where feedback is immediate and response is quick, as the best learning 
experience.

Same is true for Scala or Python Notebooks interacting with a Spark, running in 
local or cluster mode.

Cheers,

Jules

Sent from my iPhone
Pardon the dumb thumb typos :)

On Feb 28, 2016, at 1:48 PM, Ashok Kumar 
> wrote:
  Hi Gurus,

Appreciate if you recommend me a good book on Spark or documentation for 
beginner to moderate knowledge

I very much like to skill myself on transformation and action methods.

FYI, I have already looked at examples on net. However, some of them not clear 
at least to me.

Warmest regards



Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Suhaas Lang
Thanks, Jules!
On Feb 28, 2016 7:47 PM, "Jules Damji"  wrote:

> Suhass,
>
> When I referred to interactive shells, I was referring the the Scala &
> Python interactive language shells. Both Python & Scala come with
> respective interacive shells. By just typing “python” or “scala” (assume
> the installation bin directory is in your $PATH), it will put fire up the
> shell.
>
> As for the “pyspark” and “spark-shell”, they both come with the Spark
> installation and are in $spark_install_dir/bin directory.
>
> Have a go at them. Best way to learn the language.
>
> Cheers
> Jules
>
> --
> “Language is the palate from which we draw all colors of our life.”
> Jules Damji
> dmat...@comcast.net
>
>
>
>
>
> On Feb 28, 2016, at 4:08 PM, Suhaas Lang  wrote:
>
> Jules,
>
> Could you please post links to these interactive shells for Python and
> Scala?
> On Feb 28, 2016 5:32 PM, "Jules Damji"  wrote:
>
>> Hello Ashoka,
>>
>> "Learning Spark," from O'Reilly, is certainly a good start, and all basic
>> video tutorials from Spark Summit Training, "Spark Essentials", are
>> excellent supplementary materials.
>>
>> And the best (and most effective) way to teach yourself is really firing
>> up the spark-shell or pyspark and doing it yourself—immersing yourself by
>> trying all basic transformations and actions on RDDs, with contrived small
>> data sets.
>>
>> I've discovered that learning Scala & Python through their interactive
>> shell, where feedback is immediate and response is quick, as the best
>> learning experience.
>>
>> Same is true for Scala or Python Notebooks interacting with a Spark,
>> running in local or cluster mode.
>>
>> Cheers,
>>
>> Jules
>>
>> Sent from my iPhone
>> Pardon the dumb thumb typos :)
>>
>> On Feb 28, 2016, at 1:48 PM, Ashok Kumar > > wrote:
>>
>>   Hi Gurus,
>>
>> Appreciate if you recommend me a good book on Spark or documentation for
>> beginner to moderate knowledge
>>
>> I very much like to skill myself on transformation and action methods.
>>
>> FYI, I have already looked at examples on net. However, some of them not
>> clear at least to me.
>>
>> Warmest regards
>>
>>
>


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Jules Damji
Suhass,

When I referred to interactive shells, I was referring the the Scala & Python 
interactive language shells. Both Python & Scala come with respective 
interacive shells. By just typing “python” or “scala” (assume the installation 
bin directory is in your $PATH), it will put fire up the shell.

As for the “pyspark” and “spark-shell”, they both come with the Spark 
installation and are in $spark_install_dir/bin directory.

Have a go at them. Best way to learn the language.

Cheers
Jules

--
“Language is the palate from which we draw all colors of our life.”
Jules Damji
dmat...@comcast.net





> On Feb 28, 2016, at 4:08 PM, Suhaas Lang  wrote:
> 
> Jules,
> 
> Could you please post links to these interactive shells for Python and Scala?
> 
> On Feb 28, 2016 5:32 PM, "Jules Damji"  > wrote:
> Hello Ashoka,
> 
> "Learning Spark," from O'Reilly, is certainly a good start, and all basic 
> video tutorials from Spark Summit Training, "Spark Essentials", are excellent 
> supplementary materials.
> 
> And the best (and most effective) way to teach yourself is really firing up 
> the spark-shell or pyspark and doing it yourself—immersing yourself by trying 
> all basic transformations and actions on RDDs, with contrived small data sets.
> 
> I've discovered that learning Scala & Python through their interactive shell, 
> where feedback is immediate and response is quick, as the best learning 
> experience. 
> 
> Same is true for Scala or Python Notebooks interacting with a Spark, running 
> in local or cluster mode. 
> 
> Cheers,
> 
> Jules 
> 
> Sent from my iPhone
> Pardon the dumb thumb typos :)
> 
> On Feb 28, 2016, at 1:48 PM, Ashok Kumar  > wrote:
> 
>>   Hi Gurus,
>> 
>> Appreciate if you recommend me a good book on Spark or documentation for 
>> beginner to moderate knowledge
>> 
>> I very much like to skill myself on transformation and action methods.
>> 
>> FYI, I have already looked at examples on net. However, some of them not 
>> clear at least to me.
>> 
>> Warmest regards



Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Suhaas Lang
Jules,

Could you please post links to these interactive shells for Python and
Scala?
On Feb 28, 2016 5:32 PM, "Jules Damji"  wrote:

> Hello Ashoka,
>
> "Learning Spark," from O'Reilly, is certainly a good start, and all basic
> video tutorials from Spark Summit Training, "Spark Essentials", are
> excellent supplementary materials.
>
> And the best (and most effective) way to teach yourself is really firing
> up the spark-shell or pyspark and doing it yourself—immersing yourself by
> trying all basic transformations and actions on RDDs, with contrived small
> data sets.
>
> I've discovered that learning Scala & Python through their interactive
> shell, where feedback is immediate and response is quick, as the best
> learning experience.
>
> Same is true for Scala or Python Notebooks interacting with a Spark,
> running in local or cluster mode.
>
> Cheers,
>
> Jules
>
> Sent from my iPhone
> Pardon the dumb thumb typos :)
>
> On Feb 28, 2016, at 1:48 PM, Ashok Kumar  > wrote:
>
>   Hi Gurus,
>
> Appreciate if you recommend me a good book on Spark or documentation for
> beginner to moderate knowledge
>
> I very much like to skill myself on transformation and action methods.
>
> FYI, I have already looked at examples on net. However, some of them not
> clear at least to me.
>
> Warmest regards
>
>


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Chris Fregly
for hands-on, check out the end-to-end reference data pipeline available
either from the github or docker repo's described here:

http://advancedspark.com/

i use these assets to training folks of all levels of Spark knowledge.

also, some relevant videos and slideshare presentations, but might be a bit
advanced for Spark n00bs.


On Sun, Feb 28, 2016 at 4:25 PM, Mich Talebzadeh 
wrote:

> In my opinion the best way to learn something is trying it on the spot.
>
> As suggested if you have Hadoop, Hive and Spark installed and you are OK
> with SQL then you will have to focus on Scala and Spark pretty much.
>
> Your best bet is interactive work through Spark shell with Scala,
> understanding RDD, DataFrame, Transformation and actions. You also have
> online docs and a great number of users in this forum that can potentially
> help you with your questions. Buying books can help but nothing takes the
> place of getting your hands dirty so to speak.
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 28 February 2016 at 22:32, Jules Damji  wrote:
>
>> Hello Ashoka,
>>
>> "Learning Spark," from O'Reilly, is certainly a good start, and all basic
>> video tutorials from Spark Summit Training, "Spark Essentials", are
>> excellent supplementary materials.
>>
>> And the best (and most effective) way to teach yourself is really firing
>> up the spark-shell or pyspark and doing it yourself—immersing yourself by
>> trying all basic transformations and actions on RDDs, with contrived small
>> data sets.
>>
>> I've discovered that learning Scala & Python through their interactive
>> shell, where feedback is immediate and response is quick, as the best
>> learning experience.
>>
>> Same is true for Scala or Python Notebooks interacting with a Spark,
>> running in local or cluster mode.
>>
>> Cheers,
>>
>> Jules
>>
>> Sent from my iPhone
>> Pardon the dumb thumb typos :)
>>
>> On Feb 28, 2016, at 1:48 PM, Ashok Kumar > > wrote:
>>
>>   Hi Gurus,
>>
>> Appreciate if you recommend me a good book on Spark or documentation for
>> beginner to moderate knowledge
>>
>> I very much like to skill myself on transformation and action methods.
>>
>> FYI, I have already looked at examples on net. However, some of them not
>> clear at least to me.
>>
>> Warmest regards
>>
>>
>


-- 

*Chris Fregly*
Principal Data Solutions Engineer
IBM Spark Technology Center, San Francisco, CA
http://spark.tc | http://advancedspark.com


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Mich Talebzadeh
In my opinion the best way to learn something is trying it on the spot.

As suggested if you have Hadoop, Hive and Spark installed and you are OK
with SQL then you will have to focus on Scala and Spark pretty much.

Your best bet is interactive work through Spark shell with Scala,
understanding RDD, DataFrame, Transformation and actions. You also have
online docs and a great number of users in this forum that can potentially
help you with your questions. Buying books can help but nothing takes the
place of getting your hands dirty so to speak.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 28 February 2016 at 22:32, Jules Damji  wrote:

> Hello Ashoka,
>
> "Learning Spark," from O'Reilly, is certainly a good start, and all basic
> video tutorials from Spark Summit Training, "Spark Essentials", are
> excellent supplementary materials.
>
> And the best (and most effective) way to teach yourself is really firing
> up the spark-shell or pyspark and doing it yourself—immersing yourself by
> trying all basic transformations and actions on RDDs, with contrived small
> data sets.
>
> I've discovered that learning Scala & Python through their interactive
> shell, where feedback is immediate and response is quick, as the best
> learning experience.
>
> Same is true for Scala or Python Notebooks interacting with a Spark,
> running in local or cluster mode.
>
> Cheers,
>
> Jules
>
> Sent from my iPhone
> Pardon the dumb thumb typos :)
>
> On Feb 28, 2016, at 1:48 PM, Ashok Kumar  > wrote:
>
>   Hi Gurus,
>
> Appreciate if you recommend me a good book on Spark or documentation for
> beginner to moderate knowledge
>
> I very much like to skill myself on transformation and action methods.
>
> FYI, I have already looked at examples on net. However, some of them not
> clear at least to me.
>
> Warmest regards
>
>


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Jules Damji
Hello Ashoka,

"Learning Spark," from O'Reilly, is certainly a good start, and all basic video 
tutorials from Spark Summit Training, "Spark Essentials", are excellent 
supplementary materials.

And the best (and most effective) way to teach yourself is really firing up the 
spark-shell or pyspark and doing it yourself—immersing yourself by trying all 
basic transformations and actions on RDDs, with contrived small data sets.

I've discovered that learning Scala & Python through their interactive shell, 
where feedback is immediate and response is quick, as the best learning 
experience. 

Same is true for Scala or Python Notebooks interacting with a Spark, running in 
local or cluster mode. 

Cheers,

Jules 

Sent from my iPhone
Pardon the dumb thumb typos :)

> On Feb 28, 2016, at 1:48 PM, Ashok Kumar  wrote:
> 
>   Hi Gurus,
> 
> Appreciate if you recommend me a good book on Spark or documentation for 
> beginner to moderate knowledge
> 
> I very much like to skill myself on transformation and action methods.
> 
> FYI, I have already looked at examples on net. However, some of them not 
> clear at least to me.
> 
> Warmest regards


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Nicos
I agree with suggestion to start with "Learning Spark" to further forge your 
knowledge of Spark fundamentals.

"Advanced Analytics with Spark" has good practical reinforcement of what you 
learn from the previous book. Though it is a bit advanced, in my opinion some 
practical/real applications are better covered in this book.

For DataFrame and other online Apache Spark documentation is still the best 
source.

Keep in mind Spark and its different subsystems are constantly evolving. 
Publications will be always somewhat outdated but not the key fundamental 
concepts.

Cheers,
- Nicos
+++ 


> On Feb 28, 2016, at 1:53 PM, Michał Zieliński  
> wrote:
> 
> Most of the books are outdated (don't include DataFrames or Spark ML and 
> focus on RDDs and MLlib). The one I particularly liked is "Learning Spark". 
> It starts from the basics, but has lots of useful tips on caching, 
> serialization etc.
> 
> The online docs are also of great quality.
> 
>> On 28 February 2016 at 21:48, Ashok Kumar  
>> wrote:
>>   Hi Gurus,
>> 
>> Appreciate if you recommend me a good book on Spark or documentation for 
>> beginner to moderate knowledge
>> 
>> I very much like to skill myself on transformation and action methods.
>> 
>> FYI, I have already looked at examples on net. However, some of them not 
>> clear at least to me.
>> 
>> Warmest regards
> 


Re: Recommendation for a good book on Spark, beginner to moderate knowledge

2016-02-28 Thread Ted Yu
http://www.amazon.com/Scala-Spark-Alexy-Khrabrov/dp/1491929286/ref=sr_1_1?ie=UTF8=1456696284=8-1=spark+dataframe

There is another one from Wiley (to be published on March 21):

"Spark: Big Data Cluster Computing in Production," written by Ilya Ganelin,
Brennon York, Kai Sasaki, and Ema Orhian

On Sun, Feb 28, 2016 at 1:48 PM, Ashok Kumar 
wrote:

>   Hi Gurus,
>
> Appreciate if you recommend me a good book on Spark or documentation for
> beginner to moderate knowledge
>
> I very much like to skill myself on transformation and action methods.
>
> FYI, I have already looked at examples on net. However, some of them not
> clear at least to me.
>
> Warmest regards
>