Re: Scala examples for Spark do not work as written in documentation

2014-06-20 Thread Will Benton
Hey, sorry to reanimate this thread, but just a quick question:  why do the 
examples (on http://spark.apache.org/examples.html) use spark for the 
SparkContext reference?  This is minor, but it seems like it could be a little 
confusing for people who want to run them in the shell and need to change 
spark to sc.  (I noticed because this was a speedbump for a colleague who 
is trying out Spark.)


thanks,
wb

- Original Message -
 From: Andy Konwinski andykonwin...@gmail.com
 To: dev@spark.apache.org
 Sent: Tuesday, May 20, 2014 4:06:33 PM
 Subject: Re: Scala examples for Spark do not work as written in documentation
 
 I fixed the bug, but I kept the parameter i instead of _ since that (1)
 keeps it more parallel to the python and java versions which also use
 functions with a named variable and (2) doesn't require readers to know
 this particular use of the _ syntax in Scala.
 
 Thanks for catching this Glenn.
 
 Andy
 
 
 On Fri, May 16, 2014 at 12:38 PM, Mark Hamstra
 m...@clearstorydata.comwrote:
 
  Sorry, looks like an extra line got inserted in there.  One more try:
 
  val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
val x = Math.random()
val y = Math.random()
if (x*x + y*y  1) 1 else 0
  }.reduce(_ + _)
 
 
 
  On Fri, May 16, 2014 at 12:36 PM, Mark Hamstra m...@clearstorydata.com
  wrote:
 
   Actually, the better way to write the multi-line closure would be:
  
   val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
  
 val x = Math.random()
 val y = Math.random()
 if (x*x + y*y  1) 1 else 0
   }.reduce(_ + _)
  
  
   On Fri, May 16, 2014 at 9:41 AM, GlennStrycker glenn.stryc...@gmail.com
  wrote:
  
   On the webpage http://spark.apache.org/examples.html, there is an
  example
   written as
  
   val count = spark.parallelize(1 to NUM_SAMPLES).map(i =
 val x = Math.random()
 val y = Math.random()
 if (x*x + y*y  1) 1 else 0
   ).reduce(_ + _)
   println(Pi is roughly  + 4.0 * count / NUM_SAMPLES)
  
   This does not execute in Spark, which gives me an error:
   console:2: error: illegal start of simple expression
val x = Math.random()
^
  
   If I rewrite the query slightly, adding in {}, it works:
  
   val count = spark.parallelize(1 to 1).map(i =
  {
  val x = Math.random()
  val y = Math.random()
  if (x*x + y*y  1) 1 else 0
  }
   ).reduce(_ + _)
   println(Pi is roughly  + 4.0 * count / 1.0)
  
  
  
  
  
   --
   View this message in context:
  
  http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-examples-for-Spark-do-not-work-as-written-in-documentation-tp6593.html
   Sent from the Apache Spark Developers List mailing list archive at
   Nabble.com.
  
  
  
 
 


Re: Scala examples for Spark do not work as written in documentation

2014-06-20 Thread Patrick Wendell
Those are pretty old - but I think the reason Matei did that was to
make it less confusing for brand new users. `spark` is actually a
valid identifier because it's just a variable name (val spark = new
SparkContext()) but I agree this could be confusing for users who want
to drop into the shell.

On Fri, Jun 20, 2014 at 12:04 PM, Will Benton wi...@redhat.com wrote:
 Hey, sorry to reanimate this thread, but just a quick question:  why do the 
 examples (on http://spark.apache.org/examples.html) use spark for the 
 SparkContext reference?  This is minor, but it seems like it could be a 
 little confusing for people who want to run them in the shell and need to 
 change spark to sc.  (I noticed because this was a speedbump for a 
 colleague who is trying out Spark.)


 thanks,
 wb

 - Original Message -
 From: Andy Konwinski andykonwin...@gmail.com
 To: dev@spark.apache.org
 Sent: Tuesday, May 20, 2014 4:06:33 PM
 Subject: Re: Scala examples for Spark do not work as written in documentation

 I fixed the bug, but I kept the parameter i instead of _ since that (1)
 keeps it more parallel to the python and java versions which also use
 functions with a named variable and (2) doesn't require readers to know
 this particular use of the _ syntax in Scala.

 Thanks for catching this Glenn.

 Andy


 On Fri, May 16, 2014 at 12:38 PM, Mark Hamstra
 m...@clearstorydata.comwrote:

  Sorry, looks like an extra line got inserted in there.  One more try:
 
  val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
val x = Math.random()
val y = Math.random()
if (x*x + y*y  1) 1 else 0
  }.reduce(_ + _)
 
 
 
  On Fri, May 16, 2014 at 12:36 PM, Mark Hamstra m...@clearstorydata.com
  wrote:
 
   Actually, the better way to write the multi-line closure would be:
  
   val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
  
 val x = Math.random()
 val y = Math.random()
 if (x*x + y*y  1) 1 else 0
   }.reduce(_ + _)
  
  
   On Fri, May 16, 2014 at 9:41 AM, GlennStrycker glenn.stryc...@gmail.com
  wrote:
  
   On the webpage http://spark.apache.org/examples.html, there is an
  example
   written as
  
   val count = spark.parallelize(1 to NUM_SAMPLES).map(i =
 val x = Math.random()
 val y = Math.random()
 if (x*x + y*y  1) 1 else 0
   ).reduce(_ + _)
   println(Pi is roughly  + 4.0 * count / NUM_SAMPLES)
  
   This does not execute in Spark, which gives me an error:
   console:2: error: illegal start of simple expression
val x = Math.random()
^
  
   If I rewrite the query slightly, adding in {}, it works:
  
   val count = spark.parallelize(1 to 1).map(i =
  {
  val x = Math.random()
  val y = Math.random()
  if (x*x + y*y  1) 1 else 0
  }
   ).reduce(_ + _)
   println(Pi is roughly  + 4.0 * count / 1.0)
  
  
  
  
  
   --
   View this message in context:
  
  http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-examples-for-Spark-do-not-work-as-written-in-documentation-tp6593.html
   Sent from the Apache Spark Developers List mailing list archive at
   Nabble.com.
  
  
  
 



Re: Scala examples for Spark do not work as written in documentation

2014-05-20 Thread Andy Konwinski
I fixed the bug, but I kept the parameter i instead of _ since that (1)
keeps it more parallel to the python and java versions which also use
functions with a named variable and (2) doesn't require readers to know
this particular use of the _ syntax in Scala.

Thanks for catching this Glenn.

Andy


On Fri, May 16, 2014 at 12:38 PM, Mark Hamstra m...@clearstorydata.comwrote:

 Sorry, looks like an extra line got inserted in there.  One more try:

 val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
   val x = Math.random()
   val y = Math.random()
   if (x*x + y*y  1) 1 else 0
 }.reduce(_ + _)



 On Fri, May 16, 2014 at 12:36 PM, Mark Hamstra m...@clearstorydata.com
 wrote:

  Actually, the better way to write the multi-line closure would be:
 
  val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
 
val x = Math.random()
val y = Math.random()
if (x*x + y*y  1) 1 else 0
  }.reduce(_ + _)
 
 
  On Fri, May 16, 2014 at 9:41 AM, GlennStrycker glenn.stryc...@gmail.com
 wrote:
 
  On the webpage http://spark.apache.org/examples.html, there is an
 example
  written as
 
  val count = spark.parallelize(1 to NUM_SAMPLES).map(i =
val x = Math.random()
val y = Math.random()
if (x*x + y*y  1) 1 else 0
  ).reduce(_ + _)
  println(Pi is roughly  + 4.0 * count / NUM_SAMPLES)
 
  This does not execute in Spark, which gives me an error:
  console:2: error: illegal start of simple expression
   val x = Math.random()
   ^
 
  If I rewrite the query slightly, adding in {}, it works:
 
  val count = spark.parallelize(1 to 1).map(i =
 {
 val x = Math.random()
 val y = Math.random()
 if (x*x + y*y  1) 1 else 0
 }
  ).reduce(_ + _)
  println(Pi is roughly  + 4.0 * count / 1.0)
 
 
 
 
 
  --
  View this message in context:
 
 http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-examples-for-Spark-do-not-work-as-written-in-documentation-tp6593.html
  Sent from the Apache Spark Developers List mailing list archive at
  Nabble.com.
 
 
 



Re: Scala examples for Spark do not work as written in documentation

2014-05-16 Thread Reynold Xin
Thanks for pointing it out. We should update the website to fix the code.

val count = spark.parallelize(1 to NUM_SAMPLES).map { i =
  val x = Math.random()
  val y = Math.random()
  if (x*x + y*y  1) 1 else 0
}.reduce(_ + _)
println(Pi is roughly  + 4.0 * count / NUM_SAMPLES)



On Fri, May 16, 2014 at 9:41 AM, GlennStrycker glenn.stryc...@gmail.comwrote:

 On the webpage http://spark.apache.org/examples.html, there is an example
 written as

 val count = spark.parallelize(1 to NUM_SAMPLES).map(i =
   val x = Math.random()
   val y = Math.random()
   if (x*x + y*y  1) 1 else 0
 ).reduce(_ + _)
 println(Pi is roughly  + 4.0 * count / NUM_SAMPLES)

 This does not execute in Spark, which gives me an error:
 console:2: error: illegal start of simple expression
  val x = Math.random()
  ^

 If I rewrite the query slightly, adding in {}, it works:

 val count = spark.parallelize(1 to 1).map(i =
{
val x = Math.random()
val y = Math.random()
if (x*x + y*y  1) 1 else 0
}
 ).reduce(_ + _)
 println(Pi is roughly  + 4.0 * count / 1.0)





 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-examples-for-Spark-do-not-work-as-written-in-documentation-tp6593.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.



Re: Scala examples for Spark do not work as written in documentation

2014-05-16 Thread Mark Hamstra
Sorry, looks like an extra line got inserted in there.  One more try:

val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =
  val x = Math.random()
  val y = Math.random()
  if (x*x + y*y  1) 1 else 0
}.reduce(_ + _)



On Fri, May 16, 2014 at 12:36 PM, Mark Hamstra m...@clearstorydata.comwrote:

 Actually, the better way to write the multi-line closure would be:

 val count = spark.parallelize(1 to NUM_SAMPLES).map { _ =

   val x = Math.random()
   val y = Math.random()
   if (x*x + y*y  1) 1 else 0
 }.reduce(_ + _)


 On Fri, May 16, 2014 at 9:41 AM, GlennStrycker 
 glenn.stryc...@gmail.comwrote:

 On the webpage http://spark.apache.org/examples.html, there is an example
 written as

 val count = spark.parallelize(1 to NUM_SAMPLES).map(i =
   val x = Math.random()
   val y = Math.random()
   if (x*x + y*y  1) 1 else 0
 ).reduce(_ + _)
 println(Pi is roughly  + 4.0 * count / NUM_SAMPLES)

 This does not execute in Spark, which gives me an error:
 console:2: error: illegal start of simple expression
  val x = Math.random()
  ^

 If I rewrite the query slightly, adding in {}, it works:

 val count = spark.parallelize(1 to 1).map(i =
{
val x = Math.random()
val y = Math.random()
if (x*x + y*y  1) 1 else 0
}
 ).reduce(_ + _)
 println(Pi is roughly  + 4.0 * count / 1.0)





 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-examples-for-Spark-do-not-work-as-written-in-documentation-tp6593.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.





Re: Scala examples for Spark do not work as written in documentation

2014-05-16 Thread GlennStrycker
Why does the reduce function only work on sums of keys of the same type and
does not support other functional forms?

I am having trouble in another example where instead of 1s and 0s, the
output of the map function is something like A=(1,2) and B=(3,4).  I need a
reduce function that can return something complicated based on reduce( (A,B)
= (arbitrary fcn1 of A and B, arbitrary fcn2 of A and B) ), but I am only
getting reduce( (A,B) = (arbitrary fcn1 of A, arbitrary fcn2 of A) ).

See
http://apache-spark-developers-list.1001551.n3.nabble.com/reduce-only-removes-duplicates-cannot-be-arbitrary-function-td6606.html




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-examples-for-Spark-do-not-work-as-written-in-documentation-tp6593p6607.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.