On Mon, Aug 25, 2014 at 7:18 AM, Deep Pradhan <pradhandeep1...@gmail.com> wrote:
> When I add
>
> parts(0).collect().foreach(println)
>
> parts(1).collect().foreach(println), for printing parts, I get the following
> error
>
> not enough arguments for method collect: (pf:
> PartialFunction[Char,B])(implicit
> bf:scala.collection.generic.CanBuildFrom[String,B,That])That.Unspecified
> value parameter pf.parts(0).collect().foreach(println)

>>>     val links = lines.map{ s =>
>>>       val parts = s.split("\\s+")
>>>       (parts(0), parts(1))  /*I want to print this "parts"*/
>>>     }.distinct().groupByKey().cache()


Within this code, you are working in a simple Scala function. parts is
an Array[String]. parts(0) is a String. You can just
println(parts(0)). You are not calling RDD.collect() there, but
collect() on a String a sequence of Char.

However note that this will print the String on the worker that
executes this, not the driver.

Maybe you want to print the result right after this map function? Then
break this into two statements and print the result of the first. You
already are doing that in your code. A good formula is actually
"take(10)" rather than "collect()" in case the RDD is huge.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to