On Mon, Aug 25, 2014 at 7:18 AM, Deep Pradhan <pradhandeep1...@gmail.com> wrote: > When I add > > parts(0).collect().foreach(println) > > parts(1).collect().foreach(println), for printing parts, I get the following > error > > not enough arguments for method collect: (pf: > PartialFunction[Char,B])(implicit > bf:scala.collection.generic.CanBuildFrom[String,B,That])That.Unspecified > value parameter pf.parts(0).collect().foreach(println)
>>> val links = lines.map{ s => >>> val parts = s.split("\\s+") >>> (parts(0), parts(1)) /*I want to print this "parts"*/ >>> }.distinct().groupByKey().cache() Within this code, you are working in a simple Scala function. parts is an Array[String]. parts(0) is a String. You can just println(parts(0)). You are not calling RDD.collect() there, but collect() on a String a sequence of Char. However note that this will print the String on the worker that executes this, not the driver. Maybe you want to print the result right after this map function? Then break this into two statements and print the result of the first. You already are doing that in your code. A good formula is actually "take(10)" rather than "collect()" in case the RDD is huge. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org