Try this way:

scala>val input1 = sc.textFile("/test7").map(line =>
line.split(",").map(_.trim));
scala>val input2 = sc.textFile("/test8").map(line =>
line.split(",").map(_.trim));
scala>val input11 = input1.map(x=>(*(x(0) + x(1)*),x(2),x(3)))
scala>val input22 = input2.map(x=>(*(x(0) + x(1)*),x(2),x(3)))

scala> input11.join(input22).take(10)


PairFunctions basically requires RDD[K,V] and in your case its ((String,
String), String, String). You can also look in keyBy if you don't want to
concatenate your keys.

Thanks
Best Regards

On Tue, Jun 9, 2015 at 10:14 AM, amit tewari <amittewar...@gmail.com> wrote:

> Hi Dear Spark Users
>
> I am very new to Spark/Scala.
>
> Am using Datastax (4.7/Spark 1.2.1) and struggling with following
> error/issue.
>
> Already tried options like import org.apache.spark.SparkContext._ or
> explicit import org.apache.spark.SparkContext.rddToPairRDDFunctions.
> But error not resolved.
>
> Help much appreciated.
>
> Thanks
> AT
>
> scala>val input1 = sc.textFile("/test7").map(line =>
> line.split(",").map(_.trim));
> scala>val input2 = sc.textFile("/test8").map(line =>
> line.split(",").map(_.trim));
> scala>val input11 = input1.map(x=>((x(0),x(1)),x(2),x(3)))
> scala>val input22 = input2.map(x=>((x(0),x(1)),x(2),x(3)))
>
>  scala> input11.join(input22).take(10)
>
> <console>:66: error: value join is not a member of
> org.apache.spark.rdd.RDD[((String, String), String, String)]
>
>               input11.join(input22).take(10)
>
>
>
>
>
>
>

Reply via email to