You only need to define an ordering of student, no need to modify the class
definition of student. It's like a Comparator class in java.

Currently, you have to map the rdd to sort by value.

Liquan

On Wed, Sep 24, 2014 at 9:52 AM, Sean Owen <so...@cloudera.com> wrote:

> See the scaladoc for how to define an implicit ordering to use with
> sortByKey:
>
>
> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.OrderedRDDFunctions
>
> Off the top of my head, I think this is 90% correct to order by age for
> example:
>
> implicit val studentOrdering: Ordering[Student] = Ordering.by(_.age)
>
> On Wed, Sep 24, 2014 at 3:07 PM, Tao Xiao <xiaotao.cs....@gmail.com>
> wrote:
> > Hi ,
> >
> >     I have the following rdd :
> >
> >   val conf = new SparkConf()
> >           .setAppName("<< Testing Sorting >>")
> >   val sc = new SparkContext(conf)
> >
> >   val L = List(
> >       (new Student("XiaoTao", 80, 29), "I'm Xiaotao"),
> >       (new Student("CCC", 100, 24), "I'm CCC"),
> >       (new Student("Jack", 90, 25), "I'm Jack"),
> >       (new Student("Tom", 60, 35), "I'm Tom"),
> >       (new Student("Lucy", 78, 22), "I'm Lucy"))
> >
> >   val rdd = sc.parallelize(L, 3)
> >
> >
> > where Student is a class defined as follows:
> >
> > class Student(val name:String, val score:Int, val age:Int)  {
> >
> >  override def toString =
> >                  "name:" + name + ", score:" + score + ", age:" + age
> >
> > }
> >
> >
> >
> > I want to sort the rdd by key, but when I wrote rdd.sortByKey it
> complained
> > that "No implicit Ordering defined", which means I must extend the class
> > with Ordered and provide a method named  compare.  The problem is that
> the
> > class Student is from a third-party library so I cannot change its
> > definition. I'd like to know if there is a sorting method that I can
> provide
> > it a customized compare function so that it can sort the rdd according to
> > the sorting function I provide.
> >
> > One more question, if I want to sort RDD[(k, v)] by value , do I have to
> map
> > that rdd so that its key and value exchange their positions in the tuple?
> > Are there any functions that allow us to sort rdd by things other than
> key ?
> >
> > Thanks
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Liquan Pei
Department of Physics
University of Massachusetts Amherst

Reply via email to