I think in 1.3 and above, you'd need to do

.sql(...).javaRDD().map(..)

On Fri, Apr 17, 2015 at 9:22 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> Yes thanks !
>
> Le ven. 17 avr. 2015 à 16:20, Ted Yu <yuzhih...@gmail.com> a écrit :
>
> > The image didn't go through.
> >
> > I think you were referring to:
> >   override def map[R: ClassTag](f: Row => R): RDD[R] = rdd.map(f)
> >
> > Cheers
> >
> > On Fri, Apr 17, 2015 at 6:07 AM, Olivier Girardot <
> > o.girar...@lateral-thoughts.com> wrote:
> >
> > > Hi everyone,
> > > I had an issue trying to use Spark SQL from Java (8 or 7), I tried to
> > > reproduce it in a small test case close to the actual documentation
> > > <
> >
> https://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection
> > >,
> > > so sorry for the long mail, but this is "Java" :
> > >
> > > import org.apache.spark.api.java.JavaRDD;
> > > import org.apache.spark.api.java.JavaSparkContext;
> > > import org.apache.spark.sql.DataFrame;
> > > import org.apache.spark.sql.SQLContext;
> > >
> > > import java.io.Serializable;
> > > import java.util.ArrayList;
> > > import java.util.Arrays;
> > > import java.util.List;
> > >
> > > class Movie implements Serializable {
> > >     private int id;
> > >     private String name;
> > >
> > >     public Movie(int id, String name) {
> > >         this.id = id;
> > >         this.name = name;
> > >     }
> > >
> > >     public int getId() {
> > >         return id;
> > >     }
> > >
> > >     public void setId(int id) {
> > >         this.id = id;
> > >     }
> > >
> > >     public String getName() {
> > >         return name;
> > >     }
> > >
> > >     public void setName(String name) {
> > >         this.name = name;
> > >     }
> > > }
> > >
> > > public class SparkSQLTest {
> > >     public static void main(String[] args) {
> > >         SparkConf conf = new SparkConf();
> > >         conf.setAppName("My Application");
> > >         conf.setMaster("local");
> > >         JavaSparkContext sc = new JavaSparkContext(conf);
> > >
> > >         ArrayList<Movie> movieArrayList = new ArrayList<Movie>();
> > >         movieArrayList.add(new Movie(1, "Indiana Jones"));
> > >
> > >         JavaRDD<Movie> movies = sc.parallelize(movieArrayList);
> > >
> > >         SQLContext sqlContext = new SQLContext(sc);
> > >         DataFrame frame = sqlContext.applySchema(movies, Movie.class);
> > >         frame.registerTempTable("movies");
> > >
> > >         sqlContext.sql("select name from movies")
> > >
> > > *                .map(row -> row.getString(0)) // this is what i would
> > expect to work *                .collect();
> > >     }
> > > }
> > >
> > >
> > > But this does not compile, here's the compilation error :
> > >
> > > [ERROR]
> > >
> >
> /Users/ogirardot/Documents/spark/java-project/src/main/java/org/apache/spark/MainSQL.java:[37,47]
> > > method map in class org.apache.spark.sql.DataFrame cannot be applied to
> > > given types;
> > > [ERROR] *required:
> > > scala.Function1<org.apache.spark.sql.Row,R>,scala.reflect.ClassTag<R> *
> > > [ERROR]* found: (row)->"Na[...]ng(0) *
> > > [ERROR] *reason: cannot infer type-variable(s) R *
> > > [ERROR] *(actual and formal argument lists differ in length) *
> > > [ERROR]
> > >
> >
> /Users/ogirardot/Documents/spark/java-project/src/main/java/org/apache/spark/SampleSHit.java:[56,17]
> > > method map in class org.apache.spark.sql.DataFrame cannot be applied to
> > > given types;
> > > [ERROR] required:
> > > scala.Function1<org.apache.spark.sql.Row,R>,scala.reflect.ClassTag<R>
> > > [ERROR] found: (row)->row[...]ng(0)
> > > [ERROR] reason: cannot infer type-variable(s) R
> > > [ERROR] (actual and formal argument lists differ in length)
> > > [ERROR] -> [Help 1]
> > >
> > > Because in the DataFrame the *map *method is defined as :
> > >
> > > [image: Images intégrées 1]
> > >
> > > And once this is translated to bytecode the actual Java signature uses
> a
> > > Function1 and adds a ClassTag parameter.
> > > I can try to go around this and use the scala.reflect.ClassTag$ like
> > that :
> > >
> > > ClassTag$.MODULE$.apply(String.class)
> > >
> > > To get the second ClassTag parameter right, but then instantiating a
> > java.util.Function or using the Java 8 lambdas fail to work, and if I try
> > to instantiate a proper scala Function1... well this is a world of pain.
> > >
> > > This is a regression introduced by the 1.3.x DataFrame because
> > JavaSchemaRDD used to be JavaRDDLike but DataFrame's are not (and are not
> > callable with JFunctions), I can open a Jira if you want ?
> > >
> > > Regards,
> > >
> > > --
> > > *Olivier Girardot* | Associé
> > > o.girar...@lateral-thoughts.com
> > > +33 6 24 09 17 94
> > >
> >
>

Reply via email to