Actually, sorry, my mistake, you're calling DataFrame df = sqlContext.createDataFrame(data, org.apache.spark.sql.types.NumericType.class);
and giving it a list of objects which aren't NumericTypes, but the wildcards in the signature let it happen. I'm curious what'd happen if you gave it Integer.class, but I suspect it still won't work because Integer may not have the bean-style getters. On Fri, Jul 22, 2016 at 9:37 AM, Everett Anderson <ever...@nuna.com> wrote: > Hey, > > I think what's happening is that you're calling this createDataFrame > method > <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/SQLContext.html#createDataFrame(java.util.List,%20java.lang.Class)> > : > > createDataFrame(java.util.List<?> data, java.lang.Class<?> beanClass) > > which expects a JavaBean-style class with get and set methods for the > members, but Integer doesn't have such a getter. > > I bet there's an easier way if you just want a single-column DataFrame of > a primitive type, but one way that would work is to manually construct the > Rows using RowFactory.create() > <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/RowFactory.html#create(java.lang.Object...)> > and assemble the DataFrame from that like > > List<Row> rows = convert your List<Integer> to this in a loop with > RowFactory.create() > > StructType schema = DataTypes.createStructType(Collections.singletonList( > DataTypes.createStructField("int_field", DataTypes.IntegerType, > true))); > > DataFrame intDataFrame = sqlContext.createDataFrame(rows, schema); > > > > On Fri, Jul 22, 2016 at 7:53 AM, Jean Georges Perrin <j...@jgp.net> wrote: > >> >> >> I am trying to build a DataFrame from a list, here is the code: >> >> private void start() { >> SparkConf conf = new SparkConf().setAppName("Data Set from Array" >> ).setMaster("local"); >> SparkContext sc = new SparkContext(conf); >> SQLContext sqlContext = new SQLContext(sc); >> >> Integer[] l = new Integer[] { 1, 2, 3, 4, 5, 6, 7 }; >> List<Integer> data = Arrays.asList(l); >> >> System.out.println(data); >> >> >> DataFrame df = sqlContext.createDataFrame(data, >> org.apache.spark.sql.types.NumericType.class); >> df.show(); >> } >> >> My result is (unpleasantly): >> >> [1, 2, 3, 4, 5, 6, 7] >> ++ >> || >> ++ >> || >> || >> || >> || >> || >> || >> || >> ++ >> >> I also tried with: >> org.apache.spark.sql.types.NumericType.class >> org.apache.spark.sql.types.IntegerType.class >> org.apache.spark.sql.types.ArrayType.class >> >> I am probably missing something super obvious :( >> >> Thanks! >> >> jg >> >> >> >