[ 
https://issues.apache.org/jira/browse/SPARK-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-9502.
---------------------------------
    Resolution: Not A Problem

Now it seems throwing a different exception as below:

{code}
java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: 
java.util.ArrayList is not a valid external type for schema of array<string>
{code}

I am resolving this as {{Not a problem}} because it seems obsolete and it seems 
not an actual error assuming from [~jodersky]'s comment above.

If I was wrong and if this is still a problem, please feel free to reopen with 
some reasonable statements

> ArrayTypes incorrect for DataFrames Java API
> --------------------------------------------
>
>                 Key: SPARK-9502
>                 URL: https://issues.apache.org/jira/browse/SPARK-9502
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1
>            Reporter: Kuldeep
>            Priority: Critical
>
> With upgrade to 1.4.1 array types for DataFrames were different in our java 
> applications. I have modified JavaApplySchemaSuite to show the problem. 
> Mainly i have added a list field to the person class.
> {code:java}
>   public static class Person implements Serializable {
>     private String name;
>     private int age;
>     private List<String> skills;
>     public String getName() {
>       return name;
>     }
>     public void setName(String name) {
>       this.name = name;
>     }
>     public int getAge() {
>       return age;
>     }
>     public void setAge(int age) {
>       this.age = age;
>     }
>     public void setSkills(List<String> skills) {
>       this.skills = skills;
>     }
>     public List<String> getSkills() { return skills; }
>   }
>   @Test
>   public void applySchema() {
>     List<Person> personList = new ArrayList<Person>(2);
>     List<String> skills = new ArrayList<String>();
>     skills.add("eating");
>     skills.add("sleeping");
>     Person person1 = new Person();
>     person1.setName("Michael");
>     person1.setAge(29);
>     person1.setSkills(skills);
>     personList.add(person1);
>     Person person2 = new Person();
>     person2.setName("Yin");
>     person2.setAge(28);
>     person2.setSkills(skills);
>     personList.add(person2);
>     JavaRDD<Row> rowRDD = javaCtx.parallelize(personList).map(
>       new Function<Person, Row>() {
>         public Row call(Person person) throws Exception {
>           return RowFactory.create(person.getName(), person.getAge(), 
> person.getSkills());
>         }
>       });
>     List<StructField> fields = new ArrayList<StructField>(2);
>     fields.add(DataTypes.createStructField("name", DataTypes.StringType, 
> false));
>     fields.add(DataTypes.createStructField("age", DataTypes.IntegerType, 
> false));
>     fields.add(DataTypes.createStructField("skills", 
> DataTypes.createArrayType(DataTypes.StringType), false));
>     StructType schema = DataTypes.createStructType(fields);
>     DataFrame df = sqlContext.applySchema(rowRDD, schema);
>     df.registerTempTable("people");
>     Row[] actual = sqlContext.sql("SELECT * FROM people").collect();
>       System.out.println(actual[1].get(2).getClass().getName());
>       System.out.println(actual[1].get(2) instanceof List);
>     List<Row> expected = new ArrayList<Row>(2);
>     expected.add(RowFactory.create("Michael", 29, skills));
>     expected.add(RowFactory.create("Yin", 28, skills));
>     Assert.assertEquals(expected, Arrays.asList(actual));
>   }
> {code}
> This prints 
> scala.collection.immutable.$colon$colon
> false
> java.lang.AssertionError: 
> Expected :[[Michael,29,[eating, sleeping]], [Yin,28,[eating, sleeping]]]
> Actual   :[[Michael,29,List(eating, sleeping)], [Yin,28,List(eating, 
> sleeping)]]
> Not sure if this would be usable even in scala.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to