This was fixed by 1.5, could you download 1.5-RC3 to test this?

On Thu, Sep 3, 2015 at 4:45 PM, Wei Chen <wei.chen.ri...@gmail.com> wrote:
> Hey Friends,
>
> Recently I have been using Spark 1.3.1, mainly pyspark.sql. I noticed that
> the Row object collected directly from a DataFrame is different from the Row
> object we directly defined from Row(*arg, **kwarg).
>
>>>>from pyspark.sql.types import Row
>>>>aaa = Row(a=1, b=2, c=Row(a=1, b=2))
>>>>tuple(sc.parallelize([aaa]).toDF().collect()[0])
>
> (1, 2, (1, 2))
>
>>>>tuple(aaa)
>
> (1, 2, Row(a=1, b=2))
>
>
> This matters to me because I wanted to be able to create a DataFrame with
> one of the columns being a Row object by sqlcontext.createDataFrame(data,
> schema) where I specifically pass in the schema. However, if the data is RDD
> of Row objects like "aaa" in my example, it'll fail in __verify_type
> function.
>
>
>
> Thank you,
>
> Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to