Re: spark sql results maintain order (in python)

Davies Liu Thu, 04 Sep 2014 09:48:00 -0700

On Thu, Sep 4, 2014 at 3:42 AM, jamborta <jambo...@gmail.com> wrote:
> hi,
>
> I ran into a problem with spark sql, when run a query like this "select
> count(*), city, industry from table group by hour" and I would like to take
> the results from the shemaRDD
>
> 1, I have to parse each line to get the values out of the dic (eg in order
> to convert it to a csv)


> 2, The order is not kept in a python dict - I couldn't find a way to
> maintain the original order (especially a problem in this case, when the
> column names are derived).

In master and upcoming 1.1 release, you will got pyspark.sql.Row objects
from SchemaRDD, which is namedtuple, so it will keep the order as it in
the sql, you can easily convert them into tuple or list.

Also, you can access the fields just like attributes.

> thanks,
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-sql-results-maintain-order-in-python-tp13445.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: spark sql results maintain order (in python)

Reply via email to