Re: [PySpark DataFrame] When a Row is not a Row

2015-07-12 Thread Davies Liu
-DataFrame-When-a-Row-is-not-a-Row-tp12210p13153.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev

Re: [PySpark DataFrame] When a Row is not a Row

2015-07-11 Thread Jerry Lam
? Best Regards, Jerry -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/PySpark-DataFrame-When-a-Row-is-not-a-Row-tp12210p13153.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-13 Thread Nicholas Chammas
Is there some way around this? For example, can Row just be an implementation of namedtuple throughout? from collections import namedtuple class Row(namedtuple): ... From a user perspective, it’s confusing that there are 2 different implementations of the Row class with the same name. In my

回复: [PySpark DataFrame] When a Row is not a Row

2015-05-12 Thread Davies Liu
The class (called Row) for rows from Spark SQL is created on the fly, is different from pyspark.sql.Row (is an public API to create Row by users). The reason we done it in this way is that we want to have better performance when accessing the columns. Basically, the rows are just named tuples

[PySpark DataFrame] When a Row is not a Row

2015-05-11 Thread Nicholas Chammas
This is really strange. # Spark 1.3.1 print type(results) class 'pyspark.sql.dataframe.DataFrame' a = results.take(1)[0] print type(a) class 'pyspark.sql.types.Row' print pyspark.sql.types.Row class 'pyspark.sql.types.Row' print type(a) == pyspark.sql.types.Row False print

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-11 Thread Ted Yu
In Row#equals(): while (i len) { if (apply(i) != that.apply(i)) { '!=' should be !apply(i).equals(that.apply(i)) ? Cheers On Mon, May 11, 2015 at 1:49 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: This is really strange. # Spark 1.3.1 print type(results) class