Hi guys, First of all, thank you for your amazing work.
As you can see in the subject, I post here because I need to perform a for loop on a DataFrame object. Sample of my Dataset (the entire dataset is ~400k lines long) : I use the 1.4.1 Spark version with R in 3.2.1 I launch sparkR using (the package can be found at http://spark-packages.org/package/databricks/spark-csv ) I load my dataset from HDFS using the following command (the package is needed to load a CSV in a Spark DataFrame): When I do a summary, the output is : What I need to do is to calculate : But you probably know that we can't do this because the read.df function return an S4 object and it is not an iterable object. Does anyone know how can I do that ? Maybe I have to convert the type of the DataFrame or use another function to load my dataset... I have to say that I'm new to Spark and SparkR :) Thanks for your time, Florian -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-How-to-perform-a-for-loop-on-a-DataFrame-object-tp24359.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org