Try: sc.textFile("path/file").zipWithIndex().toDF("id", "text") -Xiangrui
On Sun, Apr 5, 2015 at 7:50 PM, olegshirokikh <o...@solver.com> wrote: > What would be the most efficient neat method to add a column with row ids to > dataframe? > > I can think of something as below, but it completes with errors (at line 3), > and anyways doesn't look like the best route possible: > > var dataDF = sc.textFile("path/file").toDF() > val rowDF = sc.parallelize(1 to dataDF.count().toInt).toDF("ID") > dataDF = dataDF.withColumn("ID", rowDF("ID")) > > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Add-row-IDs-column-to-data-frame-tp22385.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org