[ https://issues.apache.org/jira/browse/SPARK-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Asif Khan updated SPARK-25512: ------------------------------ Issue Type: Question (was: Bug) > Using RowNumbers in SparkR Dataframe > ------------------------------------ > > Key: SPARK-25512 > URL: https://issues.apache.org/jira/browse/SPARK-25512 > Project: Spark > Issue Type: Question > Components: SparkR > Affects Versions: 2.3.1 > Reporter: Asif Khan > Priority: Critical > > Hi, > I have a use case , where I have a SparkR dataframe and i want to iterate > over the dataframe in a for loop using the row numbers of the dataframe. Is > it possible? > Only solution I have now is to collect() the SparkR dataframe in R dataframe > , which brings the entire dataframe on Driver node and then iterate over it > using row numbers. But as the for loop executes only on driver node, I don't > get the advantage of parallel processing in Spark which was the whole purpose > of using Spark. Please Help. > Thank You, > Asif Khan -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org