[ 
https://issues.apache.org/jira/browse/SPARK-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Asif Khan updated SPARK-25512:
------------------------------
    Issue Type: Question  (was: Bug)

> Using RowNumbers in SparkR Dataframe
> ------------------------------------
>
>                 Key: SPARK-25512
>                 URL: https://issues.apache.org/jira/browse/SPARK-25512
>             Project: Spark
>          Issue Type: Question
>          Components: SparkR
>    Affects Versions: 2.3.1
>            Reporter: Asif Khan
>            Priority: Critical
>
> Hi,
> I have a use case , where I have a  SparkR  dataframe and i want to iterate 
> over the dataframe in a for loop using the row numbers  of the dataframe. Is 
> it possible?
> Only solution I have now is to collect() the SparkR dataframe in R dataframe 
> , which brings the entire dataframe on Driver node and then iterate over it 
> using row numbers. But as the for loop executes only on driver node, I don't 
> get the advantage of parallel processing in Spark which was the whole purpose 
> of using Spark. Please Help.
> Thank You,
> Asif Khan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to