Re: How to split a dataframe into two dataframes based on count

2020-05-18 Thread Vipul Rajan
Hi Mohit, "Seems like the limit on parent is executed twice and return different records each time. Not sure why it is executed twice when I mentioned only once" That is to be expected. Since spark follows lazy evaluation, which means that execution only happens when you call an action, every act

How to split a dataframe into two dataframes based on count

2020-05-18 Thread Mohit Durgapal
Dear All, I would like to know how, in spark 2.0, can I split a dataframe into two dataframes when I know the exact counts the two dataframes should have. I tried using limit but got quite weird results. Also, I am looking for exact counts in child dfs, not the approximate % based split. *Followi