In spark you can use windowing function
<https://sparkbyexamples.com/spark/spark-sql-window-functions/>s to
achieve this

HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 19 Dec 2022 at 15:28, Oliver Ruebenacker <oliv...@broadinstitute.org>
wrote:

>
>      Hello,
>
>   How can I retain from each group only the row for which one value is the
> maximum of the group? For example, imagine a DataFrame containing all major
> cities in the world, with three columns: (1) City name (2) Country (3)
> population. How would I get a DataFrame that only contains the largest city
> in each country? Thanks!
>
>      Best, Oliver
>
> --
> Oliver Ruebenacker, Ph.D. (he)
> Senior Software Engineer, Knowledge Portal Network <http://kp4cd.org/>, 
> Flannick
> Lab <http://www.flannicklab.org/>, Broad Institute
> <http://www.broadinstitute.org/>
>

Reply via email to