Maybe col func is not even needed here. :)

>>> df.select(F.dense_rank().over(wOrder).alias("rank"),
"fruit","amount").show()

+----+------+------+

|rank| fruit|amount|

+----+------+------+

|   1|cherry|     5|

|   2| apple|     3|

|   2|tomato|     3|

|   3|orange|     2|

+----+------+------+




On Tue, Feb 8, 2022 at 3:50 PM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> simple either rank() or desnse_rank()
>
> >>> from pyspark.sql import functions as F
> >>> from pyspark.sql.functions import col
> >>> from pyspark.sql.window import Window
> >>> wOrder = Window().orderBy(df['amount'].desc())
> >>> df.select(F.rank().over(wOrder).alias("rank"), col('fruit'),
> col('amount')).show()
> +----+------+------+
> |rank| fruit|amount|
> +----+------+------+
> |   1|cherry|     5|
> |   2| apple|     3|
> |   2|tomato|     3|
> |   4|orange|     2|
> +----+------+------+
>
> >>> df.select(F.dense_rank().over(wOrder).alias("rank"), col('fruit'),
> col('amount')).show()
> +----+------+------+
> |rank| fruit|amount|
> +----+------+------+
> |   1|cherry|     5|
> |   2| apple|     3|
> |   2|tomato|     3|
> |   3|orange|     2|
> +----+------+------+
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 7 Feb 2022 at 01:27, <capitnfrak...@free.fr> wrote:
>
>> For a dataframe object, how to add a column who is auto_increment like
>> mysql's behavior?
>>
>> Thank you.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Reply via email to