Hello Gourav


As you see here orderBy has already give the solution for "equal amount":

df = sc.parallelize([("orange",2),("apple",3),("tomato",3),("cherry",5)]).toDF(['fruit','amount'])

df.select("*").orderBy("amount",ascending=False).show()
+------+------+
| fruit|amount|
+------+------+
|cherry|     5|
| apple|     3|
|tomato|     3|
|orange|     2|
+------+------+


I want to add a column at the right whose name is "top" and the value auto_increment from 1 to N.

Thank you.



On 08/02/2022 13:52, Gourav Sengupta wrote:
Hi,

sorry once again, will try to understand the problem first :)

As we can clearly see that the initial responses were incorrectly
guessing the solution to be monotonically_increasing function

What if there are two fruits with equal amount? For any real life
application, can we understand what are trying to achieve by the
rankings?

Regards,
Gourav Sengupta

On Tue, Feb 8, 2022 at 4:22 AM ayan guha <guha.a...@gmail.com> wrote:

For this req you can rank or dense rank.

On Tue, 8 Feb 2022 at 1:12 pm, <capitnfrak...@free.fr> wrote:

Hello,

For this query:

df.select("*").orderBy("amount",ascending=False).show()
+------+------+
| fruit|amount|
+------+------+
|tomato|     9|
| apple|     6|
|cherry|     5|
|orange|     3|
+------+------+

I want to add a column "top", in which the value is: 1,2,3...
meaning
top1, top2, top3...

How can I do it?

Thanks.

On 07/02/2022 21:18, Gourav Sengupta wrote:
Hi,

can we understand the requirement first?

What is that you are trying to achieve by auto increment id? Do
you
just want different ID's for rows, or you may want to keep track
of
the record count of a table as well, or do you want to do use
them for
surrogate keys?

If you are going to insert records multiple times in a table,
and
still have different values?

I think without knowing the requirements all the above
responses, like
everything else where solutions are reached before understanding
the
problem, has high chances of being wrong.

Regards,
Gourav Sengupta

On Mon, Feb 7, 2022 at 2:21 AM Siva Samraj
<samraj.mi...@gmail.com>
wrote:

Monotonically_increasing_id() will give the same functionality

On Mon, 7 Feb, 2022, 6:57 am , <capitnfrak...@free.fr> wrote:

For a dataframe object, how to add a column who is
auto_increment
like
mysql's behavior?

Thank you.






---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
--
Best Regards,
Ayan Guha

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to