This has the information that you require in order to add an extra column with a sequence to it.
On Tue, 8 Feb 2022 at 09:11, <capitnfrak...@free.fr> wrote: > > Hello Gourav > > > As you see here orderBy has already give the solution for "equal > amount": > > >>> df = > >>> > sc.parallelize([("orange",2),("apple",3),("tomato",3),("cherry",5)]).toDF(['fruit','amount']) > > >>> df.select("*").orderBy("amount",ascending=False).show() > +------+------+ > | fruit|amount| > +------+------+ > |cherry| 5| > | apple| 3| > |tomato| 3| > |orange| 2| > +------+------+ > > > I want to add a column at the right whose name is "top" and the value > auto_increment from 1 to N. > > Thank you. > > > > On 08/02/2022 13:52, Gourav Sengupta wrote: > > Hi, > > > > sorry once again, will try to understand the problem first :) > > > > As we can clearly see that the initial responses were incorrectly > > guessing the solution to be monotonically_increasing function > > > > What if there are two fruits with equal amount? For any real life > > application, can we understand what are trying to achieve by the > > rankings? > > > > Regards, > > Gourav Sengupta > > > > On Tue, Feb 8, 2022 at 4:22 AM ayan guha <guha.a...@gmail.com> wrote: > > > >> For this req you can rank or dense rank. > >> > >> On Tue, 8 Feb 2022 at 1:12 pm, <capitnfrak...@free.fr> wrote: > >> > >>> Hello, > >>> > >>> For this query: > >>> > >>>>>> df.select("*").orderBy("amount",ascending=False).show() > >>> +------+------+ > >>> | fruit|amount| > >>> +------+------+ > >>> |tomato| 9| > >>> | apple| 6| > >>> |cherry| 5| > >>> |orange| 3| > >>> +------+------+ > >>> > >>> I want to add a column "top", in which the value is: 1,2,3... > >>> meaning > >>> top1, top2, top3... > >>> > >>> How can I do it? > >>> > >>> Thanks. > >>> > >>> On 07/02/2022 21:18, Gourav Sengupta wrote: > >>>> Hi, > >>>> > >>>> can we understand the requirement first? > >>>> > >>>> What is that you are trying to achieve by auto increment id? Do > >>> you > >>>> just want different ID's for rows, or you may want to keep track > >>> of > >>>> the record count of a table as well, or do you want to do use > >>> them for > >>>> surrogate keys? > >>>> > >>>> If you are going to insert records multiple times in a table, > >>> and > >>>> still have different values? > >>>> > >>>> I think without knowing the requirements all the above > >>> responses, like > >>>> everything else where solutions are reached before understanding > >>> the > >>>> problem, has high chances of being wrong. > >>>> > >>>> Regards, > >>>> Gourav Sengupta > >>>> > >>>> On Mon, Feb 7, 2022 at 2:21 AM Siva Samraj > >>> <samraj.mi...@gmail.com> > >>>> wrote: > >>>> > >>>>> Monotonically_increasing_id() will give the same functionality > >>>>> > >>>>> On Mon, 7 Feb, 2022, 6:57 am , <capitnfrak...@free.fr> wrote: > >>>>> > >>>>>> For a dataframe object, how to add a column who is > >>> auto_increment > >>>>>> like > >>>>>> mysql's behavior? > >>>>>> > >>>>>> Thank you. > >>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > --------------------------------------------------------------------- > >>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >>> > >>> > >> > > --------------------------------------------------------------------- > >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >> -- > >> Best Regards, > >> Ayan Guha > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >