As far as I know *sort* is just an alias of *orderBy* (or vice-versa) And your last operation is taking longer because you are sorting it twice.
-- *Daniel Santana* Senior Software Engineer EVERY*MUNDO* 25 SE 2nd Ave., Suite 900 Miami, FL 33131 USA main:+1 (305) 375-0045 EveryMundo.com <http://www.everymundo.com/#whoweare> *Confidentiality Notice: *This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the system manager. On Fri, Jul 29, 2016 at 12:20 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: > Hi, > > In Spark programing I can use > > df.filter(col("transactiontype") === > "DEB").groupBy("transactiondate").agg(sum("debitamount").cast("Float").as("Total > Debit Card")).orderBy("transactiondate").show(5) > > or > > df.filter(col("transactiontype") === > "DEB").groupBy("transactiondate").agg(sum("debitamount").cast("Float").as("Total > Debit Card")).sort("transactiondate").show(5) > > i get the same results > > and i can use both as well > > df.ilter(col("transactiontype") === > "DEB").groupBy("transactiondate").agg(sum("debitamount").cast("Float").as("Total > Debit Card")).orderBy("transactiondate").sort("transactiondate").show(5) > > but the last one takes more time. > > what is the use case for both these please. does it make sense to use both? > > Thanks >