Re: OrderBy Year and Month is not displaying correctly

Enrico Minack Mon, 06 Jan 2020 01:23:23 -0800

The distinct transformation does not preserve order, you need todistinct first, then orderby.


Enrico



Am 06.01.20 um 00:39 schrieb Mich Talebzadeh:

Hi,
I am working out monthly outgoing etc from an account and I am usingthe following code
import org.apache.spark.sql.expressions.Window
val wSpec =Window.partitionBy(year(col("transactiondate")),month(col("transactiondate")))
joint_accounts.
      select(year(col("transactiondate")).as("Year")
    , month(col("transactiondate")).as("Month")
, sum("moneyin").over(wSpec).cast("DECIMAL(10,2)").as("incomingPer Month") , sum("moneyout").over(wSpec).cast("DECIMAL(10,2)").as("outgoingPer Month")).
*orderBy(year(col("transactiondate")),month(col("transactiondate"))).*
    distinct.
    show(1000,false)

This shows as follows:


|Year|Month|incoming Per Month|outgoing Per Month|
+----+-----+------------------+------------------+
|2019|9    |13958.58          |17920.31          |
|2019|11   |4032.30           |4225.30           |
|2020|1    |1530.00           |1426.91           |
|2019|10   |10029.00          |10067.52          |
|2019|12   |742.00            |814.49            |
+----+-----+------------------+------------------+
however the orderby is not correct as I expect to see 2010 record and2019 records in the order of year and month.
Any suggestions?

Thanks,

Dr Mich Talebzadeh
LinkedIn/https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk.Any and all responsibility forany loss, damage or destruction of data or any other property whichmay arise from relying on this email's technical content is explicitlydisclaimed. The author will in no case be liable for any monetarydamages arising from such loss, damage or destruction.

Re: OrderBy Year and Month is not displaying correctly

Reply via email to