Hi, I am working out monthly outgoing etc from an account and I am using the following code
import org.apache.spark.sql.expressions.Window val wSpec = Window.partitionBy(year(col("transactiondate")),month(col("transactiondate"))) joint_accounts. select(year(col("transactiondate")).as("Year") , month(col("transactiondate")).as("Month") , sum("moneyin").over(wSpec).cast("DECIMAL(10,2)").as("incoming Per Month") , sum("moneyout").over(wSpec).cast("DECIMAL(10,2)").as("outgoing Per Month")). *orderBy(year(col("transactiondate")),month(col("transactiondate"))).* distinct. show(1000,false) This shows as follows: |Year|Month|incoming Per Month|outgoing Per Month| +----+-----+------------------+------------------+ |2019|9 |13958.58 |17920.31 | |2019|11 |4032.30 |4225.30 | |2020|1 |1530.00 |1426.91 | |2019|10 |10029.00 |10067.52 | |2019|12 |742.00 |814.49 | +----+-----+------------------+------------------+ however the orderby is not correct as I expect to see 2010 record and 2019 records in the order of year and month. Any suggestions? Thanks, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.