*The distinct transformation does not preserve order, you need to distinct
first, then orderby.*

Thanks Enrico. You are correct. Worked fine!

joint_accounts.
      select(year(col("transactiondate")).as("Year")
    , month(col("transactiondate")).as("Month")
    , sum("moneyin").over(wSpec).cast("DECIMAL(10,2)").as("incoming Per
Month")
    , sum("moneyout").over(wSpec).cast("DECIMAL(10,2)").as("outgoing Per
Month")).
      withColumn(("incoming Per Month"),format_number(col("incoming Per
Month"),2)).
      withColumn(("outgoing Per Month"),format_number(col("outgoing Per
Month"),2)).


*     distinct.      orderBy(col("Year"),col("Month")). *
show(1000,false)

+----+-----+------------------+------------------+
|Year|Month|incoming Per Month|outgoing Per Month|
+----+-----+------------------+------------------+
|2019|9    |13,958.58         |17,920.31         |
|2019|10   |10,029.00         |10,067.52         |

|2019|11   |4,032.30          |4,225.30          |
|2019|12   |742.00            |814.49            |
|2020|1    |1,570.00          |1,540.86          |
+----+-----+------------------+------------------+

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 6 Jan 2020 at 09:35, Gourav Sengupta <gourav.sengu...@gmail.com>
wrote:

> or just use SQL, which is less verbose, easily readable, and takes care of
> all such scenarios. But for some weird reason I have found that people
> using data frame API's have a perception that using SQL is less
> intelligent. But I think that using less effort to get better output can me
> a measure of intelligence.
>
> Regards,
> Gourav Sengupta
>
> On Mon, Jan 6, 2020 at 9:23 AM Enrico Minack <m...@enrico.minack.dev>
> wrote:
>
>> The distinct transformation does not preserve order, you need to distinct
>> first, then orderby.
>>
>> Enrico
>>
>>
>> Am 06.01.20 um 00:39 schrieb Mich Talebzadeh:
>>
>> Hi,
>>
>> I am working out monthly outgoing etc from an account and I am using the
>> following code
>>
>> import org.apache.spark.sql.expressions.Window
>> val wSpec =
>> Window.partitionBy(year(col("transactiondate")),month(col("transactiondate")))
>> joint_accounts.
>>       select(year(col("transactiondate")).as("Year")
>>     , month(col("transactiondate")).as("Month")
>>     , sum("moneyin").over(wSpec).cast("DECIMAL(10,2)").as("incoming Per
>> Month")
>>     , sum("moneyout").over(wSpec).cast("DECIMAL(10,2)").as("outgoing Per
>> Month")).
>>
>> *orderBy(year(col("transactiondate")),month(col("transactiondate"))).*
>>     distinct.
>>     show(1000,false)
>>
>> This shows as follows:
>>
>>
>> |Year|Month|incoming Per Month|outgoing Per Month|
>> +----+-----+------------------+------------------+
>> |2019|9    |13958.58          |17920.31          |
>> |2019|11   |4032.30           |4225.30           |
>> |2020|1    |1530.00           |1426.91           |
>> |2019|10   |10029.00          |10067.52          |
>> |2019|12   |742.00            |814.49            |
>> +----+-----+------------------+------------------+
>>
>>  however the orderby is not correct as I expect to see 2010 record and
>> 2019 records in the order of year and month.
>>
>> Any suggestions?
>>
>> Thanks,
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>>

Reply via email to