or just use SQL, which is less verbose, easily readable, and takes care of all such scenarios. But for some weird reason I have found that people using data frame API's have a perception that using SQL is less intelligent. But I think that using less effort to get better output can me a measure of intelligence.
Regards, Gourav Sengupta On Mon, Jan 6, 2020 at 9:23 AM Enrico Minack <m...@enrico.minack.dev> wrote: > The distinct transformation does not preserve order, you need to distinct > first, then orderby. > > Enrico > > > Am 06.01.20 um 00:39 schrieb Mich Talebzadeh: > > Hi, > > I am working out monthly outgoing etc from an account and I am using the > following code > > import org.apache.spark.sql.expressions.Window > val wSpec = > Window.partitionBy(year(col("transactiondate")),month(col("transactiondate"))) > joint_accounts. > select(year(col("transactiondate")).as("Year") > , month(col("transactiondate")).as("Month") > , sum("moneyin").over(wSpec).cast("DECIMAL(10,2)").as("incoming Per > Month") > , sum("moneyout").over(wSpec).cast("DECIMAL(10,2)").as("outgoing Per > Month")). > *orderBy(year(col("transactiondate")),month(col("transactiondate"))).* > distinct. > show(1000,false) > > This shows as follows: > > > |Year|Month|incoming Per Month|outgoing Per Month| > +----+-----+------------------+------------------+ > |2019|9 |13958.58 |17920.31 | > |2019|11 |4032.30 |4225.30 | > |2020|1 |1530.00 |1426.91 | > |2019|10 |10029.00 |10067.52 | > |2019|12 |742.00 |814.49 | > +----+-----+------------------+------------------+ > > however the orderby is not correct as I expect to see 2010 record and > 2019 records in the order of year and month. > > Any suggestions? > > Thanks, > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > >