Re: BUG: take with SparkSession.master[url]

2020-03-26 Thread Zahid Rahman
I have configured in IntelliJ as external jars spark-3.0.0-preview2-bin-hadoop2.7/jar not pulling anything from maven. Backbutton.co.uk ¯\_(ツ)_/¯ ♡۶Java♡۶RMI ♡۶ Make Use Method {MUM} makeuse.org On Fri, 27 Mar 2020 at 05:45, Wenchen Fan wrote: > Which

Re: BUG: take with SparkSession.master[url]

2020-03-26 Thread Wenchen Fan
Which Spark/Scala version do you use? On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman wrote: > > with the following sparksession configuration > > val spark = SparkSession.builder().master("local[*]").appName("Spark Session > take").getOrCreate(); > > this line works > > flights.filter(flight_row

BUG: take with SparkSession.master[url]

2020-03-26 Thread Zahid Rahman
with the following sparksession configuration val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate(); this line works flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5) however if change the

Re: results of taken(3) not appearing in console window

2020-03-26 Thread Reynold Xin
bcc dev, +user You need to print out the result. Take itself doesn't print. You only got the results printed to the console because the Scala REPL automatically prints the returned value from take. On Thu, Mar 26, 2020 at 12:15 PM, Zahid Rahman < zahidr1...@gmail.com > wrote: > > I am

results of taken(3) not appearing in console window

2020-03-26 Thread Zahid Rahman
I am running the same code with the same libraries but not getting same output. scala> case class flight (DEST_COUNTRY_NAME: String, | ORIGIN_COUNTRY_NAME:String, | count: BigInt) defined class flight scala> val flightDf =

Re: Need to order iterator values in spark dataframe

2020-03-26 Thread Zahid Rahman
I believe I logged an issue first and I should get a response first. I was ignored. Regards Did you know there are 8 million people in kashmir locked up in their homes by the Hindutwa (Indians) for 8 months. Now the whole planet is locked up in their homes. You didn't take notice of them either.

Re: Need to order iterator values in spark dataframe

2020-03-26 Thread Enrico Minack
Abhinav, you can repartition by your key, then sortWithinPartition, and the groupByKey. Since data are already hash-partitioned by key, Spark should not shuffle the data hence change the sort wihtin each partition: ds.repartition($"key").sortWithinPartitions($"code").groupBy($"key") Enrico

Need to order iterator values in spark dataframe

2020-03-26 Thread Ranjan, Abhinav
Hi, I have a dataframe which has data like: key                         |    code    |    code_value 1                            |    c1        |    11 1                            |    c2        |    12 1                            |    c2        |    9 1                            |    c3   

Programmatic: parquet file corruption error

2020-03-26 Thread Zahid Rahman
Hi, When I run the code for a user defined data type dataset using case class in scala and run the code in the interactive spark-shell against parquet file. The results are as expected. However I then the same code programmatically in IntelliJ IDE then spark is give a file corruption error.