I have configured in IntelliJ as external jars spark-3.0.0-preview2-bin-hadoop2.7/jar
not pulling anything from maven. Backbutton.co.uk ¯\_(ツ)_/¯ ♡۶Java♡۶RMI ♡۶ Make Use Method {MUM} makeuse.org <http://www.backbutton.co.uk> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cloud0...@gmail.com> wrote: > Which Spark/Scala version do you use? > > On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <zahidr1...@gmail.com> wrote: > >> >> with the following sparksession configuration >> >> val spark = SparkSession.builder().master("local[*]").appName("Spark Session >> take").getOrCreate(); >> >> this line works >> >> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != >> "Canada").map(flight_row => flight_row).take(5) >> >> >> however if change the master url like so, with the ip address then the >> following error is produced by the position of .take(5) >> >> val spark = >> SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark >> Session take").getOrCreate(); >> >> >> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, >> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign >> instance of java.lang.invoke.SerializedLambda to field >> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance >> of org.apache.spark.rdd.MapPartitionsRDD >> >> BUT if I remove take(5) or change the position of take(5) or insert an >> extra take(5) as illustrated in code then it works. I don't see why the >> position of take(5) should cause such an error or be caused by changing the >> master url >> >> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != >> "Canada").map(flight_row => flight_row).take(5) >> >> flights.take(5) >> >> flights >> .take(5) >> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada") >> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + >> 5)) >> flights.show(5) >> >> >> complete code if you wish to replicate it. >> >> import org.apache.spark.sql.SparkSession >> >> object sessiontest { >> >> // define specific data type class then manipulate it using the filter >> and map functions >> // this is also known as an Encoder >> case class flight (DEST_COUNTRY_NAME: String, >> ORIGIN_COUNTRY_NAME:String, >> count: BigInt) >> >> >> def main(args:Array[String]): Unit ={ >> >> val spark = >> SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark >> Session take").getOrCreate(); >> >> import spark.implicits._ >> val flightDf = >> spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/") >> val flights = flightDf.as[flight] >> >> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != >> "Canada").map(flight_row => flight_row).take(5) >> >> flights.take(5) >> >> flights >> .take(5) >> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada") >> .map(fr => flight(fr.DEST_COUNTRY_NAME, >> fr.ORIGIN_COUNTRY_NAME,fr.count + 5)) >> flights.show(5) >> >> } // main >> } >> >> >> >> >> >> Backbutton.co.uk >> ¯\_(ツ)_/¯ >> ♡۶Java♡۶RMI ♡۶ >> Make Use Method {MUM} >> makeuse.org >> <http://www.backbutton.co.uk> >> >