with the following sparksession configuration

val spark = SparkSession.builder().master("local[*]").appName("Spark
Session take").getOrCreate();

this line works

flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME !=
"Canada").map(flight_row => flight_row).take(5)


however if change the master url like so, with the ip address then the
following error is produced by the position of .take(5)

val spark = 
SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark
Session take").getOrCreate();


20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
instance of java.lang.invoke.SerializedLambda to field
org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
of org.apache.spark.rdd.MapPartitionsRDD

BUT if I  remove take(5) or change the position of take(5) or insert an
extra take(5) as illustrated in code then it works. I don't see why the
position of take(5) should cause such an error or be caused by changing the
master url

flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME !=
"Canada").map(flight_row => flight_row).take(5)

  flights.take(5)

  flights
  .take(5)
  .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
  .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
   flights.show(5)


complete code if you wish to replicate it.

import org.apache.spark.sql.SparkSession

object sessiontest {

  // define specific  data type class then manipulate it using the
filter and map functions
  // this is also known as an Encoder
  case class flight (DEST_COUNTRY_NAME: String,
                     ORIGIN_COUNTRY_NAME:String,
                     count: BigInt)


  def main(args:Array[String]): Unit ={

    val spark =
SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark
Session take").getOrCreate();

    import spark.implicits._
    val flightDf =
spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
    val flights = flightDf.as[flight]

    flights.take(5).filter(flight_row =>
flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row =>
flight_row).take(5)

      flights.take(5)

      flights
      .take(5)
      .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
      .map(fr => flight(fr.DEST_COUNTRY_NAME,
fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
       flights.show(5)

  } // main
}





Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>

Reply via email to