Which Spark/Scala version do you use? On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <zahidr1...@gmail.com> wrote:
> > with the following sparksession configuration > > val spark = SparkSession.builder().master("local[*]").appName("Spark Session > take").getOrCreate(); > > this line works > > flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != > "Canada").map(flight_row => flight_row).take(5) > > > however if change the master url like so, with the ip address then the > following error is produced by the position of .take(5) > > val spark = > SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark > Session take").getOrCreate(); > > > 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, > 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign > instance of java.lang.invoke.SerializedLambda to field > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance > of org.apache.spark.rdd.MapPartitionsRDD > > BUT if I remove take(5) or change the position of take(5) or insert an > extra take(5) as illustrated in code then it works. I don't see why the > position of take(5) should cause such an error or be caused by changing the > master url > > flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != > "Canada").map(flight_row => flight_row).take(5) > > flights.take(5) > > flights > .take(5) > .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada") > .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + > 5)) > flights.show(5) > > > complete code if you wish to replicate it. > > import org.apache.spark.sql.SparkSession > > object sessiontest { > > // define specific data type class then manipulate it using the filter and > map functions > // this is also known as an Encoder > case class flight (DEST_COUNTRY_NAME: String, > ORIGIN_COUNTRY_NAME:String, > count: BigInt) > > > def main(args:Array[String]): Unit ={ > > val spark = > SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark > Session take").getOrCreate(); > > import spark.implicits._ > val flightDf = > spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/") > val flights = flightDf.as[flight] > > flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != > "Canada").map(flight_row => flight_row).take(5) > > flights.take(5) > > flights > .take(5) > .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada") > .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count > + 5)) > flights.show(5) > > } // main > } > > > > > > Backbutton.co.uk > ¯\_(ツ)_/¯ > ♡۶Java♡۶RMI ♡۶ > Make Use Method {MUM} > makeuse.org > <http://www.backbutton.co.uk> >