Have you reviewed this section of the guide? http://spark.apache.org/docs/latest/programming-guide.html#shared-variables
If the dataset is static and you need a copy on all the nodes, you should look at broadcast variables. SQL specific, have you tried loading the dataset using the DataFrame API directly? It seems to me like you’re using the 2 files as “metadata” instead of data… -adrian From: Angel Angel Date: Thursday, September 17, 2015 at 12:28 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Saprk.frame.Akkasize Hi, I am running some deep learning algorithm on spark. Example: https://github.com/deeplearning4j/dl4j-spark-ml-examples i am trying to run this example in local mode and its working fine. but when i try to run this example in cluster mode i got following error. Loaded Mnist dataframe: 15/09/17 18:20:33 WARN TaskSetManager: Stage 0 contains a task of very large size (46279 KB). The maximum recommended task size is 100 KB. Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Serialized task 0:0 was 47622358 bytes, which exceeds max allowed: spark.akka.frameSize (10485760 bytes) - reserved (204800 bytes). Consider increasing spark.akka.frameSize or using broadcast variables for large values. at org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.DAGScheduler.org>$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266) Also i have attached the snapshot of error. And my java driver program is public static void main(String[] args) { SparkConf conf = new SparkConf().setMaster("spark://hadoopm0:7077"). .setAppName("Mnist Classification Pipeline (Java)"); SparkContext jsc = new SparkContext(conf); SQLContext jsql = new SQLContext(jsc); //String imagesPath = "hdfs://hadoopm0:8020/tmp/input1/images-idx1-ubyte"; //String labelsPath = "hdfs://hadoopm0:8020/tmp/input1/labels-idx1-ubyte"; String imagesPath = "file:///root/Downloads/Database/images-idx1-ubyte"; String labelsPath = "file:///root/Downloads/Database/labels-idx1-ubyte"; Map<String, String> params = new HashMap<String, String>(); params.put("imagesPath", imagesPath); params.put("labelsPath", labelsPath); DataFrame data = jsql.read().format(DefaultSource.class.getName()) .options(params).load(); System.out.println("\nLoaded Mnist dataframe:"); data.show(100); Please give me some reference or suggestions to solve this problem. Thanks in advance