Have you reviewed this section of the guide?
http://spark.apache.org/docs/latest/programming-guide.html#shared-variables

If the dataset is static and you need a copy on all the nodes, you should look 
at broadcast variables.

SQL specific, have you tried loading the dataset using the DataFrame API 
directly? It seems to me like you’re using the 2 files as “metadata” instead of 
data…

-adrian

From: Angel Angel
Date: Thursday, September 17, 2015 at 12:28 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Saprk.frame.Akkasize

Hi,

I am running some deep learning algorithm on spark.

Example:
https://github.com/deeplearning4j/dl4j-spark-ml-examples


i am trying to run this example in local mode and its working fine.
but when i try to run this example in cluster mode i got following error.

Loaded Mnist dataframe:
15/09/17 18:20:33 WARN TaskSetManager: Stage 0 contains a task of very large 
size (46279 KB). The maximum recommended task size is 100 KB.
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to 
stage failure: Serialized task 0:0 was 47622358 bytes, which exceeds max 
allowed: spark.akka.frameSize (10485760 bytes) - reserved (204800 bytes). 
Consider increasing spark.akka.frameSize or using broadcast variables for large 
values.
at 
org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.DAGScheduler.org>$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)

Also i have attached the snapshot of error.

And my java driver program is


  public static void main(String[] args) {

        SparkConf conf = new SparkConf().setMaster("spark://hadoopm0:7077").
                .setAppName("Mnist Classification Pipeline (Java)");
        SparkContext jsc = new SparkContext(conf);
       SQLContext jsql = new SQLContext(jsc);

        //String imagesPath = 
"hdfs://hadoopm0:8020/tmp/input1/images-idx1-ubyte";
        //String labelsPath = 
"hdfs://hadoopm0:8020/tmp/input1/labels-idx1-ubyte";
        String imagesPath = "file:///root/Downloads/Database/images-idx1-ubyte";
        String labelsPath = "file:///root/Downloads/Database/labels-idx1-ubyte";
        Map<String, String> params = new HashMap<String, String>();
        params.put("imagesPath", imagesPath);
        params.put("labelsPath", labelsPath);
        DataFrame data = jsql.read().format(DefaultSource.class.getName())
                .options(params).load();

        System.out.println("\nLoaded Mnist dataframe:");
        data.show(100);






Please give me some reference or suggestions to solve this problem.

Thanks in advance

Reply via email to