It shows nullPointerException, your data could be corrupted? Try putting a
try catch inside the operation that you are doing, Are you running the
worker process on the master node also? If not, then only 1 node will be
doing the processing. If yes, then try setting the level of parallelism and
number of partitions while creating/transforming the RDD.

Thanks
Best Regards

On Fri, Nov 14, 2014 at 5:17 PM, Priya Ch <learnings.chitt...@gmail.com>
wrote:

> Hi All,
>
>   We have set up 2 node cluster (NODE-DSRV05 and NODE-DSRV02) each is
> having 32gb RAM and 1 TB hard disk capacity and 8 cores of cpu. We have set
> up hdfs which has 2 TB capacity and the block size is 256 mb   When we try
> to process 1 gb file on spark, we see the following exception
>
> 14/11/14 17:01:42 INFO scheduler.TaskSetManager: Starting task 0.0 in
> stage 0.0 (TID 0, NODE-DSRV05.impetus.co.in, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:42 INFO scheduler.TaskSetManager: Starting task 1.0 in
> stage 0.0 (TID 1, NODE-DSRV05.impetus.co.in, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:42 INFO scheduler.TaskSetManager: Starting task 2.0 in
> stage 0.0 (TID 2, NODE-DSRV05.impetus.co.in, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:43 INFO cluster.SparkDeploySchedulerBackend: Registered
> executor: 
> Actor[akka.tcp://sparkExecutor@IMPETUS-DSRV02:41124/user/Executor#539551156]
> with ID 0
> 14/11/14 17:01:43 INFO storage.BlockManagerMasterActor: Registering block
> manager NODE-DSRV05.impetus.co.in:60432 with 2.1 GB RAM
> 14/11/14 17:01:43 INFO storage.BlockManagerMasterActor: Registering block
> manager NODE-DSRV02:47844 with 2.1 GB RAM
> 14/11/14 17:01:43 INFO network.ConnectionManager: Accepted connection from
> [NODE-DSRV05.impetus.co.in/192.168.145.195:51447]
> 14/11/14 17:01:43 INFO network.SendingConnection: Initiating connection to
> [NODE-DSRV05.impetus.co.in/192.168.145.195:60432]
> 14/11/14 17:01:43 INFO network.SendingConnection: Connected to [
> NODE-DSRV05.impetus.co.in/192.168.145.195:60432], 1 messages pending
> 14/11/14 17:01:43 INFO storage.BlockManagerInfo: Added broadcast_1_piece0
> in memory on NODE-DSRV05.impetus.co.in:60432 (size: 17.1 KB, free: 2.1 GB)
> 14/11/14 17:01:43 INFO storage.BlockManagerInfo: Added broadcast_0_piece0
> in memory on NODE-DSRV05.impetus.co.in:60432 (size: 14.1 KB, free: 2.1 GB)
> 14/11/14 17:01:44 WARN scheduler.TaskSetManager: Lost task 0.0 in stage
> 0.0 (TID 0, NODE-DSRV05.impetus.co.in): java.lang.NullPointerException:
>         org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:609)
>         org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:609)
>
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>         org.apache.spark.scheduler.Task.run(Task.scala:54)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         java.lang.Thread.run(Thread.java:722)
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Starting task 0.1 in
> stage 0.0 (TID 3, NODE-DSRV05.impetus.co.in, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Lost task 1.0 in stage
> 0.0 (TID 1) on executor NODE-DSRV05.impetus.co.in:
> java.lang.NullPointerException (null) [duplicate 1]
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Lost task 2.0 in stage
> 0.0 (TID 2) on executor NODE-DSRV05.impetus.co.in:
> java.lang.NullPointerException (null) [duplicate 2]
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Starting task 2.1 in
> stage 0.0 (TID 4, NODE-DSRV05.impetus.co.in, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Starting task 1.1 in
> stage 0.0 (TID 5, NODE-DSRV02, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Lost task 0.1 in stage
> 0.0 (TID 3) on executor NODE-DSRV05.impetus.co.in:
> java.lang.NullPointerException (null) [duplicate 3]
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Starting task 0.2 in
> stage 0.0 (TID 6, NODE-DSRV02, NODE_LOCAL, 1667 bytes)
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Lost task 2.1 in stage
> 0.0 (TID 4) on executor NODE-DSRV05.impetus.co.in:
> java.lang.NullPointerException (null) [duplicate 4]
> 14/11/14 17:01:44 INFO scheduler.TaskSetManager: Starting task 2.2 in
> stage 0.0 (TID 7, NODE-DSRV02, NODE_LOCAL, 1667 bytes)
>
>
> What I see is, it couldnt launch tasks on NODE-DSRV05 and processing it on
> single node i.e NODE-DSRV02. When we tried with 360 MB of data, I dont see
> any exception but the entire processing is done by only one node. I couldnt
> figure out where the issue lies.
>
> Any suggestions on what kind of situations might cause such issue ?
>
> Thanks,
> Padma Ch
>

Reply via email to