Re: [Spark Streaming] Null Pointer Exception when accessing broadcast variable to store a hashmap in Java

2015-06-23 Thread Benjamin Fradet
Are you using checkpointing?

I had a similar issue when recreating a streaming context from checkpoint
as broadcast variables are not checkpointed.
On 23 Jun 2015 5:01 pm, Nipun Arora nipunarora2...@gmail.com wrote:

 Hi,

 I have a spark streaming application where I need to access a model saved
 in a HashMap.
 I have *no problems in running the same code with broadcast variables in
 the local installation.* However I get a *null pointer* *exception* when
 I deploy it on my spark test cluster.


 I have stored a model in a HashMapString, FieldModel which is
 serializable. I use a broadcast variables declared as a global static
 variable to broadcast this hashmap:

 public static BroadcastHashMapString,FieldModel br;

 HashMapString,FieldModel hm = checkerObj.getModel(esserver, type);

 br = ssc.sparkContext().broadcast(hm);


 I need to access this model in my mapper phase, and do some operation
 based on the checkup. The following is a snippet of how I access the
 broadcast variable.


 JavaDStreamTuple3Long,Double,String split = matched.map(new 
 GenerateType2Scores());


 class GenerateType2Scores implements FunctionString, Tuple3Long, Double, 
 String {
 @Override
 public Tuple3Long, Double, String call(String s) throws Exception{

 Long time = Type2ViolationChecker.getMTS(s);
 HashMapString,FieldModel temphm= Type2ViolationChecker.br.value();

 Double score = Type2ViolationChecker.getAnomalyScore(temphm,s);
 return new Tuple3Long, Double, String(time,score, s);}
 }

 The temphm should refer to the hashmap stored in the broadcast variable.
 Can anyone help me understand what is the correct way to access broadcast
 variables in JAVA?

 Thanks
 Nipun



Re: [Spark Streaming] Null Pointer Exception when accessing broadcast variable to store a hashmap in Java

2015-06-23 Thread Nipun Arora
btw. just for reference I have added the code in a gist:

https://gist.github.com/nipunarora/ed987e45028250248edc

and a stackoverflow reference here:

http://stackoverflow.com/questions/31006490/broadcast-variable-null-pointer-exception-in-spark-streaming

On Tue, Jun 23, 2015 at 11:01 AM, Nipun Arora nipunarora2...@gmail.com
wrote:

 Hi,

 I have a spark streaming application where I need to access a model saved
 in a HashMap.
 I have *no problems in running the same code with broadcast variables in
 the local installation.* However I get a *null pointer* *exception* when
 I deploy it on my spark test cluster.


 I have stored a model in a HashMapString, FieldModel which is
 serializable. I use a broadcast variables declared as a global static
 variable to broadcast this hashmap:

 public static BroadcastHashMapString,FieldModel br;

 HashMapString,FieldModel hm = checkerObj.getModel(esserver, type);

 br = ssc.sparkContext().broadcast(hm);


 I need to access this model in my mapper phase, and do some operation
 based on the checkup. The following is a snippet of how I access the
 broadcast variable.


 JavaDStreamTuple3Long,Double,String split = matched.map(new 
 GenerateType2Scores());


 class GenerateType2Scores implements FunctionString, Tuple3Long, Double, 
 String {
 @Override
 public Tuple3Long, Double, String call(String s) throws Exception{

 Long time = Type2ViolationChecker.getMTS(s);
 HashMapString,FieldModel temphm= Type2ViolationChecker.br.value();

 Double score = Type2ViolationChecker.getAnomalyScore(temphm,s);
 return new Tuple3Long, Double, String(time,score, s);}
 }

 The temphm should refer to the hashmap stored in the broadcast variable.
 Can anyone help me understand what is the correct way to access broadcast
 variables in JAVA?

 Thanks
 Nipun



Re: [Spark Streaming] Null Pointer Exception when accessing broadcast variable to store a hashmap in Java

2015-06-23 Thread Nipun Arora
I don't think I have explicitly check-pointed anywhere. Unless it's
internal in some interface, I don't believe the application is checkpointed.

Thanks for the suggestion though..

Nipun

On Tue, Jun 23, 2015 at 11:05 AM, Benjamin Fradet benjamin.fra...@gmail.com
 wrote:

 Are you using checkpointing?

 I had a similar issue when recreating a streaming context from checkpoint
 as broadcast variables are not checkpointed.
 On 23 Jun 2015 5:01 pm, Nipun Arora nipunarora2...@gmail.com wrote:

 Hi,

 I have a spark streaming application where I need to access a model saved
 in a HashMap.
 I have *no problems in running the same code with broadcast variables in
 the local installation.* However I get a *null pointer* *exception* when
 I deploy it on my spark test cluster.


 I have stored a model in a HashMapString, FieldModel which is
 serializable. I use a broadcast variables declared as a global static
 variable to broadcast this hashmap:

 public static BroadcastHashMapString,FieldModel br;

 HashMapString,FieldModel hm = checkerObj.getModel(esserver, type);

 br = ssc.sparkContext().broadcast(hm);


 I need to access this model in my mapper phase, and do some operation
 based on the checkup. The following is a snippet of how I access the
 broadcast variable.


 JavaDStreamTuple3Long,Double,String split = matched.map(new 
 GenerateType2Scores());


 class GenerateType2Scores implements FunctionString, Tuple3Long, Double, 
 String {
 @Override
 public Tuple3Long, Double, String call(String s) throws Exception{

 Long time = Type2ViolationChecker.getMTS(s);
 HashMapString,FieldModel temphm= Type2ViolationChecker.br.value();

 Double score = Type2ViolationChecker.getAnomalyScore(temphm,s);
 return new Tuple3Long, Double, String(time,score, s);}
 }

 The temphm should refer to the hashmap stored in the broadcast variable.
 Can anyone help me understand what is the correct way to access broadcast
 variables in JAVA?

 Thanks
 Nipun




Re: [Spark Streaming] Null Pointer Exception when accessing broadcast variable to store a hashmap in Java

2015-06-23 Thread Nipun Arora
I found the error so just posting on the list.

It seems broadcast variables cannot be declared static.
If you do you get a null pointer exception.

Thanks
Nipun

On Tue, Jun 23, 2015 at 11:08 AM, Nipun Arora nipunarora2...@gmail.com
wrote:

 btw. just for reference I have added the code in a gist:

 https://gist.github.com/nipunarora/ed987e45028250248edc

 and a stackoverflow reference here:


 http://stackoverflow.com/questions/31006490/broadcast-variable-null-pointer-exception-in-spark-streaming

 On Tue, Jun 23, 2015 at 11:01 AM, Nipun Arora nipunarora2...@gmail.com
 wrote:

 Hi,

 I have a spark streaming application where I need to access a model saved
 in a HashMap.
 I have *no problems in running the same code with broadcast variables in
 the local installation.* However I get a *null pointer* *exception* when
 I deploy it on my spark test cluster.


 I have stored a model in a HashMapString, FieldModel which is
 serializable. I use a broadcast variables declared as a global static
 variable to broadcast this hashmap:

 public static BroadcastHashMapString,FieldModel br;

 HashMapString,FieldModel hm = checkerObj.getModel(esserver, type);

 br = ssc.sparkContext().broadcast(hm);


 I need to access this model in my mapper phase, and do some operation
 based on the checkup. The following is a snippet of how I access the
 broadcast variable.


 JavaDStreamTuple3Long,Double,String split = matched.map(new 
 GenerateType2Scores());


 class GenerateType2Scores implements FunctionString, Tuple3Long, Double, 
 String {
 @Override
 public Tuple3Long, Double, String call(String s) throws Exception{

 Long time = Type2ViolationChecker.getMTS(s);
 HashMapString,FieldModel temphm= Type2ViolationChecker.br.value();

 Double score = Type2ViolationChecker.getAnomalyScore(temphm,s);
 return new Tuple3Long, Double, String(time,score, s);}
 }

 The temphm should refer to the hashmap stored in the broadcast variable.
 Can anyone help me understand what is the correct way to access broadcast
 variables in JAVA?

 Thanks
 Nipun





Re: [Spark Streaming] Null Pointer Exception when accessing broadcast variable to store a hashmap in Java

2015-06-23 Thread Tathagata Das
Yes, this is a known behavior. Some static stuff are not serialized as part
of a task.

On Tue, Jun 23, 2015 at 10:24 AM, Nipun Arora nipunarora2...@gmail.com
wrote:

 I found the error so just posting on the list.

 It seems broadcast variables cannot be declared static.
 If you do you get a null pointer exception.

 Thanks
 Nipun

 On Tue, Jun 23, 2015 at 11:08 AM, Nipun Arora nipunarora2...@gmail.com
 wrote:

 btw. just for reference I have added the code in a gist:

 https://gist.github.com/nipunarora/ed987e45028250248edc

 and a stackoverflow reference here:


 http://stackoverflow.com/questions/31006490/broadcast-variable-null-pointer-exception-in-spark-streaming

 On Tue, Jun 23, 2015 at 11:01 AM, Nipun Arora nipunarora2...@gmail.com
 wrote:

 Hi,

 I have a spark streaming application where I need to access a model
 saved in a HashMap.
 I have *no problems in running the same code with broadcast variables
 in the local installation.* However I get a *null pointer* *exception*
 when I deploy it on my spark test cluster.


 I have stored a model in a HashMapString, FieldModel which is
 serializable. I use a broadcast variables declared as a global static
 variable to broadcast this hashmap:

 public static BroadcastHashMapString,FieldModel br;

 HashMapString,FieldModel hm = checkerObj.getModel(esserver, type);

 br = ssc.sparkContext().broadcast(hm);


 I need to access this model in my mapper phase, and do some operation
 based on the checkup. The following is a snippet of how I access the
 broadcast variable.


 JavaDStreamTuple3Long,Double,String split = matched.map(new 
 GenerateType2Scores());


 class GenerateType2Scores implements FunctionString, Tuple3Long, Double, 
 String {
 @Override
 public Tuple3Long, Double, String call(String s) throws Exception{

 Long time = Type2ViolationChecker.getMTS(s);
 HashMapString,FieldModel temphm= Type2ViolationChecker.br.value();

 Double score = Type2ViolationChecker.getAnomalyScore(temphm,s);
 return new Tuple3Long, Double, String(time,score, s);}
 }

 The temphm should refer to the hashmap stored in the broadcast variable.
 Can anyone help me understand what is the correct way to access
 broadcast variables in JAVA?

 Thanks
 Nipun