I have figured it out.

As shown in the code below, if the HiveContext hc were created in the actor 
object and used to create db in response to message, it would throw null 
pointer exception. This is fixed by creating the HiveContext inside the MyActor 
class instead. I also tested the code by replacing Actor with Thread. The 
problem and fix are similar.

Du

——
abstract class MyMessage
case object CreateDB extends MyMessage

object MyActor {
  def init(_sc: SparkContext) = {
    if( actorSystem == null || actorRef == null ) {
      actorSystem = ActorSystem(“root")
      actorRef = actorSystem.actorOf(Props(new MyActor(_sc)), “myactor")
    }
    //hc = new MyHiveContext(_sc)
  }

  def !(m: MyMessage) {
    actorRef ! m
  }

  //var hc: MyHiveContext = _
  private var actorSystem: ActorSystem = null
  private var actorRef: ActorRef = null
}

class MyActor(sc: SparkContext) extends Actor {
  val hc = new MyHiveContext(sc)
  def receive: Receiver = {
    case CreateDB => hc.createDB()
  }
}

class MyHiveContext(sc: SparkContext) extends HiveContext(sc) {
  def createDB() {...}
}


From:  "Chester @work" <ches...@alpinenow.com<mailto:ches...@alpinenow.com>>
Date:  Thursday, September 18, 2014 at 7:17 AM
To:  Du Li <l...@yahoo-inc.com.INVALID<mailto:l...@yahoo-inc.com.INVALID>>
Cc:  Michael Armbrust <mich...@databricks.com<mailto:mich...@databricks.com>>, 
"Cheng, Hao" <hao.ch...@intel.com<mailto:hao.ch...@intel.com>>, 
"user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject:  Re: problem with HiveContext inside Actor


Akka actor are managed under a thread pool, so the same actor can be under 
different thread.

If you create HiveContext in the actor, is it possible that you are essentially 
create different instance of HiveContext ?

Sent from my iPhone

On Sep 17, 2014, at 10:14 PM, Du Li 
<l...@yahoo-inc.com.INVALID<mailto:l...@yahoo-inc.com.INVALID>> wrote:



Thanks for your reply.

Michael: No. I only create one HiveContext in the code.

Hao: Yes. I subclass HiveContext and defines own function to create database 
and then subclass akka Actor to call that function in response to an abstract 
message. By your suggestion, I called 
println(sessionState.getConf.getAllProperties) that printed
tons of properties; however, the same NullPointerException was still thrown.

As mentioned, the weird thing is that everything worked fine if I simply called 
actor.hiveContext.createDB() directly. But it throws the null pointer exception 
from Driver.java if I do "actor ! CreateSomeDB”, which seems to me just the 
same thing because
the actor does nothing but call createDB().

Du





From: Michael Armbrust <mich...@databricks.com<mailto:mich...@databricks.com>>
Date: Wednesday, September 17, 2014 at 7:40 PM
To: "Cheng, Hao" <hao.ch...@intel.com<mailto:hao.ch...@intel.com>>
Cc: Du Li <l...@yahoo-inc.com.invalid<mailto:l...@yahoo-inc.com.invalid>>, 
"user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: problem with HiveContext inside Actor


- dev

Is it possible that you are constructing more than one HiveContext in a single 
JVM?  Due to global state in Hive code this is not allowed.

Michael


On Wed, Sep 17, 2014 at 7:21 PM, Cheng, Hao
<hao.ch...@intel.com<mailto:hao.ch...@intel.com>> wrote:

Hi, Du
I am not sure what you mean “triggers the HiveContext to create a database”, do 
you create the sub class
of HiveContext? Just be sure you call the “HiveContext.sessionState” eagerly, 
since it will set the proper “hiveconf” into the SessionState, otherwise the 
HiveDriver will always get the null value when retrieving HiveConf.
Cheng Hao
From: Du Li [mailto:l...@yahoo-inc.com.INVALID]

Sent: Thursday, September 18, 2014 7:51 AM
To: user@spark.apache.org<mailto:user@spark.apache.org>;
d...@spark.apache.org<mailto:d...@spark.apache.org>
Subject: problem with HiveContext inside Actor


Hi,


Wonder anybody had similar experience or any suggestion here.


I have an akka Actor that processes database requests in high-level messages. 
Inside this Actor, it creates a HiveContext object that does the
actual db work. The main thread creates the needed SparkContext and passes in 
to the Actor to create the HiveContext.


When a message is sent to the Actor, it is processed properly except that, when 
the message triggers the HiveContext to create a database, it
throws a NullPointerException in hive.ql.Driver.java which suggests that its 
conf variable is not initialized.


Ironically, it works fine if my main thread directly calls actor.hiveContext to 
create the database. The spark version is 1.1.0.


Thanks,

Du










Reply via email to