Sean, I create a streaming context near the bottom of the code (ssc) and basically apply a foreachRDD on the resulting DStream so that I can get access to the underlying RDD, which in return I apply a foreach on and pass in my function which applies the storing logic.
Is there a different approach I should be using? Thanks for the help. On Wed, Sep 3, 2014 at 2:43 PM, Sean Owen-2 [via Apache Spark User List] < ml-node+s1001560n13385...@n3.nabble.com> wrote: > This doesn't seem to have to do with HBase per se. Some function is > getting the StreamingContext into the closure and that won't work. Is > this exactly the code? since it doesn't reference a StreamingContext, > but is there maybe a different version in reality that tries to use > StreamingContext inside a function? > > On Wed, Sep 3, 2014 at 10:36 PM, Ted Yu <[hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=0>> wrote: > > > Adding back user@ > > > > I am not familiar with the NotSerializableException. Can you show the > full > > stack trace ? > > > > See SPARK-1297 for changes you need to make so that Spark works with > hbase > > 0.98 > > > > Cheers > > > > > > On Wed, Sep 3, 2014 at 2:33 PM, Kevin Peng <[hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=1>> wrote: > >> > >> Ted, > >> > >> The hbase-site.xml is in the classpath (had worse issues before... > until I > >> figured that it wasn't in the path). > >> > >> I get the following error in the spark-shell: > >> org.apache.spark.SparkException: Job aborted due to stage failure: Task > >> not serializable: java.io.NotSerializableException: > >> org.apache.spark.streaming.StreamingContext > >> at > >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.sc > > >> ... > >> > >> I also double checked the hbase table, just in case, and nothing new is > >> written in there. > >> > >> I am using hbase version: 0.98.1-cdh5.1.0 the default one with the > >> CDH5.1.0 distro. > >> > >> Thank you for the help. > >> > >> > >> On Wed, Sep 3, 2014 at 2:09 PM, Ted Yu <[hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=2>> wrote: > >>> > >>> Is hbase-site.xml in the classpath ? > >>> Do you observe any exception from the code below or in region server > log > >>> ? > >>> > >>> Which hbase release are you using ? > >>> > >>> > >>> On Wed, Sep 3, 2014 at 2:05 PM, kpeng1 <[hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=3>> wrote: > >>>> > >>>> I have been trying to understand how spark streaming and hbase > connect, > >>>> but > >>>> have not been successful. What I am trying to do is given a spark > >>>> stream, > >>>> process that stream and store the results in an hbase table. So far > this > >>>> is > >>>> what I have: > >>>> > >>>> import org.apache.spark.SparkConf > >>>> import org.apache.spark.streaming.{Seconds, StreamingContext} > >>>> import org.apache.spark.streaming.StreamingContext._ > >>>> import org.apache.spark.storage.StorageLevel > >>>> import org.apache.hadoop.hbase.HBaseConfiguration > >>>> import org.apache.hadoop.hbase.client.{HBaseAdmin,HTable,Put,Get} > >>>> import org.apache.hadoop.hbase.util.Bytes > >>>> > >>>> def blah(row: Array[String]) { > >>>> val hConf = new HBaseConfiguration() > >>>> val hTable = new HTable(hConf, "table") > >>>> val thePut = new Put(Bytes.toBytes(row(0))) > >>>> thePut.add(Bytes.toBytes("cf"), Bytes.toBytes(row(0)), > >>>> Bytes.toBytes(row(0))) > >>>> hTable.put(thePut) > >>>> } > >>>> > >>>> val ssc = new StreamingContext(sc, Seconds(1)) > >>>> val lines = ssc.socketTextStream("localhost", 9999, > >>>> StorageLevel.MEMORY_AND_DISK_SER) > >>>> val words = lines.map(_.split(",")) > >>>> val store = words.foreachRDD(rdd => rdd.foreach(blah)) > >>>> ssc.start() > >>>> > >>>> I am currently running the above code in spark-shell. I am not sure > what > >>>> I > >>>> am doing wrong. > >>>> > >>>> > >>>> > >>>> -- > >>>> View this message in context: > >>>> > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378.html > >>>> Sent from the Apache Spark User List mailing list archive at > Nabble.com. > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=4> > >>>> For additional commands, e-mail: [hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=5> > >>>> > >>> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=6> > For additional commands, e-mail: [hidden email] > <http://user/SendEmail.jtp?type=node&node=13385&i=7> > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378p13385.html > To unsubscribe from Spark Streaming into HBase, click here > <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=13378&code=a3BlbmcxQGdtYWlsLmNvbXwxMzM3OHwxMjA2NzA5NzQ3> > . > NAML > <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378p13386.html Sent from the Apache Spark User List mailing list archive at Nabble.com.