Sean,

I create a streaming context near the bottom of the code (ssc) and
basically apply a foreachRDD on the resulting DStream so that I can get
access to the underlying RDD, which in return I apply a foreach on and pass
in my function which applies the storing logic.

Is there a different approach I should be using?

Thanks for the help.


On Wed, Sep 3, 2014 at 2:43 PM, Sean Owen-2 [via Apache Spark User List] <
ml-node+s1001560n13385...@n3.nabble.com> wrote:

> This doesn't seem to have to do with HBase per se. Some function is
> getting the StreamingContext into the closure and that won't work. Is
> this exactly the code? since it doesn't reference a StreamingContext,
> but is there maybe a different version in reality that tries to use
> StreamingContext inside a function?
>
> On Wed, Sep 3, 2014 at 10:36 PM, Ted Yu <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=0>> wrote:
>
> > Adding back user@
> >
> > I am not familiar with the NotSerializableException. Can you show the
> full
> > stack trace ?
> >
> > See SPARK-1297 for changes you need to make so that Spark works with
> hbase
> > 0.98
> >
> > Cheers
> >
> >
> > On Wed, Sep 3, 2014 at 2:33 PM, Kevin Peng <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=1>> wrote:
> >>
> >> Ted,
> >>
> >> The hbase-site.xml is in the classpath (had worse issues before...
> until I
> >> figured that it wasn't in the path).
> >>
> >> I get the following error in the spark-shell:
> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task
> >> not serializable: java.io.NotSerializableException:
> >> org.apache.spark.streaming.StreamingContext
> >>         at
> >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.sc
>
> >> ...
> >>
> >> I also double checked the hbase table, just in case, and nothing new is
> >> written in there.
> >>
> >> I am using hbase version: 0.98.1-cdh5.1.0 the default one with the
> >> CDH5.1.0 distro.
> >>
> >> Thank you for the help.
> >>
> >>
> >> On Wed, Sep 3, 2014 at 2:09 PM, Ted Yu <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=2>> wrote:
> >>>
> >>> Is hbase-site.xml in the classpath ?
> >>> Do you observe any exception from the code below or in region server
> log
> >>> ?
> >>>
> >>> Which hbase release are you using ?
> >>>
> >>>
> >>> On Wed, Sep 3, 2014 at 2:05 PM, kpeng1 <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=3>> wrote:
> >>>>
> >>>> I have been trying to understand how spark streaming and hbase
> connect,
> >>>> but
> >>>> have not been successful. What I am trying to do is given a spark
> >>>> stream,
> >>>> process that stream and store the results in an hbase table. So far
> this
> >>>> is
> >>>> what I have:
> >>>>
> >>>> import org.apache.spark.SparkConf
> >>>> import org.apache.spark.streaming.{Seconds, StreamingContext}
> >>>> import org.apache.spark.streaming.StreamingContext._
> >>>> import org.apache.spark.storage.StorageLevel
> >>>> import org.apache.hadoop.hbase.HBaseConfiguration
> >>>> import org.apache.hadoop.hbase.client.{HBaseAdmin,HTable,Put,Get}
> >>>> import org.apache.hadoop.hbase.util.Bytes
> >>>>
> >>>> def blah(row: Array[String]) {
> >>>>   val hConf = new HBaseConfiguration()
> >>>>   val hTable = new HTable(hConf, "table")
> >>>>   val thePut = new Put(Bytes.toBytes(row(0)))
> >>>>   thePut.add(Bytes.toBytes("cf"), Bytes.toBytes(row(0)),
> >>>> Bytes.toBytes(row(0)))
> >>>>   hTable.put(thePut)
> >>>> }
> >>>>
> >>>> val ssc = new StreamingContext(sc, Seconds(1))
> >>>> val lines = ssc.socketTextStream("localhost", 9999,
> >>>> StorageLevel.MEMORY_AND_DISK_SER)
> >>>> val words = lines.map(_.split(","))
> >>>> val store = words.foreachRDD(rdd => rdd.foreach(blah))
> >>>> ssc.start()
> >>>>
> >>>> I am currently running the above code in spark-shell. I am not sure
> what
> >>>> I
> >>>> am doing wrong.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> View this message in context:
> >>>>
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378.html
> >>>> Sent from the Apache Spark User List mailing list archive at
> Nabble.com.
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=4>
> >>>> For additional commands, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=5>
> >>>>
> >>>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=6>
> For additional commands, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=13385&i=7>
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378p13385.html
>  To unsubscribe from Spark Streaming into HBase, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=13378&code=a3BlbmcxQGdtYWlsLmNvbXwxMzM3OHwxMjA2NzA5NzQ3>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-tp13378p13386.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to