Hi! I'm struggling with the following problem: I have a couple of Spark Streaming jobs that keep state (using mapWithState, and in one case updateStateByKey) and write their results to HBase. One of the Streaming jobs, needs the results that the other Streaming job writes to HBase. How it's currently implemented, is that within the state function, data is read from HBase that is used in calculations. The drawback, is that for each time the state function is called, a new connection is opened to HBase. In the Spark Streaming guide, it is suggested that you reuse the same connection within one partition, but this applies only to /actions/ (ie. foreachRDD). How would you do it for transformations (like mapWithState)?
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Reusing-HBase-connection-in-transformations-tp28389.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org