Spark Receivers

2015-02-16 Thread Mark Payne
Hello, I am one of the committers for Apache NiFi (incubating). I am looking to integrate NiFi with Spark streaming. I have created a custom Receiver to receive data from NiFi. I’ve tested it locally, and things seem to work well. I feel it would make more sense to have the NiFi Receiver in

Re: HiveContext cannot be serialized

2015-02-16 Thread Reynold Xin
Michael - it is already transient. This should probably considered a bug in the scala compiler, but we can easily work around it by removing the use of destructuring binding. On Mon, Feb 16, 2015 at 10:41 AM, Michael Armbrust mich...@databricks.com wrote: I'd suggest marking the HiveContext as

Re: HiveContext cannot be serialized

2015-02-16 Thread Reynold Xin
I submitted a patch https://github.com/apache/spark/pull/4628 On Mon, Feb 16, 2015 at 10:59 AM, Michael Armbrust mich...@databricks.com wrote: I was suggesting you mark the variable that is holding the HiveContext '@transient' since the scala compiler is not correctly propagating this

Re: HiveContext cannot be serialized

2015-02-16 Thread Michael Armbrust
I'd suggest marking the HiveContext as @transient since its not valid to use it on the slaves anyway. On Mon, Feb 16, 2015 at 4:27 AM, Haopu Wang hw...@qilinsoft.com wrote: When I'm investigating this issue (in the end of this email), I take a look at HiveContext's code and find this change

Re: Building Spark with Pants

2015-02-16 Thread Ryan Williams
I worked on Pants at Foursquare for a while and when coming up to speed on Spark was interested in the possibility of building it with Pants, particularly because allowing developers to share/reuse each others' compilation artifacts seems like it would be a boon to productivity; that was/is Pants'

Re: HiveContext cannot be serialized

2015-02-16 Thread Michael Armbrust
I was suggesting you mark the variable that is holding the HiveContext '@transient' since the scala compiler is not correctly propagating this through the tuple extraction. This is only a workaround. We can also remove the tuple extraction. On Mon, Feb 16, 2015 at 10:47 AM, Reynold Xin

org.apache.spark.sql.sources.DDLException: Unsupported dataType: [1.1] failure: ``varchar'' expected but identifier char found in spark-sql

2015-02-16 Thread Qiuzhuang Lian
Hi, I am not sure this has been reported already or not, I run into this error under spark-sql shell as build from newest of spark git trunk, spark-sql describe qiuzhuang_hcatlog_import; 15/02/17 14:38:36 ERROR SparkSQLDriver: Failed in [describe qiuzhuang_hcatlog_import]

RE: HiveContext cannot be serialized

2015-02-16 Thread Haopu Wang
Reynold and Michael, thank you so much for the quick response. This problem also happens on branch-1.1, would you mind resolving it on branch-1.1 also? Thanks again! From: Reynold Xin [mailto:r...@databricks.com] Sent: Tuesday, February 17, 2015 3:44 AM

HiveContext cannot be serialized

2015-02-16 Thread Haopu Wang
When I'm investigating this issue (in the end of this email), I take a look at HiveContext's code and find this change (https://github.com/apache/spark/commit/64945f868443fbc59cb34b34c16d782d da0fb63d#diff-ff50aea397a607b79df9bec6f2a841db): - @transient protected[hive] lazy val hiveconf = new

Re: Replacing Jetty with TomCat

2015-02-16 Thread Sean Owen
There's no particular reason you have to remove the embedded Jetty server, right? it doesn't prevent you from using it inside another app that happens to run in Tomcat. You won't be able to switch it out without rewriting a fair bit of code, no, but you don't need to. On Mon, Feb 16, 2015 at 5:08