RE: MatchError in JsonRDD.toLong

2015-01-19 Thread Wang, Daoyuan
...@preferred.jp] Sent: Tuesday, January 20, 2015 9:26 AM To: Wang, Daoyuan Cc: user Subject: Re: MatchError in JsonRDD.toLong Hi, On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan daoyuan.w...@intel.commailto:daoyuan.w...@intel.com wrote: The second parameter of jsonRDD is the sampling ratio when we infer

Re: MatchError in JsonRDD.toLong

2015-01-19 Thread Tobias Pfeiffer
Hi, On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan daoyuan.w...@intel.com wrote: The second parameter of jsonRDD is the sampling ratio when we infer schema. OK, I was aware of this, but I guess I understand the problem now. My sampling ratio is so low that I only see the Long values of data

Re: MatchError in JsonRDD.toLong

2015-01-16 Thread Tobias Pfeiffer
Hi again, On Fri, Jan 16, 2015 at 4:25 PM, Tobias Pfeiffer t...@preferred.jp wrote: Now I'm wondering where this comes from (I haven't touched this component in a while, nor upgraded Spark etc.) [...] So the reason that the error is showing up now is that suddenly data from a different

RE: MatchError in JsonRDD.toLong

2015-01-16 Thread Wang, Daoyuan
Hi Tobias, Can you provide how you create the JsonRDD? Thanks, Daoyuan From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Friday, January 16, 2015 4:01 PM To: user Subject: Re: MatchError in JsonRDD.toLong Hi again, On Fri, Jan 16, 2015 at 4:25 PM, Tobias Pfeiffer t

Re: MatchError in JsonRDD.toLong

2015-01-16 Thread Tobias Pfeiffer
Hi, On Fri, Jan 16, 2015 at 5:55 PM, Wang, Daoyuan daoyuan.w...@intel.com wrote: Can you provide how you create the JsonRDD? This should be reproducible in the Spark shell: - import org.apache.spark.sql._ val sqlc = new SparkContext(sc)

RE: MatchError in JsonRDD.toLong

2015-01-16 Thread Wang, Daoyuan
The second parameter of jsonRDD is the sampling ratio when we infer schema. Thanks, Daoyuan From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Friday, January 16, 2015 5:11 PM To: Wang, Daoyuan Cc: user Subject: Re: MatchError in JsonRDD.toLong Hi, On Fri, Jan 16, 2015 at 5:55 PM, Wang

RE: MatchError in JsonRDD.toLong

2015-01-16 Thread Wang, Daoyuan
: MatchError in JsonRDD.toLong The second parameter of jsonRDD is the sampling ratio when we infer schema. Thanks, Daoyuan From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Friday, January 16, 2015 5:11 PM To: Wang, Daoyuan Cc: user Subject: Re: MatchError in JsonRDD.toLong Hi, On Fri, Jan

MatchError in JsonRDD.toLong

2015-01-15 Thread Tobias Pfeiffer
Hi, I am experiencing a weird error that suddenly popped up in my unit tests. I have a couple of HDFS files in JSON format and my test is basically creating a JsonRDD and then issuing a very simple SQL query over it. This used to work fine, but now suddenly I get: 15:58:49.039 [Executor task