...@preferred.jp]
Sent: Tuesday, January 20, 2015 9:26 AM
To: Wang, Daoyuan
Cc: user
Subject: Re: MatchError in JsonRDD.toLong
Hi,
On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan
daoyuan.w...@intel.commailto:daoyuan.w...@intel.com wrote:
The second parameter of jsonRDD is the sampling ratio when we infer
Hi,
On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan daoyuan.w...@intel.com
wrote:
The second parameter of jsonRDD is the sampling ratio when we infer schema.
OK, I was aware of this, but I guess I understand the problem now. My
sampling ratio is so low that I only see the Long values of data
Hi again,
On Fri, Jan 16, 2015 at 4:25 PM, Tobias Pfeiffer t...@preferred.jp wrote:
Now I'm wondering where this comes from (I haven't touched this component
in a while, nor upgraded Spark etc.) [...]
So the reason that the error is showing up now is that suddenly data from a
different
Hi Tobias,
Can you provide how you create the JsonRDD?
Thanks,
Daoyuan
From: Tobias Pfeiffer [mailto:t...@preferred.jp]
Sent: Friday, January 16, 2015 4:01 PM
To: user
Subject: Re: MatchError in JsonRDD.toLong
Hi again,
On Fri, Jan 16, 2015 at 4:25 PM, Tobias Pfeiffer
t
Hi,
On Fri, Jan 16, 2015 at 5:55 PM, Wang, Daoyuan daoyuan.w...@intel.com
wrote:
Can you provide how you create the JsonRDD?
This should be reproducible in the Spark shell:
-
import org.apache.spark.sql._
val sqlc = new SparkContext(sc)
The second parameter of jsonRDD is the sampling ratio when we infer schema.
Thanks,
Daoyuan
From: Tobias Pfeiffer [mailto:t...@preferred.jp]
Sent: Friday, January 16, 2015 5:11 PM
To: Wang, Daoyuan
Cc: user
Subject: Re: MatchError in JsonRDD.toLong
Hi,
On Fri, Jan 16, 2015 at 5:55 PM, Wang
: MatchError in JsonRDD.toLong
The second parameter of jsonRDD is the sampling ratio when we infer schema.
Thanks,
Daoyuan
From: Tobias Pfeiffer [mailto:t...@preferred.jp]
Sent: Friday, January 16, 2015 5:11 PM
To: Wang, Daoyuan
Cc: user
Subject: Re: MatchError in JsonRDD.toLong
Hi,
On Fri, Jan
Hi,
I am experiencing a weird error that suddenly popped up in my unit tests. I
have a couple of HDFS files in JSON format and my test is basically
creating a JsonRDD and then issuing a very simple SQL query over it. This
used to work fine, but now suddenly I get:
15:58:49.039 [Executor task