Re: Incomplete data when reading from S3

2016-03-19 Thread DB Tsai
You need to use wholetextfiles to read the whole file at once. Otherwise, It can be split. DB Tsai - Sent From My Phone On Mar 17, 2016 12:45 AM, "Blaž Šnuderl" wrote: > Hi. > > We have json data stored in S3 (json record per line). When reading the > data from s3 using the

Incomplete data when reading from S3

2016-03-18 Thread Blaž Šnuderl
Hi. We have json data stored in S3 (json record per line). When reading the data from s3 using the following code we started noticing json decode errors. sc.textFile(paths).map(json.loads) After a bit more investigation we noticed an incomplete line, basically the line was > {"key": "value",