Update:
I tried surrounding the problematic code with try and catch but that does
not do the trick:
try
{
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
 val jsonFiles=sqlContext.jsonFile("/requests.loading")
} catch {
case _: Throwable => // Catching all exceptions and not doing anything with
them
}


any ideas?

Thanks,
Daniel

On Thu, Nov 20, 2014 at 10:20 AM, Daniel Haviv <danielru...@gmail.com>
wrote:

> Hi,
> I'm loading a bunch of json files and there seems to be problems with
> specific files (either schema changes or incomplete files).
> I'd like to catch the inconsistent files but I'm not sure how to do it.
>
> This is the exception I get:
> 14/11/20 00:13:49 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
> 0.0, whose tasks have all completed, from pool
> org.apache.spark.SparkException: Job aborted due to stage failure: Task
> 3027 in stage 0.0 failed 4 times, most recent failure: Lost task 3027.3 in
> stage 0.0 (TID 3100, HDdata2):
> com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input: was
> expecting closing quote for a string value
>  at [Source: java.io.StringReader@39a8eab6; line: 1, column: 1805]
>
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1524)
>
> and this is the code causing it:
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext._
>
> val jsonFiles=sqlContext.jsonFile("/requests.loading")
>
>
> How can I do it ?
>
> Thanks,
> Daniel
>
>
>

Reply via email to