How are exceptions in map functions handled in Spark?
I'm trying to get a clear idea about how exceptions are handled in Spark? Is there somewhere where I can read about this? I'm on spark .7 For some reason I was under the impression that such exceptions are swallowed and the value that produced them ignored but the exception is logged. However, right now we're seeing the task just re-tried over and over again in an infinite loop because there's a value that always generates an exception. John
Re: How are exceptions in map functions handled in Spark?
Exceptions should be sent back to the driver program and logged there (with a SparkException thrown if a task fails more than 4 times), but there were some bugs before where this did not happen for non-Serializable exceptions. We changed it to pass back the stack traces only (as text), which should always work. I’d recommend trying a newer Spark version, 0.8 should be easy to upgrade to from 0.7. Matei On Apr 4, 2014, at 10:40 AM, John Salvatier jsalvat...@gmail.com wrote: I'm trying to get a clear idea about how exceptions are handled in Spark? Is there somewhere where I can read about this? I'm on spark .7 For some reason I was under the impression that such exceptions are swallowed and the value that produced them ignored but the exception is logged. However, right now we're seeing the task just re-tried over and over again in an infinite loop because there's a value that always generates an exception. John
Re: How are exceptions in map functions handled in Spark?
Is there a way to log exceptions inside a mapping function? logError and logInfo seem to freeze things. On Fri, Apr 4, 2014 at 11:02 AM, Matei Zaharia matei.zaha...@gmail.comwrote: Exceptions should be sent back to the driver program and logged there (with a SparkException thrown if a task fails more than 4 times), but there were some bugs before where this did not happen for non-Serializable exceptions. We changed it to pass back the stack traces only (as text), which should always work. I'd recommend trying a newer Spark version, 0.8 should be easy to upgrade to from 0.7. Matei On Apr 4, 2014, at 10:40 AM, John Salvatier jsalvat...@gmail.com wrote: I'm trying to get a clear idea about how exceptions are handled in Spark? Is there somewhere where I can read about this? I'm on spark .7 For some reason I was under the impression that such exceptions are swallowed and the value that produced them ignored but the exception is logged. However, right now we're seeing the task just re-tried over and over again in an infinite loop because there's a value that always generates an exception. John
Re: How are exceptions in map functions handled in Spark?
Btw, thank you for your help. On Fri, Apr 4, 2014 at 11:49 AM, John Salvatier jsalvat...@gmail.comwrote: Is there a way to log exceptions inside a mapping function? logError and logInfo seem to freeze things. On Fri, Apr 4, 2014 at 11:02 AM, Matei Zaharia matei.zaha...@gmail.comwrote: Exceptions should be sent back to the driver program and logged there (with a SparkException thrown if a task fails more than 4 times), but there were some bugs before where this did not happen for non-Serializable exceptions. We changed it to pass back the stack traces only (as text), which should always work. I'd recommend trying a newer Spark version, 0.8 should be easy to upgrade to from 0.7. Matei On Apr 4, 2014, at 10:40 AM, John Salvatier jsalvat...@gmail.com wrote: I'm trying to get a clear idea about how exceptions are handled in Spark? Is there somewhere where I can read about this? I'm on spark .7 For some reason I was under the impression that such exceptions are swallowed and the value that produced them ignored but the exception is logged. However, right now we're seeing the task just re-tried over and over again in an infinite loop because there's a value that always generates an exception. John
Re: How are exceptions in map functions handled in Spark?
Logging inside a map function shouldn't freeze things. The messages should be logged on the worker logs, since the code is executed on the executors. If you throw a SparkException, however, it'll be propagated to the driver after it has failed 4 or more times (by default). On Fri, Apr 4, 2014 at 11:57 AM, John Salvatier jsalvat...@gmail.comwrote: Btw, thank you for your help. On Fri, Apr 4, 2014 at 11:49 AM, John Salvatier jsalvat...@gmail.comwrote: Is there a way to log exceptions inside a mapping function? logError and logInfo seem to freeze things. On Fri, Apr 4, 2014 at 11:02 AM, Matei Zaharia matei.zaha...@gmail.comwrote: Exceptions should be sent back to the driver program and logged there (with a SparkException thrown if a task fails more than 4 times), but there were some bugs before where this did not happen for non-Serializable exceptions. We changed it to pass back the stack traces only (as text), which should always work. I'd recommend trying a newer Spark version, 0.8 should be easy to upgrade to from 0.7. Matei On Apr 4, 2014, at 10:40 AM, John Salvatier jsalvat...@gmail.com wrote: I'm trying to get a clear idea about how exceptions are handled in Spark? Is there somewhere where I can read about this? I'm on spark .7 For some reason I was under the impression that such exceptions are swallowed and the value that produced them ignored but the exception is logged. However, right now we're seeing the task just re-tried over and over again in an infinite loop because there's a value that always generates an exception. John