Not really a good idea.
It breaks the paradigm.
If I understand the OP’s idea… they want to halt processing the RDD, but not
the entire job.
So when it hits a certain condition, it will stop that task yet continue on to
the next RDD. (Assuming you have more RDDs or partitions than you have task
’slots’) So if you fail enough RDDs, your job fails meaning you don’t get any
results.
The best you could do is a NOOP. That is… if your condition is met on that
RDD, your M/R job will not output anything to the collection so no more data is
being added to the result set.
The whole paradigm is to process the entire RDD at the time.
You may spin cycles, but that’s not a really bad thing.
HTH
-Mike
> On Jan 4, 2016, at 6:45 AM, Daniel Darabos <daniel.dara...@lynxanalytics.com>
> wrote:
>
> You can cause a failure by throwing an exception in the code running on the
> executors. The task will be retried (if spark.task.maxFailures > 1), and then
> the stage is failed. No further tasks are processed after that, and an
> exception is thrown on the driver. You could catch the exception and see if
> it was caused by your own special exception.
>
> On Mon, Jan 4, 2016 at 1:05 PM, domibd <d...@lipn.univ-paris13.fr
> <mailto:d...@lipn.univ-paris13.fr>> wrote:
> Hello,
>
> Is there a way to stop under a condition a process (like map-reduce) using
> an RDD ?
>
> (this could be use if the process does not always need to
> explore all the RDD)
>
> thanks
>
> Dominique
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/stopping-a-process-usgin-an-RDD-tp25870.html
>
> <http://apache-spark-user-list.1001560.n3.nabble.com/stopping-a-process-usgin-an-RDD-tp25870.html>
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> <mailto:user-unsubscr...@spark.apache.org>
> For additional commands, e-mail: user-h...@spark.apache.org
> <mailto:user-h...@spark.apache.org>
>
>