Re: Spark execution on Hadoop cluster (many nodes)

sam smith Mon, 24 Jan 2022 07:29:03 -0800

I am aware of that, but whenever the chunks of code are returned to Spark
from Hadoop (after processing) could they be done not in the ordered way ?
could this ever happen ?


Le lun. 24 janv. 2022 à 16:14, Sean Owen <sro...@gmail.com> a écrit :

> Hadoop does not run Spark programs, Spark does. How or why would
> something, what, modify the byte code? No
>
> On Mon, Jan 24, 2022, 9:07 AM sam smith <qustacksm2123...@gmail.com>
> wrote:
>
>> My point is could Hadoop go wrong about one Spark execution ? meaning
>> that it gets confused (given the concurrent distributed tasks) and then
>> adds wrong instruction to the program, or maybe does execute an instruction
>> not at its right order (shuffling the order of execution by executing
>> previous ones, while it shouldn't) ? Before finishing and returning the
>> results from one node it returns the results of the other in a wrong way
>> for example.
>>
>> Le lun. 24 janv. 2022 à 15:31, Sean Owen <sro...@gmail.com> a écrit :
>>
>>> Not clear what you mean here. A Spark program is a program, so what are
>>> the alternatives here? program execution order is still program execution
>>> order. You are not guaranteed anything about order of concurrent tasks.
>>> Failed tasks can be reexecuted so should be idempotent. I think the answer
>>> is 'no' but not sure what you are thinking of here.
>>>
>>> On Mon, Jan 24, 2022 at 7:10 AM sam smith <qustacksm2123...@gmail.com>
>>> wrote:
>>>
>>>> Hello guys,
>>>>
>>>> I hope my question does not sound weird, but could a Spark execution on
>>>> Hadoop cluster give different output than the program actually does ? I
>>>> mean by that, the execution order is messed by hadoop, or an instruction
>>>> executed twice..; ?
>>>>
>>>> Thanks for your enlightenment
>>>>
>>>

Re: Spark execution on Hadoop cluster (many nodes)

Reply via email to