Also, things seem to work with all your settings if you disable use of
the shuffle service (which also means no dynamic allocation), if that
helps you make progress in what you wanted to do.

On Thu, Jul 20, 2017 at 4:25 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
> Hmm... I tried this with the new shuffle service (I generally have an
> old one running) and also see failures. I also noticed some odd things
> in your logs that I'm also seeing in mine, but it's better to track
> these in a bug instead of e-mail.
>
> Please file a bug and attach your logs there, I'll take a look at this.
>
> On Thu, Jul 20, 2017 at 2:06 PM, Udit Mehrotra
> <udit.mehrotr...@gmail.com> wrote:
>> Hi Marcelo,
>>
>> I ran with setting DEBUG level logging for 'org.apache.spark.network.crypto'
>> for both Spark and Yarn.
>>
>> However, the DEBUG logs still do not convey anything meaningful. Please find
>> it attached. Can you please take a quick look, and let me know if you see
>> anything suspicious ?
>>
>> If not, do you think I should open a JIRA for this ?
>>
>> Thanks !
>>
>> On Wed, Jul 19, 2017 at 3:14 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>>>
>>> Hmm... that's not enough info and logs are intentionally kept silent
>>> to avoid flooding, but if you enable DEBUG level logging for
>>> org.apache.spark.network.crypto in both YARN and the Spark app, that
>>> might provide more info.
>>>
>>> On Wed, Jul 19, 2017 at 2:58 PM, Udit Mehrotra
>>> <udit.mehrotr...@gmail.com> wrote:
>>> > So I added these settings in yarn-site.xml as well. Now I get a
>>> > completely
>>> > different error, but atleast it seems like it is using the crypto
>>> > library:
>>> >
>>> > ExecutorLostFailure (executor 1 exited caused by one of the running
>>> > tasks)
>>> > Reason: Unable to create executor due to Unable to register with
>>> > external
>>> > shuffle server due to : java.lang.IllegalArgumentException:
>>> > Authentication
>>> > failed.
>>> >     at
>>> >
>>> > org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:125)
>>> >     at
>>> >
>>> > org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
>>> >     at
>>> >
>>> > org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
>>> >     at
>>> >
>>> > org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
>>> >
>>> > Any clue about this ?
>>> >
>>> >
>>> > On Wed, Jul 19, 2017 at 1:13 PM, Marcelo Vanzin <van...@cloudera.com>
>>> > wrote:
>>> >>
>>> >> On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
>>> >> <udit.mehrotr...@gmail.com> wrote:
>>> >> > Is there any additional configuration I need for external shuffle
>>> >> > besides
>>> >> > setting the following:
>>> >> > spark.network.crypto.enabled true
>>> >> > spark.network.crypto.saslFallback false
>>> >> > spark.authenticate               true
>>> >>
>>> >> Have you set these options on the shuffle service configuration too
>>> >> (which is the YARN xml config file, not spark-defaults.conf)?
>>> >>
>>> >> If you have there might be an issue, and you should probably file a
>>> >> bug and include your NM's log file.
>>> >>
>>> >> --
>>> >> Marcelo
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Marcelo
>>
>>
>
>
>
> --
> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to