So, according to Eron's suggestion I tried *security.ssl.verify-hostname:
false *configuration and that does the trick. I no longer get the
classloader error even with *blob.service.ssl.enabled: true *configuration.
Do you think the hostname verification fails because we are running flink
jobmanager and taskmanager via Marathon (and hence essentially as a mesos
task)?

On Wed, Oct 4, 2017 at 5:47 PM, Chesnay Schepler <ches...@apache.org> wrote:

> I don't think this is a configuration problem, but a bug in Flink. But
> we'll have to dig a little deeper to be sure.
>
> Besides the actual SSL problem, what concerns me is that we didn't fail
> earlier. If a bug in the SSL setup prevents
> the up- or download of jars then we should fail earlier. Looping in Nico
> who may have some input.
>
>
> On 04.10.2017 22:58, Aniket Deshpande wrote:
>
> Hi Chesnay,
> Thanks for the reply. After your suggestion, I found out that setting 
> *blob.service.ssl.enabled:
> false* solved the issue and now all the pipelines run as expected.
> So, the issue is kinda narrowed down to blob service ssl now.
> I also checked the jobmanager logs when blob ssl is enabled and I see the
> following error:
>
>
>
>
>
>
>
>
>
>
> *2017-10-03 23:28:50.459 [BLOB connection for /<jm_ip>:46932] ERROR
> org.apache.flink.runtime.blob.BlobServerConnection  - Error while executing
> BLOB connection.  javax.net.ssl.SSLHandshakeException: Received fatal
> alert: certificate_unknown          at
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192)          at
> sun.security.ssl.Alerts.getSSLException(Alerts.java:154)          at
> sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2023)
> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1125)
>     at
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>         at
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:928)
>     at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
>   at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
> at
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:119)
>  *
> So, is there some additional steps that I have to follow for enabling SSL
> for blob service?
>
> On Wed, Oct 4, 2017 at 4:09 PM, Eron Wright <eronwri...@gmail.com> wrote:
>
>> By following Chesney's recommendation we will hopefully uncover an SSL
>> error that is being masked.  Another thing to try is to disable hostname
>> verification (it is enabled by default) to see whether the certificate is
>> being rejected.
>>
>> On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler <ches...@apache.org>
>> wrote:
>>
>>> something that would also help us narrow down the problematic area is to
>>> enable SSL for one component at a time and see
>>> which one causesd the job to fail.
>>>
>>>
>>> On 04.10.2017 14:11, Chesnay Schepler wrote:
>>>
>>> The configuration looks reasonable. Just to be sure, are the paths
>>> accessible by all nodes?
>>>
>>> As a first step, could you set the logging level to DEBUG (by modifying
>>> the 'conf/log4j.properties' file), resubmit the job (after a cluster
>>> restart) and check the Job- and TaskManager logs for any exception?
>>>
>>> On 04.10.2017 03:15, Aniket Deshpande wrote:
>>>
>>> Background: We have a setup of Flink 1.3.1 along with a secure MAPR
>>> cluster (Flink is running on mapr client nodes). We run this flink cluster
>>> via flink-jobmanager.sh foreground and flink-taskmanager.sh foreground 
>>> command
>>> via Marathon.  In order for us to make this work, we had to add
>>> -Djavax.net.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in
>>> flink-console.sh as extra JVM arg (otherwise, flink was taking MAPR's
>>> ssl_truststore as default truststore and then we were facing issues for any
>>> 3rd party jars like aws_sdk etc.). This entire setup was working fine as it
>>> is and we could submit our jars and the pipelines ran without any problem
>>>
>>>
>>> Problem: We started experimenting with enabling ssl for all
>>> communication for Flink. For this, we followed https://ci.apache.org
>>> /projects/flink/flink-docs-release-1.3/setup/security-ssl.html for
>>> generating CA and keystore. I added the following properties to
>>> flink-conf.yaml:
>>>
>>>
>>> security.ssl.enabled: true
>>> security.ssl.keystore: /opt/flink/certs/node1.keystore
>>> security.ssl.keystore-password: <password>
>>> security.ssl.key-password: <password>
>>> security.ssl.truststore: /opt/flink/certs/ca.truststore
>>> security.ssl.truststore-password: <password>
>>> jobmanager.web.ssl.enabled: true
>>> taskmanager.data.ssl.enabled: true
>>> blob.service.ssl.enabled: true
>>> akka.ssl.enabled: true
>>>
>>>
>>> We then spin up a cluster and tried submitting the same job which was
>>> working before. We get the following erros:
>>> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot
>>> load user class: org.apache.flink.streaming.con
>>> nectors.kafka.FlinkKafkaConsumer09
>>> ClassLoader info: URL ClassLoader:
>>> Class not resolvable through given classloader.
>>>         at org.apache.flink.streaming.api.graph.StreamConfig.getStreamO
>>> perator(StreamConfig.java:229)
>>>         at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init
>>> >(OperatorChain.java:95)
>>>         at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(S
>>> treamTask.java:230)
>>>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>         at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> This error disappears when we remove the ssl config properties i.e run
>>> flink cluster without ssl enabled.
>>>
>>>
>>> So, did we miss any steps for enabling ssl?
>>>
>>>
>>> P.S.: We tried removing the extra JVm arg mentioned above, but still get
>>> the same error.
>>>
>>> --
>>>
>>> Aniket
>>>
>>>
>>>
>>>
>>
>
>
> --
> Yours Sincerely,
> Aniket S Deshpande.
>
>
>


-- 
Yours Sincerely,
Aniket S Deshpande.

Reply via email to