Hi Mohil,

Thanks for the detailed report. I think most people are reduced capacity
right now. Filing a Jira would be helpful for tracking this.

Since I am writing, I will add a quick guess, but we should move to Jira.
It seems this has more to do with Dataflow than ElasticSearch. The default
for staging files is to scan the classpath. To do more, or to fix any
problem with the autodetection, you will need to specify --filesToStage on
the command line or setFilesToStage in Java code. Am I correct that this
symptom is confirmed?

Kenn

On Mon, Apr 6, 2020 at 5:04 PM Mohil Khare <mo...@prosimo.io> wrote:

> Any update on this? Shall I open a jira for this support ?
>
> Thanks and regards
> Mohil
>
> On Sun, Mar 22, 2020 at 9:36 PM Mohil Khare <mo...@prosimo.io> wrote:
>
>> Hi,
>> This is Mohil from Prosimo, a small bay area based stealth mode startup.
>> We use Beam (on version 2.19) with google dataflow in our analytics
>> pipeline with Kafka and PubSub as source while GCS, BigQuery and
>> ElasticSearch as our sink.
>>
>> We want to use our private self signed root ca for tls connections
>> between our internal services viz kafka, ElasticSearch, beam etc. We are
>> able to setup secure tls connection between beam and kafka using self
>> signed root certificate in keystore.jks and truststore.jks and transferring
>> it to worker VMs running kafkaIO using KafkaIO's read via
>> withConsumerFactorFn().
>>
>> We want to do similar things with elasticseachIO where we want to update
>> its worker VM's truststore with our self signed root certificate so that
>> when elasticsearchIO connects using HTTPS, it can connect successfully
>> without ssl handshake failure. Currently we couldn't find any way to do so
>> with ElasticsearchIO. We tried various possible workarounds like:
>>
>> 1. Trying JvmInitializer to initialise Jvm with truststore using
>> System.setproperty for javax.net.ssl.trustStore,
>> 2. Transferring our jar to GCP's appengine where we start jar using
>> Djavax.net.ssl.trustStore and then triggering beam job from there.
>> 3. Setting elasticsearchIO flag withTrustSelfSigned to true (I don't
>> think it will work because looking at the source code, it looks like it has
>> dependency with keystorePath)
>>
>> But nothing worked. When we logged in to worker VMs, it looked like our
>> trustStore never made it to worker VM. All elasticsearchIO connections
>> failed with the following exception:
>>
>> sun.security.validator.ValidatorException: PKIX path building failed:
>> sun.security.provider.certpath.SunCertPathBuilderException: unable to find
>> valid certification path to requested target
>> sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
>>
>>
>> Right now to unblock ourselves, we have added proxy with letsencrypt root
>> ca between beam and Elasticsearch cluster and beam's elasticsearchIO
>> connect successfully to proxy using letsencrypt root certificate. We won't
>> want to use Letsencrypt root certificater for internal services as it
>> expires every three months.  Is there a way, just like kafkaIO, to use
>> selfsigned root certificate with elasticsearchIO? Or is there a way to
>> update java cacerts on worker VMs where beam job is running?
>>
>> Looking forward for some suggestions soon.
>>
>> Thanks and Regards
>> Mohil Khare
>>
>

Reply via email to