Yeah, currently containers other than for the pipeline SDK (for example,
Java SDK Harness container if you are using a Java transform from Python)
will be pulled from Docker. You can override using the option you mentioned.

We are working on copying all containers to gcr and using that but we are
not there yet.

Thanks,
Cham

On Fri, Dec 18, 2020 at 9:06 AM Steve Niemitz <[email protected]> wrote:

> I'm playing around with xlang portable pipelines in dataflow and noticed
> that it tries to pull the java harness (beam_java8_sdk:2.25.0) from
> docker.io.  This is problematic because our VPC prevents access to
> external hosts.  I was able to fix the problem by passing in
>
> --sdk_harness_container_image_overrides=.*java.*,
> gcr.io/cloud-dataflow/v1beta3/beam_java8_sdk:2.25.0
>
> to my job, but it's not ideal to have to do this.  Is there a reason the
> default location is docker.io rather than gcr?  Especially given that
> docker is going to be substantially limiting pulls / hour in the near
> future.
>

Reply via email to