Thank you for the reply.

> Why not manually docker pull the image (remember to adjust the container
location to omit to registry) locally first?

  Actually, when using docker, I have already pulled the image from remote
repository (GCR).
  Is there a way to make Flink Runner to not call "docker pull" if the
container is already pulled ?

Thanks,
Yu Watanabe

On Fri, Sep 20, 2019 at 7:48 PM Benjamin Tan <[email protected]>
wrote:

> Seems like some file is missing. Why not manually docker pull the image
> (remember to adjust the container location to omit to registry) locally
> first? At least you can eliminate another source of trouble.
>
> Also, depending on your pipeline, if you are running distributed, you must
> make sure your files are accessible. For example, local paths won’t work at
> all.
>
> So maybe you can force a single worker first too.
>
> Sent from my iPhone
>
> On 20 Sep 2019, at 5:11 PM, Yu Watanabe <[email protected]> wrote:
>
> Ankur
>
> Thank you for the advice.
> You're right. Looking at the task manager's log,  looks like first "docker
> pull" fails from yarn user and then couple of errors comes after.
> As a result, "docker run" seems to fail.
> I have been working on whole week and still not manage through from  yarn
> session to get authenticated against Google Container Registry...
>
>
> ==============================================================================================
> 2019-09-19 06:47:38,196 INFO  org.apache.flink.runtime.taskmanager.Task
>                   - MapPartition (MapPartition at [3]{Create,
> ParDo(EsOutputFn)}) (1/1) (d2f0d79e4614c3b0cb5a8cbd38de37da) switched from
> DEPLOYING to RUNNING.
> 2019-09-19 06:47:41,181 WARN
>  org.apache.beam.runners.fnexecution.environment.DockerCommand  - Unable to
> pull docker image asia.gcr.io/PROJECTNAME/beam/python3:latest, cause:
> Received exit code 1 for command 'docker pull
> asia.gcr.io/creationline001/beam/python3:latest'. stderr: Error response
> from daemon: unauthorized: You don't have the needed permissions to perform
> this operation, and you may have invalid credentials. To authenticate your
> request, follow the steps in:
> https://cloud.google.com/container-registry/docs/advanced-authentication
> 2019-09-19 06:47:44,035 INFO
>  
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>  - GetManifest for
> /tmp/artifactsknnvmjj8/job_fc5fff58-4408-4e0d-833b-675215218234/MANIFEST
> 2019-09-19 06:47:44,037 INFO
>  
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>  - Loading manifest for retrieval token
> /tmp/artifactsknnvmjj8/job_fc5fff58-4408-4e0d-833b-675215218234/MANIFEST
> 2019-09-19 06:47:44,046 INFO
>  
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>  - GetManifest for
> /tmp/artifactsknnvmjj8/job_fc5fff58-4408-4e0d-833b-675215218234/MANIFEST
> failed
> java.util.concurrent.ExecutionException: java.io.FileNotFoundException:
> /tmp/artifactsknnvmjj8/job_fc5fff58-4408-4e0d-833b-675215218234/MANIFEST
> (No such file or directory)
>         at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531)
>         at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:492)
> ...
> Caused by: java.io.FileNotFoundException:
> /tmp/artifactsknnvmjj8/job_fc5fff58-4408-4e0d-833b-675215218234/MANIFEST
> (No such file or directory)
>         at java.io.FileInputStream.open0(Native Method)
>         at java.io.FileInputStream.open(FileInputStream.java:195)
>         at java.io.FileInputStream.<init>(FileInputStream.java:138)
>         at
> org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:114)
> ...
> 2019-09-19 06:48:43,952 ERROR org.apache.flink.runtime.operators.BatchTask
>                  - Error in task code:  MapPartition (MapPartition at
> [3]{Create, ParDo(EsOutputFn)}) (1/1)
> java.lang.Exception: The user defined 'open()' method caused an exception:
> java.io.IOException: Received exit code 1 for command 'docker inspect -f
> {{.State.Running}}
> 7afbdcfd241629d24872ba1c74ef10f3d07c854c9cc675a65d4d16b9fdbde752'. stderr:
> Error: No such object:
> 7afbdcfd241629d24872ba1c74ef10f3d07c854c9cc675a65d4d16b9fdbde752
>         at
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
>         at
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>         at java.lang.Thread.run(Thread.java:748)
> ...
>
> ==============================================================================================
>
> Thanks,
> Yu Watanabe
>
> On Thu, Sep 19, 2019 at 3:47 AM Ankur Goenka <[email protected]> wrote:
>
>> Adding to the previous suggestions.
>> You can also add "--retain_docker_container" to your pipeline option and
>> later login to the machine to check the docker container log.
>>
>> Also, in my experience running on yarn, the yarn user some time do not
>> have access to use docker. I would suggest checking if the yarn user on
>> TaskManagers have permission to use docker.
>>
>>
>> On Wed, Sep 18, 2019 at 11:23 AM Kyle Weaver <[email protected]> wrote:
>>
>>> > Per your suggest, I read the design sheet and  it states that harness
>>> container is a mandatory settings for  all TaskManger.
>>>
>>> That doc is out of date. As Benjamin said, it's not strictly required
>>> any more to use Docker. However, it is still recommended, as Docker makes
>>> managing dependencies a lot easier, whereas PROCESS mode involves managing
>>> dependencies via shell scripts.
>>>
>>> > Caused by: java.lang.Exception: The user defined 'open()' method
>>> caused an exception: java.io.IOException: Received exit code 1 for command
>>> 'docker inspect -f {{.State.Running}}
>>> 3e7e0d2d9d8362995d4b566ce1611834d56c5ca550ae89ae698a279271e4c33b'. stderr:
>>> Error: No such object:
>>> 3e7e0d2d9d8362995d4b566ce1611834d56c5ca550ae89ae698a279271e4c33b
>>>
>>> This means your Docker container is failing to start up for some reason.
>>> I recommend either a) running the container manually and inspecting the
>>> logs, or b) you can use the master or Beam 2.16 branches, which have better
>>> Docker logging (https://github.com/apache/beam/pull/9389).
>>>
>>> Kyle Weaver | Software Engineer | github.com/ibzib | [email protected]
>>>
>>>
>>> On Wed, Sep 18, 2019 at 8:04 AM Yu Watanabe <[email protected]>
>>> wrote:
>>>
>>>> Thank you for the reply.
>>>>
>>>> I see files "boot" under below directories.
>>>> But these seems to be used for containers.
>>>>
>>>>   (python) admin@ip-172-31-9-89:~/beam-release-2.15.0$ find ./ -name
>>>> "boot" -exec ls -l {} \;
>>>> lrwxrwxrwx 1 admin admin 23 Sep 16 23:43
>>>> ./sdks/python/container/.gogradle/project_gopath/src/
>>>> github.com/apache/beam/sdks/python/boot -> ../../../../../../../..
>>>> -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48
>>>> ./sdks/python/container/build/target/launcher/linux_amd64/boot
>>>> -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48
>>>> ./sdks/python/container/build/target/launcher/darwin_amd64/boot
>>>> -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48
>>>> ./sdks/python/container/py3/build/docker/target/linux_amd64/boot
>>>> -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48
>>>> ./sdks/python/container/py3/build/docker/target/darwin_amd64/boot
>>>> -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48
>>>> ./sdks/python/container/py3/build/target/linux_amd64/boot
>>>> -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48
>>>> ./sdks/python/container/py3/build/target/darwin_amd64/boot
>>>>
>>>> On Wed, Sep 18, 2019 at 11:37 PM Benjamin Tan <
>>>> [email protected]> wrote:
>>>>
>>>>> Try this as part of PipelineOptions:
>>>>>
>>>>> --environment_config={\"command\":\"/opt/apache/beam/boot\"}
>>>>>
>>>>> On 2019/09/18 10:40:42, Yu Watanabe <[email protected]> wrote:
>>>>> > Hello.
>>>>> >
>>>>> > I am trying to run FlinkRunner (2.15.0) on AWS EC2 instance and
>>>>> submit job
>>>>> > to AWS EMR (5.26.0).
>>>>> >
>>>>> > However, I get below error when I run the pipeline and fail.
>>>>> >
>>>>> > ========================================================-
>>>>> > Caused by: java.lang.Exception: The user defined 'open()' method
>>>>> caused an
>>>>> > exception: java.io.IOException: Cannot run program "docker":
>>>>> error=2, No
>>>>> > such file or directory
>>>>> >         at
>>>>> > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
>>>>> >         at
>>>>> >
>>>>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>>>>> >         at
>>>>> org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>>>>> >         ... 1 more
>>>>> > Caused by:
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>>>>> > java.io.IOException: Cannot run program "docker": error=2, No such
>>>>> file or
>>>>> > directory
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:211)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:202)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
>>>>> >         at
>>>>> >
>>>>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>>>>> >         at
>>>>> > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
>>>>> >         ... 3 more
>>>>> > Caused by: java.io.IOException: Cannot run program "docker":
>>>>> error=2, No
>>>>> > such file or directory
>>>>> >         at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:141)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runImage(DockerCommand.java:92)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:152)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:178)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:162)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
>>>>> >         at
>>>>> >
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
>>>>> >         ... 11 more
>>>>> > Caused by: java.io.IOException: error=2, No such file or directory
>>>>> >         at java.lang.UNIXProcess.forkAndExec(Native Method)
>>>>> >         at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
>>>>> >         at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>>>>> >         at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>>>>> >         ... 24 more
>>>>> >  ========================================================-
>>>>> >
>>>>> > Pipeline options are below.
>>>>> >  ========================================================-
>>>>> >         options = PipelineOptions([
>>>>> >                       "--runner=FlinkRunner",
>>>>> >                       "--flink_version=1.8",
>>>>> >
>>>>> >
>>>>> "--flink_master_url=ip-172-31-1-84.ap-northeast-1.compute.internal:43581",
>>>>> >                       "--environment_config=
>>>>> > asia.gcr.io/PROJECTNAME/beam/python3",
>>>>> >                       "--experiments=beam_fn_api"
>>>>> >                   ])
>>>>> >
>>>>> >
>>>>> >         p = beam.Pipeline(options=options)
>>>>> >  ========================================================-
>>>>> >
>>>>> > I am able to run docker info ec2-user on the server where script is
>>>>> > running..
>>>>> >
>>>>> >  ========================================================-
>>>>> > (python) [ec2-user@ip-172-31-2-121 ~]$ docker info
>>>>> > Containers: 0
>>>>> >  Running: 0
>>>>> >  Paused: 0
>>>>> >  Stopped: 0
>>>>> > ...
>>>>> >  ========================================================-
>>>>> >
>>>>> > I used  "debian-stretch" .
>>>>> >
>>>>> > ========================================================-
>>>>> >
>>>>> debian-stretch-hvm-x86_64-gp2-2019-09-08-17994-572488bb-fc09-4638-8628-e1e1d26436f4-ami-0ed2d2283aa1466df.4
>>>>> > (ami-06f16171199d98c63)
>>>>> > ========================================================-
>>>>> >
>>>>> > This seems to not happen when flink runs locally.
>>>>> >
>>>>> > ========================================================-
>>>>> > admin@ip-172-31-9-89:/opt/flink$ sudo ss -atunp | grep 8081
>>>>> > tcp    LISTEN     0      128      :::8081                 :::*
>>>>> >       users:(("java",pid=18420,fd=82))
>>>>> > admin@ip-172-31-9-89:/opt/flink$ sudo ps -ef | grep java | head -1
>>>>> > admin    17698     1  0 08:59 ?        00:00:12 java -jar
>>>>> >
>>>>> /home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar
>>>>> > --flink-master-url
>>>>> ip-172-31-1-84.ap-northeast-1.compute.internal:43581
>>>>> > --artifacts-dir /tmp/artifactskj47j8yn --job-port 48205
>>>>> --artifact-port 0
>>>>> > --expansion-port 0
>>>>> > admin@ip-172-31-9-89:/opt/flink$
>>>>> > ========================================================-
>>>>> >
>>>>> > Would there be any other setting I need to look for when running on
>>>>> EC2
>>>>> > instance ?
>>>>> >
>>>>> > Thanks,
>>>>> > Yu Watanabe
>>>>> >
>>>>> > --
>>>>> > Yu Watanabe
>>>>> > Weekend Freelancer who loves to challenge building data platform
>>>>> > [email protected]
>>>>> > [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>
>>>>> [image:
>>>>> > Twitter icon] <https://twitter.com/yuwtennis>
>>>>> >
>>>>>
>>>>
>>>>
>>>> --
>>>> Yu Watanabe
>>>> Weekend Freelancer who loves to challenge building data platform
>>>> [email protected]
>>>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
>>>> Twitter icon] <https://twitter.com/yuwtennis>
>>>>
>>>
>
> --
> Yu Watanabe
> Weekend Freelancer who loves to challenge building data platform
> [email protected]
> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
> Twitter icon] <https://twitter.com/yuwtennis>
>
>

-- 
Yu Watanabe
Weekend Freelancer who loves to challenge building data platform
[email protected]
[image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
Twitter icon] <https://twitter.com/yuwtennis>

Reply via email to