Adding to the previous suggestions. You can also add "--retain_docker_container" to your pipeline option and later login to the machine to check the docker container log.
Also, in my experience running on yarn, the yarn user some time do not have access to use docker. I would suggest checking if the yarn user on TaskManagers have permission to use docker. On Wed, Sep 18, 2019 at 11:23 AM Kyle Weaver <[email protected]> wrote: > > Per your suggest, I read the design sheet and it states that harness > container is a mandatory settings for all TaskManger. > > That doc is out of date. As Benjamin said, it's not strictly required any > more to use Docker. However, it is still recommended, as Docker makes > managing dependencies a lot easier, whereas PROCESS mode involves managing > dependencies via shell scripts. > > > Caused by: java.lang.Exception: The user defined 'open()' method caused > an exception: java.io.IOException: Received exit code 1 for command 'docker > inspect -f {{.State.Running}} > 3e7e0d2d9d8362995d4b566ce1611834d56c5ca550ae89ae698a279271e4c33b'. stderr: > Error: No such object: > 3e7e0d2d9d8362995d4b566ce1611834d56c5ca550ae89ae698a279271e4c33b > > This means your Docker container is failing to start up for some reason. I > recommend either a) running the container manually and inspecting the logs, > or b) you can use the master or Beam 2.16 branches, which have better > Docker logging (https://github.com/apache/beam/pull/9389). > > Kyle Weaver | Software Engineer | github.com/ibzib | [email protected] > > > On Wed, Sep 18, 2019 at 8:04 AM Yu Watanabe <[email protected]> wrote: > >> Thank you for the reply. >> >> I see files "boot" under below directories. >> But these seems to be used for containers. >> >> (python) admin@ip-172-31-9-89:~/beam-release-2.15.0$ find ./ -name >> "boot" -exec ls -l {} \; >> lrwxrwxrwx 1 admin admin 23 Sep 16 23:43 >> ./sdks/python/container/.gogradle/project_gopath/src/ >> github.com/apache/beam/sdks/python/boot -> ../../../../../../../.. >> -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48 >> ./sdks/python/container/build/target/launcher/linux_amd64/boot >> -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48 >> ./sdks/python/container/build/target/launcher/darwin_amd64/boot >> -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48 >> ./sdks/python/container/py3/build/docker/target/linux_amd64/boot >> -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48 >> ./sdks/python/container/py3/build/docker/target/darwin_amd64/boot >> -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48 >> ./sdks/python/container/py3/build/target/linux_amd64/boot >> -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48 >> ./sdks/python/container/py3/build/target/darwin_amd64/boot >> >> On Wed, Sep 18, 2019 at 11:37 PM Benjamin Tan < >> [email protected]> wrote: >> >>> Try this as part of PipelineOptions: >>> >>> --environment_config={\"command\":\"/opt/apache/beam/boot\"} >>> >>> On 2019/09/18 10:40:42, Yu Watanabe <[email protected]> wrote: >>> > Hello. >>> > >>> > I am trying to run FlinkRunner (2.15.0) on AWS EC2 instance and submit >>> job >>> > to AWS EMR (5.26.0). >>> > >>> > However, I get below error when I run the pipeline and fail. >>> > >>> > ========================================================- >>> > Caused by: java.lang.Exception: The user defined 'open()' method >>> caused an >>> > exception: java.io.IOException: Cannot run program "docker": error=2, >>> No >>> > such file or directory >>> > at >>> > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> > at >>> > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> > ... 1 more >>> > Caused by: >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> > java.io.IOException: Cannot run program "docker": error=2, No such >>> file or >>> > directory >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:211) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:202) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185) >>> > at >>> > >>> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49) >>> > at >>> > >>> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203) >>> > at >>> > >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129) >>> > at >>> > >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> > at >>> > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> > ... 3 more >>> > Caused by: java.io.IOException: Cannot run program "docker": error=2, >>> No >>> > such file or directory >>> > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:141) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runImage(DockerCommand.java:92) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:152) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:178) >>> > at >>> > >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:162) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> > at >>> > >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> > ... 11 more >>> > Caused by: java.io.IOException: error=2, No such file or directory >>> > at java.lang.UNIXProcess.forkAndExec(Native Method) >>> > at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) >>> > at java.lang.ProcessImpl.start(ProcessImpl.java:134) >>> > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) >>> > ... 24 more >>> > ========================================================- >>> > >>> > Pipeline options are below. >>> > ========================================================- >>> > options = PipelineOptions([ >>> > "--runner=FlinkRunner", >>> > "--flink_version=1.8", >>> > >>> > >>> "--flink_master_url=ip-172-31-1-84.ap-northeast-1.compute.internal:43581", >>> > "--environment_config= >>> > asia.gcr.io/PROJECTNAME/beam/python3", >>> > "--experiments=beam_fn_api" >>> > ]) >>> > >>> > >>> > p = beam.Pipeline(options=options) >>> > ========================================================- >>> > >>> > I am able to run docker info ec2-user on the server where script is >>> > running.. >>> > >>> > ========================================================- >>> > (python) [ec2-user@ip-172-31-2-121 ~]$ docker info >>> > Containers: 0 >>> > Running: 0 >>> > Paused: 0 >>> > Stopped: 0 >>> > ... >>> > ========================================================- >>> > >>> > I used "debian-stretch" . >>> > >>> > ========================================================- >>> > >>> debian-stretch-hvm-x86_64-gp2-2019-09-08-17994-572488bb-fc09-4638-8628-e1e1d26436f4-ami-0ed2d2283aa1466df.4 >>> > (ami-06f16171199d98c63) >>> > ========================================================- >>> > >>> > This seems to not happen when flink runs locally. >>> > >>> > ========================================================- >>> > admin@ip-172-31-9-89:/opt/flink$ sudo ss -atunp | grep 8081 >>> > tcp LISTEN 0 128 :::8081 :::* >>> > users:(("java",pid=18420,fd=82)) >>> > admin@ip-172-31-9-89:/opt/flink$ sudo ps -ef | grep java | head -1 >>> > admin 17698 1 0 08:59 ? 00:00:12 java -jar >>> > >>> /home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar >>> > --flink-master-url ip-172-31-1-84.ap-northeast-1.compute.internal:43581 >>> > --artifacts-dir /tmp/artifactskj47j8yn --job-port 48205 >>> --artifact-port 0 >>> > --expansion-port 0 >>> > admin@ip-172-31-9-89:/opt/flink$ >>> > ========================================================- >>> > >>> > Would there be any other setting I need to look for when running on EC2 >>> > instance ? >>> > >>> > Thanks, >>> > Yu Watanabe >>> > >>> > -- >>> > Yu Watanabe >>> > Weekend Freelancer who loves to challenge building data platform >>> > [email protected] >>> > [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> >>> [image: >>> > Twitter icon] <https://twitter.com/yuwtennis> >>> > >>> >> >> >> -- >> Yu Watanabe >> Weekend Freelancer who loves to challenge building data platform >> [email protected] >> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> [image: >> Twitter icon] <https://twitter.com/yuwtennis> >> >
