> Per your suggest, I read the design sheet and it states that harness container is a mandatory settings for all TaskManger.
That doc is out of date. As Benjamin said, it's not strictly required any more to use Docker. However, it is still recommended, as Docker makes managing dependencies a lot easier, whereas PROCESS mode involves managing dependencies via shell scripts. > Caused by: java.lang.Exception: The user defined 'open()' method caused an exception: java.io.IOException: Received exit code 1 for command 'docker inspect -f {{.State.Running}} 3e7e0d2d9d8362995d4b566ce1611834d56c5ca550ae89ae698a279271e4c33b'. stderr: Error: No such object: 3e7e0d2d9d8362995d4b566ce1611834d56c5ca550ae89ae698a279271e4c33b This means your Docker container is failing to start up for some reason. I recommend either a) running the container manually and inspecting the logs, or b) you can use the master or Beam 2.16 branches, which have better Docker logging (https://github.com/apache/beam/pull/9389). Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com On Wed, Sep 18, 2019 at 8:04 AM Yu Watanabe <yu.w.ten...@gmail.com> wrote: > Thank you for the reply. > > I see files "boot" under below directories. > But these seems to be used for containers. > > (python) admin@ip-172-31-9-89:~/beam-release-2.15.0$ find ./ -name > "boot" -exec ls -l {} \; > lrwxrwxrwx 1 admin admin 23 Sep 16 23:43 > ./sdks/python/container/.gogradle/project_gopath/src/ > github.com/apache/beam/sdks/python/boot -> ../../../../../../../.. > -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48 > ./sdks/python/container/build/target/launcher/linux_amd64/boot > -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48 > ./sdks/python/container/build/target/launcher/darwin_amd64/boot > -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48 > ./sdks/python/container/py3/build/docker/target/linux_amd64/boot > -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48 > ./sdks/python/container/py3/build/docker/target/darwin_amd64/boot > -rwxr-xr-x 1 admin admin 16543786 Sep 16 23:48 > ./sdks/python/container/py3/build/target/linux_amd64/boot > -rwxr-xr-x 1 admin admin 16358928 Sep 16 23:48 > ./sdks/python/container/py3/build/target/darwin_amd64/boot > > On Wed, Sep 18, 2019 at 11:37 PM Benjamin Tan <benjamintanwei...@gmail.com> > wrote: > >> Try this as part of PipelineOptions: >> >> --environment_config={\"command\":\"/opt/apache/beam/boot\"} >> >> On 2019/09/18 10:40:42, Yu Watanabe <yu.w.ten...@gmail.com> wrote: >> > Hello. >> > >> > I am trying to run FlinkRunner (2.15.0) on AWS EC2 instance and submit >> job >> > to AWS EMR (5.26.0). >> > >> > However, I get below error when I run the pipeline and fail. >> > >> > ========================================================- >> > Caused by: java.lang.Exception: The user defined 'open()' method caused >> an >> > exception: java.io.IOException: Cannot run program "docker": error=2, No >> > such file or directory >> > at >> > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> > at >> > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> > ... 1 more >> > Caused by: >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> > java.io.IOException: Cannot run program "docker": error=2, No such file >> or >> > directory >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) >> > at >> > >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:211) >> > at >> > >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:202) >> > at >> > >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185) >> > at >> > >> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49) >> > at >> > >> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203) >> > at >> > >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129) >> > at >> > >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> > at >> > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> > ... 3 more >> > Caused by: java.io.IOException: Cannot run program "docker": error=2, No >> > such file or directory >> > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) >> > at >> > >> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:141) >> > at >> > >> org.apache.beam.runners.fnexecution.environment.DockerCommand.runImage(DockerCommand.java:92) >> > at >> > >> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:152) >> > at >> > >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:178) >> > at >> > >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:162) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> > at >> > >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> > ... 11 more >> > Caused by: java.io.IOException: error=2, No such file or directory >> > at java.lang.UNIXProcess.forkAndExec(Native Method) >> > at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) >> > at java.lang.ProcessImpl.start(ProcessImpl.java:134) >> > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) >> > ... 24 more >> > ========================================================- >> > >> > Pipeline options are below. >> > ========================================================- >> > options = PipelineOptions([ >> > "--runner=FlinkRunner", >> > "--flink_version=1.8", >> > >> > >> "--flink_master_url=ip-172-31-1-84.ap-northeast-1.compute.internal:43581", >> > "--environment_config= >> > asia.gcr.io/PROJECTNAME/beam/python3", >> > "--experiments=beam_fn_api" >> > ]) >> > >> > >> > p = beam.Pipeline(options=options) >> > ========================================================- >> > >> > I am able to run docker info ec2-user on the server where script is >> > running.. >> > >> > ========================================================- >> > (python) [ec2-user@ip-172-31-2-121 ~]$ docker info >> > Containers: 0 >> > Running: 0 >> > Paused: 0 >> > Stopped: 0 >> > ... >> > ========================================================- >> > >> > I used "debian-stretch" . >> > >> > ========================================================- >> > >> debian-stretch-hvm-x86_64-gp2-2019-09-08-17994-572488bb-fc09-4638-8628-e1e1d26436f4-ami-0ed2d2283aa1466df.4 >> > (ami-06f16171199d98c63) >> > ========================================================- >> > >> > This seems to not happen when flink runs locally. >> > >> > ========================================================- >> > admin@ip-172-31-9-89:/opt/flink$ sudo ss -atunp | grep 8081 >> > tcp LISTEN 0 128 :::8081 :::* >> > users:(("java",pid=18420,fd=82)) >> > admin@ip-172-31-9-89:/opt/flink$ sudo ps -ef | grep java | head -1 >> > admin 17698 1 0 08:59 ? 00:00:12 java -jar >> > >> /home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar >> > --flink-master-url ip-172-31-1-84.ap-northeast-1.compute.internal:43581 >> > --artifacts-dir /tmp/artifactskj47j8yn --job-port 48205 --artifact-port >> 0 >> > --expansion-port 0 >> > admin@ip-172-31-9-89:/opt/flink$ >> > ========================================================- >> > >> > Would there be any other setting I need to look for when running on EC2 >> > instance ? >> > >> > Thanks, >> > Yu Watanabe >> > >> > -- >> > Yu Watanabe >> > Weekend Freelancer who loves to challenge building data platform >> > yu.w.ten...@gmail.com >> > [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> >> [image: >> > Twitter icon] <https://twitter.com/yuwtennis> >> > >> > > > -- > Yu Watanabe > Weekend Freelancer who loves to challenge building data platform > yu.w.ten...@gmail.com > [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> [image: > Twitter icon] <https://twitter.com/yuwtennis> >