Re: yarn ship from s3
Hi Vijay, I'm not sure if I understand your question correctly. You have jar and configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using those? Can you simply download those things (whole directory containing those) to the machine that will be starting the Flink job? Best, Piotrek wt., 25 maj 2021 o 07:50 Vijayendra Yadav napisał(a): > Hi Team, > > I am trying to find a way to ship files from aws s3 for a flink streaming > job, I am running on AWS EMR. What i need to ship are following: > 1) application jar > 2) application property file > 3) custom flink-conf.yaml > 4) log4j application specific > > Please let me know options. > > Thanks, > Vijay >
Re: yarn ship from s3
Hi Piotr, I have been doing the same process as you mentioned so far, now I am migrating the deployment process using AWS CDK and AWS Step Functions, kind of like the CICD process. I added a download step of jar and configs (1, 2, 3 and 4) from S3 using command-runner.jar (AWS Step); it loaded that into one of the Master nodes (out of 3). In the next step when I launched Flink Job it would not find build because Job is launched in some other yarn node. I was hoping just like *Apache spark *where whatever files we provide in *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should also have a solution. Thanks, Vijay On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski wrote: > Hi Vijay, > > I'm not sure if I understand your question correctly. You have jar and > configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using > those? Can you simply download those things (whole directory containing > those) to the machine that will be starting the Flink job? > > Best, Piotrek > > wt., 25 maj 2021 o 07:50 Vijayendra Yadav > napisał(a): > >> Hi Team, >> >> I am trying to find a way to ship files from aws s3 for a flink streaming >> job, I am running on AWS EMR. What i need to ship are following: >> 1) application jar >> 2) application property file >> 3) custom flink-conf.yaml >> 4) log4j application specific >> >> Please let me know options. >> >> Thanks, >> Vijay >> >
Re: yarn ship from s3
Hi Vijay, have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe, that's what you're looking for... Best, Matthias [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files [2] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav wrote: > Hi Piotr, > > I have been doing the same process as you mentioned so far, now I am > migrating the deployment process using AWS CDK and AWS Step Functions, kind > of like the CICD process. > I added a download step of jar and configs (1, 2, 3 and 4) from S3 using > command-runner.jar (AWS Step); it loaded that into one of the Master nodes > (out of 3). In the next step when I launched Flink Job it would not find > build because Job is launched in some other yarn node. > > I was hoping just like *Apache spark *where whatever files we provide in > *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should > also have a solution. > > Thanks, > Vijay > > > On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski > wrote: > >> Hi Vijay, >> >> I'm not sure if I understand your question correctly. You have jar and >> configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using >> those? Can you simply download those things (whole directory containing >> those) to the machine that will be starting the Flink job? >> >> Best, Piotrek >> >> wt., 25 maj 2021 o 07:50 Vijayendra Yadav >> napisał(a): >> >>> Hi Team, >>> >>> I am trying to find a way to ship files from aws s3 for a flink >>> streaming job, I am running on AWS EMR. What i need to ship are following: >>> 1) application jar >>> 2) application property file >>> 3) custom flink-conf.yaml >>> 4) log4j application specific >>> >>> Please let me know options. >>> >>> Thanks, >>> Vijay >> >>
Re: yarn ship from s3
Hi Pohl, I tried to ship my property file. Example: *-yarn.ship-files s3://applib/xx/xx/1.0-SNAPSHOT/application.properties \* *Error:* 6:21:37.163 [main] ERROR org.apache.flink.client.cli.CliFrontend - Invalid command line arguments. org.apache.flink.client.cli.CliArgsException: Could not build the program from JAR file: JAR file does not exist: -yarn.ship-files at org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:244) ~[flink-dist_2.11-1.11.0.jar:1.11.0] at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:223) ~[flink-dist_2.11-1.11.0.jar:1.11.0] at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916) ~[flink-dist_2.11-1.11.0.jar:1.11.0] at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) ~[flink-dist_2.11-1.11.0.jar:1.11.0] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_292] at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_292] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) [hadoop-common-2.10.0-amzn-0.jar:?] at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) [flink-dist_2.11-1.11.0.jar:1.11.0] at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) [flink-dist_2.11-1.11.0.jar:1.11.0] Caused by: java.io.FileNotFoundException: JAR file does not exist: -yarn.ship-files at org.apache.flink.client.cli.CliFrontend.getJarFile(CliFrontend.java:740) ~[flink-dist_2.11-1.11.0.jar:1.11.0] at org.apache.flink.client.cli.CliFrontend.buildProgram(CliFrontend.java:717) ~[flink-dist_2.11-1.11.0.jar:1.11.0] at org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:242) ~[flink-dist_2.11-1.11.0.jar:1.11.0] ... 8 more Could not build the program from JAR file: JAR file does not exist: -yarn.ship-files *Thanks,* *Vijay* On Tue, May 25, 2021 at 11:58 PM Matthias Pohl wrote: > Hi Vijay, > have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe, > that's what you're looking for... > > Best, > Matthias > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives > > On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav > wrote: > >> Hi Piotr, >> >> I have been doing the same process as you mentioned so far, now I am >> migrating the deployment process using AWS CDK and AWS Step Functions, kind >> of like the CICD process. >> I added a download step of jar and configs (1, 2, 3 and 4) from S3 using >> command-runner.jar (AWS Step); it loaded that into one of the Master nodes >> (out of 3). In the next step when I launched Flink Job it would not find >> build because Job is launched in some other yarn node. >> >> I was hoping just like *Apache spark *where whatever files we provide in >> *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should >> also have a solution. >> >> Thanks, >> Vijay >> >> >> On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski >> wrote: >> >>> Hi Vijay, >>> >>> I'm not sure if I understand your question correctly. You have jar and >>> configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using >>> those? Can you simply download those things (whole directory containing >>> those) to the machine that will be starting the Flink job? >>> >>> Best, Piotrek >>> >>> wt., 25 maj 2021 o 07:50 Vijayendra Yadav >>> napisał(a): >>> Hi Team, I am trying to find a way to ship files from aws s3 for a flink streaming job, I am running on AWS EMR. What i need to ship are following: 1) application jar 2) application property file 3) custom flink-conf.yaml 4) log4j application specific Please let me know options. Thanks, Vijay >>> >>>
Re: yarn ship from s3
Hi Vijay, Currently, Flink only supports shipping files from the local machine where job is submitted. There are tickets [1][2][3] tracking the efforts that shipping files from remote paths, e.g., http, hdfs, etc. Once the efforts are done, adding s3 as an additional supported schema should be straightforward. Unfortunately, these efforts are still in progress, and are more or less staled recently. Thank you~ Xintong Song [1] https://issues.apache.org/jira/browse/FLINK-20681 [2] https://issues.apache.org/jira/browse/FLINK-20811 [3] https://issues.apache.org/jira/browse/FLINK-20867 On Thu, May 27, 2021 at 12:23 AM Vijayendra Yadav wrote: > Hi Pohl, > > I tried to ship my property file. Example: *-yarn.ship-files > s3://applib/xx/xx/1.0-SNAPSHOT/application.properties \* > > > *Error:* > > 6:21:37.163 [main] ERROR org.apache.flink.client.cli.CliFrontend - Invalid > command line arguments. > org.apache.flink.client.cli.CliArgsException: Could not build the program > from JAR file: JAR file does not exist: -yarn.ship-files > at > org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:244) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > at > org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:223) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > at > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_292] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_292] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) > [hadoop-common-2.10.0-amzn-0.jar:?] > at > org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > [flink-dist_2.11-1.11.0.jar:1.11.0] > at > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) > [flink-dist_2.11-1.11.0.jar:1.11.0] > Caused by: java.io.FileNotFoundException: JAR file does not exist: > -yarn.ship-files > at > org.apache.flink.client.cli.CliFrontend.getJarFile(CliFrontend.java:740) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > at > org.apache.flink.client.cli.CliFrontend.buildProgram(CliFrontend.java:717) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > at > org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:242) > ~[flink-dist_2.11-1.11.0.jar:1.11.0] > ... 8 more > Could not build the program from JAR file: JAR file does not exist: > -yarn.ship-files > > > *Thanks,* > > *Vijay* > > On Tue, May 25, 2021 at 11:58 PM Matthias Pohl > wrote: > >> Hi Vijay, >> have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe, >> that's what you're looking for... >> >> Best, >> Matthias >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files >> [2] >> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives >> >> On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav >> wrote: >> >>> Hi Piotr, >>> >>> I have been doing the same process as you mentioned so far, now I am >>> migrating the deployment process using AWS CDK and AWS Step Functions, kind >>> of like the CICD process. >>> I added a download step of jar and configs (1, 2, 3 and 4) from S3 using >>> command-runner.jar (AWS Step); it loaded that into one of the Master nodes >>> (out of 3). In the next step when I launched Flink Job it would not find >>> build because Job is launched in some other yarn node. >>> >>> I was hoping just like *Apache spark *where whatever files we provide >>> in *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink >>> should also have a solution. >>> >>> Thanks, >>> Vijay >>> >>> >>> On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski >>> wrote: >>> Hi Vijay, I'm not sure if I understand your question correctly. You have jar and configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using those? Can you simply download those things (whole directory containing those) to the machine that will be starting the Flink job? Best, Piotrek wt., 25 maj 2021 o 07:50 Vijayendra Yadav napisał(a): > Hi Team, > > I am trying to find a way to ship files from aws s3 for a flink > streaming job, I am running on AWS EMR. What i need to ship are following: > 1) application jar > 2) application property file > 3) custom flink-conf.yaml > 4) log4j application specific > > Please let me know options. > > Thanks, > Vijay
Re: yarn ship from s3
Thank You Xintong, I will look for these updates in the near future. Regards, Vijay On Wed, May 26, 2021 at 6:40 PM Xintong Song wrote: > Hi Vijay, > > Currently, Flink only supports shipping files from the local machine where > job is submitted. > > There are tickets [1][2][3] tracking the efforts that shipping files from > remote paths, e.g., http, hdfs, etc. Once the efforts are done, adding s3 > as an additional supported schema should be straightforward. > > Unfortunately, these efforts are still in progress, and are more or less > staled recently. > > Thank you~ > > Xintong Song > > > [1] https://issues.apache.org/jira/browse/FLINK-20681 > [2] https://issues.apache.org/jira/browse/FLINK-20811 > [3] https://issues.apache.org/jira/browse/FLINK-20867 > > On Thu, May 27, 2021 at 12:23 AM Vijayendra Yadav > wrote: > >> Hi Pohl, >> >> I tried to ship my property file. Example: *-yarn.ship-files >> s3://applib/xx/xx/1.0-SNAPSHOT/application.properties \* >> >> >> *Error:* >> >> 6:21:37.163 [main] ERROR org.apache.flink.client.cli.CliFrontend - >> Invalid command line arguments. >> org.apache.flink.client.cli.CliArgsException: Could not build the program >> from JAR file: JAR file does not exist: -yarn.ship-files >> at >> org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:244) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:223) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> at java.security.AccessController.doPrivileged(Native Method) >> ~[?:1.8.0_292] >> at javax.security.auth.Subject.doAs(Subject.java:422) >> [?:1.8.0_292] >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) >> [hadoop-common-2.10.0-amzn-0.jar:?] >> at >> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >> [flink-dist_2.11-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) >> [flink-dist_2.11-1.11.0.jar:1.11.0] >> Caused by: java.io.FileNotFoundException: JAR file does not exist: >> -yarn.ship-files >> at >> org.apache.flink.client.cli.CliFrontend.getJarFile(CliFrontend.java:740) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.buildProgram(CliFrontend.java:717) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:242) >> ~[flink-dist_2.11-1.11.0.jar:1.11.0] >> ... 8 more >> Could not build the program from JAR file: JAR file does not exist: >> -yarn.ship-files >> >> >> *Thanks,* >> >> *Vijay* >> >> On Tue, May 25, 2021 at 11:58 PM Matthias Pohl >> wrote: >> >>> Hi Vijay, >>> have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe, >>> that's what you're looking for... >>> >>> Best, >>> Matthias >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files >>> [2] >>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives >>> >>> On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav >>> wrote: >>> Hi Piotr, I have been doing the same process as you mentioned so far, now I am migrating the deployment process using AWS CDK and AWS Step Functions, kind of like the CICD process. I added a download step of jar and configs (1, 2, 3 and 4) from S3 using command-runner.jar (AWS Step); it loaded that into one of the Master nodes (out of 3). In the next step when I launched Flink Job it would not find build because Job is launched in some other yarn node. I was hoping just like *Apache spark *where whatever files we provide in *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should also have a solution. Thanks, Vijay On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski wrote: > Hi Vijay, > > I'm not sure if I understand your question correctly. You have jar and > configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using > those? Can you simply download those things (whole directory containing > those) to the machine that will be starting the Flink job? > > Best, Piotrek > > wt., 25 maj 2021 o 07:50 Vijayendra Yadav > napisał(a): > >> Hi Team, >> >> I am trying to find a way to ship files from aws s3 for a flink >> streaming job, I am running on AWS EMR. What i need to ship are >> following: >> 1) application jar >> 2) application property file