Re: yarn ship from s3

2021-05-26 Thread Vijayendra Yadav
Thank You Xintong, I will look for these updates in the near future.

Regards,
Vijay

On Wed, May 26, 2021 at 6:40 PM Xintong Song  wrote:

> Hi Vijay,
>
> Currently, Flink only supports shipping files from the local machine where
> job is submitted.
>
> There are tickets [1][2][3] tracking the efforts that shipping files from
> remote paths, e.g., http, hdfs, etc. Once the efforts are done, adding s3
> as an additional supported schema should be straightforward.
>
> Unfortunately, these efforts are still in progress, and are more or less
> staled recently.
>
> Thank you~
>
> Xintong Song
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-20681
> [2] https://issues.apache.org/jira/browse/FLINK-20811
> [3] https://issues.apache.org/jira/browse/FLINK-20867
>
> On Thu, May 27, 2021 at 12:23 AM Vijayendra Yadav 
> wrote:
>
>> Hi Pohl,
>>
>> I tried to ship my property file. Example: *-yarn.ship-files
>> s3://applib/xx/xx/1.0-SNAPSHOT/application.properties  \*
>>
>>
>> *Error:*
>>
>> 6:21:37.163 [main] ERROR org.apache.flink.client.cli.CliFrontend -
>> Invalid command line arguments.
>> org.apache.flink.client.cli.CliArgsException: Could not build the program
>> from JAR file: JAR file does not exist: -yarn.ship-files
>> at
>> org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:244)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:223)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> at java.security.AccessController.doPrivileged(Native Method)
>> ~[?:1.8.0_292]
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> [?:1.8.0_292]
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
>> [hadoop-common-2.10.0-amzn-0.jar:?]
>> at
>> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> [flink-dist_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
>> [flink-dist_2.11-1.11.0.jar:1.11.0]
>> Caused by: java.io.FileNotFoundException: JAR file does not exist:
>> -yarn.ship-files
>> at
>> org.apache.flink.client.cli.CliFrontend.getJarFile(CliFrontend.java:740)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.buildProgram(CliFrontend.java:717)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:242)
>> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
>> ... 8 more
>> Could not build the program from JAR file: JAR file does not exist:
>> -yarn.ship-files
>>
>>
>> *Thanks,*
>>
>> *Vijay*
>>
>> On Tue, May 25, 2021 at 11:58 PM Matthias Pohl 
>> wrote:
>>
>>> Hi Vijay,
>>> have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe,
>>> that's what you're looking for...
>>>
>>> Best,
>>> Matthias
>>>
>>> [1]
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files
>>> [2]
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives
>>>
>>> On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav 
>>> wrote:
>>>
 Hi Piotr,

 I have been doing the same process as you mentioned so far, now I am
 migrating the deployment process using AWS CDK and AWS Step Functions, kind
 of like the CICD process.
 I added a download step of jar and configs (1, 2, 3 and 4) from S3
 using command-runner.jar (AWS Step); it loaded that into one of the Master
 nodes (out of 3). In the next step when I launched Flink Job it would not
 find build because Job is launched in some other yarn node.

 I was hoping just like *Apache spark *where whatever files we provide
 in *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink
 should also have a solution.

 Thanks,
 Vijay


 On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski 
 wrote:

> Hi Vijay,
>
> I'm not sure if I understand your question correctly. You have jar and
> configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
> those? Can you simply download those things (whole directory containing
> those) to the machine that will be starting the Flink job?
>
> Best, Piotrek
>
> wt., 25 maj 2021 o 07:50 Vijayendra Yadav 
> napisał(a):
>
>> Hi Team,
>>
>> I am trying to find a way to ship files from aws s3 for a flink
>> streaming job, I am running on AWS EMR. What i need to ship are 
>> following:
>> 1) application jar
>> 2) application property file

Re: yarn ship from s3

2021-05-26 Thread Xintong Song
Hi Vijay,

Currently, Flink only supports shipping files from the local machine where
job is submitted.

There are tickets [1][2][3] tracking the efforts that shipping files from
remote paths, e.g., http, hdfs, etc. Once the efforts are done, adding s3
as an additional supported schema should be straightforward.

Unfortunately, these efforts are still in progress, and are more or less
staled recently.

Thank you~

Xintong Song


[1] https://issues.apache.org/jira/browse/FLINK-20681
[2] https://issues.apache.org/jira/browse/FLINK-20811
[3] https://issues.apache.org/jira/browse/FLINK-20867

On Thu, May 27, 2021 at 12:23 AM Vijayendra Yadav 
wrote:

> Hi Pohl,
>
> I tried to ship my property file. Example: *-yarn.ship-files
> s3://applib/xx/xx/1.0-SNAPSHOT/application.properties  \*
>
>
> *Error:*
>
> 6:21:37.163 [main] ERROR org.apache.flink.client.cli.CliFrontend - Invalid
> command line arguments.
> org.apache.flink.client.cli.CliArgsException: Could not build the program
> from JAR file: JAR file does not exist: -yarn.ship-files
> at
> org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:244)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:223)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[?:1.8.0_292]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_292]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
> [hadoop-common-2.10.0-amzn-0.jar:?]
> at
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> [flink-dist_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
> [flink-dist_2.11-1.11.0.jar:1.11.0]
> Caused by: java.io.FileNotFoundException: JAR file does not exist:
> -yarn.ship-files
> at
> org.apache.flink.client.cli.CliFrontend.getJarFile(CliFrontend.java:740)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.client.cli.CliFrontend.buildProgram(CliFrontend.java:717)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:242)
> ~[flink-dist_2.11-1.11.0.jar:1.11.0]
> ... 8 more
> Could not build the program from JAR file: JAR file does not exist:
> -yarn.ship-files
>
>
> *Thanks,*
>
> *Vijay*
>
> On Tue, May 25, 2021 at 11:58 PM Matthias Pohl 
> wrote:
>
>> Hi Vijay,
>> have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe,
>> that's what you're looking for...
>>
>> Best,
>> Matthias
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files
>> [2]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives
>>
>> On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav 
>> wrote:
>>
>>> Hi Piotr,
>>>
>>> I have been doing the same process as you mentioned so far, now I am
>>> migrating the deployment process using AWS CDK and AWS Step Functions, kind
>>> of like the CICD process.
>>> I added a download step of jar and configs (1, 2, 3 and 4) from S3 using
>>> command-runner.jar (AWS Step); it loaded that into one of the Master nodes
>>> (out of 3). In the next step when I launched Flink Job it would not find
>>> build because Job is launched in some other yarn node.
>>>
>>> I was hoping just like *Apache spark *where whatever files we provide
>>> in *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink
>>> should also have a solution.
>>>
>>> Thanks,
>>> Vijay
>>>
>>>
>>> On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski 
>>> wrote:
>>>
 Hi Vijay,

 I'm not sure if I understand your question correctly. You have jar and
 configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
 those? Can you simply download those things (whole directory containing
 those) to the machine that will be starting the Flink job?

 Best, Piotrek

 wt., 25 maj 2021 o 07:50 Vijayendra Yadav 
 napisał(a):

> Hi Team,
>
> I am trying to find a way to ship files from aws s3 for a flink
> streaming job, I am running on AWS EMR. What i need to ship are following:
> 1) application jar
> 2) application property file
> 3) custom flink-conf.yaml
> 4) log4j application specific
>
> Please let me know options.
>
> Thanks,
> Vijay




Re: yarn ship from s3

2021-05-26 Thread Vijayendra Yadav
Hi Pohl,

I tried to ship my property file. Example: *-yarn.ship-files
s3://applib/xx/xx/1.0-SNAPSHOT/application.properties  \*


*Error:*

6:21:37.163 [main] ERROR org.apache.flink.client.cli.CliFrontend - Invalid
command line arguments.
org.apache.flink.client.cli.CliArgsException: Could not build the program
from JAR file: JAR file does not exist: -yarn.ship-files
at
org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:244)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
at
org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:223)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
at
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
at
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native Method)
~[?:1.8.0_292]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_292]
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
[hadoop-common-2.10.0-amzn-0.jar:?]
at
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
[flink-dist_2.11-1.11.0.jar:1.11.0]
at
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
[flink-dist_2.11-1.11.0.jar:1.11.0]
Caused by: java.io.FileNotFoundException: JAR file does not exist:
-yarn.ship-files
at
org.apache.flink.client.cli.CliFrontend.getJarFile(CliFrontend.java:740)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
at
org.apache.flink.client.cli.CliFrontend.buildProgram(CliFrontend.java:717)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
at
org.apache.flink.client.cli.CliFrontend.getPackagedProgram(CliFrontend.java:242)
~[flink-dist_2.11-1.11.0.jar:1.11.0]
... 8 more
Could not build the program from JAR file: JAR file does not exist:
-yarn.ship-files


*Thanks,*

*Vijay*

On Tue, May 25, 2021 at 11:58 PM Matthias Pohl 
wrote:

> Hi Vijay,
> have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe,
> that's what you're looking for...
>
> Best,
> Matthias
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives
>
> On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav 
> wrote:
>
>> Hi Piotr,
>>
>> I have been doing the same process as you mentioned so far, now I am
>> migrating the deployment process using AWS CDK and AWS Step Functions, kind
>> of like the CICD process.
>> I added a download step of jar and configs (1, 2, 3 and 4) from S3 using
>> command-runner.jar (AWS Step); it loaded that into one of the Master nodes
>> (out of 3). In the next step when I launched Flink Job it would not find
>> build because Job is launched in some other yarn node.
>>
>> I was hoping just like *Apache spark *where whatever files we provide in
>> *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should
>> also have a solution.
>>
>> Thanks,
>> Vijay
>>
>>
>> On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski 
>> wrote:
>>
>>> Hi Vijay,
>>>
>>> I'm not sure if I understand your question correctly. You have jar and
>>> configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
>>> those? Can you simply download those things (whole directory containing
>>> those) to the machine that will be starting the Flink job?
>>>
>>> Best, Piotrek
>>>
>>> wt., 25 maj 2021 o 07:50 Vijayendra Yadav 
>>> napisał(a):
>>>
 Hi Team,

 I am trying to find a way to ship files from aws s3 for a flink
 streaming job, I am running on AWS EMR. What i need to ship are following:
 1) application jar
 2) application property file
 3) custom flink-conf.yaml
 4) log4j application specific

 Please let me know options.

 Thanks,
 Vijay
>>>
>>>


Re: yarn ship from s3

2021-05-26 Thread Matthias Pohl
Hi Vijay,
have you tried yarn-ship-files [1] or yarn-ship-archives [2]? Maybe, that's
what you're looking for...

Best,
Matthias

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-files
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#yarn-ship-archives

On Tue, May 25, 2021 at 5:56 PM Vijayendra Yadav 
wrote:

> Hi Piotr,
>
> I have been doing the same process as you mentioned so far, now I am
> migrating the deployment process using AWS CDK and AWS Step Functions, kind
> of like the CICD process.
> I added a download step of jar and configs (1, 2, 3 and 4) from S3 using
> command-runner.jar (AWS Step); it loaded that into one of the Master nodes
> (out of 3). In the next step when I launched Flink Job it would not find
> build because Job is launched in some other yarn node.
>
> I was hoping just like *Apache spark *where whatever files we provide in
> *--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should
> also have a solution.
>
> Thanks,
> Vijay
>
>
> On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski 
> wrote:
>
>> Hi Vijay,
>>
>> I'm not sure if I understand your question correctly. You have jar and
>> configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
>> those? Can you simply download those things (whole directory containing
>> those) to the machine that will be starting the Flink job?
>>
>> Best, Piotrek
>>
>> wt., 25 maj 2021 o 07:50 Vijayendra Yadav 
>> napisał(a):
>>
>>> Hi Team,
>>>
>>> I am trying to find a way to ship files from aws s3 for a flink
>>> streaming job, I am running on AWS EMR. What i need to ship are following:
>>> 1) application jar
>>> 2) application property file
>>> 3) custom flink-conf.yaml
>>> 4) log4j application specific
>>>
>>> Please let me know options.
>>>
>>> Thanks,
>>> Vijay
>>
>>


Re: yarn ship from s3

2021-05-25 Thread Vijayendra Yadav
Hi Piotr,

I have been doing the same process as you mentioned so far, now I am
migrating the deployment process using AWS CDK and AWS Step Functions, kind
of like the CICD process.
I added a download step of jar and configs (1, 2, 3 and 4) from S3 using
command-runner.jar (AWS Step); it loaded that into one of the Master nodes
(out of 3). In the next step when I launched Flink Job it would not find
build because Job is launched in some other yarn node.

I was hoping just like *Apache spark *where whatever files we provide in
*--file*s are shipped to yarn (s3 to yarn workfirectory), Flink should also
have a solution.

Thanks,
Vijay


On Tue, May 25, 2021 at 12:50 AM Piotr Nowojski 
wrote:

> Hi Vijay,
>
> I'm not sure if I understand your question correctly. You have jar and
> configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
> those? Can you simply download those things (whole directory containing
> those) to the machine that will be starting the Flink job?
>
> Best, Piotrek
>
> wt., 25 maj 2021 o 07:50 Vijayendra Yadav 
> napisał(a):
>
>> Hi Team,
>>
>> I am trying to find a way to ship files from aws s3 for a flink streaming
>> job, I am running on AWS EMR. What i need to ship are following:
>> 1) application jar
>> 2) application property file
>> 3) custom flink-conf.yaml
>> 4) log4j application specific
>>
>> Please let me know options.
>>
>> Thanks,
>> Vijay
>>
>


Re: yarn ship from s3

2021-05-25 Thread Piotr Nowojski
Hi Vijay,

I'm not sure if I understand your question correctly. You have jar and
configs (1, 2, 3 and 4) on S3 and you want to start a Flink job using
those? Can you simply download those things (whole directory containing
those) to the machine that will be starting the Flink job?

Best, Piotrek

wt., 25 maj 2021 o 07:50 Vijayendra Yadav 
napisał(a):

> Hi Team,
>
> I am trying to find a way to ship files from aws s3 for a flink streaming
> job, I am running on AWS EMR. What i need to ship are following:
> 1) application jar
> 2) application property file
> 3) custom flink-conf.yaml
> 4) log4j application specific
>
> Please let me know options.
>
> Thanks,
> Vijay
>