Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-22 Thread Chesnay Schepler
Is your user-jar packaging and relocating Flink classes? If so, then 
your job actually operate against the classes provided by the cluster, 
which, well, just wouldn't work.


On 18/06/2020 09:34, Sourabh Mehta wrote:

Hi ,
application is using 1.10.0 but cluster is setup on 1.9.0.

Yes I do have access. please find below starting logs from cluster


2020-06-17 11:28:18,989 INFO 
 org.apache.shaded.flink.table.module.ModuleManager  - Got 
FunctionDefinition equals from module core
2020-06-17 11:28:20,538 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,538 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,538 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,538 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: parallelism.default, 1
2020-06-17 11:28:20,539 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,539 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,539 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,539 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,540 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,540 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: parallelism.default, 28
2020-06-17 11:28:20,540 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,540 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,550 INFO 
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils 
 - The configuration option Key: 'taskmanager.cpu.cores' , default: 
null (fallback keys: []) required for local execution is not set, 
setting it to its default value 1.7976931348623157E308
2020-06-17 11:28:20,552 INFO 
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils 
 - The configuration option Key: 'taskmanager.memory.task.heap.size' , 
default: null (fallback keys: []) required for local execution is not 
set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO 
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils 
 - The configuration option Key: 
'taskmanager.memory.task.off-heap.size' , default: 0 bytes (fallback 
keys: []) required for local execution is not set, setting it to its 
default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO 
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils 
 - The configuration option Key: 'taskmanager.memory.network.min' , 
default: 64 mb (fallback keys: [{key=taskmanager.network.memory.min, 
isDeprecated=true}]) required for local execution is not set, setting 
it to its default value 64 mb
2020-06-17 11:28:20,553 INFO 
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils 
 - The configuration option Key: 'taskmanager.memory.network.max' , 
default: 1 gb (fallback keys: [{key=taskmanager.network.memory.max, 
isDeprecated=true}]) required for local execution is not set, setting 
it to its default value 64 mb
2020-06-17 11:28:20,553 INFO 
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils 
 - The configuration option Key: 'taskmanager.memory.managed.size' , 
default: null (fallback keys: [{key=taskmanager.memory.size, 
isDeprecated=true}]) required for local execution is not set, setting 
it to its default value 128 mb
2020-06-17 11:28:20,558 INFO 
 org.apache.shaded.flink.runtime.minicluster.MiniCluster - Starting 
Flink Mini Cluster
2020-06-17 11:28:20,561 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 
configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,561 INFO 
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading 

Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-18 Thread Sourabh Mehta
Hi ,
application is using 1.10.0 but cluster is setup on 1.9.0.

Yes I do have access. please find below starting logs from cluster


2020-06-17 11:28:18,989 INFO
 org.apache.shaded.flink.table.module.ModuleManager- Got
FunctionDefinition equals from module core
2020-06-17 11:28:20,538 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,538 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,538 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,538 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2020-06-17 11:28:20,539 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,539 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,539 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,539 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,540 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,540 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 28
2020-06-17 11:28:20,540 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,540 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,550 INFO
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  -
The configuration option Key: 'taskmanager.cpu.cores' , default: null
(fallback keys: []) required for local execution is not set, setting it to
its default value 1.7976931348623157E308
2020-06-17 11:28:20,552 INFO
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  -
The configuration option Key: 'taskmanager.memory.task.heap.size' ,
default: null (fallback keys: []) required for local execution is not set,
setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  -
The configuration option Key: 'taskmanager.memory.task.off-heap.size' ,
default: 0 bytes (fallback keys: []) required for local execution is not
set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  -
The configuration option Key: 'taskmanager.memory.network.min' , default:
64 mb (fallback keys: [{key=taskmanager.network.memory.min,
isDeprecated=true}]) required for local execution is not set, setting it to
its default value 64 mb
2020-06-17 11:28:20,553 INFO
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  -
The configuration option Key: 'taskmanager.memory.network.max' , default: 1
gb (fallback keys: [{key=taskmanager.network.memory.max,
isDeprecated=true}]) required for local execution is not set, setting it to
its default value 64 mb
2020-06-17 11:28:20,553 INFO
 org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils  -
The configuration option Key: 'taskmanager.memory.managed.size' , default:
null (fallback keys: [{key=taskmanager.memory.size, isDeprecated=true}])
required for local execution is not set, setting it to its default value
128 mb
2020-06-17 11:28:20,558 INFO
 org.apache.shaded.flink.runtime.minicluster.MiniCluster   - Starting
Flink Mini Cluster
2020-06-17 11:28:20,561 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,561 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,561 INFO
 org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m

Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-17 Thread Till Rohrmann
Hi Sourabh,

do you have access to the cluster logs? They could be helpful for debugging
the problem. Which version of Flink are you using?

Cheers,
Till

On Wed, Jun 17, 2020 at 7:39 PM Sourabh Mehta 
wrote:

> No, I am not.
>
> On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler 
> wrote:
>
>> Are you by any chance creating a local environment via
>> (Stream)ExecutionEnvironment#createLocalEnvironment?
>>
>> On 17/06/2020 17:05, Sourabh Mehta wrote:
>>
>> Hi Team,
>>
>> I'm  exploring flink for one of my use case, I'm facing some issues
>> while running a flink job in cluster mode. Below are the steps I followed
>> to setup and run job in cluster mode :
>> 1. Setup flink on google cloud dataproc using
>> https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink
>>
>> 2. After setting up the cluster I could see the flink session started and
>> could see the UI for the same.
>>
>> 3 Submitted job from dataproc master node using below command
>>
>> sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m
>> yarn-cluster -yid application_1592311654771_0001 -class
>> com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar
>> hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/
>>
>> After running the job I see the job started successfully but created a
>> mini local cluster and ran in local mode. I don't see any jobs submitted to
>> JobManger and I also see 0 task managers on UI.
>>
>> Can someone please help me understand here?, do let me know what input
>> is required to investigate the same.
>>
>>
>>
>>
>>


Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-17 Thread Sourabh Mehta
No, I am not.

On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler 
wrote:

> Are you by any chance creating a local environment via
> (Stream)ExecutionEnvironment#createLocalEnvironment?
>
> On 17/06/2020 17:05, Sourabh Mehta wrote:
>
> Hi Team,
>
> I'm  exploring flink for one of my use case, I'm facing some issues while
> running a flink job in cluster mode. Below are the steps I followed to
> setup and run job in cluster mode :
> 1. Setup flink on google cloud dataproc using
> https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink
>
> 2. After setting up the cluster I could see the flink session started and
> could see the UI for the same.
>
> 3 Submitted job from dataproc master node using below command
>
> sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m
> yarn-cluster -yid application_1592311654771_0001 -class
> com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar
> hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/
>
> After running the job I see the job started successfully but created a
> mini local cluster and ran in local mode. I don't see any jobs submitted to
> JobManger and I also see 0 task managers on UI.
>
> Can someone please help me understand here?, do let me know what input is
> required to investigate the same.
>
>
>
>
>


Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-17 Thread Chesnay Schepler
Are you by any chance creating a local environment via 
(Stream)ExecutionEnvironment#createLocalEnvironment?


On 17/06/2020 17:05, Sourabh Mehta wrote:

Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues 
while running a flink job in cluster mode. Below are the steps I 
followed to setup and run job in cluster mode :
1. Setup flink on google cloud dataproc using 
https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink


2. After setting up the cluster I could see the flink session started 
and could see the UI for the same.


3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m 
yarn-cluster -yid application_1592311654771_0001 -class 
com.sm.flink.FlinkDriver 
/usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar 
hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/


After running the job I see the job started successfully but created a 
mini local cluster and ran in local mode. I don't see any jobs 
submitted to JobManger and I also see 0 task managers on UI.


Can someone please help me understand here?, do let me know what input 
is required to investigate the same.








Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-17 Thread Sourabh Mehta
Hi Team,

I'm  exploring flink for one of my use case, I'm facing some issues while
running a flink job in cluster mode. Below are the steps I followed to
setup and run job in cluster mode :
1. Setup flink on google cloud dataproc using
https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink

2. After setting up the cluster I could see the flink session started and
could see the UI for the same.

3 Submitted job from dataproc master node using below command

sudo HADOOP_CONF_DIR=/etc/hadoop/conf /usr/lib/flink/bin/flink run -m
yarn-cluster -yid application_1592311654771_0001 -class
com.sm.flink.FlinkDriver /usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar
hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/

After running the job I see the job started successfully but created a mini
local cluster and ran in local mode. I don't see any jobs submitted to
JobManger and I also see 0 task managers on UI.

Can someone please help me understand here?, do let me know what input is
required to investigate the same.