Re: Support for detached mode for Flink1.5 SQL Client

2018-07-11 Thread Timo Walther
The INSERT INTO [1] statement will allow to submit queries detached. So 
your can close the client and let the Flink program do it's job sinking 
into external systems.


Regards,
Timo

[1] https://issues.apache.org/jira/browse/FLINK-8858

Am 12.07.18 um 02:47 schrieb Rong Rong:
Is the Gateway Mode [1] in the FLIP-24 SQL Client road map what you 
are looking for?


--
Rong

[1]: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client


On Tue, Jul 10, 2018 at 3:37 AM Shivam Sharma 
<28shivamsha...@gmail.com > wrote:


Hi All,

Is there any way to run Flink1.5 sql-client queries in detached
mode? Actually, we need to run multiple queries for different use
cases and sql-client shell will open by the user on-demand.

-- 
Shivam Sharma

Data Engineer @ Goibibo
Indian Institute Of Information Technology, Design and
Manufacturing Jabalpur
Mobile No- (+91) 8882114744
Email:- 28shivamsha...@gmail.com 
LinkedIn:-_https://www.linkedin.com/in/28shivamsharma_





Re: Need to better way to create JSON If we have TableSchema and Row

2018-07-11 Thread Timo Walther

Hi Shivam,

Flink 1.5 provides full Row-JSON-Row conversions. You can take a look at 
the `flink-json` module. A table schema can be converted into a 
TypeInformation (Types.ROW(schema.getColumns(), schema.getTypes())) 
which can be used to configure 
JsonRowSerialization/DeserializationSchemas. If you are looking for a 
string output. We might need to refactor those classes a little bit to 
also use the conversion functionality for non-binary output.


Regards,
Timo

Am 12.07.18 um 03:38 schrieb Hequn Cheng:

Hi shivam,

It seems there is no such a function but you can write one by 
yourself, maybe use the com.fasterxml.jackson.databind.ObjectMapper.


Best, Hequn

On Thu, Jul 12, 2018 at 1:56 AM, Shivam Sharma 
<28shivamsha...@gmail.com > wrote:


Hi All,

I have TableSchema


object and a Flink Row object(or list). Do we have any
straightforward way to convert Row object into JSON by using Schema?

For Example:-
TableSchema-
   - columnNames: [name, count]
   - columnTypes: [String, Integer]
Row - ("shivam", 2)
JSON - {"name": "shivam", count: 2}

Thanks
-- 
Shivam Sharma

Data Engineer @ Goibibo
Indian Institute Of Information Technology, Design and
Manufacturing Jabalpur
Mobile No- (+91) 8882114744
Email:- 28shivamsha...@gmail.com 
LinkedIn:-_https://www.linkedin.com/in/28shivamsharma
_






Re: Want to write Kafka Sink to SQL Client by Flink-1.5

2018-07-11 Thread Timo Walther

Hi Shivam,

a Kafka sink for the SQL Client will be part of Flink 1.6. For this we 
need to do provide basic interfaces that sinks can extends as Rong 
mentioned (FLINK-8866). In order to support all formats that also 
sources support we also working on separating the connector from the 
formats [1]. PR for these features are ready and I'm working on 
integrating them right now. Once this is done and we have support for 
INSERT INTO in SQL Client a Kafka sink implementation is straightforward.


Regards,
Timo


[1] https://issues.apache.org/jira/browse/FLINK-8558

Am 12.07.18 um 02:45 schrieb Rong Rong:

Hi Shivam,
Thank you for interested in contributing to Kafka Sink for SQL client. 
Could you share your plan for implementation. I have some questions as 
there might have been some overlap with current implementations.


On a higher level,
1. Are you using some type of metadata store to host topic schemas 
(Kafka can essentially be schema-less), it might be great to take a 
look at the TableSource/SinkFactory [1][2]
2. There's already a KafkaTableSource and KafkaTableSink available, I 
am assuming you are trying to contribute to the configuration in SQL 
Client to make it easier to interact with a Kafka table?


Thanks,
Rong

[1]: https://issues.apache.org/jira/browse/FLINK-8839
[2]: https://issues.apache.org/jira/browse/FLINK-8866

On Tue, Jul 10, 2018 at 3:28 AM Shivam Sharma 
<28shivamsha...@gmail.com > wrote:


Hi All,

We want to write Kafka Sink functionality for Flink(1.5) SQL
Client. We have read the code and chalk out a rough plan for
implementation.

Any guidance for this implementation will be very helpful.

Thanks
-- 
Shivam Sharma

Data Engineer @ Goibibo
Indian Institute Of Information Technology, Design and
Manufacturing Jabalpur
Mobile No- (+91) 8882114744
Email:- 28shivamsha...@gmail.com 
LinkedIn:-_https://www.linkedin.com/in/28shivamsharma_





Questions about high-availability directory

2018-07-11 Thread Xinyu Zhang
Hi all

Recently, we use flink with high-availability. We found that there are
three kinds of directories in ha.baseDir. They are applicationID/blob,
submittedJobGraph and completedcheckpoint. It's used to restore users'
jars, submitted job graphs and completed checkpoint. When old Jobmanager is
shutdown, the new job manager can recover from these data. My question is,

   1. The data is used to recover jobmanager, and each AM only has one
   jobmanager, why not put the data  to the directories such as
   "/applicationid/blob/jobid", "/applicationid/submittedJobGraph/jobid" and
   "/applicationid/completedcheckpoint/jobid"?
   2. Is there any method can make sure these data are all cleaned up when
   a job or a cluster is shutdown?

Thanks!

Xinyu Zhang


Re: Some question about document

2018-07-11 Thread vino yang
Hi Yuta,

It seems Chesnay is right. The "fallback" in flink's documentation is in
terms of the types flink supported. But for all the other arbitrary types
kryo is the first choice.

2018-07-12 9:55 GMT+08:00 Yuta Morisawa :

> Thank you for your answer.
>
> > For POJOs Flink has a custom serializer. For arbitrary objects we use
> > kryo, and can use Avro as a fallback.
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/
> dev/types_serialization.html#serialization-of-pojo-types
>
> It may be the reverse.
> Kryo is for fallback, right?
>
>
> On 2018/07/11 19:00, Chesnay Schepler wrote:
>
>> 1) TypeInformation are used to create serializers, comparators and to
>> verify correctness of certain operations (like projections on tuple
>> datasets).
>>
>> 2) see https://flink.apache.org/news/2015/05/11/Juggling-with-Bits-
>> and-Bytes.html
>>
>> 3) Flink comes with a number of serializers for varying types as outlined
>> here > 5/dev/types_serialization.html#flinks-typeinformation-class>.
>> For POJOs Flink has a custom serializer. For arbitrary objects we use
>> kryo, and can use Avro as a fallback.
>>
>> On 11.07.2018 09:24, Yuta Morisawa wrote:
>>
>>> Hi all
>>>
>>> Now, I'm reading Flink document and I have some points to feel difficult
>>> to get an idea.
>>> I'd appreciate if you tell it me.
>>>
>>> 1,TypeInformation
>>>  I understand TypeInformation is used for selecting relevant serializer
>>> and comparator.
>>>  But, the ducument doesn't specify if it has another way to be used.
>>>
>>>  So, what I want to know is that what kinds of process gets benefit from
>>> TypeInformation other than serializer and comparator.
>>>
>>> 2, Managed Memory
>>>  The word "Managed memory" is appeared several time in the document but
>>> I can't find any detail description.
>>>  This is the only document I found (https://www.slideshare.net/sb
>>> altagi/overview-of-apacheflinkbyslimbaltagi)
>>>
>>>  If anyone has document that explains managed memory, please let me know.
>>>
>>> 3, Serializer
>>>  What do the words in the document  "serializers we ship with Flink"
>>> mean? I know Flink uses avro for POJOs, is it the same thing?
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/
>>> dev/types_serialization.html
>>>
>>>
>>> Regards,
>>> Yuta
>>>
>>>
>>
>


Re: How to create User Notifications/Reminder ?

2018-07-11 Thread shyla deshpande
Hi Hequen,

I was more interested in solving using CEP.
I want to have a window of 2 weeks and in the Timeout Handler I want to
create Notification/Reminder.
Is this doable in Flink 1.4.2.?

Thanks


On Wed, Jul 11, 2018 at 6:14 PM, Hequn Cheng  wrote:

> Hi shyla,
>
> There is a same question[1] asked two days ago. Maybe it is helpful for
> you. Let me know if you have any other concern.
> Best, Hequn
>
> [1] http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/How-to-trigger-a-function-on-the-state-
> periodically-td21311.html
>
> On Thu, Jul 12, 2018 at 4:50 AM, shyla deshpande  > wrote:
>
>> I need to create User Notification/Reminder when I don’t see a specific
>> event (high volume) from that user for more than 2 weeks.
>>
>> Should I be using windowing or CEP or  ProcessFunction?
>>
>> I am pretty new to Flink. Can anyone please advise me what is the best
>> way to solve this?
>>
>> Thank you for your time.
>>
>
>


Re: Some question about document

2018-07-11 Thread Yuta Morisawa

Thank you for your answer.

> For POJOs Flink has a custom serializer. For arbitrary objects we use
> kryo, and can use Avro as a fallback.
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/types_serialization.html#serialization-of-pojo-types

It may be the reverse.
Kryo is for fallback, right?


On 2018/07/11 19:00, Chesnay Schepler wrote:
1) TypeInformation are used to create serializers, comparators and to 
verify correctness of certain operations (like projections on tuple 
datasets).


2) see 
https://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html


3) Flink comes with a number of serializers for varying types as 
outlined here 
.
For POJOs Flink has a custom serializer. For arbitrary objects we use 
kryo, and can use Avro as a fallback.


On 11.07.2018 09:24, Yuta Morisawa wrote:

Hi all

Now, I'm reading Flink document and I have some points to feel 
difficult to get an idea.

I'd appreciate if you tell it me.

1,TypeInformation
 I understand TypeInformation is used for selecting relevant 
serializer and comparator.

 But, the ducument doesn't specify if it has another way to be used.

 So, what I want to know is that what kinds of process gets benefit 
from TypeInformation other than serializer and comparator.


2, Managed Memory
 The word "Managed memory" is appeared several time in the document 
but I can't find any detail description.
 This is the only document I found 
(https://www.slideshare.net/sbaltagi/overview-of-apacheflinkbyslimbaltagi)


 If anyone has document that explains managed memory, please let me know.

3, Serializer
 What do the words in the document  "serializers we ship with Flink" 
mean? I know Flink uses avro for POJOs, is it the same thing?
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/types_serialization.html 




Regards,
Yuta







Re: Need to better way to create JSON If we have TableSchema and Row

2018-07-11 Thread Hequn Cheng
Hi shivam,

It seems there is no such a function but you can write one by yourself,
maybe use the com.fasterxml.jackson.databind.ObjectMapper.

Best, Hequn

On Thu, Jul 12, 2018 at 1:56 AM, Shivam Sharma <28shivamsha...@gmail.com>
wrote:

> Hi All,
>
> I have TableSchema
> 
> object and a Flink Row object(or list). Do we have any straightforward way
> to convert Row object into JSON by using Schema?
>
> For Example:-
> TableSchema-
>- columnNames: [name, count]
>- columnTypes: [String, Integer]
> Row - ("shivam", 2)
> JSON - {"name": "shivam", count: 2}
>
> Thanks
> --
> Shivam Sharma
> Data Engineer @ Goibibo
> Indian Institute Of Information Technology, Design and Manufacturing
> Jabalpur
> Mobile No- (+91) 8882114744
> Email:- 28shivamsha...@gmail.com
> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
> *
>


Re: How to create User Notifications/Reminder ?

2018-07-11 Thread Hequn Cheng
Hi shyla,

There is a same question[1] asked two days ago. Maybe it is helpful for
you. Let me know if you have any other concern.
Best, Hequn

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-trigger-a-function-on-the-state-periodically-td21311.html


On Thu, Jul 12, 2018 at 4:50 AM, shyla deshpande 
wrote:

> I need to create User Notification/Reminder when I don’t see a specific
> event (high volume) from that user for more than 2 weeks.
>
> Should I be using windowing or CEP or  ProcessFunction?
>
> I am pretty new to Flink. Can anyone please advise me what is the best way
> to solve this?
>
> Thank you for your time.
>


Re: Support for detached mode for Flink1.5 SQL Client

2018-07-11 Thread Rong Rong
Is the Gateway Mode [1] in the FLIP-24 SQL Client road map what you are
looking for?

--
Rong

[1]: https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client

On Tue, Jul 10, 2018 at 3:37 AM Shivam Sharma <28shivamsha...@gmail.com>
wrote:

> Hi All,
>
> Is there any way to run Flink1.5 sql-client queries in detached mode?
> Actually, we need to run multiple queries for different use cases and
> sql-client shell will open by the user on-demand.
>
> --
> Shivam Sharma
> Data Engineer @ Goibibo
> Indian Institute Of Information Technology, Design and Manufacturing
> Jabalpur
> Mobile No- (+91) 8882114744
> Email:- 28shivamsha...@gmail.com
> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
> *
>


Re: Want to write Kafka Sink to SQL Client by Flink-1.5

2018-07-11 Thread Rong Rong
Hi Shivam,
Thank you for interested in contributing to Kafka Sink for SQL client.
Could you share your plan for implementation. I have some questions as
there might have been some overlap with current implementations.

On a higher level,
1. Are you using some type of metadata store to host topic schemas (Kafka
can essentially be schema-less), it might be great to take a look at the
TableSource/SinkFactory [1][2]
2. There's already a KafkaTableSource and KafkaTableSink available, I am
assuming you are trying to contribute to the configuration in SQL Client to
make it easier to interact with a Kafka table?

Thanks,
Rong

[1]: https://issues.apache.org/jira/browse/FLINK-8839
[2]: https://issues.apache.org/jira/browse/FLINK-8866

On Tue, Jul 10, 2018 at 3:28 AM Shivam Sharma <28shivamsha...@gmail.com>
wrote:

> Hi All,
>
> We want to write Kafka Sink functionality for Flink(1.5) SQL Client. We
> have read the code and chalk out a rough plan for implementation.
>
> Any guidance for this implementation will be very helpful.
>
> Thanks
> --
> Shivam Sharma
> Data Engineer @ Goibibo
> Indian Institute Of Information Technology, Design and Manufacturing
> Jabalpur
> Mobile No- (+91) 8882114744
> Email:- 28shivamsha...@gmail.com
> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
> *
>


How to create User Notifications/Reminder ?

2018-07-11 Thread shyla deshpande
I need to create User Notification/Reminder when I don’t see a specific
event (high volume) from that user for more than 2 weeks.

Should I be using windowing or CEP or  ProcessFunction?

I am pretty new to Flink. Can anyone please advise me what is the best way
to solve this?

Thank you for your time.


Re: Filter columns of a csv file with Flink

2018-07-11 Thread françois lacombe
Ok Hequn,

I'll open 2 Jira for this issue, and maybe propose a draft of
CsvTableSource class handling avro schemas
FLINK-9813 and FLINK-9814

Thank you for your answers and best regards

François

2018-07-11 8:11 GMT+02:00 Hequn Cheng :

> Hi francois,
>
> > Is there any plan to give avro schemas a better role in Flink in
> further versions?
> Haven't heard about avro for csv. You can open a jira for it. Maybe also
> contribute to flink :-)
>
>
> On Tue, Jul 10, 2018 at 11:32 PM, françois lacombe <
> francois.laco...@dcbrain.com> wrote:
>
>> Hi Hequn,
>>
>> 2018-07-10 3:47 GMT+02:00 Hequn Cheng :
>>
>>> Maybe I misunderstand you. So you don't want to skip the whole file?
>>>
>> Yes I do
>> By skipping the whole file I mean "throw an Exception to stop the process
>> and inform user that file is invalid for a given reason" and not "the
>> process goes fully right and import 0 rows"
>>
>>
>>> If does, then "extending CsvTableSource and provide the avro schema to
>>> the constructor without creating a custom AvroInputFormat" is ok.
>>>
>>
>> Then we agree on this
>> Is there any plan to give avro schemas a better role in Flink in further
>> versions?
>> Avro schemas are perfect to build CSVTableSource with code like
>>
>> for (Schema field_nfo : sch.getTypes()){
>>  // Test if csv file header actually contains a field corresponding
>> to schema
>>  if (!csv_headers.contains(field_nfo.getName())) {
>>   throw new NoSuchFieldException(field_nfo.getName());
>>  }
>>
>>  // Declare the field in the source Builder
>>  src_builder.field(field_nfo.getName(),
>> primitiveTypes.get(field_nfo.getType()));
>> }
>>
>> All the best
>>
>> François
>>
>>
>>
>>> On Mon, Jul 9, 2018 at 11:03 PM, françois lacombe <
>>> francois.laco...@dcbrain.com> wrote:
>>>
 Hi Hequn,

 2018-07-09 15:09 GMT+02:00 Hequn Cheng :

> The first step requires an AvroInputFormat because the source needs
> AvroInputFormat to read avro data if data match schema.
>

 I don't want avro data, I just want to check if my csv file have the
 same fields than defined in a given avro schema.
 Processing should stop if and only if I find missing columns.

 A record which not match the schema (types mainly) should be rejected
 and logged in a dedicated file but the processing can go on.

 How about extending CsvTableSource and provide the avro schema to the
 constructor without creating a custom AvroInputFormat?


 François

>>>
>>>
>>
>


Need to better way to create JSON If we have TableSchema and Row

2018-07-11 Thread Shivam Sharma
Hi All,

I have TableSchema

object and a Flink Row object(or list). Do we have any straightforward way
to convert Row object into JSON by using Schema?

For Example:-
TableSchema-
   - columnNames: [name, count]
   - columnTypes: [String, Integer]
Row - ("shivam", 2)
JSON - {"name": "shivam", count: 2}

Thanks
-- 
Shivam Sharma
Data Engineer @ Goibibo
Indian Institute Of Information Technology, Design and Manufacturing
Jabalpur
Mobile No- (+91) 8882114744
Email:- 28shivamsha...@gmail.com
LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
*


Re: Flink job hangs using rocksDb as backend

2018-07-11 Thread Nico Kruber
If this is about too many timers and your application allows it, you may
also try to reduce the timer resolution and thus frequency by coalescing
them [1].


Nico

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/operators/process_function.html#timer-coalescing

On 11/07/18 18:27, Stephan Ewen wrote:
> Hi shishal!
> 
> I think there is an issue with cancellation when many timers fire at the
> same time. These timers have to finish before shutdown happens, this
> seems to take a while in your case.
> 
> Did the TM process actually kill itself in the end (and got restarted)?
> 
> 
> 
> On Wed, Jul 11, 2018 at 9:29 AM, shishal  > wrote:
> 
> Hi,
> 
> I am using flink 1.4.2 with rocksdb as backend. I am using process
> function
> with timer on EventTime.  For checkpointing I am using hdfs.
> 
> I am trying load testing so Iam reading kafka from beginning (aprox
> 7 days
> data with 50M events).
> 
> My job gets stuck after aprox 20 min with no error. There after
> watermark do
> not progress and all checkpoint fails.
> 
> Also When I try to cancel my job (using web UI) , it takes several
> minutes
> to finally gets cancelled. Also it makes Task manager down as well.
> 
> There is no logs while my job hanged but while cancelling I get
> following
> error.
> 
> /
> 
> 2018-07-11 09:10:39,385 ERROR
> org.apache.flink.runtime.taskmanager.TaskManager              -
> ==
> ==      FATAL      ===
> ==
> 
> A fatal error occurred, forcing the TaskManager to shut down: Task
> 'process
> (3/6)' did not react to cancelling signal in the last 30 seconds, but is
> stuck in method:
>  org.rocksdb.RocksDB.get(Native Method)
> org.rocksdb.RocksDB.get(RocksDB.java:810)
> 
> org.apache.flink.contrib.streaming.state.RocksDBMapState.get(RocksDBMapState.java:102)
> 
> org.apache.flink.runtime.state.UserFacingMapState.get(UserFacingMapState.java:47)
> 
> nl.ing.gmtt.observer.analyser.job.rtpe.process.RtpeProcessFunction.onTimer(RtpeProcessFunction.java:94)
> 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.onEventTime(KeyedProcessOperator.java:75)
> 
> org.apache.flink.streaming.api.operators.HeapInternalTimerService.advanceWatermark(HeapInternalTimerService.java:288)
> 
> org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:108)
> 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:885)
> org.apache.flink.streaming.runtime.io
> 
> .StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:288)
> 
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)
> 
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)
> org.apache.flink.streaming.runtime.io
> 
> .StreamInputProcessor.processInput(StreamInputProcessor.java:189)
> 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)
> 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
> java.lang.Thread.run(Thread.java:748)
> 
> 2018-07-11 09:10:39,390 DEBUG
> 
> org.apache.flink.runtime.akka.StoppingSupervisorWithoutLoggingActorKilledExceptionStrategy
> 
> - Actor was killed. Stopping it now.
> akka.actor.ActorKilledException: Kill
> 2018-07-11 09:10:39,407 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              - Stopping
> TaskManager akka://flink/user/taskmanager#-1231617791.
> 2018-07-11 09:10:39,408 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Cancelling
> all computations and discarding all cached data.
> 2018-07-11 09:10:39,409 INFO 
> org.apache.flink.runtime.taskmanager.Task                   
> - Attempting to fail task externally process (3/6)
> (432fd129f3eea363334521f8c8de5198).
> 2018-07-11 09:10:39,409 INFO 
> org.apache.flink.runtime.taskmanager.Task                   
> - Task process (3/6) is already in state CANCELING
> 2018-07-11 09:10:39,409 INFO 
> org.apache.flink.runtime.taskmanager.Task                   
> - Attempting to fail task externally process (4/6)
> (7c6b96c9f32b067bdf8fa7c283eca2e0).
> 2018-07-11 09:10:39,409 INFO 
> org.apache.flink.runtime.taskmanager.Task                   
> - Task process (4/6) is already in state CANCELING
>

Re: Flink job hangs using rocksDb as backend

2018-07-11 Thread Stephan Ewen
Hi shishal!

I think there is an issue with cancellation when many timers fire at the
same time. These timers have to finish before shutdown happens, this seems
to take a while in your case.

Did the TM process actually kill itself in the end (and got restarted)?



On Wed, Jul 11, 2018 at 9:29 AM, shishal  wrote:

> Hi,
>
> I am using flink 1.4.2 with rocksdb as backend. I am using process function
> with timer on EventTime.  For checkpointing I am using hdfs.
>
> I am trying load testing so Iam reading kafka from beginning (aprox 7 days
> data with 50M events).
>
> My job gets stuck after aprox 20 min with no error. There after watermark
> do
> not progress and all checkpoint fails.
>
> Also When I try to cancel my job (using web UI) , it takes several minutes
> to finally gets cancelled. Also it makes Task manager down as well.
>
> There is no logs while my job hanged but while cancelling I get following
> error.
>
> /
>
> 2018-07-11 09:10:39,385 ERROR
> org.apache.flink.runtime.taskmanager.TaskManager  -
> ==
> ==  FATAL  ===
> ==
>
> A fatal error occurred, forcing the TaskManager to shut down: Task 'process
> (3/6)' did not react to cancelling signal in the last 30 seconds, but is
> stuck in method:
>  org.rocksdb.RocksDB.get(Native Method)
> org.rocksdb.RocksDB.get(RocksDB.java:810)
> org.apache.flink.contrib.streaming.state.RocksDBMapState.get(
> RocksDBMapState.java:102)
> org.apache.flink.runtime.state.UserFacingMapState.get(
> UserFacingMapState.java:47)
> nl.ing.gmtt.observer.analyser.job.rtpe.process.
> RtpeProcessFunction.onTimer(RtpeProcessFunction.java:94)
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.onEventTime(
> KeyedProcessOperator.java:75)
> org.apache.flink.streaming.api.operators.HeapInternalTimerService.
> advanceWatermark(HeapInternalTimerService.java:288)
> org.apache.flink.streaming.api.operators.InternalTimeServiceManager.
> advanceWatermark(InternalTimeServiceManager.java:108)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.
> processWatermark(AbstractStreamOperator.java:885)
> org.apache.flink.streaming.runtime.io.StreamInputProcessor$
> ForwardingValveOutputHandler.handleWatermark(
> StreamInputProcessor.java:288)
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.
> findAndOutputNewMinWatermarkAcrossAlignedChannels(
> StatusWatermarkValve.java:189)
> org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.
> inputWatermark(StatusWatermarkValve.java:111)
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(
> StreamInputProcessor.java:189)
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(
> OneInputStreamTask.java:69)
> org.apache.flink.streaming.runtime.tasks.StreamTask.
> invoke(StreamTask.java:264)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
> java.lang.Thread.run(Thread.java:748)
>
> 2018-07-11 09:10:39,390 DEBUG
> org.apache.flink.runtime.akka.StoppingSupervisorWithoutLoggingActorKilledExceptionStrategy
>
> - Actor was killed. Stopping it now.
> akka.actor.ActorKilledException: Kill
> 2018-07-11 09:10:39,407 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  - Stopping
> TaskManager akka://flink/user/taskmanager#-1231617791.
> 2018-07-11 09:10:39,408 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  - Cancelling
> all computations and discarding all cached data.
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Attempting to fail task externally process (3/6)
> (432fd129f3eea363334521f8c8de5198).
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Task process (3/6) is already in state CANCELING
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Attempting to fail task externally process (4/6)
> (7c6b96c9f32b067bdf8fa7c283eca2e0).
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Task process (4/6) is already in state CANCELING
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Attempting to fail task externally process (2/6)
> (a4f731797a7ea210fd0b512b0263bcd9).
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Task process (2/6) is already in state CANCELING
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Attempting to fail task externally process (1/6)
> (cd8a113779a4c00a051d78ad63bc7963).
> 2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task
>
> - Task process (1/6) is already in state CANCELING
> 2018-07-11 09:10:39,409 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  -
> Disassociating from JobManager
> 2018-07-11 09:10:39,412 INFO
> org.apache.flink.runtime.blob.PermanentBlobCache   

Re: flink JPS result changes

2018-07-11 Thread Chesnay Schepler
Generally speaking no, the DIspatcher (here called 
StandaloneSessionClusterEntrypoint) will spawn a jobmanager internally 
when a job is submitted


On 11.07.2018 16:42, Will Du wrote:

In this case, do i need to add a jobManager
On Jul 11, 2018, at 10:14 AM, miki haiat > wrote:


Flink 6 changed  the execution model compactly
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 


https://docs.google.com/document/d/1zwBP3LKnJ0LI1R4-9hLXBVjOXAkQS5jKrGYYoPz-xRk/edit#heading=h.giuwq6q8d23j



On Wed, Jul 11, 2018 at 5:09 PM Will Du > wrote:


Hi folks
Do we have any information about the process changes after
v1.5.0? I used to see jobManager and TaskManager process once the
start-cluster.sh is being called. But, it shows below in v1.5.0
once started. Everything works, but no idea where is the jobManager.

$jps
2523 TaskManagerRunner
2190 StandaloneSessionClusterEntrypoint

thanks,
Will







Re: flink JPS result changes

2018-07-11 Thread Will Du
In this case, do i need to add a jobManager
> On Jul 11, 2018, at 10:14 AM, miki haiat  wrote:
> 
> Flink 6 changed  the execution model compactly 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 
>  
> https://docs.google.com/document/d/1zwBP3LKnJ0LI1R4-9hLXBVjOXAkQS5jKrGYYoPz-xRk/edit#heading=h.giuwq6q8d23j
>  
> 
>   
> 
> 
> On Wed, Jul 11, 2018 at 5:09 PM Will Du  > wrote:
> Hi folks
> Do we have any information about the process changes after v1.5.0? I used to 
> see jobManager and TaskManager process once the start-cluster.sh is being 
> called. But, it shows below in v1.5.0 once started. Everything works, but no 
> idea where is the jobManager.
> 
> $jps
> 2523 TaskManagerRunner
> 2190 StandaloneSessionClusterEntrypoint
> 
> thanks,
> Will



Re: flink JPS result changes

2018-07-11 Thread miki haiat
Flink 6 changed  the execution model compactly
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
https://docs.google.com/document/d/1zwBP3LKnJ0LI1R4-9hLXBVjOXAkQS5jKrGYYoPz-xRk/edit#heading=h.giuwq6q8d23j



On Wed, Jul 11, 2018 at 5:09 PM Will Du  wrote:

> Hi folks
> Do we have any information about the process changes after v1.5.0? I used
> to see jobManager and TaskManager process once the start-cluster.sh is
> being called. But, it shows below in v1.5.0 once started. Everything works,
> but no idea where is the jobManager.
>
> $jps
> 2523 TaskManagerRunner
> 2190 StandaloneSessionClusterEntrypoint
>
> thanks,
> Will


flink JPS result changes

2018-07-11 Thread Will Du
Hi folks
Do we have any information about the process changes after v1.5.0? I used to 
see jobManager and TaskManager process once the start-cluster.sh is being 
called. But, it shows below in v1.5.0 once started. Everything works, but no 
idea where is the jobManager.

$jps
2523 TaskManagerRunner
2190 StandaloneSessionClusterEntrypoint

thanks,
Will

Re: Custom metrics reporter classloading problem

2018-07-11 Thread Gyula Fóra
Thanks for the explanation, that makes sense.
For some reason I thought that in Yarn all stuff goes into the classpath.

Gy

Chesnay Schepler  ezt írta (időpont: 2018. júl. 11.,
Sze, 15:16):

> Reporters do not have access to libraries provided with user-jars.
> They are instantiated when JM/TM starts, i.e. before any user-code is
> even accessible.
>
> My recommendation would be to either put the kafka dependencies in the
> /lib folder or try to relocate the kafka code in the reporter.
>
> On 11.07.2018 14:59, Gyula Fóra wrote:
> > Hi all,
> >
> > I have ran into the following problem and I want to double check
> > wether this is intended behaviour.
> >
> > I have a custom metrics reporter that pushes things to Kafka (so it
> > creates a KafkaProducer in the open method etc.etc.) for my streaming
> job.
> >
> > Naturally as my Flink job consumes from Kafka so it has the kafka
> > connector dependencies I set the Kafka dependencies to provided in my
> > metric reporter project and I put the built kafkaReporter.jar into the
> > Flink lib. However it seems that the metrics reporter is instantiated
> > without the user code classes since I get a NoClassdefFound error for
> > KafkaProducer even though my streaming job starts successfully
> > reading/writing kafka.
> >
> > Any ideas why this happens and how to solve it? I am slightly against
> > putting the kafka dependencies twice on the classpath as it has only
> > caused problems in the past...
> >
> > Gyula
>
>
>


Re: Custom metrics reporter classloading problem

2018-07-11 Thread Chesnay Schepler

Reporters do not have access to libraries provided with user-jars.
They are instantiated when JM/TM starts, i.e. before any user-code is 
even accessible.


My recommendation would be to either put the kafka dependencies in the 
/lib folder or try to relocate the kafka code in the reporter.


On 11.07.2018 14:59, Gyula Fóra wrote:

Hi all,

I have ran into the following problem and I want to double check 
wether this is intended behaviour.


I have a custom metrics reporter that pushes things to Kafka (so it 
creates a KafkaProducer in the open method etc.etc.) for my streaming job.


Naturally as my Flink job consumes from Kafka so it has the kafka 
connector dependencies I set the Kafka dependencies to provided in my 
metric reporter project and I put the built kafkaReporter.jar into the 
Flink lib. However it seems that the metrics reporter is instantiated 
without the user code classes since I get a NoClassdefFound error for 
KafkaProducer even though my streaming job starts successfully 
reading/writing kafka.


Any ideas why this happens and how to solve it? I am slightly against 
putting the kafka dependencies twice on the classpath as it has only 
caused problems in the past...


Gyula





Re: Confusions About JDBCOutputFormat

2018-07-11 Thread Hequn Cheng
Cool. I will take a look. Thanks.

On Wed, Jul 11, 2018 at 7:08 PM, wangsan  wrote:

> Well, I see. If the connection is established when writing data into DB,
> we need to cache received rows since last write.
>
> IMO, maybe we do not need to open connections repeatedly or introduce
> connection pools. Test and refresh the connection periodically can simply
> solve this problem. I’ve implemented this at https://github.com/apache/
> flink/pull/6301, It would be kind of you to review this.
>
> Best,
> wangsan
>
>
>
> On Jul 11, 2018, at 2:25 PM, Hequn Cheng  wrote:
>
> Hi wangsan,
>
> What I mean is establishing a connection each time write data into JDBC,
> i.e.  establish a connection in flush() function. I think this will make
> sure the connection is ok. What do you think?
>
> On Wed, Jul 11, 2018 at 12:12 AM, wangsan  wrote:
>
> Hi Hequn,
>
> Establishing a connection for each batch write may also have idle
> connection problem, since we are not sure when the connection will be
> closed. We call flush() method when a batch is finished or  snapshot state,
> but what if the snapshot is not enabled and the batch size not reached
> before the connection is closed?
>
> May be we could use a Timer to test the connection periodically and keep
> it alive. What do you think?
>
> I will open a jira and try to work on that issue.
>
> Best,
> wangsan
>
>
>
> On Jul 10, 2018, at 8:38 PM, Hequn Cheng  wrote:
>
> Hi wangsan,
>
> I agree with you. It would be kind of you to open a jira to check the
> problem.
>
> For the first problem, I think we need to establish connection each time
> execute batch write. And, it is better to get the connection from a
> connection pool.
> For the second problem, to avoid multithread problem, I think we should
> synchronized the batch object in flush() method.
>
> What do you think?
>
> Best, Hequn
>
>
>
> On Tue, Jul 10, 2018 at 2:36 PM, wangsan  wrote:
>
> Hi all,
>
> I'm going to use JDBCAppendTableSink and JDBCOutputFormat in my Flink
> application. But I am confused with the implementation of JDBCOutputFormat.
>
> 1. The Connection was established when JDBCOutputFormat is opened, and
> will be used all the time. But if this connction lies idle for a long time,
> the database will force close the connetion, thus errors may occur.
> 2. The flush() method is called when batchCount exceeds the threshold,
> but it is also called while snapshotting state. So two threads may modify
> upload and batchCount, but without synchronization.
>
> Please correct me if I am wrong.
>
> ——
> wangsan
>
>
>


Custom metrics reporter classloading problem

2018-07-11 Thread Gyula Fóra
Hi all,

I have ran into the following problem and I want to double check wether
this is intended behaviour.

I have a custom metrics reporter that pushes things to Kafka (so it creates
a KafkaProducer in the open method etc.etc.) for my streaming job.

Naturally as my Flink job consumes from Kafka so it has the kafka connector
dependencies I set the Kafka dependencies to provided in my metric reporter
project and I put the built kafkaReporter.jar into the Flink lib. However
it seems that the metrics reporter is instantiated without the user code
classes since I get a NoClassdefFound error for KafkaProducer even though
my streaming job starts successfully reading/writing kafka.

Any ideas why this happens and how to solve it? I am slightly against
putting the kafka dependencies twice on the classpath as it has only caused
problems in the past...

Gyula


Re: Need assistance : creating remote environment

2018-07-11 Thread Chesnay Schepler
Based on the logs your client is using the RestClusterClient, which 
means that the client is either

a) running 1.4 with the flip6 profile enabled
b) running 1.5.

Please ensure that both the flink versions match for client and server, 
and that both run/do not run in flip6 mode.


On 11.07.2018 14:26, Mohan mohan wrote:

Attached log file. (Log level : Trace)

Is this the issue ? Trying with very minimal graph (execution plan is 
printed in log file)


WARN  akka.remote.transport.netty.NettyTransport- Remote 
connection to [/127.0.0.1:44322 ] failed with 
org.apache.flink.shaded.akka.org.jboss.netty.handler.codec.frame.TooLongFrameException:
 Adjusted frame length exceeds 10485760: 1195725860 - discarded


On Wed, Jul 11, 2018 at 5:44 PM Chesnay Schepler > wrote:


Did/could you enable logging in the submitting code?

On 11.07.2018 13:57, Mohan mohan wrote:
> Hi,
>
> I have started flink in cluster mode.  ..flink1.4.2/bin/$
> ./start-cluster.sh (no config changes ie., default settings)
> And trying to connect to it,
> ExecutionEnvironment.createRemoteEnvironment("localhost", 6123,
> "xxx.jar");
>
> I am not seeing any response, did not find anything in
jobmanager log.
> Please guide how to trace the issue. Or Do we need any
preconfiguration?
> In local environment everything works fine.
>
> Using : flink-1.4.2-bin-scala_2.11.tgz
>
> Thanks in advance.






Re: Need assistance : creating remote environment

2018-07-11 Thread Mohan mohan
Attached log file. (Log level : Trace)

Is this the issue ? Trying with very minimal graph (execution plan is
printed in log file)

WARN  akka.remote.transport.netty.NettyTransport-
Remote connection to [/127.0.0.1:44322] failed with
org.apache.flink.shaded.akka.org.jboss.netty.handler.codec.frame.TooLongFrameException:
Adjusted frame length exceeds 10485760: 1195725860 - discarded



On Wed, Jul 11, 2018 at 5:44 PM Chesnay Schepler  wrote:

> Did/could you enable logging in the submitting code?
>
> On 11.07.2018 13:57, Mohan mohan wrote:
> > Hi,
> >
> > I have started flink in cluster mode.   ..flink1.4.2/bin/$
> > ./start-cluster.sh (no config changes ie., default settings)
> > And trying to connect to it,
> > ExecutionEnvironment.createRemoteEnvironment("localhost", 6123,
> > "xxx.jar");
> >
> > I am not seeing any response, did not find anything in jobmanager log.
> > Please guide how to trace the issue. Or Do we need any preconfiguration?
> > In local environment everything works fine.
> >
> > Using : flink-1.4.2-bin-scala_2.11.tgz
> >
> > Thanks in advance.
>
>
>
11 Jul 2018 17:49:08.579 [main] INFO  
org.apache.flink.api.java.typeutils.TypeExtractor.isValidPojoField
1876 - class 
com.adaequare.etl2.batch.operator.EntityType does not contain a setter for 
field targetEntity
11 Jul 2018 17:49:08.583 [main] INFO  
org.apache.flink.api.java.typeutils.TypeExtractor.analyzePojo
1915 - Class class 
com.adaequare.etl2.batch.operator.EntityType cannot be used as a POJO type 
because not all fields are valid POJO fields, and must be processed as 
GenericType. Please read the Flink documentation on "Data Types & 
Serialization" for details of the effect on performance.
11 Jul 2018 17:49:09.027 [main] DEBUG 
org.apache.flink.api.java.ClosureCleaner.cleanThis0
120 - this$0 is accessed: false
11 Jul 2018 17:49:09.065 [main] INFO  
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
999 - The job has 2 registered types and 0 
default Kryo serializers
11 Jul 2018 17:49:09.068 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1012 - Registered Kryo types: [class 
com.adaequare.etl2.batch.operator.EntityType, interface java.util.List]
11 Jul 2018 17:49:09.068 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1013 - Registered Kryo with Serializers types: 
[]
11 Jul 2018 17:49:09.069 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1014 - Registered Kryo with Serializer Classes 
types: []
11 Jul 2018 17:49:09.069 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1015 - Registered Kryo default Serializers: []
11 Jul 2018 17:49:09.070 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1016 - Registered Kryo default Serializers 
Classes []
11 Jul 2018 17:49:09.070 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1017 - Registered POJO types: []
11 Jul 2018 17:49:09.072 [main] DEBUG 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan
1020 - Static code analysis mode: DISABLE
11 Jul 2018 17:49:09.085 [main] DEBUG 
org.apache.flink.optimizer.Optimizer.compile
433 - Beginning compilation of program 'plan'
11 Jul 2018 17:49:09.095 [main] DEBUG 
org.apache.flink.optimizer.Optimizer.compile
442 - Using a default parallelism of 4
11 Jul 2018 17:49:09.095 [main] DEBUG 
org.apache.flink.optimizer.Optimizer.compile
443 - Using default data exchange mode PIPELINED
11 Jul 2018 17:49:09.120 [main] DEBUG 
org.apache.flink.core.fs.FileSystem.loadFileSystems
937 - Loading extension file systems via 
services
11 Jul 2018 17:49:09.124 [main] INFO  
org.apache.flink.core.fs.FileSystem.loadHadoopFsFactory
1005 - Hadoop is not in the 
classpath/dependencies. The extended set of supported File Systems via Hadoop 
is not available.
11 Jul 2018 17:49:09.136 [main] DEBUG 
org.apache.flink.api.common.io.FileInputFormat.open
810 - Opening input split 
file:/tmp/stream2file8333235519400351128.tmp [0,42]
11 Jul 2018 17:49:09.150 [main] DEBUG 
org.apache.flink.api.common.io.FileInputFormat.open
810 - Opening input split 
file:/tmp/stream2file8333235519400351128.tmp [21,21]
{
"nodes": [

{
"id": 2,
"type": "source",
"pact": "Data Source",
"contents": "at 

Re: Need assistance : creating remote environment

2018-07-11 Thread Chesnay Schepler

Did/could you enable logging in the submitting code?

On 11.07.2018 13:57, Mohan mohan wrote:

Hi,

I have started flink in cluster mode.   ..flink1.4.2/bin/$ 
./start-cluster.sh (no config changes ie., default settings)

And trying to connect to it,
ExecutionEnvironment.createRemoteEnvironment("localhost", 6123, 
"xxx.jar");


I am not seeing any response, did not find anything in jobmanager log. 
Please guide how to trace the issue. Or Do we need any preconfiguration?

In local environment everything works fine.

Using : flink-1.4.2-bin-scala_2.11.tgz

Thanks in advance.





Need assistance : creating remote environment

2018-07-11 Thread Mohan mohan
Hi,

I have started flink in cluster mode.   ..flink1.4.2/bin/$
./start-cluster.sh (no config changes ie., default settings)
And trying to connect to it,
ExecutionEnvironment.createRemoteEnvironment("localhost", 6123, "xxx.jar");

I am not seeing any response, did not find anything in jobmanager log.
Please guide how to trace the issue. Or Do we need any preconfiguration?
In local environment everything works fine.

Using : flink-1.4.2-bin-scala_2.11.tgz

Thanks in advance.


Re: Confusions About JDBCOutputFormat

2018-07-11 Thread wangsan
Well, I see. If the connection is established when writing data into DB, we 
need to cache received rows since last write. 

IMO, maybe we do not need to open connections repeatedly or introduce 
connection pools. Test and refresh the connection periodically can simply solve 
this problem. I’ve implemented this at 
https://github.com/apache/flink/pull/6301 
, It would be kind of you to review 
this.

Best,
wangsan


> On Jul 11, 2018, at 2:25 PM, Hequn Cheng  wrote:
> 
> Hi wangsan,
> 
> What I mean is establishing a connection each time write data into JDBC,
> i.e.  establish a connection in flush() function. I think this will make
> sure the connection is ok. What do you think?
> 
> On Wed, Jul 11, 2018 at 12:12 AM, wangsan  > wrote:
> 
>> Hi Hequn,
>> 
>> Establishing a connection for each batch write may also have idle
>> connection problem, since we are not sure when the connection will be
>> closed. We call flush() method when a batch is finished or  snapshot state,
>> but what if the snapshot is not enabled and the batch size not reached
>> before the connection is closed?
>> 
>> May be we could use a Timer to test the connection periodically and keep
>> it alive. What do you think?
>> 
>> I will open a jira and try to work on that issue.
>> 
>> Best,
>> wangsan
>> 
>> 
>> 
>> On Jul 10, 2018, at 8:38 PM, Hequn Cheng  wrote:
>> 
>> Hi wangsan,
>> 
>> I agree with you. It would be kind of you to open a jira to check the
>> problem.
>> 
>> For the first problem, I think we need to establish connection each time
>> execute batch write. And, it is better to get the connection from a
>> connection pool.
>> For the second problem, to avoid multithread problem, I think we should
>> synchronized the batch object in flush() method.
>> 
>> What do you think?
>> 
>> Best, Hequn
>> 
>> 
>> 
>> On Tue, Jul 10, 2018 at 2:36 PM, wangsan > > wrote:
>> 
>>> Hi all,
>>> 
>>> I'm going to use JDBCAppendTableSink and JDBCOutputFormat in my Flink
>>> application. But I am confused with the implementation of JDBCOutputFormat.
>>> 
>>> 1. The Connection was established when JDBCOutputFormat is opened, and
>>> will be used all the time. But if this connction lies idle for a long time,
>>> the database will force close the connetion, thus errors may occur.
>>> 2. The flush() method is called when batchCount exceeds the threshold,
>>> but it is also called while snapshotting state. So two threads may modify
>>> upload and batchCount, but without synchronization.
>>> 
>>> Please correct me if I am wrong.
>>> 
>>> ——
>>> wangsan



Re: Some question about document

2018-07-11 Thread Chesnay Schepler
1) TypeInformation are used to create serializers, comparators and to 
verify correctness of certain operations (like projections on tuple 
datasets).


2) see 
https://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html


3) Flink comes with a number of serializers for varying types as 
outlined here 
.
For POJOs Flink has a custom serializer. For arbitrary objects we use 
kryo, and can use Avro as a fallback.


On 11.07.2018 09:24, Yuta Morisawa wrote:

Hi all

Now, I'm reading Flink document and I have some points to feel 
difficult to get an idea.

I'd appreciate if you tell it me.

1,TypeInformation
 I understand TypeInformation is used for selecting relevant 
serializer and comparator.

 But, the ducument doesn't specify if it has another way to be used.

 So, what I want to know is that what kinds of process gets benefit 
from TypeInformation other than serializer and comparator.


2, Managed Memory
 The word "Managed memory" is appeared several time in the document 
but I can't find any detail description.
 This is the only document I found 
(https://www.slideshare.net/sbaltagi/overview-of-apacheflinkbyslimbaltagi)


 If anyone has document that explains managed memory, please let me know.

3, Serializer
 What do the words in the document  "serializers we ship with Flink" 
mean? I know Flink uses avro for POJOs, is it the same thing?
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/types_serialization.html 




Regards,
Yuta





Re: Cancelling job with savepoint fails sometimes

2018-07-11 Thread Chesnay Schepler
My guess is that this is related to 
https://issues.apache.org/jira/browse/FLINK-2491.


The relevant bit is "Failed to trigger savepoint. Decline reason: Not 
all required tasks are currently running."


So, if one task has already finished (for example a source with a small 
finite input) then the savepoint cannot be taken. The same may apply if 
a task is currently restarting, failing etc. .


On 11.07.2018 09:53, Data Engineer wrote:
I notice that sometimes when I try to cancel a Flink job with 
savepoint, the cancel fails with the following error:


org.apache.flink.util.FlinkException: Could not cancel job 
3be3d380dca9bb6a5cf0d559d54d7ff8.
at 
org.apache.flink.client.cli.CliFrontend.lambda$cancel$4(CliFrontend.java:581)
at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:955)
at 
org.apache.flink.client.cli.CliFrontend.cancel(CliFrontend.java:573)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1029)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
at 
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at 
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
Caused by: java.util.concurrent.ExecutionException: 
java.util.concurrent.CompletionException: 
java.util.concurrent.CompletionException: 
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed 
to trigger savepoint. Decline reason: Not all required tasks are 
currently running.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.flink.client.program.rest.RestClusterClient.cancelWithSavepoint(RestClusterClient.java:385)
at 
org.apache.flink.client.cli.CliFrontend.lambda$cancel$4(CliFrontend.java:579)

... 6 more
Caused by: java.util.concurrent.CompletionException: 
java.util.concurrent.CompletionException: 
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed 
to trigger savepoint. Decline reason: Not all required tasks are 
currently running.
at 
org.apache.flink.runtime.jobmaster.JobMaster.lambda$triggerSavepoint$13(JobMaster.java:959)
at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at 
java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
at 
java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
at 
org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:955)

at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
at 
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)

at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at 
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed 
to trigger savepoint. Decline reason: Not all required tasks are 
currently running.
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
at 
java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:614)
at 

Re: 答复: How to find the relation between a running job and the original jar?

2018-07-11 Thread Lasse Nedergaard
Hi Chesnay

I have create an issue https://issues.apache.org/jira/browse/FLINK-9800
Please let me know if I can do anything to help implementing it.

Lasse

2018-07-11 9:08 GMT+02:00 Chesnay Schepler :

> For a running job there is no way to determine which jar was originally
> used for it.
>
> I remember a previous request for this feature, but I don't think a JIRA
> was created for it.
> Might be a good time to create one now.
>
> Could you open one and specify your requirements?
>
>
> On 11.07.2018 06:33, Lasse Nedergaard wrote:
>
> Hi Tang
>
> Thanks for the link. Yes your are rights and it works fine. But when I use
> the REST API for getting running jobs I can’t find any reference back to
> the jar used to start the job.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 11. jul. 2018 kl. 05.22 skrev Tang Cloud :
>
> Hi Lasse
>
>As far as I know, if you use post */jars/upload* REST API to submit
> your job, you can then get */jars* to list your user jar just uploaded.
> More information can refer to https://ci.apache.org/
> projects/flink/flink-docs-release-1.5/monitoring/rest_api.html#dispatcher
> Apache Flink 1.5 Documentation: Monitoring REST API
> 
> Flink has a monitoring API that can be used to query status and statistics
> of running jobs, as well as recent completed jobs. This monitoring API is
> used by Flink’s own dashboard, but is designed to be used also by custom
> monitoring tools. The monitoring API is a REST-ful API that accepts HTTP ...
> ci.apache.org
> Thanks, Yun
> --
> *发件人:* Lasse Nedergaard 
> *发送时间:* 2018年7月10日 17:40
> *收件人:* user
> *主题:* How to find the relation between a running job and the original jar?
>
> Hi.
>
> I working on a solution where I want to check if a running job use the
> right jar in the right version.
>
> Anyone knows if it is possible through the REST API to find information
> about a running job that contains the jarid or something simillary so it is
> possible to lookup the original jar?
>
> Need to work for Flink 1.5.0 or 1.4.2
>
> Any help appreciated
>
> Thanks in advance
>
> Lasse Nedergaard
>
>
>


Examples about complex analysis

2018-07-11 Thread Esa Heikkinen
Hi

I would want to find some Flink examples about complex analysis (like CEP) and 
its log files.

I have already found logs for TaxiRides and how to find the long rides using by 
Flink and CEP [1], but it is little simple analysis case (only two sequential 
events).

Do you know any more complex analysis cases or examples using by Flink (and 
CEP) ?

Especially I am interested about logs and analysis cases from IoT or ITS 
(Intelligent Transportation Systems) type systems.

[1] http://training.data-artisans.com/exercises/CEP.html

BR Esa


Cancelling job with savepoint fails sometimes

2018-07-11 Thread Data Engineer
I notice that sometimes when I try to cancel a Flink job with savepoint,
the cancel fails with the following error:

org.apache.flink.util.FlinkException: Could not cancel job
3be3d380dca9bb6a5cf0d559d54d7ff8.
at
org.apache.flink.client.cli.CliFrontend.lambda$cancel$4(CliFrontend.java:581)
at
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:955)
at
org.apache.flink.client.cli.CliFrontend.cancel(CliFrontend.java:573)
at
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1029)
at
org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
at
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
Caused by: java.util.concurrent.ExecutionException:
java.util.concurrent.CompletionException:
java.util.concurrent.CompletionException:
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to
trigger savepoint. Decline reason: Not all required tasks are currently
running.
at
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at
org.apache.flink.client.program.rest.RestClusterClient.cancelWithSavepoint(RestClusterClient.java:385)
at
org.apache.flink.client.cli.CliFrontend.lambda$cancel$4(CliFrontend.java:579)
... 6 more
Caused by: java.util.concurrent.CompletionException:
java.util.concurrent.CompletionException:
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to
trigger savepoint. Decline reason: Not all required tasks are currently
running.
at
org.apache.flink.runtime.jobmaster.JobMaster.lambda$triggerSavepoint$13(JobMaster.java:959)
at
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at
java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
at
java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
at
org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:955)
at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162)
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
at
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.util.concurrent.CompletionException:
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to
trigger savepoint. Decline reason: Not all required tasks are currently
running.
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
at
java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:614)
at
java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:1983)
at
org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:947)
... 20 more
Caused by: org.apache.flink.runtime.checkpoint.CheckpointTriggerException:
Failed to trigger savepoint. Decline reason: Not all required tasks are
currently running.
at
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerSavepoint(CheckpointCoordinator.java:377)
at

Re: Checkpointing in Flink 1.5.0

2018-07-11 Thread Data Engineer
As a workaround, we commented out  state.backend.rocksdb.localdir since it
defaults to the taskmanager.tmp.dirs location.

Now, we are having only these state backend configs in our flink-conf.yaml:
state.backend: rocksdb
state.checkpoints.dir: file:///home/demo/checkpoints/ext_checkpoints
state.savepoints.dir: file:///home/demo/checkpoints/checkpoints/savepoints

Checkpointing and savepointing works with the above configs.

However, I wasn't able to find the rocksdb directory which was supposed to
be insid /tmp directory. I was able to find these directories inside /tmp
in taskmanager :

drwxr-xr-x.  2 flink flink 4096 Jul 11 05:23
blobStore-122d93f5-35c9-4d8a-9632-e0e65f766825
drwxr-xr-x. 14 flink flink 4096 Jul 11 07:23
blobStore-c7d3433b-8e6d-4195-a431-d9392b638b5f
-rw-r--r--.  1 flink flink4 Jul 11 05:23 flink--taskexecutor.pid
drwxr-xr-x.  2 flink flink 4096 Jul 11 05:23
flink-dist-cache-08e706f9-a388-4a6d-8774-849207746783
drwxr-xr-x.  2 flink flink 4096 Jul 11 05:23
flink-io-287519fb-4cc8-4396-9072-584e3fae0dcc
drwxr-xr-x.  2 flink flink 4096 Jul 11 05:23 hsperfdata_flink
drwxr-xr-x.  2 root  root  4096 May 16 12:54 hsperfdata_root
-rw-r--r--.  1 flink flink 1179 Jul 11 05:23 jaas-4321856842187934442.conf
drwxr-xr-x.  2 flink flink 4096 Jul 11 07:17 localState

No sign of any rocksdb directory. Or is it not being used at all?



On Tue, Jul 10, 2018 at 12:45 PM, Sampath Bhat 
wrote:

> Chesnay - Why is the absolute file check required in the
> RocksDBStateBackend.setDbStoragePaths(String ... paths). I think this is
> causing the issue. Its not related to GlusterFS or file system. The same
> problem can be reproduced with the following configuration on local
> machine. The flink application should support checkpointing. We get the
> IllegealArgumentexecption (Relative File paths not allowed)
>
> state.backend: rocksdb
> state.checkpoints.dir: file:///home/demo/checkpoints/ext_checkpoints
> state.savepoints.dir: file:///home/demo/checkpoints/checkpoints/savepoints
> state.backend.fs.checkpointdir: file:///home/demo/checkpoints/
> checkpoints/fs_state
> #state.backend.rocksdb.checkpointdir: file:///home/demo/checkpoints/
> checkpoints/rocksdb_state
> state.backend.rocksdb.localdir: /home/demo/checkpoints/
> checkpoints/rocksdb_state
>
> Any insights would be helpful.
>
> On Wed, Jul 4, 2018 at 2:27 PM, Chesnay Schepler 
> wrote:
>
>> Reference: https://issues.apache.org/jira/browse/FLINK-9739
>>
>>
>> On 04.07.2018 10:46, Chesnay Schepler wrote:
>>
>> It's not really path-parsing logic, but path handling i suppose; see
>> RocksDBStateBackend#setDbStoragePaths().
>>
>> I went ahead and converted said method into a simple test method, maybe
>> this is enough to debug the issue.
>>
>> I assume this regression was caused by FLINK-6557, which refactored the
>> state backend to rely on java Files instead of Flink paths.
>> I'll open a JIRA to document it.
>>
>> The deprecation notice is not a problem.
>>
>> public static void testPaths(String... paths) {
>>if (paths.length == 0) {
>>   throw new IllegalArgumentException("empty paths");   }
>>else {
>>   File[] pp = new File[paths.length];  for (int i = 0; i < 
>> paths.length; i++) {
>>  final String rawPath = paths[i]; final String path; 
>> if (rawPath == null) {
>> throw new IllegalArgumentException("null path"); }
>>  else {
>> // we need this for backwards compatibility, to allow URIs like 
>> 'file:///'...URI uri = null;try {
>>uri = new Path(rawPath).toUri();}
>> catch (Exception e) {
>>// cannot parse as a path}
>>
>> if (uri != null && uri.getScheme() != null) {
>>if ("file".equalsIgnoreCase(uri.getScheme())) {
>>   path = uri.getPath();   }
>>else {
>>   throw new IllegalArgumentException("Path " + rawPath + " 
>> has a non-local scheme");   }
>> }
>> else {
>>path = rawPath;}
>>  }
>>
>>  pp[i] = new File(path); if (!pp[i].isAbsolute()) { // my 
>> suspicion is that this categorically fails for GlusterFS paths
>> throw new IllegalArgumentException("Relative paths are not 
>> supported"); }
>>   }
>>}
>> }
>>
>>
>>
>> On 03.07.2018 16:35, Jash, Shaswata (Nokia - IN/Bangalore) wrote:
>>
>> Hello Chesnay,
>>
>>
>>
>> Cluster (in kubernetes)-wide checkpointing directory using glusterfs
>> volume mount (thus file access protocol file:///) was working fine till
>> 1.4.2 for us. So we like to understand where the breakage happened in
>> 1.5.0.
>>
>> Can you please mention me the relevant source code files related to
>> rocksdb “custom file path” parsing logic? We would be interested to
>> investigate this.
>>
>>
>>
>> I also observed below in the log –
>>
>>
>>
>> Config uses 

Flink job hangs using rocksDb as backend

2018-07-11 Thread shishal
Hi,

I am using flink 1.4.2 with rocksdb as backend. I am using process function
with timer on EventTime.  For checkpointing I am using hdfs.

I am trying load testing so Iam reading kafka from beginning (aprox 7 days
data with 50M events).

My job gets stuck after aprox 20 min with no error. There after watermark do
not progress and all checkpoint fails.

Also When I try to cancel my job (using web UI) , it takes several minutes
to finally gets cancelled. Also it makes Task manager down as well. 

There is no logs while my job hanged but while cancelling I get following
error.

/

2018-07-11 09:10:39,385 ERROR
org.apache.flink.runtime.taskmanager.TaskManager  - 
==
==  FATAL  ===
==

A fatal error occurred, forcing the TaskManager to shut down: Task 'process
(3/6)' did not react to cancelling signal in the last 30 seconds, but is
stuck in method:
 org.rocksdb.RocksDB.get(Native Method)
org.rocksdb.RocksDB.get(RocksDB.java:810)
org.apache.flink.contrib.streaming.state.RocksDBMapState.get(RocksDBMapState.java:102)
org.apache.flink.runtime.state.UserFacingMapState.get(UserFacingMapState.java:47)
nl.ing.gmtt.observer.analyser.job.rtpe.process.RtpeProcessFunction.onTimer(RtpeProcessFunction.java:94)
org.apache.flink.streaming.api.operators.KeyedProcessOperator.onEventTime(KeyedProcessOperator.java:75)
org.apache.flink.streaming.api.operators.HeapInternalTimerService.advanceWatermark(HeapInternalTimerService.java:288)
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:108)
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:885)
org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:288)
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:189)
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
java.lang.Thread.run(Thread.java:748)

2018-07-11 09:10:39,390 DEBUG
org.apache.flink.runtime.akka.StoppingSupervisorWithoutLoggingActorKilledExceptionStrategy
 
- Actor was killed. Stopping it now.
akka.actor.ActorKilledException: Kill
2018-07-11 09:10:39,407 INFO 
org.apache.flink.runtime.taskmanager.TaskManager  - Stopping
TaskManager akka://flink/user/taskmanager#-1231617791.
2018-07-11 09:10:39,408 INFO 
org.apache.flink.runtime.taskmanager.TaskManager  - Cancelling
all computations and discarding all cached data.
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Attempting to fail task externally process (3/6)
(432fd129f3eea363334521f8c8de5198).
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Task process (3/6) is already in state CANCELING
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Attempting to fail task externally process (4/6)
(7c6b96c9f32b067bdf8fa7c283eca2e0).
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Task process (4/6) is already in state CANCELING
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Attempting to fail task externally process (2/6)
(a4f731797a7ea210fd0b512b0263bcd9).
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Task process (2/6) is already in state CANCELING
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Attempting to fail task externally process (1/6)
(cd8a113779a4c00a051d78ad63bc7963).
2018-07-11 09:10:39,409 INFO  org.apache.flink.runtime.taskmanager.Task 
   
- Task process (1/6) is already in state CANCELING
2018-07-11 09:10:39,409 INFO 
org.apache.flink.runtime.taskmanager.TaskManager  -
Disassociating from JobManager
2018-07-11 09:10:39,412 INFO 
org.apache.flink.runtime.blob.PermanentBlobCache  - Shutting
down BLOB cache
2018-07-11 09:10:39,431 INFO 
org.apache.flink.runtime.blob.TransientBlobCache  - Shutting
down BLOB cache
2018-07-11 09:10:39,444 INFO 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
Stopping ZooKeeperLeaderRetrievalService.
2018-07-11 09:10:39,444 DEBUG

Some question about document

2018-07-11 Thread Yuta Morisawa

Hi all

Now, I'm reading Flink document and I have some points to feel difficult 
to get an idea.

I'd appreciate if you tell it me.

1,TypeInformation
 I understand TypeInformation is used for selecting relevant serializer 
and comparator.

 But, the ducument doesn't specify if it has another way to be used.

 So, what I want to know is that what kinds of process gets benefit 
from TypeInformation other than serializer and comparator.


2, Managed Memory
 The word "Managed memory" is appeared several time in the document 
but I can't find any detail description.
 This is the only document I found 
(https://www.slideshare.net/sbaltagi/overview-of-apacheflinkbyslimbaltagi)


 If anyone has document that explains managed memory, please let me know.

3, Serializer
 What do the words in the document  "serializers we ship with Flink" 
mean? I know Flink uses avro for POJOs, is it the same thing?

https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/types_serialization.html


Regards,
Yuta

--


 Business Vision :"Challenge for the future"
-

  〒356-8502
 埼玉県ふじみ野市大原2丁目1番15号
 株式会社 KDDI総合研究所(KDDI Research, Inc.)
 コネクティッドカー1G
 森澤 雄太
 mail yu-moris...@kddi-research.jp
 tel  070-3871-8883

 この電子メールおよび添付書類は、名宛人のための
 特別な秘密情報を含んでおります。
 そのため、名宛人以外の方による利用は認められて
 おりません。
 名宛人以外の方による通信内容公表、複写、転用等
 は厳禁であり、違法となることがあります。
 万が一、何らかの誤りによりこの電子メールを名宛
 人以外の方が受信された場合は、お手数でも、直ち
 に発信人にお知らせ頂くと同時に、当メールを削除
 下さいますようお願い申し上げます。



Re: 答复: How to find the relation between a running job and the original jar?

2018-07-11 Thread Chesnay Schepler
For a running job there is no way to determine which jar was originally 
used for it.


I remember a previous request for this feature, but I don't think a JIRA 
was created for it.

Might be a good time to create one now.

Could you open one and specify your requirements?

On 11.07.2018 06:33, Lasse Nedergaard wrote:

Hi Tang

Thanks for the link. Yes your are rights and it works fine. But when I 
use the REST API for getting running jobs I can’t find any reference 
back to the jar used to start the job.


Med venlig hilsen / Best regards
Lasse Nedergaard


Den 11. jul. 2018 kl. 05.22 skrev Tang Cloud >:



Hi Lasse

   As far as I know, if you use post */jars/upload* REST API to 
submit your job, you can then get */jars* to list your user jar just 
uploaded. More information can refer to 
https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/rest_api.html#dispatcher
Apache Flink 1.5 Documentation: Monitoring REST API 

Flink has a monitoring API that can be used to query status and 
statistics of running jobs, as well as recent completed jobs. This 
monitoring API is used by Flink’s own dashboard, but is designed to 
be used also by custom monitoring tools. The monitoring API is a 
REST-ful API that accepts HTTP ...

ci.apache.org 

Thanks, Yun

*发件人:* Lasse Nedergaard >

*发送时间:* 2018年7月10日 17:40
*收件人:* user
*主题:* How to find the relation between a running job and the original 
jar?

Hi.

I working on a solution where I want to check if a running job use 
the right jar in the right version.


Anyone knows if it is possible through the REST API to find 
information about a running job that contains the jarid or something 
simillary so it is possible to lookup the original jar?


Need to work for Flink 1.5.0 or 1.4.2

Any help appreciated

Thanks in advance

Lasse Nedergaard





Re: Confusions About JDBCOutputFormat

2018-07-11 Thread Hequn Cheng
Hi wangsan,

What I mean is establishing a connection each time write data into JDBC,
i.e.  establish a connection in flush() function. I think this will make
sure the connection is ok. What do you think?

On Wed, Jul 11, 2018 at 12:12 AM, wangsan  wrote:

> Hi Hequn,
>
> Establishing a connection for each batch write may also have idle
> connection problem, since we are not sure when the connection will be
> closed. We call flush() method when a batch is finished or  snapshot state,
> but what if the snapshot is not enabled and the batch size not reached
> before the connection is closed?
>
> May be we could use a Timer to test the connection periodically and keep
> it alive. What do you think?
>
> I will open a jira and try to work on that issue.
>
> Best,
> wangsan
>
>
>
> On Jul 10, 2018, at 8:38 PM, Hequn Cheng  wrote:
>
> Hi wangsan,
>
> I agree with you. It would be kind of you to open a jira to check the
> problem.
>
> For the first problem, I think we need to establish connection each time
> execute batch write. And, it is better to get the connection from a
> connection pool.
> For the second problem, to avoid multithread problem, I think we should
> synchronized the batch object in flush() method.
>
> What do you think?
>
> Best, Hequn
>
>
>
> On Tue, Jul 10, 2018 at 2:36 PM, wangsan  wrote:
>
>> Hi all,
>>
>> I'm going to use JDBCAppendTableSink and JDBCOutputFormat in my Flink
>> application. But I am confused with the implementation of JDBCOutputFormat.
>>
>> 1. The Connection was established when JDBCOutputFormat is opened, and
>> will be used all the time. But if this connction lies idle for a long time,
>> the database will force close the connetion, thus errors may occur.
>> 2. The flush() method is called when batchCount exceeds the threshold,
>> but it is also called while snapshotting state. So two threads may modify
>> upload and batchCount, but without synchronization.
>>
>> Please correct me if I am wrong.
>>
>> ——
>> wangsan
>>
>
>
>


Re: Filter columns of a csv file with Flink

2018-07-11 Thread Hequn Cheng
Hi francois,

> Is there any plan to give avro schemas a better role in Flink in further
versions?
Haven't heard about avro for csv. You can open a jira for it. Maybe also
contribute to flink :-)


On Tue, Jul 10, 2018 at 11:32 PM, françois lacombe <
francois.laco...@dcbrain.com> wrote:

> Hi Hequn,
>
> 2018-07-10 3:47 GMT+02:00 Hequn Cheng :
>
>> Maybe I misunderstand you. So you don't want to skip the whole file?
>>
> Yes I do
> By skipping the whole file I mean "throw an Exception to stop the process
> and inform user that file is invalid for a given reason" and not "the
> process goes fully right and import 0 rows"
>
>
>> If does, then "extending CsvTableSource and provide the avro schema to
>> the constructor without creating a custom AvroInputFormat" is ok.
>>
>
> Then we agree on this
> Is there any plan to give avro schemas a better role in Flink in further
> versions?
> Avro schemas are perfect to build CSVTableSource with code like
>
> for (Schema field_nfo : sch.getTypes()){
>  // Test if csv file header actually contains a field corresponding to
> schema
>  if (!csv_headers.contains(field_nfo.getName())) {
>   throw new NoSuchFieldException(field_nfo.getName());
>  }
>
>  // Declare the field in the source Builder
>  src_builder.field(field_nfo.getName(), primitiveTypes.get(field_nfo.
> getType()));
> }
>
> All the best
>
> François
>
>
>
>> On Mon, Jul 9, 2018 at 11:03 PM, françois lacombe <
>> francois.laco...@dcbrain.com> wrote:
>>
>>> Hi Hequn,
>>>
>>> 2018-07-09 15:09 GMT+02:00 Hequn Cheng :
>>>
 The first step requires an AvroInputFormat because the source needs
 AvroInputFormat to read avro data if data match schema.

>>>
>>> I don't want avro data, I just want to check if my csv file have the
>>> same fields than defined in a given avro schema.
>>> Processing should stop if and only if I find missing columns.
>>>
>>> A record which not match the schema (types mainly) should be rejected
>>> and logged in a dedicated file but the processing can go on.
>>>
>>> How about extending CsvTableSource and provide the avro schema to the
>>> constructor without creating a custom AvroInputFormat?
>>>
>>>
>>> François
>>>
>>
>>
>