Re: Flink SQL: Execute DELETE queries

2019-05-28 Thread JingsongLee
Hi @Papadopoulos, Konstantinos
I think you can try something like this:
JDBCAppendTableSink sink = JDBCAppendTableSink.builder()
   .setDrivername("foo")
   .setDBUrl("bar")
   .setQuery("delete from %s where id = ?)")
   .setParameterTypes(FIELD_TYPES)
   .build();
Or you can build your own Sink code, where you can delete rows of DB table.

Best, JingsongLee


--
From:Papadopoulos, Konstantinos 
Send Time:2019年5月28日(星期二) 22:54
To:Vasyl Bervetskyi 
Cc:user@flink.apache.org 
Subject:RE: Flink SQL: Execute DELETE queries

  
The case I have in mind was to have an external JDBC table sink and try to 
delete a number of or all rows of the target DB table. Is it possible using 
Flink SQL?
 
From: Vasyl Bervetskyi  
Sent: Tuesday, May 28, 2019 5:36 PM
To: Papadopoulos, Konstantinos 
Cc: user@flink.apache.org
Subject: RE: Flink SQL: Execute DELETE queries  
Hi Papadopoulos,
 
Unfortunately no, there is no DELETE or MODIFY queries, you should create new 
table as a result of query which will filter records from existing one  
 
From: Papadopoulos, Konstantinos  
Sent: Tuesday, May 28, 2019 5:25 PM
To: user@flink.apache.org
Subject: Flink SQL: Execute DELETE queries  
 
Hi all,
 
I experiment on Flink Table API & SQL and I have the following question; is 
there any way to execute DELETE queries using Flink SQL?
 
Thanks in advance,
Konstantinos  

Building Flink distribution with Scala2.12

2019-05-28 Thread Boris Lublinsky
Hi, I am trying to build Flink distribution for Scala 2.12 using the following 
command:
mvn clean package -pl flink-dist -am -Pscala-2.12 -DskipTests
, but I am getting the following error:

[INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce (enforce-versions) @ 
flink-runtime_2.12 ---
[WARNING] Rule 0: org.apache.maven.plugins.enforcer.BannedDependencies failed 
with message:
Found Banned Dependency: com.typesafe.akka:akka-testkit_2.12:jar:2.4.20
Found Banned Dependency: 
org.apache.flink:flink-queryable-state-client-java_2.12:jar:1.8.0
Found Banned Dependency: com.typesafe.akka:akka-remote_2.12:jar:2.4.20
Found Banned Dependency: 
org.scala-lang.modules:scala-java8-compat_2.12:jar:0.8.0
Found Banned Dependency: com.typesafe:ssl-config-core_2.12:jar:0.2.1
Found Banned Dependency: org.clapper:grizzled-slf4j_2.12:jar:1.3.2
Found Banned Dependency: com.github.scopt:scopt_2.12:jar:3.5.0
Found Banned Dependency: com.typesafe.akka:akka-protobuf_2.12:jar:2.4.20
Found Banned Dependency: com.twitter:chill_2.12:jar:0.7.6
Found Banned Dependency: org.scalatest:scalatest_2.12:jar:3.0.0
Found Banned Dependency: com.typesafe.akka:akka-actor_2.12:jar:2.4.20
Found Banned Dependency: com.typesafe.akka:akka-slf4j_2.12:jar:2.4.20
Found Banned Dependency: org.scalactic:scalactic_2.12:jar:3.0.0
Found Banned Dependency: com.typesafe.akka:akka-stream_2.12:jar:2.4.20
Found Banned Dependency: org.scala-lang.modules:scala-xml_2.12:jar:1.0.5
Found Banned Dependency: 
org.scala-lang.modules:scala-parser-combinators_2.12:jar:1.0.4
Use 'mvn dependency:tree' to locate the source of the banned dependencies.
[INFO] -

Which fails the build.

At the same time, build 

mvn clean package -pl flink-dist -am -Pscala-2.11 -DskipTests
Works fine

Am I doing something wrong?


Boris Lublinsky
FDP Architect
boris.lublin...@lightbend.com
https://www.lightbend.com/



Re: Upgrading from 1.4 to 1.8, losing Kafka consumer state

2019-05-28 Thread Nikolas Davis
I checked the logs thanks to Paul's suggestion. I see a couple interesting
things. Restoring into 1.8 from a 1.4 savepoint, some TMs receive partial
state (e.g. only a partition/offset pair or two per TM -- we have 8
partitions on this topic). I'm not sure if this is normal (e.g. maybe TMs
only used to receive the state for which they care). I focused on one
topic, and I noticed that for at least 1 partition there is no restored
state. Regardless of there being some state, it appears that all consumers
are starting from scratch. What's also weird is that again we start from
earliest offsets. The partition/offset state that is "restored" looks
healthy -- e.g. valid partitions and offsets. We use the default of
setStartFromGroupOffsets as well as the default Kafka option for
auto.offset.reset. I believe should cause it to read from latest in the
absence of state, not earliest. We are using the same consumer group as the
legacy 1.4 app that we are restoring from, and shutting off the 1.4 job
before starting our new cluster up.

I also received some of these errors:

2019-05-24 22:10:57,479 INFO
 org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase  - No
restore state for FlinkKafkaConsumer.
2019-05-24 22:10:59,511 INFO
 org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase  - No
restore state for FlinkKafkaConsumer.
2019-05-24 22:10:59,524 INFO
 org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase  - No
restore state for FlinkKafkaConsumer.

Using a 1.8 savepoint and restoring to 1.8, I see that each TM receives all
state for the cluster, e.g. each receives every partition/offset pair for
the topic I'm digging into. I also see none of the errors above. There is
no inrush of data -- it appears to be restoring from known offsets well.

Is there some change in how state is managed or what state is stored
between these versions that can cause this? I can post more of the logs if
it is of help. Is there some intermediate version of Flink (1.5-1.7) that
we'd be able to restore / create a savepoint from to ensure the continuity
of our state in 1.8? Any other thoughts?

Thanks again,

Nik Davis
Senior Software Engineer
New Relic


On Fri, May 24, 2019 at 12:26 AM Paul Lam  wrote:

> Hi Nik,
>
> Could you check outt the taskmanagers’ logs? When restored from a
> savepoint/checkpoint, FlinkKafkaConsumer would log the starting offset of
> Kafka partitions.
>
> WRT `auto.offset.rest` in Kafka configuration, it’s of a relatively low
> priority, and would only be used when there’s no restored state plus
> FlinkKafkaConsumer is set to `startFromGroupOffset`.
>
> Best,
> Paul Lam
>
> 在 2019年5月24日,07:50,Nikolas Davis  写道:
>
> Howdy,
>
> We're in the process of upgrading to 1.8. When restoring state to the new
> cluster (using a savepoint) we are seeing our Kafka consumers restart from
> the earliest offset. We're not receiving any other indication that our
> state was not accepted as part of the deploy, e.g. we are not allowing
> unrestored state, not receiving any errors.
>
> We have our consumers setup with the same consumer group and using the
> same consumer (FlinkKafkaConsumer010) as our 1.4 deploy.
>
> Has anyone encountered this? Any idea what we might be doing wrong?
>
> What's also strange is that we are not setting auto.offset.reset, which
> defaults to is largest (analogous to latest, correct?) -- which is not what
> we're seeing happen.
>
> Regards,
>
> Nik
>
>
>


RE: Flink SQL: Execute DELETE queries

2019-05-28 Thread Papadopoulos, Konstantinos
The case I have in mind was to have an external JDBC table sink and try to 
delete a number of or all rows of the target DB table. Is it possible using 
Flink SQL?

From: Vasyl Bervetskyi 
Sent: Tuesday, May 28, 2019 5:36 PM
To: Papadopoulos, Konstantinos 
Cc: user@flink.apache.org
Subject: RE: Flink SQL: Execute DELETE queries

Hi Papadopoulos,

Unfortunately no, there is no DELETE or MODIFY queries, you should create new 
table as a result of query which will filter records from existing one

From: Papadopoulos, Konstantinos 
mailto:konstantinos.papadopou...@iriworldwide.com>>
Sent: Tuesday, May 28, 2019 5:25 PM
To: user@flink.apache.org
Subject: Flink SQL: Execute DELETE queries

Hi all,

I experiment on Flink Table API & SQL and I have the following question; is 
there any way to execute DELETE queries using Flink SQL?

Thanks in advance,
Konstantinos


RE: Flink SQL: Execute DELETE queries

2019-05-28 Thread Vasyl Bervetskyi
Hi Papadopoulos,

Unfortunately no, there is no DELETE or MODIFY queries, you should create new 
table as a result of query which will filter records from existing one

From: Papadopoulos, Konstantinos 
Sent: Tuesday, May 28, 2019 5:25 PM
To: user@flink.apache.org
Subject: Flink SQL: Execute DELETE queries

Hi all,

I experiment on Flink Table API & SQL and I have the following question; is 
there any way to execute DELETE queries using Flink SQL?

Thanks in advance,
Konstantinos


Re: [External] Re: How many task managers can a cluster reasonably handle?

2019-05-28 Thread Antonio Verardi
Thanks for the info, Xintong Song!

Cheers,
Antonio


On Fri, May 24, 2019 at 3:38 AM Xintong Song  wrote:

> Hi Antonio,
>
> According to experience in our production, Flink totally can handle 150
> TaskManagers per cluster. Actually, we have encountered much larger jobs
> with thousands that each single job demands thousands of TaskManagers.
> However, as the job scale increases, it gets harder to achieve good
> stability. Because there are more tasks, thus higher chance of job failover
> (or region failover if possible) caused by a single task failure. So if you
> don't have jobs as large as that scale, I think 150 TaskManagers per
> cluster would be a good choice.
>
> In case you do encounter a JobManager performance bottleneck, usually it
> can be solved by increasing the JobManager's resources with a '-jm'
> argument.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, May 24, 2019 at 2:33 AM Antonio Verardi  wrote:
>
>> Hello Flink users,
>>
>> How many task managers one can expect a Flink cluster to be able to
>> reasonably handle?
>>
>> I want to move a pretty big cluster from a setup on AWS EMR to one based
>> on Kubernetes. I was wondering whether it makes sense to break up the beefy
>> task managers the cluster had in something like 150 task manager containers
>> of a slot each. This is a pattern that a couple different people I met at
>> meetups told me they are using in production, but I don't know if they
>> tried something similar at this scale. Would the jobmanager be able to
>> manage so many task managers in your opinion?
>>
>> Cheers,
>> Antonio
>>
>


Flink SQL: Execute DELETE queries

2019-05-28 Thread Papadopoulos, Konstantinos
Hi all,

I experiment on Flink Table API & SQL and I have the following question; is 
there any way to execute DELETE queries using Flink SQL?

Thanks in advance,
Konstantinos


Re: Restore state class not found exception in 1.8

2019-05-28 Thread Lasse Nedergaard
Hi Gordon

We have found a solution but not why it happens on 1.8. 
For it to work we need to call
Env.registertype(Reportmessage.class)

Reportmessage extends ReportmessageBase and the state operator use 
ReportmessageBase. 
So we need to register all the class’s that extends a class used in state. 
Don’t know why this is needed in 1.8

Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 28. maj 2019 kl. 10.06 skrev Tzu-Li (Gordon) Tai :
> 
> Hi Lasse,
> 
> Did you move the class to a different namespace / package or changed to be a 
> nested class, across the Flink versions?
> That would be the only cause I could reason about at the moment.
> 
> If possible, could you also have a very minimal snippet / instructions on how 
> I can maybe reproduce this?
> That might give me more insight.
> 
> Cheers,
> Gordon
> 
>> On Mon, May 27, 2019 at 7:52 PM Lasse Nedergaard  
>> wrote:
>> Hi.
>> 
>> When we restart some of our jobs from a savepoint we see the the exception 
>> below. It only happens for some of our jobs and we didn't see it in 1.7.2. 
>> The class Flink can't find differ from job to job and we are sure it's 
>> included in our Fat jar.
>> As a side note we are on our way to use Avro instead of POJO, but are not 
>> there yet.
>> If anyone have a clue what the root cause could be, and how to resolve it 
>> would be appreciated.
>> Thanks in advance
>> 
>> Lasse Nedergaard
>> 
>> java.lang.Exception: Exception while creating StreamOperatorStateContext.
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:195)
>>  at 
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:250)
>>  at 
>> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
>>  at 
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
>>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>>  at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.flink.util.FlinkException: Could not restore operator 
>> state backend for StreamSink_609b5f7fc746f29234b038c121356a9b_(2/2) from any 
>> of the 1 provided restore options.
>>  at 
>> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.operatorStateBackend(StreamTaskStateInitializerImpl.java:255)
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:143)
>>  ... 5 more
>> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Failed 
>> when trying to restore operator state backend
>>  at 
>> org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder.build(DefaultOperatorStateBackendBuilder.java:86)
>>  at 
>> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createOperatorStateBackend(RocksDBStateBackend.java:537)
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$operatorStateBackend$0(StreamTaskStateInitializerImpl.java:246)
>>  at 
>> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
>>  at 
>> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
>>  ... 7 more
>> Caused by: java.lang.RuntimeException: Cannot instantiate class.
>>  at 
>> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:384)
>>  at 
>> org.apache.flink.runtime.state.OperatorStateRestoreOperation.deserializeOperatorStateValues(OperatorStateRestoreOperation.java:191)
>>  at 
>> org.apache.flink.runtime.state.OperatorStateRestoreOperation.restore(OperatorStateRestoreOperation.java:165)
>>  at 
>> org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder.build(DefaultOperatorStateBackendBuilder.java:83)
>>  ... 11 more
>> Caused by: java.lang.ClassNotFoundException: 
>> org/trackunit/tm2/formats/ReportMessage
>>  at java.lang.Class.forName0(Native Method)
>>  at java.lang.Class.forName(Class.java:348)
>>  at 
>> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:382)
>>  ... 14 more
>> 
>>  


Flink CLI distributed cache fault

2019-05-28 Thread Vasyl Bervetskyi
Hi there,

I faced with issue in adding file to distributed cache in Flink.
My setup:

-  Java 1.8

-  Flink 1.8

-  OS: Windows, Linux
Test scenario:

1.   Create simple stream environment

2.   Add to distributed cache local file

3.   Add simple source function and sink

4.   Execute job from Flink CLI (Windows/Linux)

In order to restore job from savepoint or from checkpoint we need to run our 
job from Flink CLI. And pipelines that have distributed cache fails their 
execution.
Moreover it is different in Linux and Windows systems: in Windows we get 
"java.nio.file.InvalidPathException: Illegal char <:> at index 4" and on Linux 
we have our Flink freezing (it just stuck and do not do anything, no any error 
message or stacktrace).

My piece of code for windows environment:


public class CachePipeline {

public static void main(String[] args) throws Exception {
StreamExecutionEnvironment see = 
StreamExecutionEnvironment.getExecutionEnvironment();
see.registerCachedFile("file:///D:/test.csv", "MyFile");

see.addSource(new SourceFunction() {

@Override
public void run(SourceContext ctx) throws Exception {
while(true){
synchronized(ctx.getCheckpointLock()){
ctx.collect(5);
}
Thread.sleep(1000);
}
}

@Override
public void cancel() {}

}).print();

see.execute();
}
}

command for running job that I used for:

flink run -c test.CachePipeline D:\path\to\jar\cache-test.jar


In case with Linux OS I changed file location to:

see.registerCachedFile("file:///home/test.csv", "MyFile");

Windows stacktrace:

flink run -c com.CachePipeline D:\repository\cache-test.jar

log4j:WARN No appenders could be found for logger 
(org.apache.flink.client.cli.CliFrontend).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Starting execution of program


The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Could not retrieve 
the execution result. (JobID: 38631d859b64cd86201bbe09a32c62f3)
at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:261)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483)
at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1509)
at com.granduke.teleprocessing.CachePipeline.main(CachePipeline.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
at 
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:388)
at java.util.concurrent.CompletableFuture.uniExceptionally(Unknown 
Source)
at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(Unknown Source)
at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExec

Re: Restore state class not found exception in 1.8

2019-05-28 Thread Lasse Nedergaard
Hi Gordon

Thanks for the reply. No we haven’t moved it around namespaces. The only thing 
we have done is to add a new attribute to the object in another branch of our 
code and it could be we by mistake has used it but it should still not give a 
class not found exception. 
We have the save point data in S3 so is there a way to use this save point 
together with test case so we can debug it locally? Or start Flink mini cluster 
with this save point?

Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 28. maj 2019 kl. 10.06 skrev Tzu-Li (Gordon) Tai :
> 
> Hi Lasse,
> 
> Did you move the class to a different namespace / package or changed to be a 
> nested class, across the Flink versions?
> That would be the only cause I could reason about at the moment.
> 
> If possible, could you also have a very minimal snippet / instructions on how 
> I can maybe reproduce this?
> That might give me more insight.
> 
> Cheers,
> Gordon
> 
>> On Mon, May 27, 2019 at 7:52 PM Lasse Nedergaard  
>> wrote:
>> Hi.
>> 
>> When we restart some of our jobs from a savepoint we see the the exception 
>> below. It only happens for some of our jobs and we didn't see it in 1.7.2. 
>> The class Flink can't find differ from job to job and we are sure it's 
>> included in our Fat jar.
>> As a side note we are on our way to use Avro instead of POJO, but are not 
>> there yet.
>> If anyone have a clue what the root cause could be, and how to resolve it 
>> would be appreciated.
>> Thanks in advance
>> 
>> Lasse Nedergaard
>> 
>> java.lang.Exception: Exception while creating StreamOperatorStateContext.
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:195)
>>  at 
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:250)
>>  at 
>> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
>>  at 
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
>>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>>  at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.flink.util.FlinkException: Could not restore operator 
>> state backend for StreamSink_609b5f7fc746f29234b038c121356a9b_(2/2) from any 
>> of the 1 provided restore options.
>>  at 
>> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.operatorStateBackend(StreamTaskStateInitializerImpl.java:255)
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:143)
>>  ... 5 more
>> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Failed 
>> when trying to restore operator state backend
>>  at 
>> org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder.build(DefaultOperatorStateBackendBuilder.java:86)
>>  at 
>> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createOperatorStateBackend(RocksDBStateBackend.java:537)
>>  at 
>> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$operatorStateBackend$0(StreamTaskStateInitializerImpl.java:246)
>>  at 
>> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
>>  at 
>> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
>>  ... 7 more
>> Caused by: java.lang.RuntimeException: Cannot instantiate class.
>>  at 
>> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:384)
>>  at 
>> org.apache.flink.runtime.state.OperatorStateRestoreOperation.deserializeOperatorStateValues(OperatorStateRestoreOperation.java:191)
>>  at 
>> org.apache.flink.runtime.state.OperatorStateRestoreOperation.restore(OperatorStateRestoreOperation.java:165)
>>  at 
>> org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder.build(DefaultOperatorStateBackendBuilder.java:83)
>>  ... 11 more
>> Caused by: java.lang.ClassNotFoundException: 
>> org/trackunit/tm2/formats/ReportMessage
>>  at java.lang.Class.forName0(Native Method)
>>  at java.lang.Class.forName(Class.java:348)
>>  at 
>> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:382)
>>  ... 14 more
>> 
>>  


[ANNOUNCE] Munich meetup: "Let's talk about "Stream Processing with Apache Flink"

2019-05-28 Thread Fabian Hueske
Hi folks,

Next Tuesday (June, 4th), Vasia and I will be speaking at a meetup in
Munich about Flink and how we wrote our book "Stream Processing with Apache
Flink".

We will also raffle a few copies of the book.

Please RSVP if you'd like to attend:
-> https://www.meetup.com/inovex-munich/events/261643276/

Cheers, Fabian


Re: Distributed cache fault

2019-05-28 Thread Till Rohrmann
Hi Vasyl,

please post these kind of question to Flink's user ML since the dev ML is
used for development discussions.

For the failure on Windows could you share the complete stack trace to see
where exactly it fails? It looks as if on Windows the scheme part of the
URI makes problems.

Looking at the example program, you have effectively implemented a busy
loop which will eat up all your CPU cycles. Moreover you create a new
object in every iteration which puts a high load on the JVM's GC. This
might explain why Flink seems to freeze. Could you check the CPU load on
the machines? Alternatively, you could insert a `Thread.sleep(1)` after
every `collect()` call. As a side note, you should always emit objects
under the checkpointing lock. Otherwise Flink cannot give you proper
processing guarantees:

synchronized (ctx.getCheckpointLock()) {
ctx.collect(...);
}

Cheers,
Till

On Mon, May 27, 2019 at 5:54 PM Vasyl Bervetskyi 
wrote:

> Hi there,
>
> I faced with issue in adding file to distributed cache in Flink.
> My setup:
>
> -  Java 1.8
>
> -  Flink 1.8
>
> -  OS: Windows, Linux
> Test scenario:
>
> 1.   Create simple stream environment
>
> 2.   Add to distributed cache local file
>
> 3.   Add simple source function and sink
>
> 4.   Execute job from CLI (Windows/Linux)
>
> In order to restore job from savepoint or from checkpoint we need to run
> our job from Flink CLI. And pipelines that have distributed cache fails
> their execution.
> Moreover it is different in Linux and Windows systems: in Windows we get
> "java.nio.file.InvalidPathException: Illegal char <:> at index 4" and on
> Linux we have our Flink freezing (it just stops and do not do anything, no
> any error message or stacktrace).
>
> My piece of code for windows environment:
>
>
> public class CachePipeline {
>
> public static void main(String[] args) throws Exception {
> StreamExecutionEnvironment see =
> StreamExecutionEnvironment.getExecutionEnvironment();
> see.registerCachedFile("file:///D:/test.csv", "CacheFile");
>
> see.addSource(new SourceFunction() {
> @Override
> public void run(SourceContext ctx) throws Exception {
> ctx.collect(new Object());
> }
>
> @Override
> public void cancel() {
>
> }
> }).print();
>
> see.execute();
> }
> }
>
> command for running job that I used for:
>
> flink run -c test.CachePipeline D:\path\to\jar\cache-test.jar
>
> Did anybody face with this?
>


Re: [SURVEY] Usage of flink-ml and [DISCUSS] Delete flink-ml

2019-05-28 Thread Till Rohrmann
+1 for removing it. I think it is effectively dead by now.

Cheers,
Till

On Mon, May 27, 2019 at 4:00 PM Hequn Cheng  wrote:

> Hi Shaoxuan,
>
> Thanks a lot for driving this. +1 to remove the module.
>
> The git log of this module shows that it has been inactive for a long
> time. I think it's ok to remove it for now. It would also be good to switch
> to the new interface earlier.
>
> Best, Hequn
>
> On Mon, May 27, 2019 at 8:58 PM Becket Qin  wrote:
>
>> +1 for removal. Personally I'd prefer marking it as deprecated and remove
>> the module in the next release, just to follow the established procedure.
>>
>> And +1 on removing the `flink-libraries/flink-ml-uber` as well.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Mon, May 27, 2019 at 5:07 PM jincheng sun 
>> wrote:
>>
>>> +1 for remove it!
>>>
>>> And we also plan to delete the `flink-libraries/flink-ml-uber`, right?
>>>
>>> Best,
>>> Jincheng
>>>
>>> Rong Rong  于2019年5月24日周五 上午1:18写道:
>>>
 +1 for the deletion.

 Also I think it also might be a good idea to update the roadmap for the
 plan of removal/development since we've reached the consensus on FLIP-39.

 Thanks,
 Rong


 On Wed, May 22, 2019 at 8:26 AM Shaoxuan Wang 
 wrote:

> Hi Chesnay,
> Yes, you are right. There is not any active commit planned for the
> legacy Flink-ml package. It does not matter delete it now or later. I will
> open a PR and remove it.
>
> Shaoxuan
>
> On Wed, May 22, 2019 at 7:05 PM Chesnay Schepler 
> wrote:
>
>> I believe we can remove it regardless since users could just use the
>> 1.8
>> version against future releases.
>>
>> Generally speaking, any library/connector that is no longer actively
>> developed can be removed from the project as existing users can
>> always
>> rely on previous versions, which should continue to work by virtue of
>> working against @Stable APIs.
>>
>> On 22/05/2019 12:08, Shaoxuan Wang wrote:
>> > Hi Flink community,
>> >
>> > We plan to delete/deprecate the legacy flink-libraries/flink-ml
>> package in
>> > Flink1.9, and replace it with the new flink-ml interface proposed
>> in FLIP39
>> > (FLINK-12470).
>> > Before we remove this package, I want to reach out to you and ask
>> if there
>> > is any active project still uses this package. Please respond to
>> this
>> > thread and outline how you use flink-libraries/flink-ml.
>> > Depending on the replies of activity and adoption
>> > of flink-libraries/flink-ml, we will decide to either delete this
>> package
>> > in Flink1.9 or deprecate it for now & remove it in the next release
>> after
>> > 1.9.
>> >
>> > Thanks for your attention and help!
>> >
>> > Regards,
>> > Shaoxuan
>> >
>>
>>


Flink 1.8: Job manager redirection not happening in High Availability mode

2019-05-28 Thread Kumar Bolar, Harshith
Hi all,

Prior to upgrading to 1.8, there was one active job manager and when I try to 
access the inactive job manager's web UI, the page used to get redirected to 
the active job manager. But now there is no redirection happening from the 
inactive JM to active JM. Did something change to the redirection handling in 
1.8?

My job managers are indexed flink0-0 and flink4-0.

Log from flink0-0:
2019-05-28 04:03:10,082 INFO  
org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
  - State change: CONNECTED
2019-05-28 04:03:10,189 INFO  
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint- Web frontend 
listening at http://flink0-0.high.ue1.pre.aws.cloud.abc.com:8080.
Masters file in flink0-0:
flink0-0:8081
flink4-0:8081

Log from flink4-0:
2019-05-28 04:03:10,093 INFO  
org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
  - State change: CONNECTED
2019-05-28 04:03:10,293 INFO  
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint- Web frontend 
listening at http://flink4-0.high.ue1.pre.aws.cloud.abc.com:8080.
Masters file in flink4-0:
flink0-0:8081
flink4-0:8081

Thanks,
Harshith


Re: Re: How can I add config file as classpath in taskmgr node when submitting a flink job?

2019-05-28 Thread wangl...@geekplus.com.cn

Thanks. Let me have a try


wangl...@geekplus.com.cn
 
From: Yang Wang
Date: 2019-05-28 09:47
To: wangl...@geekplus.com.cn
CC: user
Subject: Re: How can I add config file as classpath in taskmgr node when 
submitting a flink job?
Hi, wangleiYou could use the flink distributed cache to register some config 
files and then access them in your task.1. Register a cached 
fileStreamExecutionEnvironment.registerCachedFile(inputFile.toString(), 
"test_data", false);2. Access the file in your taskfinal Path testFile = 
getRuntimeContext().getDistributedCache().getFile("test_data").toPath();

wangl...@geekplus.com.cn  于2019年5月26日周日 上午12:06写道:

When starting  a single node java application, I can add some config file to it.

How can i implenment it when submitting a flink job? The config file need to be 
read from taskMgr node and used to initialize some classess.





wangl...@geekplus.com.cn


Re: Restore state class not found exception in 1.8

2019-05-28 Thread Tzu-Li (Gordon) Tai
Hi Lasse,

Did you move the class to a different namespace / package or changed to be
a nested class, across the Flink versions?
That would be the only cause I could reason about at the moment.

If possible, could you also have a very minimal snippet / instructions on
how I can maybe reproduce this?
That might give me more insight.

Cheers,
Gordon

On Mon, May 27, 2019 at 7:52 PM Lasse Nedergaard 
wrote:

> Hi.
>
> When we restart some of our jobs from a savepoint we see the the exception
> below. It only happens for some of our jobs and we didn't see it in 1.7.2.
> The class Flink can't find differ from job to job and we are sure it's
> included in our Fat jar.
> As a side note we are on our way to use Avro instead of POJO, but are not
> there yet.
> If anyone have a clue what the root cause could be, and how to resolve it
> would be appreciated.
> Thanks in advance
>
> Lasse Nedergaard
>
> java.lang.Exception: Exception while creating StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:195)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:250)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.FlinkException: Could not restore operator 
> state backend for StreamSink_609b5f7fc746f29234b038c121356a9b_(2/2) from any 
> of the 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.operatorStateBackend(StreamTaskStateInitializerImpl.java:255)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:143)
>   ... 5 more
> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Failed 
> when trying to restore operator state backend
>   at 
> org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder.build(DefaultOperatorStateBackendBuilder.java:86)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createOperatorStateBackend(RocksDBStateBackend.java:537)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$operatorStateBackend$0(StreamTaskStateInitializerImpl.java:246)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
>   ... 7 more
> Caused by: java.lang.RuntimeException: Cannot instantiate class.
>   at 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:384)
>   at 
> org.apache.flink.runtime.state.OperatorStateRestoreOperation.deserializeOperatorStateValues(OperatorStateRestoreOperation.java:191)
>   at 
> org.apache.flink.runtime.state.OperatorStateRestoreOperation.restore(OperatorStateRestoreOperation.java:165)
>   at 
> org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder.build(DefaultOperatorStateBackendBuilder.java:83)
>   ... 11 more
> Caused by: java.lang.ClassNotFoundException: 
> org/trackunit/tm2/formats/ReportMessage
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:382)
>   ... 14 more
>
>
>
>