Re: Starting a job that does not use checkpointing from a savepoint is broken ?

2018-01-18 Thread jelmer
jelmer 07:49 (0 minutes ago) to Eron Hey Eron, Thanks, you stated the issue better and more compact than I could I will not debate the wisdom of not using checkpoints but when migrating jobs you may not be aware if a job has checkpointing enabled, if you are not the author,

Starting a job that does not use checkpointing from a savepoint is broken ?

2018-01-18 Thread jelmer
I ran into a rather annoying issue today while upgrading a flink jobs from flink 1.3.2 to 1.4.0 This particular job does not use checkpointing not state. I followed the instructions at https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/upgrading.html First created a savepoint,

Re: Failing to recover once checkpoint fails

2018-01-18 Thread Vishal Santoshi
Or this one https://issues.apache.org/jira/browse/FLINK-4815 On Thu, Jan 18, 2018 at 4:13 PM, Vishal Santoshi wrote: > ping. > > This happened again on production and it seems reasonable to abort > when a checkpoint is not found rather than behave as if it is a

Re: Failing to recover once checkpoint fails

2018-01-18 Thread Vishal Santoshi
ping. This happened again on production and it seems reasonable to abort when a checkpoint is not found rather than behave as if it is a brand new pipeline. On Tue, Jan 16, 2018 at 9:33 AM, Vishal Santoshi wrote: > Folks sorry for being late on this. Can some

NoClassDefFoundError of a Avro class after cancel then resubmit the same job

2018-01-18 Thread xiatao123
Not sure why, when I submit the job at the first time after a cluster launch, it is working fine. After I cancelled the first job, then resubmit the same job again, it will hit the NoClassDefFoundError. Very weird, feels like some clean up of a cancelled job messed up future job of the same

Re: flowable <-> flink integration

2018-01-18 Thread Martin Grofčík
Hi Maciek, Thanks a lot for your answer. The first step which I did was that I am able execute flink job through flink REST API. Until flink job runs flowable process instance checks the status of the process through the flink rest API. Process instance continues further when the process ins

Re: flowable <-> flink integration

2018-01-18 Thread Maciek Próchniak
Hi Martin, I did some activiti development so your mail caught my attention :) I don't think I understand what are you trying to achieve - where is process you're simulating, where is simulation running and where is place for Flink. Do you want to invoke Flink (batch job I suppose?) from

Re: Submitting jobs via Java code

2018-01-18 Thread Luigi Sgaglione
Solved. this is the corret code to deploy a Job programmatically via REST API. Thanks URL serverUrl = new URL("http://192.168.149.130:8081/jars/upload;); HttpURLConnection urlConnection = (HttpURLConnection) serverUrl.openConnection(); String boundaryString = "--Boundary"; String crlf

Re: Submitting jobs via Java code

2018-01-18 Thread Luigi Sgaglione
Hi Timo, I think that the REST API is the most suitable solution. Thanks. So, I'm trying to use the Flink REST API and I'm able to perform get request but not the post one. In particular when I issue a post to upload the jar I receive this error form the server: {"error": "Failed to upload the

Re: Scheduling of GroupByKey and CombinePerKey operations

2018-01-18 Thread Fabian Hueske
Hi Pawel, This question might be better suited for the Beam user list. Beam includes the Beam Flink runner which translates Beam programs into Flink programs. Best, Fabian 2018-01-18 16:02 GMT+01:00 Pawel Bartoszek : > Can I ask why some operations run only one

Re: Far too few watermarks getting generated with Kafka source

2018-01-18 Thread William Saar
Hi, The watermark does not seem to get updated at all after the first one is emitted. We used to get out-of-order warnings, but we changed to job to support a bounded timestamp extractor so we no longer get those warnings. Our timestamp extractor looks like this class TsExtractor[T

Scheduling of GroupByKey and CombinePerKey operations

2018-01-18 Thread Pawel Bartoszek
Can I ask why some operations run only one slot? I understand that file writes should happen only one one slot but GroupByKey operation could be distributed across all slots. I am having around 20k distinct keys every minute. Is there any way to break this operator chain? I noticed that

RE: Multiple Elasticsearch sinks not working in Flink

2018-01-18 Thread Teena Kappen // BPRISE
Hi Timo, It works fine when the second sink is a Cassandra Sink. The data gets read from KafkaTopic2 and it gets written to Cassandra as expected. Regards, Teena From: Timo Walther [mailto:twal...@apache.org] Sent: 18 January 2018 18:41 To: user@flink.apache.org Subject: Re: Multiple

Re: Flink CEP exception during RocksDB update

2018-01-18 Thread Kostas Kloudas
Thanks a lot Varun! Kostas > On Jan 17, 2018, at 9:59 PM, Varun Dhore wrote: > > Thank you Kostas. Since this error is not easily reproducible on my end I’ll > continue testing this and confirm the resolution once I am able to do so. > > Thanks, > Varun > > Sent

Re: Multiple Elasticsearch sinks not working in Flink

2018-01-18 Thread Timo Walther
Hi Teena, what happens if you replace the second sink with a non-ElasticSearchSink? Is there the same result? Is the data read from the KafkaTopic2? We should determine which system is the bottleneck. Regards, Timo Am 1/18/18 um 9:53 AM schrieb Teena Kappen // BPRISE: Hi, I am running

Re: Which collection to use in Scala case class

2018-01-18 Thread Timo Walther
I filed a more specific issue for this: https://issues.apache.org/jira/browse/FLINK-8451 Am 1/18/18 um 10:47 AM schrieb shashank agarwal: @Chesnay , @Timo, yes it's simple case class which i am using with java.util.List and one case class with Option and Seq. With CEP. I have filed Jira

Re: CaseClassTypeInfo fails to deserialize in flink 1.4 with Parent First Class Loading

2018-01-18 Thread Timo Walther
I filed an issue for this: https://issues.apache.org/jira/browse/FLINK-8451 Am 1/12/18 um 4:40 PM schrieb Seth Wiesman: Here is the stack trace: org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot instantiate user function. at

Re: Submitting jobs via Java code

2018-01-18 Thread Timo Walther
Hi Luigi, I'm also working on a solution for submitting jobs programmatically. You can look into my working branch [1]. As far as I know, the best and most stable solution is using the ClusterClient. But this is internal API and might change. You could also use Flink's REST API for

Re: BucketingSink broken in flink 1.4.0 ?

2018-01-18 Thread Stephan Ewen
Re-posting the solution here from other threads: You can fix this by either - Removing all Hadoop dependencies from your user jar - Set the framework back to parent-first classloading: https://ci. apache.org/projects/flink/flink-docs-master/monitoring/

Re: Submitting jobs via Java code

2018-01-18 Thread Luigi Sgaglione
Hi Timo, my objective is to create a web interface that allows me to edit and deploy jobs on Flink. To do so I'm evaluating all possibilities provided by Flink APIs. What do you think that is the best solution? Thanks 2018-01-18 9:39 GMT+01:00 Timo Walther : > Hi Luigi, >

Re: Far too few watermarks getting generated with Kafka source

2018-01-18 Thread Gary Yao
Hi William, How often does the Watermark get updated? Can you share your code that generates the watermarks? Watermarks should be strictly ascending. If your code produces watermarks that are not ascending, smaller ones will be discarded. Could it be that the events in Kafka are more "out of

Re: Which collection to use in Scala case class

2018-01-18 Thread shashank agarwal
@Chesnay , @Timo, yes it's simple case class which i am using with java.util.List and one case class with Option and Seq. With CEP. I have filed Jira bugs also for that. I have put logs also there. https://issues.apache.org/jira/browse/FLINK-7760 I have the issue with Rocksdb checkpointing also

Multiple Elasticsearch sinks not working in Flink

2018-01-18 Thread Teena Kappen // BPRISE
Hi, I am running flink 1.4 in single node. My job has two Kafka consumers reading from separate topics. After fetching the data, the job writes it to two separate Elasticsearch sinks. So the process is like this KafkaTopic1 -> Kafkaconsumer1 -> create output record -> Elasticsearchsink1