Re: Flink will delete all jars uploaded when restart jobmanager

2017-06-13 Thread Chesnay Schepler
There's currently no way to prevent this. On 14.06.2017 07:03, XiangWei Huang wrote: Hi, When restart flink jobmanager jars which uploaded by user from web ui will be deleted . Is there anyway to avoid this.

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Chesnay Schepler
Did you activate the "include-kinesis" maven profile when building? On 13.06.2017 22:49, Foster, Craig wrote: Oh, sorry. I’m not using distributed libraries but trying to build from source. So, using Maven 3.2.2 and building the connector doesn’t give me a jar for some reason. *From: *Chesn

Re: Flink - Iteration and Backpressure

2017-06-13 Thread MAHESH KUMAR
Hi Robert/Team, Is there any recommended solution or any other insight on how I must be doing it? Thanks and Regards, Mahesh On Thu, Jun 1, 2017 at 10:32 AM, MAHESH KUMAR wrote: > Hi Robert, > > The Message Auditor System must monitor all the 4 kafka queue and gather > information about messag

Flink will delete all jars uploaded when restart jobmanager

2017-06-13 Thread XiangWei Huang
Hi, When restart flink jobmanager jars which uploaded by user from web ui will be deleted . Is there anyway to avoid this.

Can't kill a job which contains a while loop in the main method before it be submitted

2017-06-13 Thread XiangWei Huang
Hi, I met a problem when use jedis in flink.When using jedis to get a connection to redis if the redis server is not available then jedis will keep trying and never end,the problem is that the job’s status is not set to RUNNING by flink, that means it can’t be killed by flink.The only way to br

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Foster, Craig
Oh, sorry. I’m not using distributed libraries but trying to build from source. So, using Maven 3.2.2 and building the connector doesn’t give me a jar for some reason. From: Chesnay Schepler Date: Tuesday, June 13, 2017 at 1:44 PM To: "Foster, Craig" , "user@flink.apache.org" , Robert Metzger

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Chesnay Schepler
Here's the relevant JIRA: https://issues.apache.org/jira/browse/FLINK-6812 Apologies if I was unclear, i meant that you could use the 1.3-SNAPSHOT version of the kinesis connector, as it is compatible with 1.3.0. Alternatively you can take the 1.3.0 sources and build the connector manually. A

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Foster, Craig
So, in addition to the question below, can we be more clear on if there is a patch/fix/JIRA available since I have to use 1.3.0? From: "Foster, Craig" Date: Tuesday, June 13, 2017 at 9:27 AM To: Chesnay Schepler , "user@flink.apache.org" Subject: Re: Flink Kinesis connector in 1.3.0 Thanks! D

Re: RichMapFunction setup method

2017-06-13 Thread Chesnay Schepler
It /is /a remnant of the past since that method signature originates from the Record API, the predecessor of the current DataSet API. Even in the DataSet API you can just pass arguments through the constructor. Feel free to open a JIRA, just make sure it is a subtask of FLINK-3957. On 13.06.201

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Dail, Christopher
For reference, the two issues I filed on the metrics: https://issues.apache.org/jira/browse/FLINK-6910 - Metrics value for lastCheckpointExternalPath is not valid https://issues.apache.org/jira/browse/FLINK-6911 - StatsD Metrics name should escape spaces Thanks Chris From: "Dail, Christopher"

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Dail, Christopher
I think I found the root cause of this problem. It has to do with how DC/OS metrics handling works. DC/OS passes special environment variables to any task started by mesos. These include STATSD_UDP_HOST and STATSD_UDP_PORT. It sets up a StatsD relay that adds extra data into statsd events that

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Foster, Craig
Thanks! Does this also explain why commons HttpClient is not included in flink-dist-*? From: Chesnay Schepler Date: Tuesday, June 13, 2017 at 8:53 AM To: "user@flink.apache.org" Subject: Re: Flink Kinesis connector in 1.3.0 Something went wrong during the release process which prevented the 1.

SANSA 0.2 (Semantic Technologies on top of Flink) Released

2017-06-13 Thread Jens Lehmann
Dear all, The Smart Data Analytics group [1] is happy to announce SANSA 0.2 - the second release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing for semantic technologies in order to allow scalable machine learning, inference and querying capabilities for large knowl

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Chesnay Schepler
Something went wrong during the release process which prevented the 1.3.0 kinesis artifact from being released. This will be fixed for 1.3.1, in the mean time you can use 1.3.0-SNAPSHOT instead. On 13.06.2017 17:48, Foster, Craig wrote: Hi: I’m trying to build an application that uses the

Flink Kinesis connector in 1.3.0

2017-06-13 Thread Foster, Craig
Hi: I’m trying to build an application that uses the Flink Kinesis Connector in 1.3.0. However, I don’t see that resolving anymore. It resolved with 1.2.x but doesn’t with 1.3.0. Is there something I need to now do differently than described here? https://ci.apache.org/projects/flink/flink-docs-

Re: Can't get keyed messages from Kafka

2017-06-13 Thread Chesnay Schepler
in getProducedType(), replace the implementation with: return new TupleTypeInfo(BasicTypeInfo.STRING_TYPE_INFO, TypeExtractor.getForClass(CustomObject.class)); On 13.06.2017 17:18, AndreaKinn wrote: Can I ask you to help me? I trying to implement a CustomDeserializer My kafka messages are com

Re: Can't get keyed messages from Kafka

2017-06-13 Thread AndreaKinn
Can I ask you to help me? I trying to implement a CustomDeserializer My kafka messages are composed by KeyedMessages where key and messages are strings. I created a new class named CustomObject to manage the message string because it's more complex then a simple string. public class CustomDeseria

Re: Flink 1.3 REST API wrapper for Scala

2017-06-13 Thread Michael Reid
Thanks! Yes, it's available now on Maven Central - https://search.maven.org/#search%7Cga%7C1%7Ca%3A%22flink-wrapper_2.11%22 The README on Github has directions on how to add the project via SBT. - Mike On Mon, Jun 12, 2017 at 9:45 PM Flavio Pompermaier wrote: > Nice lib! Is it available also on

Re: RichMapFunction setup method

2017-06-13 Thread Mikhail Pryakhin
Thanks a lot Chesnay, In case it works properly in the Batch API, don’t you think that it should not be called "remnant of the past“? Should I create an issue so we don’t forget about it and may be fix it in the future, I think I’m not the only one who deals with this method. Kind Regards, Mik

Re: RichMapFunction setup method

2017-06-13 Thread Chesnay Schepler
I'm not aware of any plans to replace it. For the Batch API it also works properly, so deprecating it would be misleading. On 13.06.2017 16:04, Mikhail Pryakhin wrote: Hi Chesnay, Thanks for the reply, The existing signature for open() is a remnant of the past. Should the method be deprec

Re: RichMapFunction setup method

2017-06-13 Thread Mikhail Pryakhin
Hi Chesnay, Thanks for the reply, > The existing signature for open() is a remnant of the past. Should the method be deprecated then so that it doesn’t confuse users? Kind Regards, Mike Pryakhin > On 13 Jun 2017, at 16:54, Chesnay Schepler wrote: > > The existing signature for open() is a re

Re: RichMapFunction setup method

2017-06-13 Thread Chesnay Schepler
The existing signature for open() is a remnant of the past. We currently recommend to pass all arguments through the constructor and store them in fields. You can of course also pass a Configuration containing all parameters. On 13.06.2017 15:46, Mikhail Pryakhin wrote: Hi all! A RichMapFunc

RichMapFunction setup method

2017-06-13 Thread Mikhail Pryakhin
Hi all! A RichMapFunction [1] provides a very handy setup method RichFunction#open(org.apache.flink.configuration.Configuration) which consumes a Configuration instance as an argument, but this argument doesn't bear any configuration parameters because it is always passed to the method as a new

Process event with last 1 hour, 1week and 1 Month data

2017-06-13 Thread shashank agarwal
Hi, I have to process each event with last 1 hour , 1 week and 1 month data. Like how many times same ip occurred in last 1 month corresponding to that event. \ I think window is for fixed time i can't calculate with last 1 hour corresponding to current event. If you have any clue please guide w

Re: User self resource file.

2017-06-13 Thread yunfan123
Or how can I get the blob store of my jar file. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/User-self-resource-file-tp13693p13694.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

User self resource file.

2017-06-13 Thread yunfan123
For example, I use ./flink run flink_helloworld.jar the flink_helloworld.jar contains a resource folder in the root dir. How can I get the resource file in flink. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/User-self-resource-file-tp1369

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Chesnay Schepler
Both your suggestions sound good, would be great to create JIRAs for them. Could you replace the task scope format with the one below and try again? metrics.scope.task: flink.tm This scope doesn't contain any special characters, except the periods. If you receive task metrics with this scop

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Dail, Christopher
Responses to your questions: 1. Did this work with the same setup before 1.3? I have not tested it with another version. I started working on the metrics stuff with a snapshot of 1.3 and move to the release. 1. Are all task/operator metrics available in the metrics tab of the dashboard

Re: Can't get keyed messages from Kafka

2017-06-13 Thread Chesnay Schepler
You have to create your own implementation that deserializes the byte arrays into whatever type you want to use. On 13.06.2017 13:19, AndreaKinn wrote: But KeyedDeserializationSchema has just 2 implementations: TypeInformationKeyValueSerializationSchema JSONKeyValueDeserializationSchema The

Re: Can't get keyed messages from Kafka

2017-06-13 Thread AndreaKinn
But KeyedDeserializationSchema has just 2 implementations: TypeInformationKeyValueSerializationSchema JSONKeyValueDeserializationSchema The first give me this error: 06/12/2017 02:09:12 Source: Custom Source(4/4) switched to FAILED java.io.EOFException at org.apache.flink.runtime.util.DataInput

Re: Can't get keyed messages from Kafka

2017-06-13 Thread Chesnay Schepler
Have you tried implementing a KeyedDeserializationSchema? This receives both the message and key as byte arrays, which you could then deserialize as strings and return them in a Tuple2. On 13.06.2017 12:36, AndreaKinn wrote: Hi, I already spent two days trying to get simple messages from Kafka

Can't get keyed messages from Kafka

2017-06-13 Thread AndreaKinn
Hi, I already spent two days trying to get simple messages from Kafka without success. I have a Kafka producer written in javascript: KeyedMessage = kafka.KeyedMessage; keyed_message = new KeyedMessage(key, string_to_sent); payload = [{topics: topic, messages: keyed_message }]; And I want to re

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-13 Thread Sebastian Neef
Hi, the code is part of a bigger project, so I'll try to outline the used methods and their order: # Step 1 - Reading a Wikipedia XML Dump into a DataSet of -tag delimited strings using XmlInputFormat. - A .distinct() operations removes all duplicates based on the content. - .map() is used to par

Re: Java parallel streams vs IterativeStream

2017-06-13 Thread nragon
That would work but after FlatMap, T> I would have to downstream all elements into one. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Java-parallel-streams-vs-IterativeStream-tp13655p13685.html Sent from the Apache Flink User Mailing List

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-13 Thread Kurt Young
Hi, Can you paste some code snippet to show how you use the DataSet API? Best, Kurt On Tue, Jun 13, 2017 at 4:29 PM, Sebastian Neef < gehax...@mailbox.tu-berlin.de> wrote: > Hi Kurt, > > thanks for the input. > > What do you mean with "try to disable your combiner"? Any tips on how I > can do t

Re: Flink Mesos task management

2017-06-13 Thread Till Rohrmann
Hi Jared, the way to solve this problem at the moment is to take a savepoint of your job, restart the cluster with the updated Docker image and then resubmit the job starting from the most recent savepoint. The same applies when you want to change the parallelism of your job. We are currently wor

Re: ReduceFunction mechanism

2017-06-13 Thread nragon
So, if my reduce function applies some transformation I must migrate that transformation to a map before the reduce to ensure it transforms, even if there is only one element? I can chain them together and it will be "almost" as they were in the same function(Ensure same thread processing)? -- V

Re: Flink with Mesos: Fetcher error

2017-06-13 Thread Till Rohrmann
Hi Ani, the problem is that you have to set a reachable jobmanager hostname in the flink-conf.yaml via jobmanager.rpc.address: [reachable hostname]. I assume that you use the default value which is localhost. You can see it in the fetcher info where the URL for the different files points to localh

Re: Error running Flink job in Yarn-cluster mode

2017-06-13 Thread Aljoscha Krettek
You’re welcome. 😃 > On 13. Jun 2017, at 10:24, Biplob Biswas wrote: > > Thanks a lot Aljoscha (again) > > I created my project from scratch and used the flink-maven-archetype and now > it works on the yarn-cluster mode. I was creating a fat jar initially as > well with my old project setup so n

Re: Java parallel streams vs IterativeStream

2017-06-13 Thread Aljoscha Krettek
I think for that you would unpack to List of values, for example with a FlatMap, T>. This would emit each element of the list as a separate element. Then, downstream operations can operate on each element individually and you will exploit parallelism in the cluster. Best, Aljoscha > On 13. Jun

Re: Listening to timed-out patterns in Flink CEP

2017-06-13 Thread Till Rohrmann
Great to hear that things are now working :-) On Sun, Jun 11, 2017 at 11:19 PM, David Koch wrote: > Hello, > > It's been a while and I have never replied on the list. In fact, the fix > committed by Till does work. Thanks! > > On Tue, Apr 25, 2017 at 9:37 AM, Moiz Jinia wrote: > >> Hey David, >

Re: Java parallel streams vs IterativeStream

2017-06-13 Thread nragon
Iterate until all elements were changed perhaps. But just wanted to know if there areimplementations out there using java 8 streams, in cases where you want to parallelize a map function even if it is function scoped. So, in my case, if the computation for each list element is to heavy, how can one

Re: Error running Flink job in Yarn-cluster mode

2017-06-13 Thread Biplob Biswas
Thanks a lot Aljoscha (again) I created my project from scratch and used the flink-maven-archetype and now it works on the yarn-cluster mode. I was creating a fat jar initially as well with my old project setup so not really sure what went wrong there as it was working on my local test environment

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-13 Thread Sebastian Neef
Hi Flavio, thanks for pointing me to your old thread. I don't have administrative rights on the cluster, but from what dmesg reports, I could not find anything that looks like an OOM message. So no luck for me, I guess... Best, Sebastian

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-13 Thread Sebastian Neef
Hi Ted, thanks for bringing this to my attention. I just rechecked my Java version and it is indeed version 8. Both the code and the Flink environment run that version. Cheers, Sebastian

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-13 Thread Sebastian Neef
Hi Kurt, thanks for the input. What do you mean with "try to disable your combiner"? Any tips on how I can do that? I don't actively use any combine* DataSet API functions, so the calls to the SynchronousChainedCombineDriver come from Flink. Kind regards, Sebastian

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Chesnay Schepler
The scopes look OK to me. Let's try to narrow down the problem areas a bit: 1. Did this work with the same setup before 1.3? 2. Are all task/operator metrics available in the metrics tab of the dashboard? 3. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDRe

Re: Java parallel streams vs IterativeStream

2017-06-13 Thread Aljoscha Krettek
Hi, How would you use IterativeStream? In Flink IterativeStream is a pipeline-level concept whereas your problem seems to be scoped to one user function. Best, Aljoscha > On 12. Jun 2017, at 19:17, nragon wrote: > > In my map functions i have an object containing a list which must be changed,

Re: Window Function on AllWindowed Stream - Combining Kafka Topics

2017-06-13 Thread Aljoscha Krettek
Hi, As a simple test, you can put your key extraction logic into a MapFunction, i.e. MapFunction, Tuple2> and then simply use that field as the key: input .map(new MyKeyExtractorMapper()) .keyBy(0) If that solves your problem it means that the key extraction is not deterministic. This is

Re: Use Single Sink For All windows

2017-06-13 Thread Aljoscha Krettek
Yes, CheckpointListener will enable you to listen for completed checkpoints. I think that you should put the the values in state before returning from the snapshot method, though, to prevent data loss. And regarding your other question: yes, when a snapshot is ongoing the invoke() method will n