Re:Re: ElasticsearchSink on DataSet

2017-05-08 Thread Tzu-Li (Gordon) Tai
Hi! Thanks for sharing that repo! I think that would be quite an useful contribution to Flink for the users, if you’re up to preparing a PR for it :) It also looks like you’ve adopted most of the current ElasticsearchSink APIs (RequestIndexer, ElasticsearchSinkFunction, etc.) for the

Re: Writing a reliable Flink source for a NON-replay-able queue/protocol that supports message ACKs

2017-05-08 Thread Tzu-Li (Gordon) Tai
Hi Martin! Let me try to follow-up some of your questions :) a. When the acknowledgeIDs method is called, is it certain that all the rest of the operators, including the Sinks finished successfully? E.g: If I have a sink that writes to MySQL/Cassandra and one that writes to SQS/Kafka, will the

Re: Job ID

2017-05-08 Thread Tzu-Li (Gordon) Tai
Hi Joe, AFAIK, this currently isn’t possible through the DataStream API. You’ll be able to get a JobExecutionResult which contains the job id from the execute() call, but that’s blocked until the streaming job finishes. There are plans for a new DataStream client that allows querying job info

Re:Re: ElasticsearchSink on DataSet

2017-05-08 Thread wyphao.2007
Hi Flavio Maybe this is what you want: https://github.com/397090770/flink-elasticsearch2-connector, It can save Flink DataSet to elasticsearch. importscala.collection.JavaConversions._ valconfig=Map("bulk.flush.max.actions"->"1000",

Re: Flink + Kafka + avro example

2017-05-08 Thread Tzu-Li (Gordon) Tai
Thanks a lot for sharing this Flavio! On 5 May 2017 at 10:45:38 PM, Flavio Pompermaier (pomperma...@okkam.it) wrote: Hi to all Flink users, we've just published on our Okkam public repository an example of using Flink 1.2.1 + Kafka 0.10 to exchange Avro objects[1]. We hope this could be

Re: ElasticsearchSink on DataSet

2017-05-08 Thread Tzu-Li (Gordon) Tai
Hi Flavio, I don’t think there is a bridge class for this. At the moment you’ll have to implement your own OutputFormat. The ElasticsearchSink is a SinkFunction which is part of the DataStream API, which generally speaking at the moment has no bridge or unification yet with the DataSet API.

Job ID

2017-05-08 Thread Joe Olson
I've got a job name, and need the job id. Is there a way to get this via the java API? I know I can get it via the rest interface. Is there an equivalent API call in the streaming API? If not, I'll continue to use the rest interface.

Re: [DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-08 Thread Renjie Liu
Hi, does this include the FLIP6? On Tue, May 9, 2017 at 2:29 AM Stephan Ewen wrote: > Did a quick test: Simply adding the > "org.apache.maven.plugins.shade.resource.ApacheNoticeResourceTransformer" > helps with NOTICE files, > but does not add the required BSD licence copies.

Joining on multiple row values produced by TableFunction

2017-05-08 Thread Samuel Doyle
I want to do something like the following .join("fields(fields) as (name, content)") .where("text = 'password for user' && name='text' && !content.like('%accepted%') && name='appname' && content.like('%hostd%')") Fields collects 4 rows in this case which contain those

AsyncCollector Does not release the thread (1.2.1)

2017-05-08 Thread Steve Robert
Hi guys, AsyncCollector.collect(Throwable) method seem to not release the Thread. This scenario may be problematic when calling an external API In the case of a timeout error there is no data to collect. for example : CompletableFuture.supplyAsync(() -> asyncCallTask(input))

Re: Question on checkpoint management

2017-05-08 Thread Stefan Richter
I think this jira is helpful for your question: https://issues.apache.org/jira/browse/FLINK-6328 > Am 08.05.2017 um 19:33 schrieb Cliff Resnick : > > When a job cancel-with-savepoint finishes a successful Savepoint, the >

Re: [DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-08 Thread Stephan Ewen
Did a quick test: Simply adding the "org.apache.maven.plugins.shade.resource.ApacheNoticeResourceTransformer" helps with NOTICE files, but does not add the required BSD licence copies. On Mon, May 8, 2017 at 8:25 PM, Stephan Ewen wrote: > I did the first pass for the legal

Re: [DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-08 Thread Stephan Ewen
I did the first pass for the legal check. - Source LICENSE and NOTICE are okay - In the shaded JAR files, we are not bundling the license and notice files of the dependencies we include in the shaded jars. => Not a problem for Guava (Apache Licensed) => I think is a problem for ASM

Re: RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: null

2017-05-08 Thread Kaepke, Marc
Hi, did some had an answer or solution? Best Marc Am 05.05.2017 um 20:05 schrieb Kaepke, Marc >: Hi everyone, what does mean that following exception, if I run my gelly program? Exception in thread "main"

Question on checkpoint management

2017-05-08 Thread Cliff Resnick
When a job cancel-with-savepoint finishes a successful Savepoint, the preceding last successful Checkpoint is removed. Is this the intended behavior? I thought that checkpoints and savepoints were separate entities and, as such, savepoints should not infringe on checkpoints. This is actually an

Re: Window Function on AllWindowed Stream - Combining Kafka Topics

2017-05-08 Thread Aljoscha Krettek
It seems that eventTime is a static field in TopicPojo and the key selector also just gets the static field via TopicPojo.getEventTime(). Why is that? Because with this the event time basically has nothing to do with the data. > On 5. May 2017, at 10:32, G.S.Vijay Raajaa

Re: High Availability on Yarn

2017-05-08 Thread Jain, Ankit
Thanks Stephan – we will go with a central ZooKeeper Instance and hopefully have it started through a cloudformation script as part of EMR startup. Is Zk also used to keep track of checkpoint metadata and the execution graph of the running job to recover from ApplicationMaster failure as

Re: Stopping the job with ID XXX failed.

2017-05-08 Thread Stefan Richter
Hi, what you intend to do is cancel in Flink terminology, not stop. So you should use the cancel command instead of the stop. Please take a look here: https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/cli.html

Re: High Availability on Yarn

2017-05-08 Thread Stephan Ewen
@Ankit: ZooKeeper is required in YARN setups still. Even if there is only one JobManager in the normal case, Yarn can accidentally create a second one when there is a network partition. To prevent that this leads to inconsistencies, we use ZooKeeper. Flink uses ZooKeeper very little, so you can

Re: AllWindowed vs Windowed with 1 key

2017-05-08 Thread Adrienne Kole
Hi, Thanks for the reply. So I have 2 cases: 1. timeWindowAll (length, slide).reduce (...) (with parallelism = 1) 2. groupby(someField).timeWindow(length, slide). reduce(...) Lets say case-1 global window, case-2 partitioned window. If I have only one key (for case-2) and I set parallelism=1

Re: AllWindowed vs Windowed with 1 key

2017-05-08 Thread Stefan Richter
Hi, to answer this question, we would first need to know what you mean by „global windows“: using „windowAll()“ or „GlobalWindows“? Also, the answer might depend on the Flink version that you are using. Best, Stefan > Am 07.05.2017 um 23:23 schrieb Adrienne Kole : >

Stopping the job with ID XXX failed.

2017-05-08 Thread yunfan123
I can't stop the job, every time the exception like follows. Retrieving JobManager. Using address /10.6.192.141:6123 to connect to JobManager. The program finished with the following exception: java.lang.Exception: Stopping the job

Re: High Availability on Yarn

2017-05-08 Thread Aljoscha Krettek
Hi, Yes, it’s recommended to use one ZooKeeper cluster for all Flink clusters. Best, Aljoscha > On 5. May 2017, at 16:56, Jain, Ankit wrote: > > Thanks for the update Aljoscha. > > @Till Rohrmann , > Can you please chim in? > > Also, we

Re: Can ValueState use generics?

2017-05-08 Thread Stephan Ewen
Please use "new ValueStateDescriptor<>("mystate", TypeInformation.of(new TypeHint >(){})); That should work... On Mon, May 8, 2017 at 1:11 PM, Chesnay Schepler wrote: > If you want to use generics you have to either provide a TypeInformation >

Re: Can ValueState use generics?

2017-05-08 Thread Chesnay Schepler
If you want to use generics you have to either provide a TypeInformation instead of a class or create a class that extends Tuple2(Integer, ObjectNode) and use it as the class argument. On 07.05.2017 15:14, yunfan123 wrote: My process function is like : private static class MergeFunction

[DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-08 Thread Robert Metzger
Hi Devs, I've created a first non-voting release candidate for Flink 1.3.0. Please use this RC to test as much as you can and provide feedback to the Flink community. The more we find and fix now, the better Flink 1.3.0 wil be :) I've CC'ed the user@ mailing list to get more people to test it.

Re: Kafka 0.10 jaas multiple clients

2017-05-08 Thread Tzu-Li (Gordon) Tai
Hi Gwenhael, Sorry for the very long delayed response on this. As you noticed, the “KafkaClient” entry name seems to be a hardcoded thing on the Kafka side, so currently I don’t think what you’re asking for is possible. It seems like this could be made possible with some of the new