subscribe

2017-12-04 Thread Xin Wang
hello flink -- Thanks, Xin

CPU Cores of JobManager

2017-12-04 Thread Yuta Morisawa
Hi Now I am looking for the way to increase the number of allocated CPU cores because my JobManagaer WEBUI is very heavy and sometimes freeze. I think this is caused by the resource shortage of JobManager. How can I increase the number of CPU for JobManager in YARN mode? Thanks Yuta --

Re: Window function support on SQL

2017-12-04 Thread Tao Xia
Thanks for the quick response Fabian I have DataStream of avro objects. Not sure how to add a TIMESTAMP attribute or convert the event_timestramp field to Timestamp Attribute for my SQL use cases. Most docs only covers the Table API with static schema. p.s. my Avro schema has 100+ fields. Can

Window function support on SQL

2017-12-04 Thread Tao Xia
Hi All, Do you know if window function supported on SQL yet? I got the error message when trying to use group function in SQL. My query below: val query = "SELECT nd_key, concept_rank, event_timestamp FROM "+streamName + " GROUP BY TUMBLE(event_timestamp, INTERVAL '1' HOUR), nd_key" Error

Re: non-shared TaskManager-specific config file

2017-12-04 Thread Hao Sun
https://issues.apache.org/jira/browse/FLINK-8197, here is the JIRA link for xref. On Mon, Dec 4, 2017 at 7:35 AM Hao Sun wrote: > Sure, I will do that. > > On Mon, Dec 4, 2017, 07:26 Fabian Hueske wrote: > >> Can you create a JIRA issue to propose the

Re: Trace jar file name from jobId, is that possible?

2017-12-04 Thread Hao Sun
Thanks Fabian, there is one case can not be covered by the REST API. When a job rescheduled to run, but jobid will change, and I wont be able to backtrace the jar name. Why not keep the jar name stored somewhere and expose it through the api as well? On Mon, Dec 4, 2017 at 4:52 AM Fabian Hueske

Re: non-shared TaskManager-specific config file

2017-12-04 Thread Hao Sun
Sure, I will do that. On Mon, Dec 4, 2017, 07:26 Fabian Hueske wrote: > Can you create a JIRA issue to propose the feature? > > Thank you, > Fabian > > 2017-12-04 16:15 GMT+01:00 Hao Sun : > >> Thanks. If we can support include configuration dir that will

Re: non-shared TaskManager-specific config file

2017-12-04 Thread Fabian Hueske
Can you create a JIRA issue to propose the feature? Thank you, Fabian 2017-12-04 16:15 GMT+01:00 Hao Sun : > Thanks. If we can support include configuration dir that will be very > helpful. > > On Mon, Dec 4, 2017, 00:50 Chesnay Schepler wrote: > >> You

Re: non-shared TaskManager-specific config file

2017-12-04 Thread Hao Sun
Thanks. If we can support include configuration dir that will be very helpful. On Mon, Dec 4, 2017, 00:50 Chesnay Schepler wrote: > You will have to create a separate config for each TaskManager. > > > On 01.12.2017 23:14, Hao Sun wrote: > > Hi team, I am wondering how can I

Re: Kafka consumer to sync topics by event time?

2017-12-04 Thread Fabian Hueske
You are right, offsets cannot be used for tracking processing progress. I think setting Kafka offsets with respect to some progress notion other than "has been consumed" would be highly application specific and hard to generalize. As you said, there might be a window (such as a session window)

Re: Kafka consumer to sync topics by event time?

2017-12-04 Thread Juho Autio
Thank you Fabian. Really clear explanation. That matches with my observation indeed (data is not dropped from either small or big topic, but the offsets are advancing in kafka side already before those offsets have been triggered from a window operator). This means that it's a bit harder to

Re: Kafka consumer to sync topics by event time?

2017-12-04 Thread Fabian Hueske
Hi Juho, the partitions of both topics are independently consumed, i.e., at their own speed without coordination. With the configuration that Gordon linked, watermarks are generated per partition. Each source task maintains the latest (and highest) watermark per partition and propagates the

Re: Checkpoint expired before completing

2017-12-04 Thread Nico Kruber
Although there may be no checkpoints in flight with this configuration, there are most certainly records floating around in various buffers which filled up during your sink pausing everything. Those records need to be processed first before the new chackpoint's checkpoint barrier may make it

Re: Blob server not working with 1.4.0.RC2

2017-12-04 Thread Nico Kruber
Hi Bernd, thanks for the report. I tried to reproduce it locally but both a telnet connection to the BlobServer as well as the BLOB download by the TaskManagers work for me. Can you share your configuration that is causing the problem? You could also try increasing the log level to DEBUG and see

Re: Trace jar file name from jobId, is that possible?

2017-12-04 Thread Fabian Hueske
Hi, you can submit jar files and start jobs via the REST interface [1]. When starting a job, you get the jobId. You can link jar files and savepoints via the jobId. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/rest_api.html#submitting-programs

Re: using regular expression to specify Kafka topics

2017-12-04 Thread Tzu-Li (Gordon) Tai
Hi, I’ve created a PR to publicly expose the feature:  https://github.com/apache/flink/pull/5117. Whether or not we should include this in the next release candidate for 1.4 is still up for discussion. Best, Gordon On 4 December 2017 at 3:02:29 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org)

Re: TaskManager HA on YARN

2017-12-04 Thread Till Rohrmann
Hi Hayden, in Yarn mode, Flink will tolerate as many TM failures as you have configured `yarn.maximum-failed-containers`. Per default this is set to the initial number of requested TMs. So in your case, the Flink cluster would restart twice a TM and then fail the cluster once a TM fails for the

Re: Flik typesafe configuration

2017-12-04 Thread Fabian Hueske
Hi Georg, The recommended approach to configure user functions is to pass parameters as (typesafe) arguments to the constructor. Flink serializes users function objects using Java serialization and distributes them to the workers. Hence, the configuration during plan construction is preserved.

Re: Maintain heavy hitters in Flink application

2017-12-04 Thread Fabian Hueske
Hi Max, state (keyed or operator state) is always local to the task. By default it is not accessible (read or write) from the outside or other tasks of the application. You can expose keyed state as queryable state [1] to perform key look ups. This feature was designed for external application

TaskManager HA on YARN

2017-12-04 Thread Marchant, Hayden
Hi, WE are currently start to test Flink running on YARN. Till now, we've been testing on Standalone Cluster. One thing lacking in standalone is that we have to manually restart a Task Manager if it dies. I looked at

Blob server not working with 1.4.0.RC2

2017-12-04 Thread Bernd.Winterstein
Hi Since we switched to Release 1.4 the taskmanagers are unable to download blobs from the jobmanager. The taskmanager registration still works. Netstat on jobmanager shows open ports at 6123 and 5. But a telnet connection from taskmanager to jobmanager on port 5 times out. Any ideas

Re: non-shared TaskManager-specific config file

2017-12-04 Thread Chesnay Schepler
You will have to create a separate config for each TaskManager. On 01.12.2017 23:14, Hao Sun wrote: Hi team, I am wondering how can I create a non-shared config file and let Flink read it. Can I use include in the config? Or I have to prepare a different config for each TM?

Re: FlinkKafkaProducerXX

2017-12-04 Thread Mikhail Pryakhin
Exactly, at least it's worth mentioning the partitioner used by default in case none was specified in the javadoc, because the default behavior might not seem obvious. Kind Regards, Mike Pryakhin > On 3 Dec 2017, at 22:08, Stephan Ewen wrote: > > Sounds like adding a