Re: Batch job getting stuck

2018-02-14 Thread Amit Jain
Hi Timo, Yes, we are using off-heap memory, our yarn container are set to use ~23G memory with two slot per container and set yarn heap cutoff ratio to 0.6. Jobs are having normal memory usage, problem here is not temporary halt but permanent halt for the running jobs. Task manager's log 2018-0

Re: IO metrics

2018-02-14 Thread Till Rohrmann
Hi, the metrics listed on the web page are registered for all tasks automatically. Thus, you should be able to simply consume them by configuring the respective Prometheus metric reporter. One thing to note is that the metrics are reported per Task and not per logical operator. Thus, if your sourc

Re: IO metrics

2018-02-14 Thread Chesnay Schepler
All metrics listed are automatically measured, you only need to configure the Prometheus reporter. Note that we do not measure how much data a source is reading / a sink is writing (see FLINK-7286 ). If you want to measure these you will have t

[jira] [Created] (FLINK-8651) Add support for different event-time OVER windows in a query

2018-02-14 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8651: Summary: Add support for different event-time OVER windows in a query Key: FLINK-8651 URL: https://issues.apache.org/jira/browse/FLINK-8651 Project: Flink I

Re: Why are checkpoint failures so serious?

2018-02-14 Thread Till Rohrmann
Hi Ron, you should be able to turn off the Task failure in case of a checkpoint failure by setting `ExecutionConfig.setFailTaskOnCheckpointError(false)`. This setting should change the behavior such that checkpoint failures will simply fail the distributed checkpoint. Cheers, Till On Tue, Feb 13

Re: Support distinct aggregation over data stream on Table/SQL API

2018-02-14 Thread Fabian Hueske
Hi Rong, Thanks for taking the initiative to improve the support for DISTINCT aggregations! I've made a pass over your design document and left a couple of comments. I think it is a really good write up and serves as a good start. IMO, the next steps could be to 1) continue and finalize the discu

Re: Why are checkpoint failures so serious?

2018-02-14 Thread Aljoscha Krettek
Hi Ron, Keep in mind, though, that this feature will only be available with the upcoming Flink 1.5. Just making sure you don't go looking for this and are surprised if you don't find it. Best, Aljoscha > On 14. Feb 2018, at 10:20, Till Rohrmann wrote: > > Hi Ron, > > you should be able to

[jira] [Created] (FLINK-8652) Reduce log level of QueryableStateClient.getKvState() to DEBUG

2018-02-14 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8652: Summary: Reduce log level of QueryableStateClient.getKvState() to DEBUG Key: FLINK-8652 URL: https://issues.apache.org/jira/browse/FLINK-8652 Project: Flink

[jira] [Created] (FLINK-8653) Remove slot request timeout from SlotPool

2018-02-14 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8653: Summary: Remove slot request timeout from SlotPool Key: FLINK-8653 URL: https://issues.apache.org/jira/browse/FLINK-8653 Project: Flink Issue Type: Improveme

[jira] [Created] (FLINK-8654) Extend quickstart docs on how to submit jobs

2018-02-14 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8654: --- Summary: Extend quickstart docs on how to submit jobs Key: FLINK-8654 URL: https://issues.apache.org/jira/browse/FLINK-8654 Project: Flink Issue Type:

[jira] [Created] (FLINK-8655) Add a default keyspace to CassandraSink

2018-02-14 Thread Christopher Hughes (JIRA)
Christopher Hughes created FLINK-8655: - Summary: Add a default keyspace to CassandraSink Key: FLINK-8655 URL: https://issues.apache.org/jira/browse/FLINK-8655 Project: Flink Issue Type: I

[jira] [Created] (FLINK-8656) Add CLI command for rescaling

2018-02-14 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8656: Summary: Add CLI command for rescaling Key: FLINK-8656 URL: https://issues.apache.org/jira/browse/FLINK-8656 Project: Flink Issue Type: New Feature

Apache EU Roadshow CFP Closing Soon (23 February)

2018-02-14 Thread Sharan F
Hello Everyone This is an initial reminder to let you all know that we are holding an Apache EU Roadshow co-located with FOSS Backstage in Berlin on 13^th and 14^th June 2018. https://s.apache.org/tCHx The Call for Proposals (CFP) for the Apache EU Roadshow is currently open and will close a

[jira] [Created] (FLINK-8657) Fix incorrect description for external checkpoint vs savepoint

2018-02-14 Thread Sihua Zhou (JIRA)
Sihua Zhou created FLINK-8657: - Summary: Fix incorrect description for external checkpoint vs savepoint Key: FLINK-8657 URL: https://issues.apache.org/jira/browse/FLINK-8657 Project: Flink Issue

Re: [VOTE] Release 1.4.1, release candidate #1

2018-02-14 Thread Aljoscha Krettek
+1 (binding) - I checked the signatures and hashes - I ran a cluster and tried some example programs - I checked the list of all changes: we're good, legally, and the other changes also all look good The only thing I'm not sure about is the Network Buffer changes but I don't think they make

Re: [VOTE] Release 1.4.1, release candidate #1

2018-02-14 Thread Timo Walther
+1 (binding) - I scanned the changes - Run some example table programs Looks good from my side. Regards, Timo Am 2/14/18 um 5:46 PM schrieb Aljoscha Krettek: +1 (binding) - I checked the signatures and hashes - I ran a cluster and tried some example programs - I checked the list of al

[jira] [Created] (FLINK-8658) NoClassDefFoundError: Could not initialize class org.elasticsearch.transport.client.PreBuiltTransportClient

2018-02-14 Thread sathiyarajan (JIRA)
sathiyarajan created FLINK-8658: --- Summary: NoClassDefFoundError: Could not initialize class org.elasticsearch.transport.client.PreBuiltTransportClient Key: FLINK-8658 URL: https://issues.apache.org/jira/browse/FLINK

[jira] [Created] (FLINK-8659) Add migration tests for Broadcast state.

2018-02-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-8659: - Summary: Add migration tests for Broadcast state. Key: FLINK-8659 URL: https://issues.apache.org/jira/browse/FLINK-8659 Project: Flink Issue Type: Improvem

Re: IO metrics

2018-02-14 Thread cw7k
To use the "numBytesOutPerSecond" example, what exactly is being measured?  Is there an example app with usage of this metric? Just to clarify, if I register this meter in BucketingSink, it will be ignored?  Does this mean I need to implement my own measurement mechanism in BucketingSink and s

[jira] [Created] (FLINK-8660) Enable the user to provide custom HAServices implementation

2018-02-14 Thread JIRA
Krzysztof Białek created FLINK-8660: --- Summary: Enable the user to provide custom HAServices implementation Key: FLINK-8660 URL: https://issues.apache.org/jira/browse/FLINK-8660 Project: Flink

[jira] [Created] (FLINK-8661) Replace Collections.EMPTY_MAP with Collections.emptyMap()

2018-02-14 Thread Ted Yu (JIRA)
Ted Yu created FLINK-8661: - Summary: Replace Collections.EMPTY_MAP with Collections.emptyMap() Key: FLINK-8661 URL: https://issues.apache.org/jira/browse/FLINK-8661 Project: Flink Issue Type: Bug

Re: [VOTE] Release 1.4.1, release candidate #1

2018-02-14 Thread Tzu-Li (Gordon) Tai
+1 (binding) - Built from source, tests pass. - Tested the now shaded Elasticsearch Connector on a cluster execution. Logs are correctly forwarded to TM logs. Checked that all dependencies are correctly shaded. - Successfully restored a window operator job’s savepoint from 1.3.2 in 1.4.1. - Kerb