[jira] [Created] (FLINK-8419) Kafka consumer's offset metrics are not registered for dynamically discovered partitions
Tzu-Li (Gordon) Tai created FLINK-8419: -- Summary: Kafka consumer's offset metrics are not registered for dynamically discovered partitions Key: FLINK-8419 URL: https://issues.apache.org/jira/browse/FLINK-8419 Project: Flink Issue Type: Bug Components: Kafka Connector, Metrics Affects Versions: 1.4.0, 1.5.0 Reporter: Tzu-Li (Gordon) Tai Assignee: Tzu-Li (Gordon) Tai Priority: Blocker Fix For: 1.5.0, 1.4.1 Currently, the per-partition offset metrics are registered via the {{AbstractFetcher#addOffsetStateGauge}} method. That method is only ever called for the initial startup partitions, and not for dynamically discovered partitions. We should consider adding some unit tests to make sure that metrics are properly registered for all partitions. That would also safeguard us from accidentally removing metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8418) Kafka08ITCase.testStartFromLatest() times out on Travis
Tzu-Li (Gordon) Tai created FLINK-8418: -- Summary: Kafka08ITCase.testStartFromLatest() times out on Travis Key: FLINK-8418 URL: https://issues.apache.org/jira/browse/FLINK-8418 Project: Flink Issue Type: Bug Components: Kafka Connector, Tests Affects Versions: 1.4.0, 1.5.0 Reporter: Tzu-Li (Gordon) Tai Assignee: Tzu-Li (Gordon) Tai Priority: Critical Fix For: 1.5.0, 1.4.1 Instance: https://travis-ci.org/kl0u/flink/builds/327733085 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8417) Support STSAssumeRoleSessionCredentialsProvider in FlinkKinesisConsumer
Tzu-Li (Gordon) Tai created FLINK-8417: -- Summary: Support STSAssumeRoleSessionCredentialsProvider in FlinkKinesisConsumer Key: FLINK-8417 URL: https://issues.apache.org/jira/browse/FLINK-8417 Project: Flink Issue Type: New Feature Components: Kinesis Connector Reporter: Tzu-Li (Gordon) Tai Fix For: 1.5.0 As discussed in ML: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connectors-With-Temporary-Credentials-td17734.html. Users need the functionality to access cross-account AWS Kinesis streams, using AWS Temporary Credentials [1]. We should add support for {{AWSConfigConstants.CredentialsProvider.STSAssumeRole}}, which internally would use the {{STSAssumeRoleSessionCredentialsProvider}} in {{AWSUtil#getCredentialsProvider(Properties)}}. [1] https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8416) Kinesis consumer doc examples should demonstrate preferred default credentials provider
Tzu-Li (Gordon) Tai created FLINK-8416: -- Summary: Kinesis consumer doc examples should demonstrate preferred default credentials provider Key: FLINK-8416 URL: https://issues.apache.org/jira/browse/FLINK-8416 Project: Flink Issue Type: Improvement Components: Documentation, Kinesis Connector Reporter: Tzu-Li (Gordon) Tai Fix For: 1.3.3, 1.5.0, 1.4.1 The Kinesis consumer docs [here](https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/connectors/kinesis.html#kinesis-consumer) demonstrate providing credentials by explicitly supplying the AWS Access ID and Key. The always preferred approach for AWS, unless running locally, is to automatically fetch the shipped credentials from the AWS environment. That is actually the default behaviour of the Kinesis consumer, so the docs should demonstrate that more clearly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8415) Unprotected access to recordsToSend in LongRecordWriterThread#shutdown()
Ted Yu created FLINK-8415: - Summary: Unprotected access to recordsToSend in LongRecordWriterThread#shutdown() Key: FLINK-8415 URL: https://issues.apache.org/jira/browse/FLINK-8415 Project: Flink Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} public void shutdown() { running = false; recordsToSend.complete(0L); {code} In other methods, access to recordsToSend is protected by synchronized keyword. shutdown() should do the same. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Issues to work on for newbie to the Flink project
There's a `starter` label, here're the task list I can find https://issues.apache.org/jira/browse/FLINK-6924?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20starter%20ORDER%20BY%20priority%20DESC On Thu, Jan 11, 2018 at 5:50 AM, Shailesh Jain wrote: > Bump. I have the same question. > > Shailesh > > > On 09-Jan-2018 9:17 PM, "Ashutosh Dubey" wrote: > > Hello Flink Commiters, > > I have been reading Flink documentation for some time now, and want to > contribute as a developer to the project, however unlike other projects I > could not find tags on issues that suggest it suitable for newbies to pick > up. > > I would really appreciate if you could suggest few jira issues that I could > pick up from to get started with. > > I appreciate your time and efforts for the community. > > Thanks! > > Ashutosh > -- Mingmin
[jira] [Created] (FLINK-8414) Gelly performance seriously decreases when using the suggested parallelism configuration
flora karniav created FLINK-8414: Summary: Gelly performance seriously decreases when using the suggested parallelism configuration Key: FLINK-8414 URL: https://issues.apache.org/jira/browse/FLINK-8414 Project: Flink Issue Type: Bug Components: Configuration, Documentation, Gelly Reporter: flora karniav Priority: Minor I am running Gelly examples with different datasets in a cluster of 5 machines (1 Jobmanager and 4 Taskmanagers) of 32 cores each. The number of Slots parameter is set to 32 (as suggested) and the parallelism to 128 (32 cores*4 taskmanagers). I observe a vast performance degradation using these suggested settings than setting parallelism.default to 16 for example were the same job completes at 37 seconds vs 140 in the 128 parallelism case. Is there something wrong in my configuration? Should I decrease parallelism and -if so- will this inevitably decrease CPU utilization? Another matter that may be related to this is the number of partitions of the data. Is this somehow related to parallelism? How many partitions are created in the case of parallelism.default=128? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Issues to work on for newbie to the Flink project
Bump. I have the same question. Shailesh On 09-Jan-2018 9:17 PM, "Ashutosh Dubey" wrote: Hello Flink Commiters, I have been reading Flink documentation for some time now, and want to contribute as a developer to the project, however unlike other projects I could not find tags on issues that suggest it suitable for newbies to pick up. I would really appreciate if you could suggest few jira issues that I could pick up from to get started with. I appreciate your time and efforts for the community. Thanks! Ashutosh
[jira] [Created] (FLINK-8413) Checkpointing in flink doesnt maintain the snapshot state
suganya created FLINK-8413: -- Summary: Checkpointing in flink doesnt maintain the snapshot state Key: FLINK-8413 URL: https://issues.apache.org/jira/browse/FLINK-8413 Project: Flink Issue Type: Bug Components: State Backends, Checkpointing Affects Versions: 1.3.2 Reporter: suganya We have a project which consumes events from kafka,does a groupby in a time window(5 mins),after window elapses it pushes the events to downstream for merge.This project is deployed using flink ,we have enabled checkpointing to recover from failed state. (windowsize: 5mins , checkpointingInterval: 5mins,state.backend: filesystem) Offsets from kafka get checkpointed every 5 mins(checkpointingInterval).Before finishing the entire DAG(groupBy and merge) , events offsets are getting checkpointed.So incase of any restart from task-manager ,new task gets started from last successful checkpoint ,but we could'nt able to get the aggregated snapshot data(data from groupBy task) from the persisted checkpoint. Able to retrieve the last successful checkpointed offset from kafka ,but couldnt able to get last aggregated data till checkpointing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8412) Show outdated warning for 1.3 docs
Timo Walther created FLINK-8412: --- Summary: Show outdated warning for 1.3 docs Key: FLINK-8412 URL: https://issues.apache.org/jira/browse/FLINK-8412 Project: Flink Issue Type: Bug Components: Documentation Reporter: Timo Walther Assignee: Timo Walther We should show the outdated warning for the old docs and link to the new release docs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Call for Presentations FOSS Backstage open
Hi, As announced on Berlin Buzzwords we (that is Isabel Drost-Fromm, Stefan Rudnitzki as well as the eventing team over at newthinking communications GmbH) are working on a new conference in summer in Berlin. The name of this new conference will be "FOSS Backstage". Backstage comprises all things FOSS governance, open collaboration and how to build and manage communities within the open source space. Submission URL: https://foss-backstage.de/call-papers The event will comprise presentations on all things FOSS governance, decentralised decision making, open collaboration. We invite you to submit talks on the topics: FOSS project governance, collaboration, community management. Asynchronous/ decentralised decision making. Vendor neutrality in FOSS, sustainable FOSS, cross team collaboration. Dealing with poisonous people. Project growth and hand-over. Trademarks. Strategic licensing. While it's primarily targeted at contributions from FOSS people, we would love to also learn more on how typical FOSS collaboration models work well within enterprises. Closely related topics not explicitly listed above are welcome. Important Dates (all dates in GMT +2) Submission deadline: February 18th, 2018. Conference: June, 13th/14th, 2018 High quality talks are called for, ranging from principles to practice. We are looking for real world case studies, background on the social architecture of specific projects and a deep dive into cross community collaboration. Acceptance notifications will be sent out soon after the submission deadline. Please include your name, bio and email, the title of the talk, a brief abstract in English language. We have drafted the submission form to allow for regular talks, each 45 min in length. However you are free to submit your own ideas on how to support the event: If you would like to take our attendees out to show them your favourite bar in Berlin, please submit this offer through the CfP form. If you are interested in sponsoring the event (e.g. we would be happy to provide videos after the event, free drinks for attendees as well as an after-show party), please contact us. Schedule and further updates on the event will be published soon on the event web page. Please re-distribute this CfP to people who might be interested. Contact us at: newthinking communications GmbH Schoenhauser Allee 6/7 10119 Berlin, Germany i...@foss-backstage.de Looking forward to meeting you all in person in summer :) I would love to see all those tracks filled with lots of valuable talks on the Apache Way, on how we work, on how the incubator works, on how being a 501(c3) influences how people get involved and projects are being run, on how being a member run organisation is different, on merit for life, on growing communities, on things gone great - and things gone entirely wrong in the ASF's history, on how to interact with Apache projects as a corporation and everything else you can think of. Isabel -- Sorry for any typos: Mail was typed in vim, written in mutt, via ssh (most likely involving some kind of mobile connection only.)
[jira] [Created] (FLINK-8411) inconsistent behavior between HeapListState#add() and RocksDBListState#add()
Bowen Li created FLINK-8411: --- Summary: inconsistent behavior between HeapListState#add() and RocksDBListState#add() Key: FLINK-8411 URL: https://issues.apache.org/jira/browse/FLINK-8411 Project: Flink Issue Type: Bug Affects Versions: 1.4.0 Reporter: Bowen Li Fix For: 1.5.0, 1.4.1 You can see that HeapListState#add(null) will result in the whole state being cleared or wiped out. {code:java} // HeapListState @Override public void add(V value) { final N namespace = currentNamespace; if (value == null) { clear(); return; } final StateTable> map = stateTable; ArrayList list = map.get(namespace); if (list == null) { list = new ArrayList<>(); map.put(namespace, list); } list.add(value); } {code} {code:java} // RocksDBListState @Override public void add(V value) throws IOException { try { writeCurrentKeyWithGroupAndNamespace(); byte[] key = keySerializationStream.toByteArray(); keySerializationStream.reset(); DataOutputViewStreamWrapper out = new DataOutputViewStreamWrapper(keySerializationStream); valueSerializer.serialize(value, out); backend.db.merge(columnFamily, writeOptions, key, keySerializationStream.toByteArray()); } catch (Exception e) { throw new RuntimeException("Error while adding data to RocksDB", e); } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)