[jira] [Assigned] (FLINK-1651) Running mvn test got stuck
[ https://issues.apache.org/jira/browse/FLINK-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Saputra reassigned FLINK-1651: Assignee: Henry Saputra Running mvn test got stuck -- Key: FLINK-1651 URL: https://issues.apache.org/jira/browse/FLINK-1651 Project: Flink Issue Type: Bug Components: test Reporter: Henry Saputra Assignee: Henry Saputra Priority: Minor I keep getting my test stuck at this state: ... Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec - in org.apache.flink.runtime.types.TypeTest Running org.apache.flink.runtime.util.AtomicDisposableReferenceCounterTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.561 sec - in org.apache.flink.runtime.util.AtomicDisposableReferenceCounterTest Running org.apache.flink.runtime.util.DataInputOutputSerializerTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.848 sec - in org.apache.flink.runtime.operators.DataSourceTaskTest Running org.apache.flink.runtime.util.DelegatingConfigurationTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec - in org.apache.flink.runtime.util.DelegatingConfigurationTest Running org.apache.flink.runtime.util.EnvironmentInformationTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.563 sec - in org.apache.flink.runtime.io.network.serialization.LargeRecordsTest Running org.apache.flink.runtime.util.event.TaskEventHandlerTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 sec - in org.apache.flink.runtime.util.event.TaskEventHandlerTest Running org.apache.flink.runtime.util.LRUCacheMapTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec - in org.apache.flink.runtime.util.LRUCacheMapTest Running org.apache.flink.runtime.util.MathUtilTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec - in org.apache.flink.runtime.util.MathUtilTest Running org.apache.flink.runtime.util.NonReusingKeyGroupedIteratorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.064 sec - in org.apache.flink.runtime.util.NonReusingKeyGroupedIteratorTest Running org.apache.flink.runtime.util.ReusingKeyGroupedIteratorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec - in org.apache.flink.runtime.util.ReusingKeyGroupedIteratorTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.238 sec - in org.apache.flink.runtime.taskmanager.TaskManagerTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.616 sec - in org.apache.flink.runtime.profiling.impl.InstanceProfilerTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.303 sec - in org.apache.flink.runtime.util.DataInputOutputSerializerTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.488 sec - in org.apache.flink.runtime.util.EnvironmentInformationTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.81 sec - in org.apache.flink.runtime.taskmanager.TaskManagerProcessReapingTest Tests run: 46, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.653 sec - in org.apache.flink.runtime.operators.MatchTaskTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.071 sec - in org.apache.flink.runtime.operators.sort.LargeRecordHandlerTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.534 sec - in org.apache.flink.runtime.operators.DataSinkTaskTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.98 sec - in org.apache.flink.runtime.operators.sort.NormalizedKeySorterTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.017 sec - in org.apache.flink.runtime.io.disk.ChannelViewsTest After this seemed like nothing happen. And the program just hang. I am using MacOSX with Java version 1.7.0_71 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Mesos integration of Apache Flink
Github user omnisis commented on the pull request: https://github.com/apache/flink/pull/251#issuecomment-77671415 Whats the current status of this ticket? Starting to checkout Flink and native Mesos support would be a huge win for my use case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1659) Rename classes and packages that contains Pact
Henry Saputra created FLINK-1659: Summary: Rename classes and packages that contains Pact Key: FLINK-1659 URL: https://issues.apache.org/jira/browse/FLINK-1659 Project: Flink Issue Type: Task Reporter: Henry Saputra Priority: Minor We have several class names that contain or start with Pact. Pact is the previous term for Flink data model and user defined functions/ operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Fix checking null for ternary operator check o...
GitHub user hsaputra opened a pull request: https://github.com/apache/flink/pull/461 Fix checking null for ternary operator check on Exception#getMessage calls Add parentheses on Exception#getMessage calls from pattern of: Initializing the input processing failed + e.getMessage() == null ? . : : + e.getMessage() to: Initializing the input processing failed + (e.getMessage() == null ? . : : + e.getMessage()) Extra parentheses needed to make sure ternary operator check on e.getMessage scope call. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsaputra/flink fix_parentheses_exception_getmessage Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/461.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #461 commit 1345cbf38cdbf849bdc7c0ff7e29d02fa00bc8fa Author: Henry Saputra henry.sapu...@gmail.com Date: 2015-03-07T01:50:57Z Fix checking null for Exception#getMessage call from pattern of: Initializing the input processing failed + e.getMessage() == null ? . : : + e.getMessage() to: Initializing the input processing failed + (e.getMessage() == null ? . : : + e.getMessage()) Extra parentheses needed to make sure ternary operator check on e.getMessage scope call. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-77569539 I think it would be definitely good to have something like a job submission queue, that accepts jobs and executes them as soon as enough as enough resource become available. That should not be too hard to do. Also simple dependencies could be checked like execute job Y only if job X successfully completed. However, I am not aware of any effort in that direction. 2015-03-06 11:26 GMT+01:00 Flavio Pompermaier notificati...@github.com: I know that in stratosphere there was an effort to write a job scheduler, do you think that such a thing could be valuable for the future or are you going to rely only on hadoop-ecosytem stuff (like Oozie or Falcon upon YARN)? â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/410#issuecomment-77537609. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent
Gyula Fora created FLINK-1658: - Summary: Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent Key: FLINK-1658 URL: https://issues.apache.org/jira/browse/FLINK-1658 Project: Flink Issue Type: Improvement Components: Distributed Runtime, Local Runtime Reporter: Gyula Fora Priority: Trivial The same name is used for different event classes in the runtime which can cause confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1648) Add a mode where the system automatically sets the parallelism to the available task slots
[ https://issues.apache.org/jira/browse/FLINK-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1648. - Resolution: Implemented Implemented in d8d642fd6d7d9b8526325d4efff1015f636c5ddb Add a mode where the system automatically sets the parallelism to the available task slots -- Key: FLINK-1648 URL: https://issues.apache.org/jira/browse/FLINK-1648 Project: Flink Issue Type: New Feature Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 This is basically a port of this code form the 0.8 release: https://github.com/apache/flink/pull/410 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent
[ https://issues.apache.org/jira/browse/FLINK-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350469#comment-14350469 ] Ufuk Celebi commented on FLINK-1658: I agree. Please wait with the renaming till after the blocking result PR is merged. Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent -- Key: FLINK-1658 URL: https://issues.apache.org/jira/browse/FLINK-1658 Project: Flink Issue Type: Improvement Components: Distributed Runtime, Local Runtime Reporter: Gyula Fora Priority: Trivial The same name is used for different event classes in the runtime which can cause confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1656) Fix ForwardedField documentation for operators with iterator input
Fabian Hueske created FLINK-1656: Summary: Fix ForwardedField documentation for operators with iterator input Key: FLINK-1656 URL: https://issues.apache.org/jira/browse/FLINK-1656 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Critical The documentation of ForwardedFields is incomplete for operators with iterator inputs (GroupReduce, CoGroup). This should be fixed ASAP, because it can lead to incorrect program execution. The conditions for forwarded fields on operators with iterator input are: 1) forwarded fields must be emitted in the order in which they are received through the iterator 2) all forwarded fields of a record must stick together, i.e., if your function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of the 2nd, 4th, ... record coming through the iterator, these are not valid forwarded fields. 3) it is OK to completely filter out records coming through the iterator. The reason for these conditions is that the optimizer uses forwarded fields to reason about physical data properties such as order and grouping. Mixing up the order of records or emitting records which are composed from different input records, might destroy a (secondary) order or grouping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1657) Make count windows local automatically
Gyula Fora created FLINK-1657: - Summary: Make count windows local automatically Key: FLINK-1657 URL: https://issues.apache.org/jira/browse/FLINK-1657 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Count windows should be automatically discretized in parallel as it doesnt break the assumptions on the window: ds.window(Count.of(10)) is equivalent to ds.window(Count.of(10)).local() if we dont make ordering guarantees and the second version is distributed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user fpompermaier commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-77586066 That would be awesome :) I think you could talk with Markus about the Dopa scheduler..propably it's a closed project but it could be a source of inputs to create a ticket for contributors who wants to implement that! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1536) GSoC project: Graph partitioning operators for Gelly
[ https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350558#comment-14350558 ] Vasia Kalavri commented on FLINK-1536: -- Hi [~ayd]! As a first step, it would be nice to get familiar with Flink and Gelly. I would suggest you take a look in the [Flink documentation | http://ci.apache.org/projects/flink/flink-docs-release-0.8/] and the [Gelly guide | http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html]. Also, go through the [Flink examples | http://github.com/apache/flink/tree/master/flink-examples] and the [Gelly examples | http://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example]. Once you're confident enough, you can pick one of the [starter issues | http://issues.apache.org/jira/browse/FLINK-992?jql=project%20%3D%20FLINK%20AND%20labels%20%3D%20starter] and try to make your first contribution to Flink :-) For GSoC, you will have to write up a proposal for a project. If you're interested in working on Gelly, I can help you with that! Make sure you subscribe to the mailing lists and let us know if you have any questions! -Vasia. GSoC project: Graph partitioning operators for Gelly Key: FLINK-1536 URL: https://issues.apache.org/jira/browse/FLINK-1536 Project: Flink Issue Type: New Feature Components: Gelly, Java API Reporter: Vasia Kalavri Priority: Minor Labels: graph, gsoc2015, java Smart graph partitioning can significantly improve the performance and scalability of graph analysis applications. Depending on the computation pattern, a graph partitioning algorithm divides the graph into (maybe overlapping) subgraphs, optimizing some objective. For example, if communication is performed across graph edges, one might want to minimize the edges that cross from one partition to another. The problem of graph partitioning is a well studied problem and several algorithms have been proposed in the literature. The goal of this project would be to choose a few existing partitioning techniques and implement the corresponding graph partitioning operators for Gelly. Some related literature can be found [here| http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user fpompermaier commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-77533354 That's true but what if there's not enough resources? Is there any policy to retry the job submission automatically or give priority to waiting/queued ones? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [streaming] [wip] Fault tolerance prototype (f...
Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/459#issuecomment-77538649 hold on, will commit an update very soon. :P --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-77530332 Hey, Flink already supports running multiple jobs in parallel. If you have 50 slots available, you can run two jobs requiring 25 slots. The webfrontend is not really able to properly report the status of concurrent jobs, but thats only a visualization issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user fpompermaier commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-77537609 I know that in stratosphere there was an effort to write a job scheduler, do you think that such a thing could be valuable for the future or are you going to rely only on hadoop-ecosytem stuff (like Oozie or Falcon upon YARN)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-77536080 At the moment, this is not supported yet. The easiest way to execute multiple jobs concurrently is to start each job in a separate Flink cluster running on YARN. On Fri, Mar 6, 2015 at 10:52 AM, Flavio Pompermaier notificati...@github.com wrote: That's true but what if there's not enough resources? Is there any policy to retry the job submission automatically or give priority to waiting/queued ones? â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/410#issuecomment-77533354. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1656) Fix ForwardedField documentation for operators with iterator input
[ https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350263#comment-14350263 ] Fabian Hueske commented on FLINK-1656: -- Right, that's a good point. +1 limiting to key fields. That's much easier to reason about for users. However, I am not sure how it is implemented right now. I guess secondary sort info is already removed by the property filtering, but I need to verify that. Fix ForwardedField documentation for operators with iterator input -- Key: FLINK-1656 URL: https://issues.apache.org/jira/browse/FLINK-1656 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Critical The documentation of ForwardedFields is incomplete for operators with iterator inputs (GroupReduce, CoGroup). This should be fixed ASAP, because it can lead to incorrect program execution. The conditions for forwarded fields on operators with iterator input are: 1) forwarded fields must be emitted in the order in which they are received through the iterator 2) all forwarded fields of a record must stick together, i.e., if your function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of the 2nd, 4th, ... record coming through the iterator, these are not valid forwarded fields. 3) it is OK to completely filter out records coming through the iterator. The reason for these conditions is that the optimizer uses forwarded fields to reason about physical data properties such as order and grouping. Mixing up the order of records or emitting records which are composed from different input records, might destroy a (secondary) order or grouping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1629) Add option to start Flink on YARN in a detached mode
[ https://issues.apache.org/jira/browse/FLINK-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350926#comment-14350926 ] Robert Metzger commented on FLINK-1629: --- For everyone who's ever facing issues stopping a detached Flink-YARN cluster, there is a bug in Hadoop affecting Ubuntu/Debian (and probably many newer distributions): https://issues.apache.org/jira/browse/HADOOP-9752 So that's the reason why YARN can not kill running containers. Add option to start Flink on YARN in a detached mode Key: FLINK-1629 URL: https://issues.apache.org/jira/browse/FLINK-1629 Project: Flink Issue Type: Improvement Components: YARN Client Reporter: Robert Metzger Assignee: Robert Metzger Right now, we expect the YARN command line interface to be connected with the Application Master all the time to control the yarn session or the job. For very long running sessions or jobs users want to just fire and forget a job/session to YARN. Stopping the session will still be possible using YARN's tools. Also, prior to detaching itself, the CLI frontend could print the required command to kill the session as a convenience. -- This message was sent by Atlassian JIRA (v6.3.4#6332)