[jira] [Assigned] (FLINK-1651) Running mvn test got stuck

2015-03-06 Thread Henry Saputra (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Saputra reassigned FLINK-1651:


Assignee: Henry Saputra

 Running mvn test got stuck
 --

 Key: FLINK-1651
 URL: https://issues.apache.org/jira/browse/FLINK-1651
 Project: Flink
  Issue Type: Bug
  Components: test
Reporter: Henry Saputra
Assignee: Henry Saputra
Priority: Minor

 I keep getting my test stuck at this state:
 ...
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec - 
 in org.apache.flink.runtime.types.TypeTest
 Running org.apache.flink.runtime.util.AtomicDisposableReferenceCounterTest
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.561 sec - 
 in org.apache.flink.runtime.util.AtomicDisposableReferenceCounterTest
 Running org.apache.flink.runtime.util.DataInputOutputSerializerTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.848 sec - 
 in org.apache.flink.runtime.operators.DataSourceTaskTest
 Running org.apache.flink.runtime.util.DelegatingConfigurationTest
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec - 
 in org.apache.flink.runtime.util.DelegatingConfigurationTest
 Running org.apache.flink.runtime.util.EnvironmentInformationTest
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.563 sec - 
 in org.apache.flink.runtime.io.network.serialization.LargeRecordsTest
 Running org.apache.flink.runtime.util.event.TaskEventHandlerTest
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 sec - 
 in org.apache.flink.runtime.util.event.TaskEventHandlerTest
 Running org.apache.flink.runtime.util.LRUCacheMapTest
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec - 
 in org.apache.flink.runtime.util.LRUCacheMapTest
 Running org.apache.flink.runtime.util.MathUtilTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec - 
 in org.apache.flink.runtime.util.MathUtilTest
 Running org.apache.flink.runtime.util.NonReusingKeyGroupedIteratorTest
 Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.064 sec - 
 in org.apache.flink.runtime.util.NonReusingKeyGroupedIteratorTest
 Running org.apache.flink.runtime.util.ReusingKeyGroupedIteratorTest
 Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec - 
 in org.apache.flink.runtime.util.ReusingKeyGroupedIteratorTest
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.238 sec - 
 in org.apache.flink.runtime.taskmanager.TaskManagerTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.616 sec - 
 in org.apache.flink.runtime.profiling.impl.InstanceProfilerTest
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.303 sec - 
 in org.apache.flink.runtime.util.DataInputOutputSerializerTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.488 sec - 
 in org.apache.flink.runtime.util.EnvironmentInformationTest
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.81 sec - in 
 org.apache.flink.runtime.taskmanager.TaskManagerProcessReapingTest
 Tests run: 46, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.653 sec - 
 in org.apache.flink.runtime.operators.MatchTaskTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.071 sec - 
 in org.apache.flink.runtime.operators.sort.LargeRecordHandlerTest
 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.534 sec - 
 in org.apache.flink.runtime.operators.DataSinkTaskTest
 Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.98 sec - 
 in org.apache.flink.runtime.operators.sort.NormalizedKeySorterTest
 Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.017 sec - 
 in org.apache.flink.runtime.io.disk.ChannelViewsTest
 After this seemed like nothing happen. And the program just hang.
 I am using MacOSX with Java version 1.7.0_71



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Mesos integration of Apache Flink

2015-03-06 Thread omnisis
Github user omnisis commented on the pull request:

https://github.com/apache/flink/pull/251#issuecomment-77671415
  
Whats the current status of this ticket?  Starting to checkout Flink and 
native Mesos support would be a huge win for my use case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1659) Rename classes and packages that contains Pact

2015-03-06 Thread Henry Saputra (JIRA)
Henry Saputra created FLINK-1659:


 Summary: Rename classes and packages that contains Pact
 Key: FLINK-1659
 URL: https://issues.apache.org/jira/browse/FLINK-1659
 Project: Flink
  Issue Type: Task
Reporter: Henry Saputra
Priority: Minor


We have several class names that contain or start with Pact.

Pact is the previous term for Flink data model and user defined functions/ 
operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Fix checking null for ternary operator check o...

2015-03-06 Thread hsaputra
GitHub user hsaputra opened a pull request:

https://github.com/apache/flink/pull/461

Fix checking null for ternary operator check on Exception#getMessage calls

Add parentheses on Exception#getMessage calls from pattern of:
 
Initializing the input processing failed + e.getMessage() == null ? . : 
:  + e.getMessage()

to:

Initializing the input processing failed + (e.getMessage() == null ? . 
: :  + e.getMessage())

Extra parentheses needed to make sure ternary operator check on 
e.getMessage scope call.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsaputra/flink 
fix_parentheses_exception_getmessage

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #461


commit 1345cbf38cdbf849bdc7c0ff7e29d02fa00bc8fa
Author: Henry Saputra henry.sapu...@gmail.com
Date:   2015-03-07T01:50:57Z

Fix checking null for Exception#getMessage call from pattern of:

Initializing the input processing failed + e.getMessage() == null ? . : 
:  + e.getMessage()

to:

Initializing the input processing failed + (e.getMessage() == null ? . 
: :  + e.getMessage())

Extra parentheses needed to make sure ternary operator check on 
e.getMessage scope call.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)

2015-03-06 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/410#issuecomment-77569539
  
I think it would be definitely good to have something like a job submission
queue, that accepts jobs and executes them as soon as enough as enough
resource become available.
That should not be too hard to do.
Also simple dependencies could be checked like execute job Y only if job X
successfully completed.

However, I am not aware of any effort in that direction.

2015-03-06 11:26 GMT+01:00 Flavio Pompermaier notificati...@github.com:

 I know that in stratosphere there was an effort to write a job scheduler,
 do you think that such a thing could be valuable for the future or are you
 going to rely only on hadoop-ecosytem stuff (like Oozie or Falcon upon
 YARN)?

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/410#issuecomment-77537609.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent

2015-03-06 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1658:
-

 Summary: Rename AbstractEvent to AbstractTaskEvent and 
AbstractJobEvent
 Key: FLINK-1658
 URL: https://issues.apache.org/jira/browse/FLINK-1658
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Local Runtime
Reporter: Gyula Fora
Priority: Trivial


The same name is used for different event classes in the runtime which can 
cause confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1648) Add a mode where the system automatically sets the parallelism to the available task slots

2015-03-06 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1648.
-
Resolution: Implemented

Implemented in d8d642fd6d7d9b8526325d4efff1015f636c5ddb

 Add a mode where the system automatically sets the parallelism to the 
 available task slots
 --

 Key: FLINK-1648
 URL: https://issues.apache.org/jira/browse/FLINK-1648
 Project: Flink
  Issue Type: New Feature
  Components: JobManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9


 This is basically a port of this code form the 0.8 release:
 https://github.com/apache/flink/pull/410



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1658) Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent

2015-03-06 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350469#comment-14350469
 ] 

Ufuk Celebi commented on FLINK-1658:


I agree. Please wait with the renaming till after the blocking result PR is 
merged.

 Rename AbstractEvent to AbstractTaskEvent and AbstractJobEvent
 --

 Key: FLINK-1658
 URL: https://issues.apache.org/jira/browse/FLINK-1658
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Local Runtime
Reporter: Gyula Fora
Priority: Trivial

 The same name is used for different event classes in the runtime which can 
 cause confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1656) Fix ForwardedField documentation for operators with iterator input

2015-03-06 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1656:


 Summary: Fix ForwardedField documentation for operators with 
iterator input
 Key: FLINK-1656
 URL: https://issues.apache.org/jira/browse/FLINK-1656
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Critical


The documentation of ForwardedFields is incomplete for operators with iterator 
inputs (GroupReduce, CoGroup). 
This should be fixed ASAP, because it can lead to incorrect program execution.

The conditions for forwarded fields on operators with iterator input are:

1) forwarded fields must be emitted in the order in which they are received 
through the iterator
2) all forwarded fields of a record must stick together, i.e., if your function 
builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of the 2nd, 
4th, ... record coming through the iterator, these are not valid forwarded 
fields.
3) it is OK to completely filter out records coming through the iterator.

The reason for these conditions is that the optimizer uses forwarded fields to 
reason about physical data properties such as order and grouping. Mixing up the 
order of records or emitting records which are composed from different input 
records, might destroy a (secondary) order or grouping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1657) Make count windows local automatically

2015-03-06 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1657:
-

 Summary: Make count windows local automatically
 Key: FLINK-1657
 URL: https://issues.apache.org/jira/browse/FLINK-1657
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora


Count windows should be automatically discretized in parallel as it doesnt 
break the assumptions on the window:

ds.window(Count.of(10)) is equivalent to ds.window(Count.of(10)).local() if we 
dont make ordering guarantees and the second version is distributed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)

2015-03-06 Thread fpompermaier
Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/410#issuecomment-77586066
  
That would be awesome :)
I think you could talk with Markus about the Dopa scheduler..propably it's 
a closed project but it could be a source of inputs to create a ticket for 
contributors who wants to implement that!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1536) GSoC project: Graph partitioning operators for Gelly

2015-03-06 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350558#comment-14350558
 ] 

Vasia Kalavri commented on FLINK-1536:
--

Hi [~ayd]!

As a first step, it would be nice to get familiar with Flink and Gelly. I would 
suggest you take a look in the [Flink documentation | 
http://ci.apache.org/projects/flink/flink-docs-release-0.8/] and the [Gelly 
guide | 
http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html]. Also, 
go through the [Flink examples | 
http://github.com/apache/flink/tree/master/flink-examples] and the [Gelly 
examples | 
http://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example].
 Once you're confident enough, you can pick one of the [starter issues | 
http://issues.apache.org/jira/browse/FLINK-992?jql=project%20%3D%20FLINK%20AND%20labels%20%3D%20starter]
 and try to make your first contribution to Flink :-)

For GSoC, you will have to write up a proposal for a project. If you're 
interested in working on Gelly, I can help you with that!
Make sure you subscribe to the mailing lists and let us know if you have any 
questions!

-Vasia.

 GSoC project: Graph partitioning operators for Gelly
 

 Key: FLINK-1536
 URL: https://issues.apache.org/jira/browse/FLINK-1536
 Project: Flink
  Issue Type: New Feature
  Components: Gelly, Java API
Reporter: Vasia Kalavri
Priority: Minor
  Labels: graph, gsoc2015, java

 Smart graph partitioning can significantly improve the performance and 
 scalability of graph analysis applications. Depending on the computation 
 pattern, a graph partitioning algorithm divides the graph into (maybe 
 overlapping) subgraphs, optimizing some objective. For example, if 
 communication is performed across graph edges, one might want to minimize the 
 edges that cross from one partition to another.
 The problem of graph partitioning is a well studied problem and several 
 algorithms have been proposed in the literature. The goal of this project 
 would be to choose a few existing partitioning techniques and implement the 
 corresponding graph partitioning operators for Gelly.
 Some related literature can be found [here| 
 http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)

2015-03-06 Thread fpompermaier
Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/410#issuecomment-77533354
  
That's true but what if there's not enough resources? Is there any policy 
to retry the job submission automatically or give priority to waiting/queued 
ones?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [streaming] [wip] Fault tolerance prototype (f...

2015-03-06 Thread senorcarbone
Github user senorcarbone commented on the pull request:

https://github.com/apache/flink/pull/459#issuecomment-77538649
  
hold on, will commit an update very soon. :P


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)

2015-03-06 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/410#issuecomment-77530332
  
Hey,
Flink already supports running multiple jobs in parallel.
If you have 50 slots available, you can run two jobs requiring 25 slots.
The webfrontend is not really able to properly report the status of 
concurrent jobs, but thats only a visualization issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)

2015-03-06 Thread fpompermaier
Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/410#issuecomment-77537609
  
I know that in stratosphere there was an effort to write a job scheduler, 
do you think that such a thing could be valuable for the future or are you 
going to rely only on hadoop-ecosytem stuff (like Oozie or Falcon upon YARN)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)

2015-03-06 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/410#issuecomment-77536080
  
At the moment, this is not supported yet. The easiest way to execute
multiple jobs concurrently is to start each job in a separate Flink cluster
running on YARN.

On Fri, Mar 6, 2015 at 10:52 AM, Flavio Pompermaier 
notificati...@github.com wrote:

 That's true but what if there's not enough resources? Is there any policy
 to retry the job submission automatically or give priority to
 waiting/queued ones?

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/410#issuecomment-77533354.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1656) Fix ForwardedField documentation for operators with iterator input

2015-03-06 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350263#comment-14350263
 ] 

Fabian Hueske commented on FLINK-1656:
--

Right, that's a good point.
+1 limiting to key fields. That's much easier to reason about for users.

However, I am not sure how it is implemented right now. 
I guess secondary sort info is already removed by the property filtering, but I 
need to verify that.

 Fix ForwardedField documentation for operators with iterator input
 --

 Key: FLINK-1656
 URL: https://issues.apache.org/jira/browse/FLINK-1656
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Critical

 The documentation of ForwardedFields is incomplete for operators with 
 iterator inputs (GroupReduce, CoGroup). 
 This should be fixed ASAP, because it can lead to incorrect program execution.
 The conditions for forwarded fields on operators with iterator input are:
 1) forwarded fields must be emitted in the order in which they are received 
 through the iterator
 2) all forwarded fields of a record must stick together, i.e., if your 
 function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of 
 the 2nd, 4th, ... record coming through the iterator, these are not valid 
 forwarded fields.
 3) it is OK to completely filter out records coming through the iterator.
 The reason for these conditions is that the optimizer uses forwarded fields 
 to reason about physical data properties such as order and grouping. Mixing 
 up the order of records or emitting records which are composed from different 
 input records, might destroy a (secondary) order or grouping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1629) Add option to start Flink on YARN in a detached mode

2015-03-06 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350926#comment-14350926
 ] 

Robert Metzger commented on FLINK-1629:
---

For everyone who's ever facing issues stopping a detached Flink-YARN cluster, 
there is a bug in Hadoop affecting Ubuntu/Debian (and probably many newer 
distributions): https://issues.apache.org/jira/browse/HADOOP-9752
So that's the reason why YARN can not kill running containers.

 Add option to start Flink on YARN in a detached mode
 

 Key: FLINK-1629
 URL: https://issues.apache.org/jira/browse/FLINK-1629
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger

 Right now, we expect the YARN command line interface to be connected with the 
 Application Master all the time to control the yarn session or the job.
 For very long running sessions or jobs users want to just fire and forget a 
 job/session to YARN.
 Stopping the session will still be possible using YARN's tools.
 Also, prior to detaching itself, the CLI frontend could print the required 
 command to kill the session as a convenience.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)