[GitHub] flink pull request: [FLINK-2691] fix broken links to Python script...

2015-09-17 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1140#issuecomment-140979385
  
Nice catch! Looks good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2691) Broken links to Python script on QuickStart doc

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791638#comment-14791638
 ] 

ASF GitHub Bot commented on FLINK-2691:
---

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1140#issuecomment-140979385
  
Nice catch! Looks good to merge.


> Broken links to Python script on QuickStart doc
> ---
>
> Key: FLINK-2691
> URL: https://issues.apache.org/jira/browse/FLINK-2691
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.9
>Reporter: Felix Cheung
>Priority: Minor
>
> Links to plotPoints.py are broken on 
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/run_example_quickstart.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2591] Add configuration parameter for d...

2015-09-17 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1121#issuecomment-141016403
  
Sorry for the late response, I didn't see this new pull request.

The failed test is okay. The test is known for being unstable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2591) Add configuration parameter for default number of yarn containers

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791811#comment-14791811
 ] 

ASF GitHub Bot commented on FLINK-2591:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1121#issuecomment-141016403
  
Sorry for the late response, I didn't see this new pull request.

The failed test is okay. The test is known for being unstable.


> Add configuration parameter for default number of yarn containers
> -
>
> Key: FLINK-2591
> URL: https://issues.apache.org/jira/browse/FLINK-2591
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN Client
>Reporter: Robert Metzger
>Assignee: Will Miao
>Priority: Minor
>  Labels: starter
>
> A user complained about the requirement to always specify the number of yarn 
> containers (-n) when starting a job.
> Adding a configuration value with a default value will allow users to set a 
> default ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791821#comment-14791821
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141018736
  
Very nice.
I tried the new interface locally, and it seems to work.

I suspect these values are still test values? http://i.imgur.com/d9ZRR7g.png
There is certainly more work to do until the web interface has the same 
features as the old interface (task manager overview / monitoring)


> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2637] [api-breaking] [scala, types] Add...

2015-09-17 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1134#discussion_r39725082
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/typeinfo/IntegerTypeInfo.java
 ---
@@ -18,15 +18,33 @@
 
 package org.apache.flink.api.common.typeinfo;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Sets;
 import org.apache.flink.api.common.typeutils.TypeComparator;
 import org.apache.flink.api.common.typeutils.TypeSerializer;
 
+import java.util.Set;
+
 /**
- * Type information for numeric primitive types (int, long, double, byte, 
...).
+ * Type information for numeric integer primitive types: int, long, byte, 
short, boolean, character.
  */
 public class IntegerTypeInfo extends NumericTypeInfo {
 
+   private static final long serialVersionUID = -8068827354966766955L;
+
+   private static final Set integerTypes = 
Sets.newHashSet(
+   Integer.class,
+   Long.class,
+   Byte.class,
+   Short.class,
+   Boolean.class,
--- End diff --

Nope, I missed that. Will remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2591) Add configuration parameter for default number of yarn containers

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791826#comment-14791826
 ] 

ASF GitHub Bot commented on FLINK-2591:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1121#discussion_r39725238
  
--- Diff: 
flink-yarn-tests/src/main/java/org/apache/flink/yarn/YARNSessionFIFOITCase.java 
---
@@ -111,6 +111,20 @@ public void testClientStartup() {
}
 
/**
+* Test configuration parameter for default number of yarn containers
+*/
+   @Test
+   public void testDefaultNumberOfTaskManagers() {
+   LOG.info("Starting testDefaultNumberOfTaskManagers()");
--- End diff --

I would generate a flink-conf.yaml just for this test to test the behavior


> Add configuration parameter for default number of yarn containers
> -
>
> Key: FLINK-2591
> URL: https://issues.apache.org/jira/browse/FLINK-2591
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN Client
>Reporter: Robert Metzger
>Assignee: Will Miao
>Priority: Minor
>  Labels: starter
>
> A user complained about the requirement to always specify the number of yarn 
> containers (-n) when starting a job.
> Adding a configuration value with a default value will allow users to set a 
> default ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2591) Add configuration parameter for default number of yarn containers

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791828#comment-14791828
 ] 

ASF GitHub Bot commented on FLINK-2591:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1121#issuecomment-141019208
  
For others who follow this PR, there was already some discussion on this 
change here: https://github.com/apache/flink/pull/1107


> Add configuration parameter for default number of yarn containers
> -
>
> Key: FLINK-2591
> URL: https://issues.apache.org/jira/browse/FLINK-2591
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN Client
>Reporter: Robert Metzger
>Assignee: Will Miao
>Priority: Minor
>  Labels: starter
>
> A user complained about the requirement to always specify the number of yarn 
> containers (-n) when starting a job.
> Adding a configuration value with a default value will allow users to set a 
> default ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2591] Add configuration parameter for d...

2015-09-17 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1121#issuecomment-141019208
  
For others who follow this PR, there was already some discussion on this 
change here: https://github.com/apache/flink/pull/1107


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2691] fix broken links to Python script...

2015-09-17 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1140#issuecomment-141021568
  
Thanks for the patch! 
Will merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2691) Broken links to Python script on QuickStart doc

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791832#comment-14791832
 ] 

ASF GitHub Bot commented on FLINK-2691:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1140#issuecomment-141021568
  
Thanks for the patch! 
Will merge this.


> Broken links to Python script on QuickStart doc
> ---
>
> Key: FLINK-2691
> URL: https://issues.apache.org/jira/browse/FLINK-2691
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.9
>Reporter: Felix Cheung
>Priority: Minor
>
> Links to plotPoints.py are broken on 
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/run_example_quickstart.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2684) Make TypeInformation non-serializable again by removing Table API's need for it

2015-09-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802692#comment-14802692
 ] 

Till Rohrmann commented on FLINK-2684:
--

Yes, I think this is the better solution here. In fact, I already implemented 
this solution and it seems to work.

> Make TypeInformation non-serializable again by removing Table API's need for 
> it
> ---
>
> Key: FLINK-2684
> URL: https://issues.apache.org/jira/browse/FLINK-2684
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Till Rohrmann
>Priority: Minor
>
> Currently, the {{TypeInformations}} must be serializable because they are 
> shipped with UDFs of the Table API to the TMs. There the Table API code is 
> generated and compiled. By generating the code on the client and shipping the 
> code as strings, we could get rid of this dependency. As a consequence, the 
> {{TypeInformations}} can be non-serializable again, as they were intended to 
> be.
> Maybe [~aljoscha] can provide some more implementation details here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2690) CsvInputFormat cannot find the field of derived POJO class

2015-09-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802695#comment-14802695
 ] 

Till Rohrmann commented on FLINK-2690:
--

Extracting properly all declared fields in the {{CsvInputFormat}} also seems to 
solve the problem. I've code which does this. Will open PR for it.

> CsvInputFormat cannot find the field of derived POJO class
> --
>
> Key: FLINK-2690
> URL: https://issues.apache.org/jira/browse/FLINK-2690
> Project: Flink
>  Issue Type: Bug
>  Components: Java API, Scala API
>Affects Versions: 0.10
>Reporter: Chiwan Park
>Assignee: Chiwan Park
>
> A user reports {{CsvInputFormat}} cannot find the field of derived POJO 
> class. 
> (http://mail-archives.apache.org/mod_mbox/flink-user/201509.mbox/%3ccaj54yvi6cbldn7cypey+xe8a5a_j1-6tnx1wm1eb63gvnqd...@mail.gmail.com%3e)
> The reason of the bug is that {{CsvInputFormat}} uses {{getDeclaredField}} 
> method without scanning base classes to find the field. When 
> {{CsvInputFormat}} was wrote, {{TypeInformation}} cannot be serialized. So we 
> needed to initialize {{TypeInformation}} in {{open}} method manually. Some 
> mistakes in initializing cause this bug.
> After FLINK-2637 is merged, we can serialize {{TypeInformation}} and don't 
> need to create field objects in {{CsvInputFormat}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2690) CsvInputFormat cannot find the field of derived POJO class

2015-09-17 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802697#comment-14802697
 ] 

Chiwan Park commented on FLINK-2690:


Okay. :-)

> CsvInputFormat cannot find the field of derived POJO class
> --
>
> Key: FLINK-2690
> URL: https://issues.apache.org/jira/browse/FLINK-2690
> Project: Flink
>  Issue Type: Bug
>  Components: Java API, Scala API
>Affects Versions: 0.10
>Reporter: Chiwan Park
>Assignee: Till Rohrmann
>
> A user reports {{CsvInputFormat}} cannot find the field of derived POJO 
> class. 
> (http://mail-archives.apache.org/mod_mbox/flink-user/201509.mbox/%3ccaj54yvi6cbldn7cypey+xe8a5a_j1-6tnx1wm1eb63gvnqd...@mail.gmail.com%3e)
> The reason of the bug is that {{CsvInputFormat}} uses {{getDeclaredField}} 
> method without scanning base classes to find the field. When 
> {{CsvInputFormat}} was wrote, {{TypeInformation}} cannot be serialized. So we 
> needed to initialize {{TypeInformation}} in {{open}} method manually. Some 
> mistakes in initializing cause this bug.
> After FLINK-2637 is merged, we can serialize {{TypeInformation}} and don't 
> need to create field objects in {{CsvInputFormat}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2690) CsvInputFormat cannot find the field of derived POJO class

2015-09-17 Thread Chiwan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiwan Park updated FLINK-2690:
---
Assignee: Till Rohrmann  (was: Chiwan Park)

> CsvInputFormat cannot find the field of derived POJO class
> --
>
> Key: FLINK-2690
> URL: https://issues.apache.org/jira/browse/FLINK-2690
> Project: Flink
>  Issue Type: Bug
>  Components: Java API, Scala API
>Affects Versions: 0.10
>Reporter: Chiwan Park
>Assignee: Till Rohrmann
>
> A user reports {{CsvInputFormat}} cannot find the field of derived POJO 
> class. 
> (http://mail-archives.apache.org/mod_mbox/flink-user/201509.mbox/%3ccaj54yvi6cbldn7cypey+xe8a5a_j1-6tnx1wm1eb63gvnqd...@mail.gmail.com%3e)
> The reason of the bug is that {{CsvInputFormat}} uses {{getDeclaredField}} 
> method without scanning base classes to find the field. When 
> {{CsvInputFormat}} was wrote, {{TypeInformation}} cannot be serialized. So we 
> needed to initialize {{TypeInformation}} in {{open}} method manually. Some 
> mistakes in initializing cause this bug.
> After FLINK-2637 is merged, we can serialize {{TypeInformation}} and don't 
> need to create field objects in {{CsvInputFormat}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802706#comment-14802706
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141035389
  
Yes, the values you refer to are still placeholders.

I would like to merge this in its current state and improve on it in the 
master branch. The old frontend is still the default, and this one has to be 
explicitly activated, to there should be no conflict.


> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141035389
  
Yes, the values you refer to are still placeholders.

I would like to merge this in its current state and improve on it in the 
master branch. The old frontend is still the default, and this one has to be 
explicitly activated, to there should be no conflict.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141035558
  
I agree with this approach, so +1 to merge ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802708#comment-14802708
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141035558
  
I agree with this approach, so +1 to merge ;)


> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141050563
  
Another follow-up todo is adding support for yarn:

```
11:31:01,955 ERROR org.apache.flink.runtime.jobmanager.JobManager   
 - WebServer could not be created
org.apache.flink.configuration.IllegalConfigurationException: The path to 
the static contents 
(/yarn/nm/usercache/robert/appcache/application_1441703985068_0007/container_e07_1441703985068_0007_01_01/resources/web-runtime-monitor)
 is not a readable directory.
at 
org.apache.flink.runtime.webmonitor.WebRuntimeMonitor.(WebRuntimeMonitor.java:137)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.flink.runtime.jobmanager.JobManager$.startWebRuntimeMonitor(JobManager.scala:1741)
at 
org.apache.flink.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:134)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
at 
org.apache.flink.yarn.ApplicationMaster$.main(ApplicationMaster.scala:69)
at org.apache.flink.yarn.ApplicationMaster.main(ApplicationMaster.scala)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2410] [java api] PojoTypeInfo is not co...

2015-09-17 Thread twalthr
Github user twalthr closed the pull request at:

https://github.com/apache/flink/pull/943


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2410) PojoTypeInfo is not completely serializable

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802808#comment-14802808
 ] 

ASF GitHub Bot commented on FLINK-2410:
---

Github user twalthr closed the pull request at:

https://github.com/apache/flink/pull/943


> PojoTypeInfo is not completely serializable
> ---
>
> Key: FLINK-2410
> URL: https://issues.apache.org/jira/browse/FLINK-2410
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Table API requires PojoTypeInfo to be serializable. The following code fails:
> {code}
> Table finishedEtlTable = maxMeasurements
> .join(stationTable).where("s_station_id = m_station_id")
> .select("year, month, day, value, country, name");
> DataSet maxTemp = tableEnv.toDataSet(finishedEtlTable, 
> MaxTemperature.class);
> maxTemp
> .groupBy("year")
> .sortGroup("value", Order.DESCENDING)
> .first(1)
> .print();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2653) Enable object reuse in MergeIterator

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802742#comment-14802742
 ] 

ASF GitHub Bot commented on FLINK-2653:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1115#issuecomment-141041498
  
You are right, in those drivers, it is handled in the wrong way. Probably 
an artifact from the time before the `MultableObjectIterator` had both variants 
of the `next()` method. It used to have only the reusing variant.

Clearly, this should be fixed.

Here is the guide that I try to follow when working with the mutable 
objects:

  - All `MultableObjectIterator`s have two variants of the `next()` method 
- one for reuse, one without.
  - The variant without reuse it crucial, as not every situation can work 
with object reuse.
  - The variant with reuse is optional, but should be implemented where 
possible for performance.
  - The task drivers (AllReduceDriver, ...) or algorithms (sorter / hasher) 
know whether they want to work with reuse or not, and call the respective 
method in that case.


> Enable object reuse in MergeIterator
> 
>
> Key: FLINK-2653
> URL: https://issues.apache.org/jira/browse/FLINK-2653
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Greg Hogan
>
> MergeIterator currently discards given reusable objects and simply returns a 
> new object from the JVM heap. This inefficiency has a noticeable impact on 
> garbage collection and runtime overhead (~5% overall performance by my 
> measure).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141046424
  
Your change set is very large large but this inevitably comes with major 
reworks like this. What I checked out already works nicely. Improvements here 
and there will be made over the next weeks. +1 for merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2689] [runtime] Fix reuse of null objec...

2015-09-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1136


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2640][yarn] integrate off-heap configur...

2015-09-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1132


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2659] Fix object reuse in UnionWithTemp...

2015-09-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1130


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2691] fix broken links to Python script...

2015-09-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1140


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: FLINK-2595 Unclosed JarFile may leak resource ...

2015-09-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1137


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2689] [runtime] Fix reuse of null objec...

2015-09-17 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1136#issuecomment-141003929
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2689) Reusing null object for joins with SolutionSet

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791782#comment-14791782
 ] 

ASF GitHub Bot commented on FLINK-2689:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1136#issuecomment-141003929
  
LGTM


> Reusing null object for joins with SolutionSet
> --
>
> Key: FLINK-2689
> URL: https://issues.apache.org/jira/browse/FLINK-2689
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.9, 0.10
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 0.10, 0.9.2
>
>
> Joins and CoGroups with a solution set have outer join semantics because a 
> certain key might not have been inserted into the solution set yet. When 
> probing a non-existing key, the CompactingHashTable will return null.
> In object reuse mode, this null value is used as reuse object when the next 
> key is probed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2689) Reusing null object for joins with SolutionSet

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791791#comment-14791791
 ] 

ASF GitHub Bot commented on FLINK-2689:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1136#issuecomment-141006389
  
Thanks for the review.
Will merge this.


> Reusing null object for joins with SolutionSet
> --
>
> Key: FLINK-2689
> URL: https://issues.apache.org/jira/browse/FLINK-2689
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.9, 0.10
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 0.10, 0.9.2
>
>
> Joins and CoGroups with a solution set have outer join semantics because a 
> certain key might not have been inserted into the solution set yet. When 
> probing a non-existing key, the CompactingHashTable will return null.
> In object reuse mode, this null value is used as reuse object when the next 
> key is probed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2684) Make TypeInformation non-serializable again by removing Table API's need for it

2015-09-17 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791818#comment-14791818
 ] 

Chiwan Park commented on FLINK-2684:


Although {{TypeInformation}} is not serializable, we can fix FLINK-2690 with 
another solution using {{getAllDeclaredFields}} in {{TypeExtractor}}. So If 
there are some problems with serializable {{TypeInformation}}, let's make it 
non-serializable.

> Make TypeInformation non-serializable again by removing Table API's need for 
> it
> ---
>
> Key: FLINK-2684
> URL: https://issues.apache.org/jira/browse/FLINK-2684
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Till Rohrmann
>Priority: Minor
>
> Currently, the {{TypeInformations}} must be serializable because they are 
> shipped with UDFs of the Table API to the TMs. There the Table API code is 
> generated and compiled. By generating the code on the client and shipping the 
> code as strings, we could get rid of this dependency. As a consequence, the 
> {{TypeInformations}} can be non-serializable again, as they were intended to 
> be.
> Maybe [~aljoscha] can provide some more implementation details here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2591] Add configuration parameter for d...

2015-09-17 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1121#discussion_r39725184
  
--- Diff: flink-dist/src/main/resources/flink-conf.yaml ---
@@ -130,6 +130,15 @@ state.backend: jobmanager
 
 
 
#==
+# YARN

+#==
+
+# Default number of YARN container to allocate (=Number of Task Managers)
+
+yarn.defaultNumberOfTaskManagers: 1
--- End diff --

Can you remove the configuration value from the default configuration again?

I would like to force new users to specify the number of yarn containers 
when they start Flink on YARN.
The configuration value is meant for production users which want to control 
everything using configuration values.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141018736
  
Very nice.
I tried the new interface locally, and it seems to work.

I suspect these values are still test values? http://i.imgur.com/d9ZRR7g.png
There is certainly more work to do until the web interface has the same 
features as the old interface (task manager overview / monitoring)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2637) Add abstract equals, hashCode and toString methods to TypeInformation

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791822#comment-14791822
 ] 

ASF GitHub Bot commented on FLINK-2637:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1134#discussion_r39725082
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/typeinfo/IntegerTypeInfo.java
 ---
@@ -18,15 +18,33 @@
 
 package org.apache.flink.api.common.typeinfo;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Sets;
 import org.apache.flink.api.common.typeutils.TypeComparator;
 import org.apache.flink.api.common.typeutils.TypeSerializer;
 
+import java.util.Set;
+
 /**
- * Type information for numeric primitive types (int, long, double, byte, 
...).
+ * Type information for numeric integer primitive types: int, long, byte, 
short, boolean, character.
  */
 public class IntegerTypeInfo extends NumericTypeInfo {
 
+   private static final long serialVersionUID = -8068827354966766955L;
+
+   private static final Set integerTypes = 
Sets.newHashSet(
+   Integer.class,
+   Long.class,
+   Byte.class,
+   Short.class,
+   Boolean.class,
--- End diff --

Nope, I missed that. Will remove it.


> Add abstract equals, hashCode and toString methods to TypeInformation
> -
>
> Key: FLINK-2637
> URL: https://issues.apache.org/jira/browse/FLINK-2637
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.9, 0.10
>Reporter: Fabian Hueske
>Assignee: Till Rohrmann
>  Labels: starter
> Fix For: 0.10
>
>
> Flink expects that implementations of {{TypeInformation}} have valid 
> implementations of {{hashCode}} and {{equals}}. However, the API does not 
> enforce to implement these methods. Hence, this is a common origin for bugs 
> such as for example FLINK-2633.
> This can be avoided by adding abstract {{hashCode}} and {{equals}} methods to 
> TypeInformation. An abstract {{toString}} method could also be added.
> This change will brake the API and require to fix a couple of broken 
> {{TypeInformation}} implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-2595 Unclosed JarFile may leak resource ...

2015-09-17 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1137#issuecomment-141006345
  
Thanks for the update.
Will merge this now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2689] [runtime] Fix reuse of null objec...

2015-09-17 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1136#issuecomment-141006389
  
Thanks for the review.
Will merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2595) Unclosed JarFile may leak resource in ClassLoaderUtilsTest

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791790#comment-14791790
 ] 

ASF GitHub Bot commented on FLINK-2595:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1137#issuecomment-141006345
  
Thanks for the update.
Will merge this now.


> Unclosed JarFile may leak resource in ClassLoaderUtilsTest
> --
>
> Key: FLINK-2595
> URL: https://issues.apache.org/jira/browse/FLINK-2595
> Project: Flink
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
>
> Here is related code:
> {code}
> try {
> new JarFile(validJar.getAbsolutePath());
> }
> catch (Exception e) {
> e.printStackTrace();
> fail("test setup broken: cannot create a 
> valid jar file");
> }
> {code}
> When no exception happens, the JarFile instance is not closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2591] Add configuration parameter for d...

2015-09-17 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1121#discussion_r39725238
  
--- Diff: 
flink-yarn-tests/src/main/java/org/apache/flink/yarn/YARNSessionFIFOITCase.java 
---
@@ -111,6 +111,20 @@ public void testClientStartup() {
}
 
/**
+* Test configuration parameter for default number of yarn containers
+*/
+   @Test
+   public void testDefaultNumberOfTaskManagers() {
+   LOG.info("Starting testDefaultNumberOfTaskManagers()");
--- End diff --

I would generate a flink-conf.yaml just for this test to test the behavior


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2591) Add configuration parameter for default number of yarn containers

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791824#comment-14791824
 ] 

ASF GitHub Bot commented on FLINK-2591:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1121#discussion_r39725184
  
--- Diff: flink-dist/src/main/resources/flink-conf.yaml ---
@@ -130,6 +130,15 @@ state.backend: jobmanager
 
 
 
#==
+# YARN

+#==
+
+# Default number of YARN container to allocate (=Number of Task Managers)
+
+yarn.defaultNumberOfTaskManagers: 1
--- End diff --

Can you remove the configuration value from the default configuration again?

I would like to force new users to specify the number of yarn containers 
when they start Flink on YARN.
The configuration value is meant for production users which want to control 
everything using configuration values.


> Add configuration parameter for default number of yarn containers
> -
>
> Key: FLINK-2591
> URL: https://issues.apache.org/jira/browse/FLINK-2591
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN Client
>Reporter: Robert Metzger
>Assignee: Will Miao
>Priority: Minor
>  Labels: starter
>
> A user complained about the requirement to always specify the number of yarn 
> containers (-n) when starting a job.
> Adding a configuration value with a default value will allow users to set a 
> default ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2557) Manual type information via "returns" fails in DataSet API

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14799236#comment-14799236
 ] 

ASF GitHub Bot commented on FLINK-2557:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141024224
  
Can you add a test case that checks if "returns" works now. See the JIRA 
example.


> Manual type information via "returns" fails in DataSet API
> --
>
> Key: FLINK-2557
> URL: https://issues.apache.org/jira/browse/FLINK-2557
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Reporter: Matthias J. Sax
>Assignee: Chesnay Schepler
>
> I changed the WordCount example as below and get an exception:
> Tokenizer is change to this (removed generics and added cast to String):
> {code:java}
> public static final class Tokenizer implements FlatMapFunction {
>   public void flatMap(Object value, Collector out) {
>   String[] tokens = ((String) value).toLowerCase().split("\\W+");
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> {code}
> I added call to "returns()" here:
> {code:java}
> DataSet> counts =
>   text.flatMap(new Tokenizer()).returns("Tuple2")
>   .groupBy(0).sum(1);
> {code}
> The exception is:
> {noformat}
> Exception in thread "main" java.lang.IllegalArgumentException: The types of 
> the interface org.apache.flink.api.common.functions.FlatMapFunction could not 
> be inferred. Support for synthetic interfaces, lambdas, and generic types is 
> limited at this point.
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:686)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:710)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:673)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:365)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:279)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:120)
>   at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:262)
>   at 
> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:69)
> {noformat}
> Fix:
> This should not immediately fail, but also only give a "MissingTypeInfo" so 
> that type hints would work.
> The error message is also wrong, btw: It should state that raw types are not 
> supported.
> The issue has been reported here: 
> http://stackoverflow.com/questions/32122495/stuck-with-type-hints-in-clojure-for-generic-class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2557] TypeExtractor properly returns Mi...

2015-09-17 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141024224
  
Can you add a test case that checks if "returns" works now. See the JIRA 
example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802829#comment-14802829
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141062133
  
The YARN support needs to ship the static contents to the JobManager 
(AppMaster). The current implementation does not support to serve the static 
contents out of a JAR file...


> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141062133
  
The YARN support needs to ship the static contents to the JobManager 
(AppMaster). The current implementation does not support to serve the static 
contents out of a JAR file...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2613] Print usage information for Scala...

2015-09-17 Thread nikste
Github user nikste commented on the pull request:

https://github.com/apache/flink/pull/1106#issuecomment-141066672
  
Should be fixed according to your comments @fhueske 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2613) Print usage information for Scala Shell

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802846#comment-14802846
 ] 

ASF GitHub Bot commented on FLINK-2613:
---

Github user nikste commented on the pull request:

https://github.com/apache/flink/pull/1106#issuecomment-141066672
  
Should be fixed according to your comments @fhueske 


> Print usage information for Scala Shell
> ---
>
> Key: FLINK-2613
> URL: https://issues.apache.org/jira/browse/FLINK-2613
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala Shell
>Affects Versions: 0.10
>Reporter: Maximilian Michels
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>  Labels: starter
> Fix For: 0.10
>
>
> The Scala Shell startup script starts a {{FlinkMiniCluster}} by default if 
> invoked with no arguments.
> We should add a {{--help}} or {{-h}} option to make it easier for people to 
> find out how to configure remote execution. Alternatively, we could print a 
> notice on the local startup explaining how to start the shell in remote mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2410) PojoTypeInfo is not completely serializable

2015-09-17 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-2410.
-
   Resolution: Fixed
Fix Version/s: 0.10

> PojoTypeInfo is not completely serializable
> ---
>
> Key: FLINK-2410
> URL: https://issues.apache.org/jira/browse/FLINK-2410
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 0.10
>
>
> Table API requires PojoTypeInfo to be serializable. The following code fails:
> {code}
> Table finishedEtlTable = maxMeasurements
> .join(stationTable).where("s_station_id = m_station_id")
> .select("year, month, day, value, country, name");
> DataSet maxTemp = tableEnv.toDataSet(finishedEtlTable, 
> MaxTemperature.class);
> maxTemp
> .groupBy("year")
> .sortGroup("value", Order.DESCENDING)
> .first(1)
> .print();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2613] Print usage information for Scala...

2015-09-17 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1106#discussion_r39763329
  
--- Diff: 
flink-staging/flink-scala-shell/src/test/scala/org/apache/flink/api/scala/ScalaShellITSuite.scala
 ---
@@ -212,6 +212,49 @@ class ScalaShellITSuite extends FunSuite with Matchers 
with BeforeAndAfterAll {
 out.toString + stdout
   }
 
+  /**
+   * tests flink shell startup with remote cluster (starts cluster 
internally)
+   */
+  test("start flink scala shell with remote cluster") {
+
+val input: String = "val els = env.fromElements(\"a\",\"b\");\n" +
--- End diff --

Can you also add a check, that the program was actually executed for 
example by checking that the correct output was produced?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2125) String delimiter for SocketTextStream

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802964#comment-14802964
 ] 

ASF GitHub Bot commented on FLINK-2125:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1077#discussion_r39749641
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/functions/source/SocketTextStreamFunction.java
 ---
@@ -117,12 +122,13 @@ private void streamFromSocket(SourceContext 
ctx, Socket socket) throws E
continue;
}
 
-   if (data == delimiter) {
-   ctx.collect(buffer.toString());
-   buffer = new StringBuilder();
-   } else if (data != '\r') { // ignore carriage 
return
-   buffer.append((char) data);
+   buffer.append(charBuffer, 0, readCount);
+   String[] splits = 
buffer.toString().split(delimiter);
--- End diff --

`String.split()` and `String.replace()` create new String objects which 
must be garbage collected. This adds quite some overhead, because these 
functions are called very often. The previous implementation was operating on a 
byte-level and avoiding the creation of new objects. It would be good if we 
could preserve this behavior.


> String delimiter for SocketTextStream
> -
>
> Key: FLINK-2125
> URL: https://issues.apache.org/jira/browse/FLINK-2125
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Priority: Minor
>  Labels: starter
>
> The SocketTextStreamFunction uses a character delimiter, despite other parts 
> of the API using String delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2557] TypeExtractor properly returns Mi...

2015-09-17 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141110503
  
Looks good to merge, if build succeeds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2557] TypeExtractor properly returns Mi...

2015-09-17 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141121393
  
One failing test in new to me. Do we need a JIRA for it?
```
Tests in error: 
  CompactingHashTableTest.testHashTableGrowthWithInsert:98->getMemory:243 
» OutOfMemory
```
The other one, is our good old friend `YARNSessionFIFOITCase`.

I guess it can get merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2557) Manual type information via "returns" fails in DataSet API

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803089#comment-14803089
 ] 

ASF GitHub Bot commented on FLINK-2557:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141121393
  
One failing test in new to me. Do we need a JIRA for it?
```
Tests in error: 
  CompactingHashTableTest.testHashTableGrowthWithInsert:98->getMemory:243 » 
OutOfMemory
```
The other one, is our good old friend `YARNSessionFIFOITCase`.

I guess it can get merged.


> Manual type information via "returns" fails in DataSet API
> --
>
> Key: FLINK-2557
> URL: https://issues.apache.org/jira/browse/FLINK-2557
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Reporter: Matthias J. Sax
>Assignee: Chesnay Schepler
>
> I changed the WordCount example as below and get an exception:
> Tokenizer is change to this (removed generics and added cast to String):
> {code:java}
> public static final class Tokenizer implements FlatMapFunction {
>   public void flatMap(Object value, Collector out) {
>   String[] tokens = ((String) value).toLowerCase().split("\\W+");
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> {code}
> I added call to "returns()" here:
> {code:java}
> DataSet> counts =
>   text.flatMap(new Tokenizer()).returns("Tuple2")
>   .groupBy(0).sum(1);
> {code}
> The exception is:
> {noformat}
> Exception in thread "main" java.lang.IllegalArgumentException: The types of 
> the interface org.apache.flink.api.common.functions.FlatMapFunction could not 
> be inferred. Support for synthetic interfaces, lambdas, and generic types is 
> limited at this point.
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:686)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:710)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:673)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:365)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:279)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:120)
>   at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:262)
>   at 
> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:69)
> {noformat}
> Fix:
> This should not immediately fail, but also only give a "MissingTypeInfo" so 
> that type hints would work.
> The error message is also wrong, btw: It should state that raw types are not 
> supported.
> The issue has been reported here: 
> http://stackoverflow.com/questions/32122495/stuck-with-type-hints-in-clojure-for-generic-class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2613) Print usage information for Scala Shell

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803127#comment-14803127
 ] 

ASF GitHub Bot commented on FLINK-2613:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1106#discussion_r39763373
  
--- Diff: 
flink-staging/flink-scala-shell/src/test/scala/org/apache/flink/api/scala/ScalaShellLocalStartupITCase.scala
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import java.io._
+
+import org.junit.runner.RunWith
+import org.scalatest.{Matchers, FunSuite}
+import org.scalatest.junit.JUnitRunner
+
+
+@RunWith(classOf[JUnitRunner])
+class ScalaShellLocalStartupITCase extends FunSuite with Matchers {
+
+/**
+ * tests flink shell with local setup through startup script in bin 
folder
+ */
+test("start flink scala shell with local cluster") {
+
+  val input: String = "val els = env.fromElements(\"a\",\"b\");\n" + 
"els.print\nError\n:q\n"
--- End diff --

Please check that the program was actually executed.


> Print usage information for Scala Shell
> ---
>
> Key: FLINK-2613
> URL: https://issues.apache.org/jira/browse/FLINK-2613
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala Shell
>Affects Versions: 0.10
>Reporter: Maximilian Michels
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>  Labels: starter
> Fix For: 0.10
>
>
> The Scala Shell startup script starts a {{FlinkMiniCluster}} by default if 
> invoked with no arguments.
> We should add a {{--help}} or {{-h}} option to make it easier for people to 
> find out how to configure remote execution. Alternatively, we could print a 
> notice on the local startup explaining how to start the shell in remote mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2613) Print usage information for Scala Shell

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803128#comment-14803128
 ] 

ASF GitHub Bot commented on FLINK-2613:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1106#issuecomment-141131210
  
Thanks for the update @nikste! I have only one minor thing to add.
After that it's good to merge, IMO.
Thanks!


> Print usage information for Scala Shell
> ---
>
> Key: FLINK-2613
> URL: https://issues.apache.org/jira/browse/FLINK-2613
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala Shell
>Affects Versions: 0.10
>Reporter: Maximilian Michels
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>  Labels: starter
> Fix For: 0.10
>
>
> The Scala Shell startup script starts a {{FlinkMiniCluster}} by default if 
> invoked with no arguments.
> We should add a {{--help}} or {{-h}} option to make it easier for people to 
> find out how to configure remote execution. Alternatively, we could print a 
> notice on the local startup explaining how to start the shell in remote mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2689] [runtime] Fix reuse of null objec...

2015-09-17 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/flink/pull/1136#issuecomment-141122197
  
@fhueske not sure where else to mention this, but for some reason, PRs like 
this are causing Apache JIRA to comment on Spark JIRAs (see 
https://issues.apache.org/jira/browse/SPARK-2689 for instance). Is it maybe 
because the merge script or other setup from Spark is also used in Flink? Maybe 
something wasn't changed. Not a huge deal but worth tracking down


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2689) Reusing null object for joins with SolutionSet

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803091#comment-14803091
 ] 

ASF GitHub Bot commented on FLINK-2689:
---

Github user srowen commented on the pull request:

https://github.com/apache/flink/pull/1136#issuecomment-141122197
  
@fhueske not sure where else to mention this, but for some reason, PRs like 
this are causing Apache JIRA to comment on Spark JIRAs (see 
https://issues.apache.org/jira/browse/SPARK-2689 for instance). Is it maybe 
because the merge script or other setup from Spark is also used in Flink? Maybe 
something wasn't changed. Not a huge deal but worth tracking down


> Reusing null object for joins with SolutionSet
> --
>
> Key: FLINK-2689
> URL: https://issues.apache.org/jira/browse/FLINK-2689
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.9, 0.10
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 0.9, 0.10
>
>
> Joins and CoGroups with a solution set have outer join semantics because a 
> certain key might not have been inserted into the solution set yet. When 
> probing a non-existing key, the CompactingHashTable will return null.
> In object reuse mode, this null value is used as reuse object when the next 
> key is probed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2613] Print usage information for Scala...

2015-09-17 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1106#discussion_r39763373
  
--- Diff: 
flink-staging/flink-scala-shell/src/test/scala/org/apache/flink/api/scala/ScalaShellLocalStartupITCase.scala
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import java.io._
+
+import org.junit.runner.RunWith
+import org.scalatest.{Matchers, FunSuite}
+import org.scalatest.junit.JUnitRunner
+
+
+@RunWith(classOf[JUnitRunner])
+class ScalaShellLocalStartupITCase extends FunSuite with Matchers {
+
+/**
+ * tests flink shell with local setup through startup script in bin 
folder
+ */
+test("start flink scala shell with local cluster") {
+
+  val input: String = "val els = env.fromElements(\"a\",\"b\");\n" + 
"els.print\nError\n:q\n"
--- End diff --

Please check that the program was actually executed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2613) Print usage information for Scala Shell

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803126#comment-14803126
 ] 

ASF GitHub Bot commented on FLINK-2613:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1106#discussion_r39763329
  
--- Diff: 
flink-staging/flink-scala-shell/src/test/scala/org/apache/flink/api/scala/ScalaShellITSuite.scala
 ---
@@ -212,6 +212,49 @@ class ScalaShellITSuite extends FunSuite with Matchers 
with BeforeAndAfterAll {
 out.toString + stdout
   }
 
+  /**
+   * tests flink shell startup with remote cluster (starts cluster 
internally)
+   */
+  test("start flink scala shell with remote cluster") {
+
+val input: String = "val els = env.fromElements(\"a\",\"b\");\n" +
--- End diff --

Can you also add a check, that the program was actually executed for 
example by checking that the correct output was produced?


> Print usage information for Scala Shell
> ---
>
> Key: FLINK-2613
> URL: https://issues.apache.org/jira/browse/FLINK-2613
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala Shell
>Affects Versions: 0.10
>Reporter: Maximilian Michels
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>  Labels: starter
> Fix For: 0.10
>
>
> The Scala Shell startup script starts a {{FlinkMiniCluster}} by default if 
> invoked with no arguments.
> We should add a {{--help}} or {{-h}} option to make it easier for people to 
> find out how to configure remote execution. Alternatively, we could print a 
> notice on the local startup explaining how to start the shell in remote mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2613] Print usage information for Scala...

2015-09-17 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1106#issuecomment-141131210
  
Thanks for the update @nikste! I have only one minor thing to add.
After that it's good to merge, IMO.
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802919#comment-14802919
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1139#issuecomment-141089265
  
For a streaming job, when I open the running job overview the plan is 
displayed correctly. When I then switch over to another tab and return to the 
plan view all the operators are disconnected and shown one above the other.


> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2557) Manual type information via "returns" fails in DataSet API

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802932#comment-14802932
 ] 

ASF GitHub Bot commented on FLINK-2557:
---

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141091784
  
@mjsax test case added


> Manual type information via "returns" fails in DataSet API
> --
>
> Key: FLINK-2557
> URL: https://issues.apache.org/jira/browse/FLINK-2557
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Reporter: Matthias J. Sax
>Assignee: Chesnay Schepler
>
> I changed the WordCount example as below and get an exception:
> Tokenizer is change to this (removed generics and added cast to String):
> {code:java}
> public static final class Tokenizer implements FlatMapFunction {
>   public void flatMap(Object value, Collector out) {
>   String[] tokens = ((String) value).toLowerCase().split("\\W+");
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> {code}
> I added call to "returns()" here:
> {code:java}
> DataSet> counts =
>   text.flatMap(new Tokenizer()).returns("Tuple2")
>   .groupBy(0).sum(1);
> {code}
> The exception is:
> {noformat}
> Exception in thread "main" java.lang.IllegalArgumentException: The types of 
> the interface org.apache.flink.api.common.functions.FlatMapFunction could not 
> be inferred. Support for synthetic interfaces, lambdas, and generic types is 
> limited at this point.
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:686)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:710)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:673)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:365)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:279)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:120)
>   at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:262)
>   at 
> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:69)
> {noformat}
> Fix:
> This should not immediately fail, but also only give a "MissingTypeInfo" so 
> that type hints would work.
> The error message is also wrong, btw: It should state that raw types are not 
> supported.
> The issue has been reported here: 
> http://stackoverflow.com/questions/32122495/stuck-with-type-hints-in-clojure-for-generic-class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2640) Integrate the off-heap configurations with YARN runner

2015-09-17 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-2640.
--
Resolution: Implemented

Implemented with 93c95b6a6f150a2c55dc387e4ef1d603b3ef3f22

> Integrate the off-heap configurations with YARN runner
> --
>
> Key: FLINK-2640
> URL: https://issues.apache.org/jira/browse/FLINK-2640
> Project: Flink
>  Issue Type: New Feature
>  Components: YARN Client
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Maximilian Michels
> Fix For: 0.10
>
>
> The YARN runner needs to adjust the {{-Xmx}}, {{-Xms}}, and 
> {{-XX:MaxDirectMemorySize}} parameters according to the off-heap memory 
> settings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2690) CsvInputFormat cannot find the field of derived POJO class

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802966#comment-14802966
 ] 

ASF GitHub Bot commented on FLINK-2690:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/1141

[FLINK-2690] [api-breaking] [scala api] [java api] Allows CsvInputFormat to 
use derived Pojos

This PR adds support for the `CsvInputFormat` to use derived Pojos. In 
order to also find pojo fields defined in a parent class, one has to traverse 
the type hierarchy. This is done by the function `findAllFields`.

While working on the `CsvInputFormat`, I noticed that the 
`ScalaCsvInputFormat` shared almost all code with the `CsvInputFormat`. In 
order to reduce duplicated code, both input formats now extend the 
`CommonCsvInputFormat` which contains the shared code.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixCsvInputFormatPojo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1141.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1141


commit bf2e2c40f340ccfe31256744380345287c9e6f9e
Author: Till Rohrmann 
Date:   2015-09-17T13:28:25Z

[FLINK-2690] [api-breaking] [scala api] [java api] Adds functionality to 
the CsvInputFormat to find fields defined in a super class of a Pojo. Refactors 
CsvInputFormat to share code between this format and ScalaCsvInputFormat.




> CsvInputFormat cannot find the field of derived POJO class
> --
>
> Key: FLINK-2690
> URL: https://issues.apache.org/jira/browse/FLINK-2690
> Project: Flink
>  Issue Type: Bug
>  Components: Java API, Scala API
>Affects Versions: 0.10
>Reporter: Chiwan Park
>Assignee: Till Rohrmann
>
> A user reports {{CsvInputFormat}} cannot find the field of derived POJO 
> class. 
> (http://mail-archives.apache.org/mod_mbox/flink-user/201509.mbox/%3ccaj54yvi6cbldn7cypey+xe8a5a_j1-6tnx1wm1eb63gvnqd...@mail.gmail.com%3e)
> The reason of the bug is that {{CsvInputFormat}} uses {{getDeclaredField}} 
> method without scanning base classes to find the field. When 
> {{CsvInputFormat}} was wrote, {{TypeInformation}} cannot be serialized. So we 
> needed to initialize {{TypeInformation}} in {{open}} method manually. Some 
> mistakes in initializing cause this bug.
> After FLINK-2637 is merged, we can serialize {{TypeInformation}} and don't 
> need to create field objects in {{CsvInputFormat}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2537] Add scala examples.jar to build-t...

2015-09-17 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1123#issuecomment-141102303
  
I agree with @uce and I don't think we should add the Scala example JARs 
again. 
@chenliang613, could you please close this PR?

I am really sorry that we didn't respond to your JIRA before you started 
with the implementation.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2537) Add scala examples.jar to build-target/examples

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802975#comment-14802975
 ] 

ASF GitHub Bot commented on FLINK-2537:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1123#issuecomment-141102303
  
I agree with @uce and I don't think we should add the Scala example JARs 
again. 
@chenliang613, could you please close this PR?

I am really sorry that we didn't respond to your JIRA before you started 
with the implementation.



> Add scala examples.jar to build-target/examples
> ---
>
> Key: FLINK-2537
> URL: https://issues.apache.org/jira/browse/FLINK-2537
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 0.10
>Reporter: chenliang613
>Assignee: chenliang613
>Priority: Minor
>  Labels: maven
> Fix For: 0.10
>
>
> Currently Scala as functional programming language has been acknowledged  by 
> more and more developers,  some starters may want to modify scala examples' 
> code for further understanding flink mechanism. After changing scala 
> code,they may select this method to check result: 
> 1.go to "build-target/bin" start server
> 2.use web UI to upload scala examples' jar
> 3.this time they would get confusion, why changes would be not updated.
> Because build-target/examples only copy java examples, suggest adding scala 
> examples also.
> The new directory would like this :
> build-target/examples/java
> build-target/examples/scala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2392) Instable test in flink-yarn-tests

2015-09-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803024#comment-14803024
 ] 

Till Rohrmann commented on FLINK-2392:
--

I observed the failure multiple times again (50 builds). The builds were run 
with debug log level on travis, thus the logs might be helpful.

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80641990/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80641991/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642039/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642040/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642050/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642051/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642062/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642063/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642070/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80670456/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80670458/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80670477/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80670478/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770686/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770685/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770718/log.txt

> Instable test in flink-yarn-tests
> -
>
> Key: FLINK-2392
> URL: https://issues.apache.org/jira/browse/FLINK-2392
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: test-stability
>
> The test YARNSessionFIFOITCase fails from time to time on an irregular basis. 
> For example see: https://travis-ci.org/apache/flink/jobs/72019690
> {noformat}
> Tests run: 12, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 205.163 sec 
> <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOITCase
> perJobYarnClusterWithParallelism(org.apache.flink.yarn.YARNSessionFIFOITCase) 
>  Time elapsed: 60.651 sec  <<< FAILURE!
> java.lang.AssertionError: During the timeout period of 60 seconds the 
> expected string did not show up
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.apache.flink.yarn.YarnTestBase.runWithArgs(YarnTestBase.java:478)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.perJobYarnClusterWithParallelism(YARNSessionFIFOITCase.java:435)
> Results :
> Failed tests: 
>   
> YARNSessionFIFOITCase.perJobYarnClusterWithParallelism:435->YarnTestBase.runWithArgs:478
>  During the timeout period of 60 seconds the expected string did not show up
> {noformat}
> Another error case is this (see 
> https://travis-ci.org/mjsax/flink/jobs/77313444)
> {noformat}
> Tests run: 12, Failures: 3, Errors: 0, Skipped: 2, Time elapsed: 182.008 sec 
> <<< FAILURE! - in org.apache.flink.yarn.YARNSessionFIFOITCase
> testTaskManagerFailure(org.apache.flink.yarn.YARNSessionFIFOITCase)  Time 
> elapsed: 27.356 sec  <<< FAILURE!
> java.lang.AssertionError: Found a file 
> /home/travis/build/mjsax/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1440595422559_0007/container_1440595422559_0007_01_03/taskmanager.log
>  with a prohibited string: [Exception, Started 
> SelectChannelConnector@0.0.0.0:8081]
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.yarn.YarnTestBase.ensureNoProhibitedStringInLogFiles(YarnTestBase.java:294)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.checkForProhibitedLogContents(YARNSessionFIFOITCase.java:94)
> testNonexistingQueue(org.apache.flink.yarn.YARNSessionFIFOITCase)  Time 
> elapsed: 17.421 sec  <<< FAILURE!
> java.lang.AssertionError: Found a file 
> /home/travis/build/mjsax/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1440595422559_0007/container_1440595422559_0007_01_03/taskmanager.log
>  with a prohibited string: [Exception, Started 
> SelectChannelConnector@0.0.0.0:8081]
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.yarn.YarnTestBase.ensureNoProhibitedStringInLogFiles(YarnTestBase.java:294)
>   at 
> org.apache.flink.yarn.YARNSessionFIFOITCase.checkForProhibitedLogContents(YARNSessionFIFOITCase.java:94)
> testJavaAPI(org.apache.flink.yarn.YARNSessionFIFOITCase)  Time elapsed: 
> 11.984 sec  <<< FAILURE!
> java.lang.AssertionError: Found a file 
> /home/travis/build/mjsax/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1440595422559_0007/container_1440595422559_0007_01_03/taskmanager.log
>  with a 

[GitHub] flink pull request: [FLINK-2537] Add scala examples.jar to build-t...

2015-09-17 Thread chenliang613
Github user chenliang613 closed the pull request at:

https://github.com/apache/flink/pull/1123


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2537] Add scala examples.jar to build-t...

2015-09-17 Thread chenliang613
Github user chenliang613 commented on the pull request:

https://github.com/apache/flink/pull/1123#issuecomment-141116565
  
ok,fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2125][streaming] Delimiter change from ...

2015-09-17 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1077#discussion_r39749641
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/functions/source/SocketTextStreamFunction.java
 ---
@@ -117,12 +122,13 @@ private void streamFromSocket(SourceContext 
ctx, Socket socket) throws E
continue;
}
 
-   if (data == delimiter) {
-   ctx.collect(buffer.toString());
-   buffer = new StringBuilder();
-   } else if (data != '\r') { // ignore carriage 
return
-   buffer.append((char) data);
+   buffer.append(charBuffer, 0, readCount);
+   String[] splits = 
buffer.toString().split(delimiter);
--- End diff --

`String.split()` and `String.replace()` create new String objects which 
must be garbage collected. This adds quite some overhead, because these 
functions are called very often. The previous implementation was operating on a 
byte-level and avoiding the creation of new objects. It would be good if we 
could preserve this behavior.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2693) Refactor InvalidTypesException to be checked

2015-09-17 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-2693:
---

 Summary: Refactor InvalidTypesException to be checked
 Key: FLINK-2693
 URL: https://issues.apache.org/jira/browse/FLINK-2693
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Minor


When the TypeExtractor fails, it generally throws an InvalidTypesException. 
This is currently an unchecked exception, although we sometimes recover from 
it, usually by creating a MissingTypeInfo manually.

Furthermore, the extractor can also throw IllegalArgumentExceptions in some 
cases. Figuring out which exception is thrown under which conditions is pretty 
tricky, causing issues such as FLINK-2557.

This should be rectified by
# making InvalidTypesException a checked exception
# only throwing an InvalidTypesException upon failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2695) KafkaITCase.testConcurrentProducerConsumerTopology failed on Travis

2015-09-17 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2695:


 Summary: KafkaITCase.testConcurrentProducerConsumerTopology failed 
on Travis
 Key: FLINK-2695
 URL: https://issues.apache.org/jira/browse/FLINK-2695
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Priority: Critical


The {{KafkaITCase.testConcurrentProducerConsumerTopology}} failed on Travis with

{code}
---
 T E S T S
---
Running org.apache.flink.streaming.connectors.kafka.KafkaProducerITCase
Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.296 sec - in 
org.apache.flink.streaming.connectors.kafka.KafkaProducerITCase
09/16/2015 17:19:36 Job execution switched to status RUNNING.
09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to 
SCHEDULED 
09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to 
DEPLOYING 
09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to 
RUNNING 
09/16/2015 17:19:36 Source: Custom Source -> Sink: Unnamed(1/1) switched to 
FINISHED 
09/16/2015 17:19:36 Job execution switched to status FINISHED.
09/16/2015 17:19:36 Job execution switched to status RUNNING.
09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched 
to SCHEDULED 
09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched 
to DEPLOYING 
09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched 
to RUNNING 
09/16/2015 17:19:36 Source: Custom Source -> Map -> Flat Map(1/1) switched 
to FAILED 
java.lang.Exception: Could not forward element to next operator
at 
org.apache.flink.streaming.connectors.kafka.internals.LegacyFetcher.run(LegacyFetcher.java:242)
at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.run(FlinkKafkaConsumer.java:382)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:58)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:171)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Could not forward element to next 
operator
at 
org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:332)
at 
org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:316)
at 
org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:50)
at 
org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:30)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$SourceOutput.collect(SourceStreamTask.java:92)
at 
org.apache.flink.streaming.api.operators.StreamSource$NonTimestampContext.collect(StreamSource.java:88)
at 
org.apache.flink.streaming.connectors.kafka.internals.LegacyFetcher$SimpleConsumerThread.run(LegacyFetcher.java:449)
Caused by: java.lang.RuntimeException: Could not forward element to next 
operator
at 
org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:332)
at 
org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:316)
at 
org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:50)
at 
org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:30)
at 
org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:37)
at 
org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:329)
... 6 more
Caused by: 
org.apache.flink.streaming.connectors.kafka.testutils.SuccessException
at 
org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase$7.flatMap(KafkaConsumerTestBase.java:896)
at 
org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase$7.flatMap(KafkaConsumerTestBase.java:876)
at 
org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:47)
at 
org.apache.flink.streaming.runtime.tasks.OutputHandler$CopyingChainingOutput.collect(OutputHandler.java:329)
... 11 more

09/16/2015 17:19:36 Job execution switched to status FAILING.
09/16/2015 17:19:36 Job execution switched to status FAILED.
org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Job execution failed.
at 

[jira] [Commented] (FLINK-2616) Failing Test: ZooKeeperLeaderElectionTest

2015-09-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803038#comment-14803038
 ] 

Till Rohrmann commented on FLINK-2616:
--

Encountered it again:

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642071/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770636/log.txt

> Failing Test: ZooKeeperLeaderElectionTest
> -
>
> Key: FLINK-2616
> URL: https://issues.apache.org/jira/browse/FLINK-2616
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> {noformat}
> Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 146.262 sec 
> <<< FAILURE! - in 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest
> testMultipleLeaders(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>  Time elapsed: 22.329 sec <<< ERROR!
> java.util.concurrent.TimeoutException: Listener was not notified about a 
> leader within 2ms
> at 
> org.apache.flink.runtime.leaderelection.TestingListener.waitForLeader(TestingListener.java:69)
> at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testMultipleLeaders(ZooKeeperLeaderElectionTest.java:334)
> {noformat}
> https://travis-ci.org/mjsax/flink/jobs/78553799



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-2600) Failing ElasticsearchSinkITCase.testNodeClient test case

2015-09-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803050#comment-14803050
 ] 

Till Rohrmann edited comment on FLINK-2600 at 9/17/15 3:10 PM:
---

Occurred again

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770633/log.txt


was (Author: till.rohrmann):
Occured again

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770633/log.txt

> Failing ElasticsearchSinkITCase.testNodeClient test case
> 
>
> Key: FLINK-2600
> URL: https://issues.apache.org/jira/browse/FLINK-2600
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Aljoscha Krettek
>  Labels: test-stability
>
> I observed that the {{ElasticsearchSinkITCase.testNodeClient}} test case 
> fails on Travis. The stack trace is
> {code}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:414)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.testingUtils.TestingJobManager$$anonfun$handleTestingMessage$1.applyOrElse(TestingJobManager.scala:285)
>   at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:104)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.RuntimeException: An error occured in ElasticsearchSink.
>   at 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSink.close(ElasticsearchSink.java:307)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:40)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.close(AbstractUdfStreamOperator.java:75)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:243)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:185)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: IndexMissingException[[my-index] 
> missing]
>   at 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSink$1.afterBulk(ElasticsearchSink.java:240)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.execute(BulkProcessor.java:316)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.executeIfNeeded(BulkProcessor.java:299)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.internalAdd(BulkProcessor.java:281)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:264)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:260)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:246)
>   at 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSink.invoke(ElasticsearchSink.java:286)
>   at 
> 

[jira] [Created] (FLINK-2692) Untangle CsvInputFormat into PojoTypeCsvInputFormat and TupleTypeCsvInputFormat

2015-09-17 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2692:


 Summary: Untangle CsvInputFormat into PojoTypeCsvInputFormat and 
TupleTypeCsvInputFormat 
 Key: FLINK-2692
 URL: https://issues.apache.org/jira/browse/FLINK-2692
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Priority: Minor


The {{CsvInputFormat}} currently allows to return values as a {{Tuple}} or a 
{{Pojo}} type. As a consequence, the processing logic, which has to work for 
both types, is overly complex. For example, the {{CsvInputFormat}} contains 
fields which are only used when a Pojo is returned. Moreover, the pojo field 
information are constructed by calling setter methods which have to be called 
in a very specific order, otherwise they fail. E.g. one first has to call 
{{setFieldTypes}} before calling {{setOrderOfPOJOFields}}, otherwise the number 
of fields might be different. Furthermore, some of the methods can only be 
called if the return type is a {{Pojo}} type, because they expect that a 
{{PojoTypeInfo}} is present.

I think the {{CsvInputFormat}} should be refactored to make the code more 
easily maintainable. I propose to split it up into a {{PojoTypeCsvInputFormat}} 
and a {{TupleTypeCsvInputFormat}} which take all the required information via 
their constructors instead of using the {{setFields}} and 
{{setOrderOfPOJOFields}} approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2557) Manual type information via "returns" fails in DataSet API

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803013#comment-14803013
 ] 

ASF GitHub Bot commented on FLINK-2557:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/1045#issuecomment-141110503
  
Looks good to merge, if build succeeds.


> Manual type information via "returns" fails in DataSet API
> --
>
> Key: FLINK-2557
> URL: https://issues.apache.org/jira/browse/FLINK-2557
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Reporter: Matthias J. Sax
>Assignee: Chesnay Schepler
>
> I changed the WordCount example as below and get an exception:
> Tokenizer is change to this (removed generics and added cast to String):
> {code:java}
> public static final class Tokenizer implements FlatMapFunction {
>   public void flatMap(Object value, Collector out) {
>   String[] tokens = ((String) value).toLowerCase().split("\\W+");
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> {code}
> I added call to "returns()" here:
> {code:java}
> DataSet> counts =
>   text.flatMap(new Tokenizer()).returns("Tuple2")
>   .groupBy(0).sum(1);
> {code}
> The exception is:
> {noformat}
> Exception in thread "main" java.lang.IllegalArgumentException: The types of 
> the interface org.apache.flink.api.common.functions.FlatMapFunction could not 
> be inferred. Support for synthetic interfaces, lambdas, and generic types is 
> limited at this point.
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:686)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:710)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:673)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:365)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:279)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:120)
>   at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:262)
>   at 
> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:69)
> {noformat}
> Fix:
> This should not immediately fail, but also only give a "MissingTypeInfo" so 
> that type hints would work.
> The error message is also wrong, btw: It should state that raw types are not 
> supported.
> The issue has been reported here: 
> http://stackoverflow.com/questions/32122495/stuck-with-type-hints-in-clojure-for-generic-class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2637) Add abstract equals, hashCode and toString methods to TypeInformation

2015-09-17 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-2637.

Resolution: Fixed

Fixed with 8ca853e0f6c18be8e6b066c6ec0f23badb797323

> Add abstract equals, hashCode and toString methods to TypeInformation
> -
>
> Key: FLINK-2637
> URL: https://issues.apache.org/jira/browse/FLINK-2637
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.9, 0.10
>Reporter: Fabian Hueske
>Assignee: Till Rohrmann
>  Labels: starter
> Fix For: 0.10
>
>
> Flink expects that implementations of {{TypeInformation}} have valid 
> implementations of {{hashCode}} and {{equals}}. However, the API does not 
> enforce to implement these methods. Hence, this is a common origin for bugs 
> such as for example FLINK-2633.
> This can be avoided by adding abstract {{hashCode}} and {{equals}} methods to 
> TypeInformation. An abstract {{toString}} method could also be added.
> This change will brake the API and require to fix a couple of broken 
> {{TypeInformation}} implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2694) JobManagerProcessReapingTest.testReapProcessOnFailure failed on Travis

2015-09-17 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2694:


 Summary: JobManagerProcessReapingTest.testReapProcessOnFailure 
failed on Travis
 Key: FLINK-2694
 URL: https://issues.apache.org/jira/browse/FLINK-2694
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Priority: Critical


I observed a failing {{JobManagerProcessReapingTest.testReapProcessOnFailure}} 
test case on Travis. The reason for the test failure seems to be that the 
{{JobManager}} could not be started. The reason for this was that Netty could 
not bind to the specified port.

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80642036/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2653) Enable object reuse in MergeIterator

2015-09-17 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan reassigned FLINK-2653:
-

Assignee: Greg Hogan

> Enable object reuse in MergeIterator
> 
>
> Key: FLINK-2653
> URL: https://issues.apache.org/jira/browse/FLINK-2653
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>
> MergeIterator currently discards given reusable objects and simply returns a 
> new object from the JVM heap. This inefficiency has a noticeable impact on 
> garbage collection and runtime overhead (~5% overall performance by my 
> measure).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2125) String delimiter for SocketTextStream

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802967#comment-14802967
 ] 

ASF GitHub Bot commented on FLINK-2125:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1077#discussion_r39749738
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/functions/source/SocketTextStreamFunction.java
 ---
@@ -70,14 +74,15 @@ public void run(SourceContext ctx) throws 
Exception {
 
private void streamFromSocket(SourceContext ctx, Socket socket) 
throws Exception {
try {
-   StringBuilder buffer = new StringBuilder();
+   StringBuffer buffer = new StringBuffer();
+   char[] charBuffer = new char[Math.max(8192, 2 * 
delimiter.length())];
BufferedReader reader = new BufferedReader(new 
InputStreamReader(
socket.getInputStream()));
 
while (isRunning) {
-   int data;
+   int readCount;
try {
-   data = reader.read();
+   readCount = reader.read(charBuffer);
--- End diff --

Good change. It's much better to read a buffer instead of individual 
characters.


> String delimiter for SocketTextStream
> -
>
> Key: FLINK-2125
> URL: https://issues.apache.org/jira/browse/FLINK-2125
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Priority: Minor
>  Labels: starter
>
> The SocketTextStreamFunction uses a character delimiter, despite other parts 
> of the API using String delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2690] [api-breaking] [scala api] [java ...

2015-09-17 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/1141

[FLINK-2690] [api-breaking] [scala api] [java api] Allows CsvInputFormat to 
use derived Pojos

This PR adds support for the `CsvInputFormat` to use derived Pojos. In 
order to also find pojo fields defined in a parent class, one has to traverse 
the type hierarchy. This is done by the function `findAllFields`.

While working on the `CsvInputFormat`, I noticed that the 
`ScalaCsvInputFormat` shared almost all code with the `CsvInputFormat`. In 
order to reduce duplicated code, both input formats now extend the 
`CommonCsvInputFormat` which contains the shared code.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixCsvInputFormatPojo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1141.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1141


commit bf2e2c40f340ccfe31256744380345287c9e6f9e
Author: Till Rohrmann 
Date:   2015-09-17T13:28:25Z

[FLINK-2690] [api-breaking] [scala api] [java api] Adds functionality to 
the CsvInputFormat to find fields defined in a super class of a Pojo. Refactors 
CsvInputFormat to share code between this format and ScalaCsvInputFormat.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2697) Deadlock in StreamDiscretizer

2015-09-17 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2697:


 Summary: Deadlock in StreamDiscretizer
 Key: FLINK-2697
 URL: https://issues.apache.org/jira/browse/FLINK-2697
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Till Rohrmann


Encountered a deadlock in the {{StreamDiscretizer}}

{code}
Found one Java-level deadlock:
=
"Thread-11":
  waiting to lock monitor 0x7f9d081e1ab8 (object 0xff6b4590, a 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer),
  which is held by "StreamDiscretizer -> TumblingGroupedPreReducer -> (Filter, 
ExtractParts) (3/4)"
"StreamDiscretizer -> TumblingGroupedPreReducer -> (Filter, ExtractParts) 
(3/4)":
  waiting to lock monitor 0x7f9d081e20e8 (object 0xff75fd88, a 
org.apache.flink.streaming.api.windowing.policy.TimeTriggerPolicy),
  which is held by "Thread-11"

Java stack information for the threads listed above:
===
"Thread-11":
at 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer.triggerOnFakeElement(StreamDiscretizer.java:121)
- waiting to lock <0xff6b4590> (a 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer)
at 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer$WindowingCallback.sendFakeElement(StreamDiscretizer.java:203)
at 
org.apache.flink.streaming.api.windowing.policy.TimeTriggerPolicy.activeFakeElementEmission(TimeTriggerPolicy.java:117)
- locked <0xff75fd88> (a 
org.apache.flink.streaming.api.windowing.policy.TimeTriggerPolicy)
at 
org.apache.flink.streaming.api.windowing.policy.TimeTriggerPolicy$TimeCheck.run(TimeTriggerPolicy.java:144)
at java.lang.Thread.run(Thread.java:745)
"StreamDiscretizer -> TumblingGroupedPreReducer -> (Filter, ExtractParts) 
(3/4)":
at 
org.apache.flink.streaming.api.windowing.policy.TimeTriggerPolicy.preNotifyTrigger(TimeTriggerPolicy.java:74)
- waiting to lock <0xff75fd88> (a 
org.apache.flink.streaming.api.windowing.policy.TimeTriggerPolicy)
at 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer.processRealElement(StreamDiscretizer.java:91)
- locked <0xff6b4590> (a 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer)
at 
org.apache.flink.streaming.api.operators.windowing.StreamDiscretizer.processElement(StreamDiscretizer.java:73)
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:162)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:56)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:171)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)

Found 1 deadlock.
{code}

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770719/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2689) Reusing null object for joins with SolutionSet

2015-09-17 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-2689.

   Resolution: Fixed
Fix Version/s: (was: 0.9.2)
   0.9

Fixed for 0.10 with 988a04eb486d286e071f4a68aa22c64a2cd4ed8e
Fixed for 0.9 with 43e23ba5efb509f08882df6c2a5d774840bf87a5

> Reusing null object for joins with SolutionSet
> --
>
> Key: FLINK-2689
> URL: https://issues.apache.org/jira/browse/FLINK-2689
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.9, 0.10
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 0.9, 0.10
>
>
> Joins and CoGroups with a solution set have outer join semantics because a 
> certain key might not have been inserted into the solution set yet. When 
> probing a non-existing key, the CompactingHashTable will return null.
> In object reuse mode, this null value is used as reuse object when the next 
> key is probed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2576) Add outer joins to API and Optimizer

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803028#comment-14803028
 ] 

ASF GitHub Bot commented on FLINK-2576:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1138#issuecomment-14654
  
@jkovacs and @r-pogalz, thank you very much for this PR and the detailed 
description!
It's quite a bit of code so it will take some time to be reviewed. I hope 
to give feedback soon.

Nonetheless, we can start a discussion about the handling of projection for 
outer joins. By changing the type information to `GenericTypeInfo` to 
support tuples with null values, a `DataSet` cannot be used (in a join, 
groupBy, reduce, ...) as before because the runtime will use completely 
different serializers and comparators. Therefore, I am more in favor of not 
supporting projection for outer joins.



> Add outer joins to API and Optimizer
> 
>
> Key: FLINK-2576
> URL: https://issues.apache.org/jira/browse/FLINK-2576
> Project: Flink
>  Issue Type: Sub-task
>  Components: Java API, Optimizer, Scala API
>Reporter: Ricky Pogalz
>Priority: Minor
> Fix For: pre-apache
>
>
> Add left/right/full outer join methods to the DataSet APIs (Java, Scala) and 
> to the optimizer of Flink.
> Initially, the execution strategy should be a sort-merge outer join 
> (FLINK-2105) but can later be extended to hash joins for left/right outer 
> joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2595) Unclosed JarFile may leak resource in ClassLoaderUtilsTest

2015-09-17 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-2595.

   Resolution: Fixed
Fix Version/s: 0.10

Fixed with 2c9e2c8bbe1547709d820949d1739f7ea2ce89cf

> Unclosed JarFile may leak resource in ClassLoaderUtilsTest
> --
>
> Key: FLINK-2595
> URL: https://issues.apache.org/jira/browse/FLINK-2595
> Project: Flink
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
> Fix For: 0.10
>
>
> Here is related code:
> {code}
> try {
> new JarFile(validJar.getAbsolutePath());
> }
> catch (Exception e) {
> e.printStackTrace();
> fail("test setup broken: cannot create a 
> valid jar file");
> }
> {code}
> When no exception happens, the JarFile instance is not closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2576] Add Outer Join operator to Optimi...

2015-09-17 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1138#issuecomment-14654
  
@jkovacs and @r-pogalz, thank you very much for this PR and the detailed 
description!
It's quite a bit of code so it will take some time to be reviewed. I hope 
to give feedback soon.

Nonetheless, we can start a discussion about the handling of projection for 
outer joins. By changing the type information to `GenericTypeInfo` to 
support tuples with null values, a `DataSet` cannot be used (in a join, 
groupBy, reduce, ...) as before because the runtime will use completely 
different serializers and comparators. Therefore, I am more in favor of not 
supporting projection for outer joins.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-2691) Broken links to Python script on QuickStart doc

2015-09-17 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-2691.

   Resolution: Fixed
Fix Version/s: 0.10
   0.9

Fixed for 0.10 with 77989d3cb2dd8a5513f5bacafc0e5e7d6f8278e8
Fixed for 0.9 with 45e2a2a82c4aa95d6946a956f9ef5d9c4bb9da77

> Broken links to Python script on QuickStart doc
> ---
>
> Key: FLINK-2691
> URL: https://issues.apache.org/jira/browse/FLINK-2691
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.9
>Reporter: Felix Cheung
>Priority: Minor
> Fix For: 0.9, 0.10
>
>
> Links to plotPoints.py are broken on 
> https://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/run_example_quickstart.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2600) Failing ElasticsearchSinkITCase.testNodeClient test case

2015-09-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803050#comment-14803050
 ] 

Till Rohrmann commented on FLINK-2600:
--

Occured again

https://s3.amazonaws.com/archive.travis-ci.org/jobs/80770633/log.txt

> Failing ElasticsearchSinkITCase.testNodeClient test case
> 
>
> Key: FLINK-2600
> URL: https://issues.apache.org/jira/browse/FLINK-2600
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Aljoscha Krettek
>  Labels: test-stability
>
> I observed that the {{ElasticsearchSinkITCase.testNodeClient}} test case 
> fails on Travis. The stack trace is
> {code}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:414)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.testingUtils.TestingJobManager$$anonfun$handleTestingMessage$1.applyOrElse(TestingJobManager.scala:285)
>   at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:104)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.RuntimeException: An error occured in ElasticsearchSink.
>   at 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSink.close(ElasticsearchSink.java:307)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:40)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.close(AbstractUdfStreamOperator.java:75)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:243)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:185)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: IndexMissingException[[my-index] 
> missing]
>   at 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSink$1.afterBulk(ElasticsearchSink.java:240)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.execute(BulkProcessor.java:316)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.executeIfNeeded(BulkProcessor.java:299)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.internalAdd(BulkProcessor.java:281)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:264)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:260)
>   at 
> org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:246)
>   at 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSink.invoke(ElasticsearchSink.java:286)
>   at 
> org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:37)
>   at 
> 

[jira] [Closed] (FLINK-2687) Moniroting api / web dashboard: Create request handlers list subtask details and accumulators

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2687.
---

> Moniroting api / web dashboard: Create request handlers list subtask details 
> and accumulators
> -
>
> Key: FLINK-2687
> URL: https://issues.apache.org/jira/browse/FLINK-2687
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 0.10
>
>
> As part of the new web dashboard and monitoring api, we need handlers that 
> report details about subtasks and accumulators



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2687) Moniroting api / web dashboard: Create request handlers list subtask details and accumulators

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2687.
-
Resolution: Fixed

Implemented in b3f791727c0713251a4b02a766eb177eb9327113

> Moniroting api / web dashboard: Create request handlers list subtask details 
> and accumulators
> -
>
> Key: FLINK-2687
> URL: https://issues.apache.org/jira/browse/FLINK-2687
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 0.10
>
>
> As part of the new web dashboard and monitoring api, we need handlers that 
> report details about subtasks and accumulators



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803370#comment-14803370
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user StephanEwen closed the pull request at:

https://github.com/apache/flink/pull/1139


> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 0.10
>
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2357] Add the new web dashboard and mon...

2015-09-17 Thread StephanEwen
Github user StephanEwen closed the pull request at:

https://github.com/apache/flink/pull/1139


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2698) Add trailing newline to flink-conf.yaml

2015-09-17 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-2698:
-

 Summary: Add trailing newline to flink-conf.yaml
 Key: FLINK-2698
 URL: https://issues.apache.org/jira/browse/FLINK-2698
 Project: Flink
  Issue Type: Improvement
Affects Versions: master
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


The distributed flink-conf.yaml does not contain a trailing newline. This 
interferes with 
[bdutil|https://github.com/GoogleCloudPlatform/bdutil/blob/master/extensions/flink/install_flink.sh#L64]
 which appends extra/override configuration parameters with a heredoc.

There are many other files without trailing newlines, but this looks to be the 
only detrimental effect.

{code}
for i in $(find * -type f) ; do if diff /dev/null "$i" | tail -1 | grep '^\\ No 
newline' > /dev/null; then  echo $i; fi; done
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2637] [api-breaking] [scala, types] Add...

2015-09-17 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/flink/pull/1134#issuecomment-141163582
  
I understand you took the bot from Spark, but can you change the JIRA link? 
It is spamming the Spark JIRA right now.

https://issues.apache.org/jira/secure/ViewProfile.jspa?name=mxm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-2547) Provide Cluster Status Info

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2547.
-
Resolution: Fixed

Implemented in 01b0fc8fc44d2454ef351928e73f5d49f9e41247

> Provide Cluster Status Info
> ---
>
> Key: FLINK-2547
> URL: https://issues.apache.org/jira/browse/FLINK-2547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Minor
> Fix For: 0.10
>
>
> Add a request to the web server that returns
> {code}
> {
>   "taskmanagers" : numTaskmanagers,
>   "slots-total" : numSlotsTotal, 
>   "slots-available" : nonSlotsAvailable,
>   "jobs-running" : numJobsRunning,
>   "jobs-finished" : numJobsFinished,
>   "jobs-cancelled" : numJobsCancelled,
>   "jobs-failed" : numJobsFailed
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2415) Link nodes in plan to vertices

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2415.
-
   Resolution: Fixed
Fix Version/s: 0.10

Implemened in b26ce5a35c5b9e1d0451fa129b80f3fb97960f2e

> Link nodes in plan to vertices
> --
>
> Key: FLINK-2415
> URL: https://issues.apache.org/jira/browse/FLINK-2415
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Reporter: Piotr Godek
>Assignee: Stephan Ewen
> Fix For: 0.10
>
>
> The plan API function (/jobs//plan) lacks vertices' identifiers, so 
> that plan can be linked to execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2547) Provide Cluster Status Info

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2547.
---

> Provide Cluster Status Info
> ---
>
> Key: FLINK-2547
> URL: https://issues.apache.org/jira/browse/FLINK-2547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Minor
> Fix For: 0.10
>
>
> Add a request to the web server that returns
> {code}
> {
>   "taskmanagers" : numTaskmanagers,
>   "slots-total" : numSlotsTotal, 
>   "slots-available" : nonSlotsAvailable,
>   "jobs-running" : numJobsRunning,
>   "jobs-finished" : numJobsFinished,
>   "jobs-cancelled" : numJobsCancelled,
>   "jobs-failed" : numJobsFailed
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2357) New JobManager Runtime Web Frontend

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2357.
-
   Resolution: Fixed
 Assignee: Stephan Ewen
Fix Version/s: 0.10

Implemented in various commits by [~iampeter] and me. Merged as of 
93a4d31d9e5d0ffa1ef2b578d1eee8088fa7c0e1

> New JobManager Runtime Web Frontend
> ---
>
> Key: FLINK-2357
> URL: https://issues.apache.org/jira/browse/FLINK-2357
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Affects Versions: 0.10
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 0.10
>
> Attachments: Webfrontend Mockup.pdf
>
>
> We need to improve rework the Job Manager Web Frontend.
> The current web frontend is limited and has a lot of design issues
>   - It does not display and progress while operators are running. This is 
> especially problematic for streaming jobs
>   - It has no graph representation of the data flows
>   - it does not allow to look into execution attempts
>   - it has no hook to deal with the upcoming live accumulators
>   - The architecture is not very modular/extensible
> I propose to add a new JobManager web frontend:
>   - Based on Netty HTTP (very lightweight)
>   - Using rest-style URLs for jobs and vertices
>   - integrating the D3 graph renderer of the previews with the runtime monitor
>   - with details on execution attempts
>   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803222#comment-14803222
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141148749
  
Hey @StephanEwen, apologies for being too eager but is it possible to get 
this in soon?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2415) Link nodes in plan to vertices

2015-09-17 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2415.
---

> Link nodes in plan to vertices
> --
>
> Key: FLINK-2415
> URL: https://issues.apache.org/jira/browse/FLINK-2415
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Reporter: Piotr Godek
>Assignee: Stephan Ewen
> Fix For: 0.10
>
>
> The plan API function (/jobs//plan) lacks vertices' identifiers, so 
> that plan can be linked to execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >