[jira] [Commented] (FLINK-1372) TaskManager and JobManager do not log startup settings any more

2015-01-16 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279982#comment-14279982
 ] 

Robert Metzger commented on FLINK-1372:
---

Yes, switching the logger sounds good. The information is very helpful when 
debugging issues reported by users.

 TaskManager and JobManager do not log startup settings any more
 ---

 Key: FLINK-1372
 URL: https://issues.apache.org/jira/browse/FLINK-1372
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Till Rohrmann
 Fix For: 0.9


 In prior versions, the jobmanager and taskmanager logged a lot of startup 
 options:
  - Environment
  - ports
  - memory configuration
  - network configuration
 Currently, they log very little. We should add the logging back in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-655) Rename DataSet.withBroadcastSet(DataSet, String) method

2015-01-16 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280004#comment-14280004
 ] 

Till Rohrmann commented on FLINK-655:
-

I think the getBroadcastVariable returns a List of the DataSet's elements you 
have broadcasted. Thus, the return type should at least stay of Collection 
type. 

 Rename DataSet.withBroadcastSet(DataSet, String) method
 ---

 Key: FLINK-655
 URL: https://issues.apache.org/jira/browse/FLINK-655
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Ufuk Celebi
Assignee: Henry Saputra
  Labels: breaking-api, github-import, starter
 Fix For: pre-apache


 To broadcast a data set you have to do the following:
 ```java
 lorem.map(new MyMapper()).withBroadcastSet(toBroadcast, toBroadcastName)
 ```
 In the operator you call:
 ```java
 getRuntimeContext().getBroadcastVariable(toBroadcastName)
 ```
 I propose to have both method names consistent, e.g.
   - `withBroadcastVariable(DataSet, String)`, or
   - `getBroadcastSet(String)`.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/655
 Created by: [uce|https://github.com/uce]
 Labels: enhancement, java api, user satisfaction, 
 Created at: Wed Apr 02 16:29:08 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1382][java] Adds the new basic types Vo...

2015-01-16 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/299#issuecomment-70226685
  
The change looks good. I would like to see some test cases there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/311#issuecomment-70226947
  
How about names along the lines of Unmodified Fields ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/311#discussion_r23069617
  
--- Diff: 
flink-compiler/src/main/java/org/apache/flink/compiler/dag/BinaryUnionNode.java 
---
@@ -266,4 +268,44 @@ public void computeOutputEstimates(DataStatistics 
statistics) {
this.estimatedOutputSize = in1.estimatedOutputSize  0  
in2.estimatedOutputSize  0 ?
in1.estimatedOutputSize + in2.estimatedOutputSize : -1;
}
+
+   public static class UnionSemanticProperties implements 
SemanticProperties {
+
+   @Override
+   public FieldSet getTargetFields(int input, int sourceField) {
+   if (input != 0  input != 1) {
+   throw new IndexOutOfBoundsException();
--- End diff --

How about returning an exception that explains that unions only support 
input to be 0 or 1. ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/311#discussion_r23069937
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/functions/SemanticPropUtil.java
 ---
@@ -40,38 +45,108 @@
 import org.apache.flink.api.java.functions.FunctionAnnotation.ReadFields;
 import 
org.apache.flink.api.java.functions.FunctionAnnotation.ReadFieldsFirst;
 import 
org.apache.flink.api.java.functions.FunctionAnnotation.ReadFieldsSecond;
+import org.apache.flink.api.java.operators.Keys;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
 
 public class SemanticPropUtil {
 
-   private final static String REGEX_LIST = 
(\\s*(\\d+\\s*,\\s*)*(\\d+\\s*));
-   private final static String REGEX_FORWARD = (\\s*(\\d+)\\s*-( + 
REGEX_LIST + |(\\*)));
-   private final static String REGEX_LIST_OR_FORWARD = ( + REGEX_LIST + 
| + REGEX_FORWARD + );
-   private final static String REGEX_ANNOTATION = (\\s*( + 
REGEX_LIST_OR_FORWARD + \\s*;\\s*)*( + REGEX_LIST_OR_FORWARD + \\s*));
+   private final static String REGEX_WILDCARD = [\\+ 
Keys.ExpressionKeys.SELECT_ALL_CHAR+\\+ 
Keys.ExpressionKeys.SELECT_ALL_CHAR_SCALA+];
+   private final static String REGEX_SINGLE_FIELD = [a-zA-Z0-9_\\$]+;
+   private final static String REGEX_NESTED_FIELDS = (( + 
REGEX_SINGLE_FIELD + \\.)* + REGEX_SINGLE_FIELD + )(\\.+ REGEX_WILDCARD 
+)?;
 
+   private final static String REGEX_LIST = (( + REGEX_NESTED_FIELDS + 
;)*( + REGEX_NESTED_FIELDS + );?);
+   private final static String REGEX_FORWARD = ((+ REGEX_NESTED_FIELDS 
+|+ REGEX_WILDCARD +)-( + REGEX_NESTED_FIELDS + |+ REGEX_WILDCARD +));
+   private final static String REGEX_FIELD_OR_FORWARD = ( + 
REGEX_NESTED_FIELDS + | + REGEX_FORWARD + );
+   private final static String REGEX_ANNOTATION = (( + 
REGEX_FIELD_OR_FORWARD + ;)*( + REGEX_FIELD_OR_FORWARD + );?);
+
+   private static final Pattern PATTERN_WILDCARD = 
Pattern.compile(REGEX_WILDCARD);
private static final Pattern PATTERN_FORWARD = 
Pattern.compile(REGEX_FORWARD);
private static final Pattern PATTERN_ANNOTATION = 
Pattern.compile(REGEX_ANNOTATION);
private static final Pattern PATTERN_LIST = Pattern.compile(REGEX_LIST);
+   private static final Pattern PATTERN_FIELD = 
Pattern.compile(REGEX_NESTED_FIELDS);
 
-   private static final Pattern PATTERN_DIGIT = Pattern.compile(\\d+);
-
-   public static SingleInputSemanticProperties 
createProjectionPropertiesSingle(int[] fields) {
+   public static SingleInputSemanticProperties 
createProjectionPropertiesSingle(int[] fields, CompositeType? inType)
+   {
SingleInputSemanticProperties ssp = new 
SingleInputSemanticProperties();
-   for (int i = 0; i  fields.length; i++) {
-   ssp.addForwardedField(fields[i], i);
+
+   int[] sourceOffsets = new int[inType.getArity()];
+   sourceOffsets[0] = 0;
+   for(int i=1; iinType.getArity(); i++) {
+   sourceOffsets[i] = 
inType.getTypeAt(i-1).getTotalFields() + sourceOffsets[i-1];
}
+
+   int targetOffset = 0;
+   for(int i=0; ifields.length; i++) {
+   int sourceOffset = sourceOffsets[fields[i]];
+   int numFieldsToCopy = 
inType.getTypeAt(fields[i]).getTotalFields();
+
+   for(int j=0; jnumFieldsToCopy; j++) {
+   ssp.addForwardedField(sourceOffset+j, 
targetOffset+j);
+   }
+   targetOffset += numFieldsToCopy;
+   }
+
return ssp;
}
 
-   public static DualInputSemanticProperties 
createProjectionPropertiesDual(int[] fields, boolean[] isFromFirst) {
+   public static DualInputSemanticProperties 
createProjectionPropertiesDual(int[] fields, boolean[] isFromFirst,
+   

TypeInformation? inType1, TypeInformation? inType2)
+   {
DualInputSemanticProperties dsp = new 
DualInputSemanticProperties();
 
-   for (int i = 0; i  fields.length; i++) {
-   if (isFromFirst[i]) {
-   dsp.addForwardedField1(fields[i], i);
+   int[] sourceOffsets1;
+   if(inType1 instanceof TupleTypeInfo?) {
+   sourceOffsets1 = new int[inType1.getArity()];
+   sourceOffsets1[0] = 0;
+   for(int i=1; iinType1.getArity(); i++) {
+   sourceOffsets1[i] = 
((TupleTypeInfo?)inType1).getTypeAt(i-1).getTotalFields() + 
sourceOffsets1[i-1];
+   }
+   } else {
+   

[jira] [Commented] (FLINK-655) Rename DataSet.withBroadcastSet(DataSet, String) method

2015-01-16 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280030#comment-14280030
 ] 

Fabian Hueske commented on FLINK-655:
-

Aren't we going for two methods?
1. {{op.withBroadcastSet(DataSetX, String)}} + {{ListX bcs = 
getRuntimeContext().getBroadcastSet(String)}}
2. {{op.withBroadcastValue(DataSetX, String)}} + {{X bcv = 
getRuntimeContext().getBroadcastValue(String)}}

For the broadcast value, the runtime should check that the dataset has only a 
single value (implying DOP=1 for the producing operator).


 Rename DataSet.withBroadcastSet(DataSet, String) method
 ---

 Key: FLINK-655
 URL: https://issues.apache.org/jira/browse/FLINK-655
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Ufuk Celebi
Assignee: Henry Saputra
  Labels: breaking-api, github-import, starter
 Fix For: pre-apache


 To broadcast a data set you have to do the following:
 ```java
 lorem.map(new MyMapper()).withBroadcastSet(toBroadcast, toBroadcastName)
 ```
 In the operator you call:
 ```java
 getRuntimeContext().getBroadcastVariable(toBroadcastName)
 ```
 I propose to have both method names consistent, e.g.
   - `withBroadcastVariable(DataSet, String)`, or
   - `getBroadcastSet(String)`.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/655
 Created by: [uce|https://github.com/uce]
 Labels: enhancement, java api, user satisfaction, 
 Created at: Wed Apr 02 16:29:08 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/311#discussion_r23070484
  
--- Diff: 
flink-compiler/src/main/java/org/apache/flink/compiler/dag/BinaryUnionNode.java 
---
@@ -266,4 +268,44 @@ public void computeOutputEstimates(DataStatistics 
statistics) {
this.estimatedOutputSize = in1.estimatedOutputSize  0  
in2.estimatedOutputSize  0 ?
in1.estimatedOutputSize + in2.estimatedOutputSize : -1;
}
+
+   public static class UnionSemanticProperties implements 
SemanticProperties {
+
+   @Override
+   public FieldSet getTargetFields(int input, int sourceField) {
+   if (input != 0  input != 1) {
+   throw new IndexOutOfBoundsException();
--- End diff --

That's an internal exception. If something fails here, there is something 
wrong with the optimizer. Nothing a user can solve. Could make that clear 
though...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/311#discussion_r23070438
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/PojoTypeInfo.java 
---
@@ -45,6 +46,16 @@
  */
 public class PojoTypeInfoT extends CompositeTypeT{
 
+   private final static String REGEX_FIELD = 
[a-zA-Z_\\$][a-zA-Z0-9_\\$]*;
--- End diff --

Java allows to use any unicode character to be used as field names. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1183] Generate gentle notification mess...

2015-01-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/296


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1409) Connected datastream functionality broken since the introduction of intermediate results

2015-01-16 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280035#comment-14280035
 ] 

Ufuk Celebi commented on FLINK-1409:


I'll add a test case and look into it right now.

 Connected datastream functionality broken since the introduction of 
 intermediate results
 

 Key: FLINK-1409
 URL: https://issues.apache.org/jira/browse/FLINK-1409
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Gyula Fora

 The connected data stream functionality which allows joint transformations on 
 two data streams of arbitrary type is broken since Ufuk's commit which 
 introduces the intermediate results.
 The problem is most likely in the CoRecordReader which should allow 
 nonblocking read from inputs with different types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/311#discussion_r23070841
  
--- Diff: 
flink-compiler/src/main/java/org/apache/flink/compiler/dag/BinaryUnionNode.java 
---
@@ -266,4 +268,44 @@ public void computeOutputEstimates(DataStatistics 
statistics) {
this.estimatedOutputSize = in1.estimatedOutputSize  0  
in2.estimatedOutputSize  0 ?
in1.estimatedOutputSize + in2.estimatedOutputSize : -1;
}
+
+   public static class UnionSemanticProperties implements 
SemanticProperties {
+
+   @Override
+   public FieldSet getTargetFields(int input, int sourceField) {
+   if (input != 0  input != 1) {
+   throw new IndexOutOfBoundsException();
--- End diff --

Ah, okay. Then its fine.
I saw many helpful exceptions in this change. So I guess the user-facing 
exceptions are more descriptive. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1379) add RSS feed for the blog

2015-01-16 Thread Max Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280036#comment-14280036
 ] 

Max Michels commented on FLINK-1379:


Thank you for the hint, Robert. I guess I'm too much rooted in the GitHub 
world...

 add RSS feed for the blog
 -

 Key: FLINK-1379
 URL: https://issues.apache.org/jira/browse/FLINK-1379
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
 Attachments: feed.patch


 I couldn't find an RSS feed for the Flink blog. I think that a feed helps a 
 lot of people to stay up to date with the changes in Flink. 
 [FLINK-391] mentions a RSS feed but it does not seem to exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/311#issuecomment-70230559
  
Except for the comments and the missing documentation, the change looks 
good.
I can however not really validate the changes in the optimizer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1395] Add support for JodaTime in KryoS...

2015-01-16 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/304#discussion_r23071236
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/KryoSerializer.java
 ---
@@ -185,6 +187,8 @@ private void checkKryoInitialized() {
this.kryo.setRegistrationRequired(false);
this.kryo.register(type);

this.kryo.setClassLoader(Thread.currentThread().getContextClassLoader());
+
+   kryo.register(DateTime.class, new 
JodaDateTimeSerializer());
--- End diff --

I would suggest to add some more serializers from `de.javakaffee` .. since 
we have it already as a dependency, it doesn't hurt to add them. 
I'm suggesting 
- `jodatime/JodaIntervalSerializer`,
- `guava/ImmutableListSerializer`,
- `UnmodifiableCollectionsSerializer`, 
- `GregorianCalendarSerializer`,
- `EnumSetSerializer`,
- `EnumMapSerializer`,
- BitSetSerializer - serializer for java.util.BitSet
- RegexSerializer - serializer for java.util.regex.Pattern
- URISerializer - serializer for java.net.URI
- UUIDSerializer - serializer for java.util.UUID



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1395] Add support for JodaTime in KryoS...

2015-01-16 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/304#issuecomment-70231876
  
Change looks good except for comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1406] update Flink compatibility notice

2015-01-16 Thread mxm
GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/314

[FLINK-1406] update Flink compatibility notice



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink flink_1406

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/314.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #314


commit b09ab9f644bb21737babddf387045ff953f72135
Author: Max m...@posteo.de
Date:   2015-01-16T10:02:47Z

[FLINK-1406] update Flink compatibility notice




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1406] update Flink compatibility notice

2015-01-16 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/314#issuecomment-70234048
  
Good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-1406) Windows compatibility

2015-01-16 Thread Max Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Michels updated FLINK-1406:
---
Attachment: flink_1406.patch

Here is the patch for the how to contribute guide in addition to the pull 
request.

 Windows compatibility
 -

 Key: FLINK-1406
 URL: https://issues.apache.org/jira/browse/FLINK-1406
 Project: Flink
  Issue Type: Improvement
Reporter: Max Michels
Priority: Minor
 Attachments: flink_1406.patch


 The documentation [1] states: Flink runs on all UNIX-like environments: 
 Linux, Mac OS X, Cygwin. The only requirement is to have a working Java 6.x 
 (or higher) installation.
 I just found out Flink runs also natively on Windows. Do we want to support 
 Windows? If so, we should update the documentation. Clearly, we don't support 
 it at the moment for development. At multiple places, the tests contain 
 references to /tmp or /dev/random/ which are not Windows compatible.
 Probably it's enough to update the documentation stating that Flink runs on 
 Windows but cannot be build or developed on Windows without Cygwin.
 [1] http://flink.apache.org/docs/0.7-incubating/setup_quickstart.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-16 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-70234744
  
BTW: How is that done in Hadoop? Can we follow a similar way, to make it 
easier for users to understand this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/311#discussion_r23073054
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/PojoTypeInfo.java 
---
@@ -45,6 +46,16 @@
  */
 public class PojoTypeInfoT extends CompositeTypeT{
 
+   private final static String REGEX_FIELD = 
[a-zA-Z_\\$][a-zA-Z0-9_\\$]*;
--- End diff --

Right! Will change that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1328] Reworked semantic annotations

2015-01-16 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/311#issuecomment-70236149
  
Thanks for the review!

Proposed names for constant field semantic properties:
* constant fields (current)
* unmodified fields
* forwarded fields
* preserved fields

I leaning towards forwarded fields. Other opinions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1147][Java API] TypeInference on POJOs

2015-01-16 Thread twalthr
GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/315

[FLINK-1147][Java API] TypeInference on POJOs

The TypeExtractor now also fully supports generic POJOs and tries to get 
missing types by using type inference.

Functions can look like:

```
MapFunctionPojoWithGenericsLong, T, PojoWithGenericsT,T
MapFunctionTuple2E, D, PojoTupleE, D, D
MapFunctionPojoTupleE, D, D, Tuple2E, D
```

POJOs can contain fields such as:
```
public PojoWithGenericsZ, Z field;
public Z[] field;
public Tuple1Z[] field;
```

If you have ideas for other test cases, I'm happy to implement them.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink GenericPojos

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/315.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #315


commit 391cc8fde82a38797c0355031f156f8914845f1b
Author: twalthr i...@twalthr.com
Date:   2015-01-13T22:59:35Z

[FLINK-1147][Java API] TypeInference on POJOs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1409) Connected datastream functionality broken since the introduction of intermediate results

2015-01-16 Thread Gyula Fora (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280224#comment-14280224
 ] 

Gyula Fora commented on FLINK-1409:
---

Thanks, let me know if I can help in any way.

 Connected datastream functionality broken since the introduction of 
 intermediate results
 

 Key: FLINK-1409
 URL: https://issues.apache.org/jira/browse/FLINK-1409
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Gyula Fora

 The connected data stream functionality which allows joint transformations on 
 two data streams of arbitrary type is broken since Ufuk's commit which 
 introduces the intermediate results.
 The problem is most likely in the CoRecordReader which should allow 
 nonblocking read from inputs with different types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1382][java] Adds the new basic types Vo...

2015-01-16 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/299#issuecomment-70255146
  
There are test cases. I have adapted the existing ones.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1409) Connected datastream functionality broken since the introduction of intermediate results

2015-01-16 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280231#comment-14280231
 ] 

Ufuk Celebi commented on FLINK-1409:


Thanks :-) I've just looked into it and I know what it is. 

 Connected datastream functionality broken since the introduction of 
 intermediate results
 

 Key: FLINK-1409
 URL: https://issues.apache.org/jira/browse/FLINK-1409
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Gyula Fora

 The connected data stream functionality which allows joint transformations on 
 two data streams of arbitrary type is broken since Ufuk's commit which 
 introduces the intermediate results.
 The problem is most likely in the CoRecordReader which should allow 
 nonblocking read from inputs with different types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Implement the convenience methods count and co...

2015-01-16 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/210#discussion_r23080920
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/accumulators/ListAccumulator.java
 ---
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.api.common.accumulators;
+
+import java.io.IOException;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutputStream;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.core.memory.DataInputView;
+import org.apache.flink.core.memory.DataInputViewStream;
+import org.apache.flink.core.memory.DataOutputView;
+import org.apache.flink.core.memory.DataOutputViewStream;
+
+public class ListAccumulatorT implements AccumulatorT, ListT {
+
+   private static final long serialVersionUID = 1L;
+   
+   private ArrayListT localValue = new ArrayListT();
+   
+   private TypeSerializerT typeSerializer;
+   
+   public ListAccumulator() {}
+   
+   public ListAccumulator(TypeSerializerT ser) {
+   typeSerializer = ser;
+   }
+
+   @Override
+   public void add(T value) {
+   localValue.add(value);  
+   }
+
+   @Override
+   public ArrayListT getLocalValue() {
+   return localValue;
+   }
+
+   @Override
+   public void resetLocal() {
+   localValue.clear();
+   }
+
+   @Override
+   public void merge(AccumulatorT, ListT other) {
+   localValue.addAll(((ListAccumulatorT) other).getLocalValue());
+   }
+
+   @Override
+   public void write(DataOutputView out) throws IOException {
+   ObjectOutputStream outStream = new ObjectOutputStream(new 
DataOutputViewStream(out));
+   outStream.writeObject(typeSerializer);
+   outStream.flush();
--- End diff --

Let us change this to `close()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Implement the convenience methods count and co...

2015-01-16 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/210#discussion_r23081185
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/AbstractIDTest.java ---
@@ -23,8 +23,8 @@
 import static org.junit.Assert.fail;
 
 import org.junit.Test;
-
 import org.apache.flink.runtime.testutils.CommonTestUtils;
+import org.apache.flink.util.AbstractID;
--- End diff --

Let us move this test to `flink-core`, with `AbstractID` being now part of 
the core.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1296] Add sorter support for very large...

2015-01-16 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/249#issuecomment-70257415
  
I think that this one can go into the 0.9 master now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Implement the convenience methods count and co...

2015-01-16 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/210#issuecomment-70258735
  
With the scheduler and intermediate data set enhancements coming up for 0.9 
soon, this is now quite feasible to use. I suggest to merge it once the inline 
comments are addressed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1098) flatArray() operator that converts arrays to elements

2015-01-16 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280297#comment-14280297
 ] 

Timo Walther commented on FLINK-1098:
-

What do you think about a additional method collectEach() in Collector. I 
know that is just syntactic sugar, but then we could write:

{code}
text
.flatMap((line, out) - out.collectEach(line.toLowerCase().split(\\W+)))
.map((word) - new Tuple2(word, 1))
.groupBy(0)
.sum(1);
{code}


 flatArray() operator that converts arrays to elements
 -

 Key: FLINK-1098
 URL: https://issues.apache.org/jira/browse/FLINK-1098
 Project: Flink
  Issue Type: New Feature
Reporter: Timo Walther
Priority: Minor

 It would be great to have an operator that converts e.g. from String[] to 
 String. Actually, it is just a flatMap over the elements of an array.
 A typical use case is a WordCount where we then could write:
 {code}
 text
 .map((line) - line.toLowerCase().split(\\W+))
 .flatArray()
 .map((word) - new Tuple2(word, 1))
 .groupBy(0)
 .sum(1);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1410) Integrate Flink version variables into website layout

2015-01-16 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1410:
-

 Summary: Integrate Flink version variables into website layout
 Key: FLINK-1410
 URL: https://issues.apache.org/jira/browse/FLINK-1410
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Robert Metzger


The new website layout doesn't use the variables in the website configuration.
This makes releasing versions extremely hard, because one needs to manually fix 
all the links for every version change.

The old layout of the website was respecting all variables which made releasing 
a new version of the website a matter of minutes (changing one file).
I would highly recommend to fix FLINK-1387 first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1382][java] Adds the new basic types Vo...

2015-01-16 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/299#issuecomment-70269761
  
Oh, yes .. sorry. I need to be more careful when reviewing pull requests.

+1 to merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1411) PlanVisualizer is not working

2015-01-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1411:


 Summary: PlanVisualizer is not working
 Key: FLINK-1411
 URL: https://issues.apache.org/jira/browse/FLINK-1411
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Priority: Minor


In the current master, the PlanVisualizer is no longer working. The reason is 
that the resources folder containing the web resources has been moved to the 
flink-runtime and flink-clients jar. 

Maybe we should pick up FLINK-1317 and make the PlanVisualizer accessible 
through the flink website.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-655) Rename DataSet.withBroadcastSet(DataSet, String) method

2015-01-16 Thread Henry Saputra (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279683#comment-14279683
 ] 

Henry Saputra edited comment on FLINK-655 at 1/16/15 4:36 PM:
--

Currently, the getBroadcastVariable returns List instance instead of just an 
instance.
With this refactor, do we want to change the return type too?


was (Author: hsaputra):
Currently, the getBroadcastVariable returns List instance of just an instance.
With this refactor, do we want to change the return type too?

 Rename DataSet.withBroadcastSet(DataSet, String) method
 ---

 Key: FLINK-655
 URL: https://issues.apache.org/jira/browse/FLINK-655
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Ufuk Celebi
Assignee: Henry Saputra
  Labels: breaking-api, github-import, starter
 Fix For: pre-apache


 To broadcast a data set you have to do the following:
 ```java
 lorem.map(new MyMapper()).withBroadcastSet(toBroadcast, toBroadcastName)
 ```
 In the operator you call:
 ```java
 getRuntimeContext().getBroadcastVariable(toBroadcastName)
 ```
 I propose to have both method names consistent, e.g.
   - `withBroadcastVariable(DataSet, String)`, or
   - `getBroadcastSet(String)`.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/655
 Created by: [uce|https://github.com/uce]
 Labels: enhancement, java api, user satisfaction, 
 Created at: Wed Apr 02 16:29:08 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Update incubator-flink name in the merge pull ...

2015-01-16 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/313#issuecomment-70284303
  
Thanks @rmetzger, will merge this today. Not a  blocker for 0.8 so I will 
not merge it to 0.8 branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-655) Add support for both single and set of broadcast values

2015-01-16 Thread Henry Saputra (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280590#comment-14280590
 ] 

Henry Saputra commented on FLINK-655:
-

I change the summary to reflect the new task to do for this JIRA

 Add support for both single and set of broadcast values
 ---

 Key: FLINK-655
 URL: https://issues.apache.org/jira/browse/FLINK-655
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Ufuk Celebi
Assignee: Henry Saputra
  Labels: breaking-api, github-import, starter
 Fix For: pre-apache


 To broadcast a data set you have to do the following:
 ```java
 lorem.map(new MyMapper()).withBroadcastSet(toBroadcast, toBroadcastName)
 ```
 In the operator you call:
 ```java
 getRuntimeContext().getBroadcastVariable(toBroadcastName)
 ```
 I propose to have both method names consistent, e.g.
   - `withBroadcastVariable(DataSet, String)`, or
   - `getBroadcastSet(String)`.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/655
 Created by: [uce|https://github.com/uce]
 Labels: enhancement, java api, user satisfaction, 
 Created at: Wed Apr 02 16:29:08 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-655) Add support for both single and set of broadcast values

2015-01-16 Thread Henry Saputra (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Saputra updated FLINK-655:

Summary: Add support for both single and set of broadcast values  (was: 
Rename DataSet.withBroadcastSet(DataSet, String) method)

 Add support for both single and set of broadcast values
 ---

 Key: FLINK-655
 URL: https://issues.apache.org/jira/browse/FLINK-655
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Ufuk Celebi
Assignee: Henry Saputra
  Labels: breaking-api, github-import, starter
 Fix For: pre-apache


 To broadcast a data set you have to do the following:
 ```java
 lorem.map(new MyMapper()).withBroadcastSet(toBroadcast, toBroadcastName)
 ```
 In the operator you call:
 ```java
 getRuntimeContext().getBroadcastVariable(toBroadcastName)
 ```
 I propose to have both method names consistent, e.g.
   - `withBroadcastVariable(DataSet, String)`, or
   - `getBroadcastSet(String)`.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/655
 Created by: [uce|https://github.com/uce]
 Labels: enhancement, java api, user satisfaction, 
 Created at: Wed Apr 02 16:29:08 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1369] [types] Add support for Subclasse...

2015-01-16 Thread fhueske
GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/316

[FLINK-1369] [types] Add support for Subclasses, Interfaces, Abstract 
Classes

This PR rebased PR #236 to the current master.
Some tests were failing and I had a closer look. The original PR handled 
interfaces and abstract classes without member variables as POJO types. 
However, POJO types without members cannot be handled correctly and do also not 
have members that can be referenced as keys or fields.
I changed the logic such that interfaces and abstract classes without 
members are handled as GenericTypes.

I'm not so familiar with the TypeExtractor, so it would be good if someone 
else could have a quick look.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink aljoscha-subclass-types

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/316.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #316


commit 8f20c0df045db820f69aa18ced8658055e131081
Author: Aljoscha Krettek aljoscha.kret...@gmail.com
Date:   2014-11-26T12:27:06Z

[FLINK-1369] [types] Add support for Subclasses, Interfaces, Abstract 
Classes.
- Abstract classes with fields are handled as POJO types.
- Interfaces and abstract classes without fields are handled as generic 
types.

This closes #236
This closes #316




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1369] [types] Add support for Subclasse...

2015-01-16 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/316#issuecomment-70348260
  
I will take a look at it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---