Re: Travis-CI builds queuing up

2015-03-26 Thread Fabian Hueske
Great!
Thanks Robert for sharing the good news :-)

2015-03-26 9:08 GMT+01:00 Robert Metzger rmetz...@apache.org:

 Travis replied me with very good news: Somebody from INFRA was asking the
 same question around the same time as I did and Travis is working on adding
 more build capacity for the apache github organization.
 I hope we'll soon have quicker builds again.

 On Tue, Mar 24, 2015 at 4:42 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:

  That's good idea.
 
  Should be good to have mix of stable with Apache Jenkins for master
  and PRs, and Travis for individual forks.
 
  - Henry
 
  On Tue, Mar 24, 2015 at 8:03 AM, Maximilian Michels m...@apache.org
  wrote:
   Hey!
  
   I would also like to continue using Travis but the current situation is
  not
   acceptable because we practically can't use Travis anymore for pull
   requests or the current master. If it cannot be resolved then I think
 we
   should move on.
  
   The builds service team [1] at Apache offers Jenkins [2] for continuous
   integration. I think it should be fairly simple to set up. We could
 still
   use Travis in our forked repositories but have a reliable CI solution
 for
   the master and pull requests.
  
   Max
  
   [1] https://builds.apache.org/
   [2] http://jenkins-ci.org
  
   On Tue, Mar 24, 2015 at 3:46 PM, Márton Balassi 
  balassi.mar...@gmail.com
   wrote:
  
   I also like the travis infrastucture. Thanks for bringing this up and
   reaching out to the travis guys.
  
   On Tue, Mar 24, 2015 at 3:38 PM, Robert Metzger rmetz...@apache.org
   wrote:
  
Hi guys,
   
the build queue on travis is getting very very long. It seems that
 it
   takes
4 days now until commits to master are build. The nightly builds
 from
  the
website and the maven snapshots are also delayed by that.
Right now,  there are 33 pull request builds scheduled (
https://travis-ci.org/apache/flink/pull_requests), and 8 builds on
   master:
https://travis-ci.org/apache/flink/builds.
   
The problem is that travis accounts are per github user. In our
 case,
  the
user is apache, so all ASF projects that have travis enabled
 share 5
concurrent builders.
   
I would actually like to continue using Travis.
   
The easiest option is probably asking travis if they can give the
   apache
user more build capacity.
   
If thats not possible, we have to look into other options.
   
   
I'm going to ask Travis if they can do anything about it.
   
Robert
   
  
 



Re: Master mvn test is broken with java.lang.NoSuchFieldError: IBM_JAVA error

2015-03-26 Thread Robert Metzger
I suspect this error only happens once in a while. We didn't change
anything on these tests recently.
Your PR for fixing this issue looks good, maybe its fixing it.

On Thu, Mar 26, 2015 at 3:15 AM, Henry Saputra henry.sapu...@gmail.com
wrote:

 Hi All,

 I just pulled from master and seemed like it fails mvn test:


 ---
  T E S T S
 ---

 ---
  T E S T S
 ---
 Running org.apache.flink.tachyon.HDFSTest
 Running org.apache.flink.tachyon.TachyonFileSystemWrapperTest
 java.lang.NoSuchFieldError: IBM_JAVA
 at
 org.apache.hadoop.security.UserGroupInformation.getOSLoginModuleName(UserGroupInformation.java:303)
 at
 org.apache.hadoop.security.UserGroupInformation.clinit(UserGroupInformation.java:348)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:807)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:266)
 at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:122)
 at
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:775)
 at
 org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:642)
 at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:334)
 at
 org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:316)
 at org.apache.flink.tachyon.HDFSTest.createHDFS(HDFSTest.java:62)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
 at
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 at
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 at
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at
 org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
 at
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
 at
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
 java.lang.NoClassDefFoundError: Could not initialize class
 org.apache.hadoop.security.UserGroupInformation
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:807)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:266)
 at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:122)
 at
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:775)
 at
 org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:642)
 at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:334)
 at
 org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:316)
 at org.apache.flink.tachyon.HDFSTest.createHDFS(HDFSTest.java:62)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
 at
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 at
 

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Robert Metzger
Two weeks have passed since we've discussed the 0.9 release the last time.

The ApacheCon is in 18 days from now.
If we want, we can also release a 0.9.0-beta release that contains known
bugs, but allows our users to try out the new features easily (because they
are part of a release). The vote for such a release would be mainly about
the legal aspects of the release rather than the stability. So I suspect
that the vote will go through much quicker.



On Fri, Mar 13, 2015 at 12:01 PM, Robert Metzger rmetz...@apache.org
wrote:

 I've reopened https://issues.apache.org/jira/browse/FLINK-1650 because
 the issue is still occurring.

 On Thu, Mar 12, 2015 at 7:05 PM, Ufuk Celebi u...@apache.org wrote:

 On Thursday, March 12, 2015, Till Rohrmann till.rohrm...@gmail.com
 wrote:

  Have you run the 20 builds with the new shading code? With new shading
 the
  TaskManagerFailsITCase should no longer fail. If it still does, then we
  have to look into it again.


 No, rebased on Monday before shading. Let me rebase and rerun tonight.





Re: Validate (commons) versus checkArgument (guava)

2015-03-26 Thread Robert Metzger
I created a starter task JIRA for this.
https://issues.apache.org/jira/browse/FLINK-1787


On Sun, Mar 8, 2015 at 3:23 PM, Aljoscha Krettek aljos...@apache.org
wrote:

 +1 I also tend to use guava.

 On Sun, Mar 8, 2015 at 3:21 PM, Ufuk Celebi u...@apache.org wrote:
 
  On 08 Mar 2015, at 15:05, Stephan Ewen se...@apache.org wrote:
 
  Different parts of the code currently use different utilities to
 validate
  the arguments.
 
   - Some parts use Guava (checkNotNull, checkArgument)
   - Other parts use Validate from Apache commons-lang(3).
 
  How about we use one consistently, at least for all new code additions?
 
  In choosing one, I have a slight bias towards Guava, which has
 more/nicer
  methods and seems more popular in other projects (I have no source to
 back
  this up, it is a gut feeling from what I have seen in other projects
 that I
  looked into)
 
  +1 I'm always using Guava for the same reasons.



Re: Validate (commons) versus checkArgument (guava)

2015-03-26 Thread Robert Metzger
I didn't know that there was already an issue for this. I closed FLINK-1787.
The correct issue is this one:
https://issues.apache.org/jira/browse/FLINK-1711


AW: [VOTE] Name of Expression API Representation

2015-03-26 Thread Markl, Volker, Prof. Dr.
+Table

I also agree with that line of argument (think SQL ;-) )

-Ursprüngliche Nachricht-
Von: Timo Walther [mailto:twal...@apache.org] 
Gesendet: Donnerstag, 26. März 2015 09:28
An: dev@flink.apache.org
Betreff: Re: [VOTE] Name of Expression API Representation

+Table API

Same thoughts as Stephan. Table is more common in the economy than Relation.

On 25.03.2015 21:30, Stephan Ewen wrote:
 +Table API / Table

 I have a feeling that Relation is a name mostly used by people with a 
 deeper background in (relational) databases, while table is more the 
 pragmatic developer term. (As a reason for my choice) Am 25.03.2015 
 20:37 schrieb Fabian Hueske fhue...@gmail.com:

 I think the voting scheme is clear.
 The mail that started the thread says:

 The name with the most votes is chosen.
 If the vote ends with no name having the most votes, a new vote with 
 an alternative voting scheme will be done.

 So let's go with a single vote and handle corner cases as they appear.

 2015-03-25 20:24 GMT+01:00 Ufuk Celebi u...@apache.org:

 +Table, DataTable

 ---

 How are votes counted? When voting for the name of the project, we 
 didn't vote for one name, but gave a preference ordering.

 In this case, I am for Table or DataTable, but what happens if I 
 vote for Table and then there is a tie between DataTable and 
 Relation? Will Table count for DataTable then?

 – Ufuk

 On 25 Mar 2015, at 18:33, Vasiliki Kalavri 
 vasilikikala...@gmail.com
 wrote:

 +Relation
 On Mar 25, 2015 6:29 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:
 +Relation

 PS
 Aljoscha, don't forget to cast your own vote :)


 On Wednesday, March 25, 2015, Aljoscha Krettek 
 aljos...@apache.org
 wrote:

 Please vote on the new name of the equivalent to DataSet and 
 DataStream in the new expression-based API.

  From the previous discussion thread three names emerged: 
 Relation, Table and DataTable.

 The vote is open for the next 72 hours.
 The name with the most votes is chosen.
 If the vote ends with no name having the most votes, a new vote 
 with an alternative voting scheme will be done.

 Please vote either of these:

 +Relation
 +Table
 +DataTable





Re: [VOTE] Name of Expression API Representation

2015-03-26 Thread Robert Metzger
+Table


On Thu, Mar 26, 2015 at 10:13 AM, Aljoscha Krettek aljos...@apache.org
wrote:

 Thanks Henry. :D

 +Relation

 On Thu, Mar 26, 2015 at 9:36 AM, Till Rohrmann trohrm...@apache.org
 wrote:
  +Table
 
  On Thu, Mar 26, 2015 at 9:32 AM, Márton Balassi 
 balassi.mar...@gmail.com
  wrote:
 
  +DataTable
 
  On Thu, Mar 26, 2015 at 9:29 AM, Markl, Volker, Prof. Dr. 
  volker.ma...@tu-berlin.de wrote:
 
   +Table
  
   I also agree with that line of argument (think SQL ;-) )
  
   -Ursprüngliche Nachricht-
   Von: Timo Walther [mailto:twal...@apache.org]
   Gesendet: Donnerstag, 26. März 2015 09:28
   An: dev@flink.apache.org
   Betreff: Re: [VOTE] Name of Expression API Representation
  
   +Table API
  
   Same thoughts as Stephan. Table is more common in the economy than
   Relation.
  
   On 25.03.2015 21:30, Stephan Ewen wrote:
+Table API / Table
   
I have a feeling that Relation is a name mostly used by people with
 a
deeper background in (relational) databases, while table is more the
pragmatic developer term. (As a reason for my choice) Am 25.03.2015
20:37 schrieb Fabian Hueske fhue...@gmail.com:
   
I think the voting scheme is clear.
The mail that started the thread says:
   
The name with the most votes is chosen.
If the vote ends with no name having the most votes, a new vote
 with
an alternative voting scheme will be done.
   
So let's go with a single vote and handle corner cases as they
 appear.
   
2015-03-25 20:24 GMT+01:00 Ufuk Celebi u...@apache.org:
   
+Table, DataTable
   
---
   
How are votes counted? When voting for the name of the project, we
didn't vote for one name, but gave a preference ordering.
   
In this case, I am for Table or DataTable, but what happens if I
vote for Table and then there is a tie between DataTable and
Relation? Will Table count for DataTable then?
   
– Ufuk
   
On 25 Mar 2015, at 18:33, Vasiliki Kalavri
vasilikikala...@gmail.com
wrote:
   
+Relation
On Mar 25, 2015 6:29 PM, Henry Saputra 
 henry.sapu...@gmail.com
wrote:
+Relation
   
PS
Aljoscha, don't forget to cast your own vote :)
   
   
On Wednesday, March 25, 2015, Aljoscha Krettek
aljos...@apache.org
wrote:
   
Please vote on the new name of the equivalent to DataSet and
DataStream in the new expression-based API.
   
 From the previous discussion thread three names emerged:
Relation, Table and DataTable.
   
The vote is open for the next 72 hours.
The name with the most votes is chosen.
If the vote ends with no name having the most votes, a new vote
with an alternative voting scheme will be done.
   
Please vote either of these:
   
+Relation
+Table
+DataTable
   
   
  
  
 



Re: [VOTE] Name of Expression API Representation

2015-03-26 Thread Márton Balassi
+DataTable

On Thu, Mar 26, 2015 at 9:29 AM, Markl, Volker, Prof. Dr. 
volker.ma...@tu-berlin.de wrote:

 +Table

 I also agree with that line of argument (think SQL ;-) )

 -Ursprüngliche Nachricht-
 Von: Timo Walther [mailto:twal...@apache.org]
 Gesendet: Donnerstag, 26. März 2015 09:28
 An: dev@flink.apache.org
 Betreff: Re: [VOTE] Name of Expression API Representation

 +Table API

 Same thoughts as Stephan. Table is more common in the economy than
 Relation.

 On 25.03.2015 21:30, Stephan Ewen wrote:
  +Table API / Table
 
  I have a feeling that Relation is a name mostly used by people with a
  deeper background in (relational) databases, while table is more the
  pragmatic developer term. (As a reason for my choice) Am 25.03.2015
  20:37 schrieb Fabian Hueske fhue...@gmail.com:
 
  I think the voting scheme is clear.
  The mail that started the thread says:
 
  The name with the most votes is chosen.
  If the vote ends with no name having the most votes, a new vote with
  an alternative voting scheme will be done.
 
  So let's go with a single vote and handle corner cases as they appear.
 
  2015-03-25 20:24 GMT+01:00 Ufuk Celebi u...@apache.org:
 
  +Table, DataTable
 
  ---
 
  How are votes counted? When voting for the name of the project, we
  didn't vote for one name, but gave a preference ordering.
 
  In this case, I am for Table or DataTable, but what happens if I
  vote for Table and then there is a tie between DataTable and
  Relation? Will Table count for DataTable then?
 
  – Ufuk
 
  On 25 Mar 2015, at 18:33, Vasiliki Kalavri
  vasilikikala...@gmail.com
  wrote:
 
  +Relation
  On Mar 25, 2015 6:29 PM, Henry Saputra henry.sapu...@gmail.com
  wrote:
  +Relation
 
  PS
  Aljoscha, don't forget to cast your own vote :)
 
 
  On Wednesday, March 25, 2015, Aljoscha Krettek
  aljos...@apache.org
  wrote:
 
  Please vote on the new name of the equivalent to DataSet and
  DataStream in the new expression-based API.
 
   From the previous discussion thread three names emerged:
  Relation, Table and DataTable.
 
  The vote is open for the next 72 hours.
  The name with the most votes is chosen.
  If the vote ends with no name having the most votes, a new vote
  with an alternative voting scheme will be done.
 
  Please vote either of these:
 
  +Relation
  +Table
  +DataTable
 
 




Re: [VOTE] Name of Expression API Representation

2015-03-26 Thread Till Rohrmann
+Table

On Thu, Mar 26, 2015 at 9:32 AM, Márton Balassi balassi.mar...@gmail.com
wrote:

 +DataTable

 On Thu, Mar 26, 2015 at 9:29 AM, Markl, Volker, Prof. Dr. 
 volker.ma...@tu-berlin.de wrote:

  +Table
 
  I also agree with that line of argument (think SQL ;-) )
 
  -Ursprüngliche Nachricht-
  Von: Timo Walther [mailto:twal...@apache.org]
  Gesendet: Donnerstag, 26. März 2015 09:28
  An: dev@flink.apache.org
  Betreff: Re: [VOTE] Name of Expression API Representation
 
  +Table API
 
  Same thoughts as Stephan. Table is more common in the economy than
  Relation.
 
  On 25.03.2015 21:30, Stephan Ewen wrote:
   +Table API / Table
  
   I have a feeling that Relation is a name mostly used by people with a
   deeper background in (relational) databases, while table is more the
   pragmatic developer term. (As a reason for my choice) Am 25.03.2015
   20:37 schrieb Fabian Hueske fhue...@gmail.com:
  
   I think the voting scheme is clear.
   The mail that started the thread says:
  
   The name with the most votes is chosen.
   If the vote ends with no name having the most votes, a new vote with
   an alternative voting scheme will be done.
  
   So let's go with a single vote and handle corner cases as they appear.
  
   2015-03-25 20:24 GMT+01:00 Ufuk Celebi u...@apache.org:
  
   +Table, DataTable
  
   ---
  
   How are votes counted? When voting for the name of the project, we
   didn't vote for one name, but gave a preference ordering.
  
   In this case, I am for Table or DataTable, but what happens if I
   vote for Table and then there is a tie between DataTable and
   Relation? Will Table count for DataTable then?
  
   – Ufuk
  
   On 25 Mar 2015, at 18:33, Vasiliki Kalavri
   vasilikikala...@gmail.com
   wrote:
  
   +Relation
   On Mar 25, 2015 6:29 PM, Henry Saputra henry.sapu...@gmail.com
   wrote:
   +Relation
  
   PS
   Aljoscha, don't forget to cast your own vote :)
  
  
   On Wednesday, March 25, 2015, Aljoscha Krettek
   aljos...@apache.org
   wrote:
  
   Please vote on the new name of the equivalent to DataSet and
   DataStream in the new expression-based API.
  
From the previous discussion thread three names emerged:
   Relation, Table and DataTable.
  
   The vote is open for the next 72 hours.
   The name with the most votes is chosen.
   If the vote ends with no name having the most votes, a new vote
   with an alternative voting scheme will be done.
  
   Please vote either of these:
  
   +Relation
   +Table
   +DataTable
  
  
 
 



[jira] [Created] (FLINK-1786) Add support for pipelined programs with slot count exceeding parallelism

2015-03-26 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1786:
--

 Summary: Add support for pipelined programs with slot count 
exceeding parallelism
 Key: FLINK-1786
 URL: https://issues.apache.org/jira/browse/FLINK-1786
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, JobManager
Affects Versions: master
Reporter: Ufuk Celebi


We support slot count exceeding parallelism with blocking results and w/o slot 
sharing (FLINK-1709).

We need to add support for this with pipelined results as well (w/o slot 
sharing). The runtime needs support for mixed pipelined and blocking 
subpartitions and the job manager needs to be aware of how to deploy such mixed 
results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Travis-CI builds queuing up

2015-03-26 Thread Maximilian Michels
That's nice to hear. They didn't specify any time frame?

On Thu, Mar 26, 2015 at 9:25 AM, Fabian Hueske fhue...@gmail.com wrote:

 Great!
 Thanks Robert for sharing the good news :-)

 2015-03-26 9:08 GMT+01:00 Robert Metzger rmetz...@apache.org:

  Travis replied me with very good news: Somebody from INFRA was asking the
  same question around the same time as I did and Travis is working on
 adding
  more build capacity for the apache github organization.
  I hope we'll soon have quicker builds again.
 
  On Tue, Mar 24, 2015 at 4:42 PM, Henry Saputra henry.sapu...@gmail.com
  wrote:
 
   That's good idea.
  
   Should be good to have mix of stable with Apache Jenkins for master
   and PRs, and Travis for individual forks.
  
   - Henry
  
   On Tue, Mar 24, 2015 at 8:03 AM, Maximilian Michels m...@apache.org
   wrote:
Hey!
   
I would also like to continue using Travis but the current situation
 is
   not
acceptable because we practically can't use Travis anymore for pull
requests or the current master. If it cannot be resolved then I think
  we
should move on.
   
The builds service team [1] at Apache offers Jenkins [2] for
 continuous
integration. I think it should be fairly simple to set up. We could
  still
use Travis in our forked repositories but have a reliable CI solution
  for
the master and pull requests.
   
Max
   
[1] https://builds.apache.org/
[2] http://jenkins-ci.org
   
On Tue, Mar 24, 2015 at 3:46 PM, Márton Balassi 
   balassi.mar...@gmail.com
wrote:
   
I also like the travis infrastucture. Thanks for bringing this up
 and
reaching out to the travis guys.
   
On Tue, Mar 24, 2015 at 3:38 PM, Robert Metzger 
 rmetz...@apache.org
wrote:
   
 Hi guys,

 the build queue on travis is getting very very long. It seems that
  it
takes
 4 days now until commits to master are build. The nightly builds
  from
   the
 website and the maven snapshots are also delayed by that.
 Right now,  there are 33 pull request builds scheduled (
 https://travis-ci.org/apache/flink/pull_requests), and 8 builds
 on
master:
 https://travis-ci.org/apache/flink/builds.

 The problem is that travis accounts are per github user. In our
  case,
   the
 user is apache, so all ASF projects that have travis enabled
  share 5
 concurrent builders.

 I would actually like to continue using Travis.

 The easiest option is probably asking travis if they can give the
apache
 user more build capacity.

 If thats not possible, we have to look into other options.


 I'm going to ask Travis if they can do anything about it.

 Robert

   
  
 



Re: [VOTE] Name of Expression API Representation

2015-03-26 Thread Timo Walther

+Table API

Same thoughts as Stephan. Table is more common in the economy than Relation.

On 25.03.2015 21:30, Stephan Ewen wrote:

+Table API / Table

I have a feeling that Relation is a name mostly used by people with a
deeper background in (relational) databases, while table is more the
pragmatic developer term. (As a reason for my choice)
Am 25.03.2015 20:37 schrieb Fabian Hueske fhue...@gmail.com:


I think the voting scheme is clear.
The mail that started the thread says:

The name with the most votes is chosen.
If the vote ends with no name having the most votes, a new vote
with an alternative voting scheme will be done.

So let's go with a single vote and handle corner cases as they appear.

2015-03-25 20:24 GMT+01:00 Ufuk Celebi u...@apache.org:


+Table, DataTable

---

How are votes counted? When voting for the name of the project, we didn't
vote for one name, but gave a preference ordering.

In this case, I am for Table or DataTable, but what happens if I vote for
Table and then there is a tie between DataTable and Relation? Will Table
count for DataTable then?

– Ufuk

On 25 Mar 2015, at 18:33, Vasiliki Kalavri vasilikikala...@gmail.com
wrote:


+Relation
On Mar 25, 2015 6:29 PM, Henry Saputra henry.sapu...@gmail.com

wrote:

+Relation

PS
Aljoscha, don't forget to cast your own vote :)


On Wednesday, March 25, 2015, Aljoscha Krettek aljos...@apache.org
wrote:


Please vote on the new name of the equivalent to DataSet and
DataStream in the new expression-based API.

 From the previous discussion thread three names emerged: Relation,
Table and DataTable.

The vote is open for the next 72 hours.
The name with the most votes is chosen.
If the vote ends with no name having the most votes, a new vote
with an alternative voting scheme will be done.

Please vote either of these:

+Relation
+Table
+DataTable







Re: [VOTE] Name of Expression API Representation

2015-03-26 Thread Aljoscha Krettek
Thanks Henry. :D

+Relation

On Thu, Mar 26, 2015 at 9:36 AM, Till Rohrmann trohrm...@apache.org wrote:
 +Table

 On Thu, Mar 26, 2015 at 9:32 AM, Márton Balassi balassi.mar...@gmail.com
 wrote:

 +DataTable

 On Thu, Mar 26, 2015 at 9:29 AM, Markl, Volker, Prof. Dr. 
 volker.ma...@tu-berlin.de wrote:

  +Table
 
  I also agree with that line of argument (think SQL ;-) )
 
  -Ursprüngliche Nachricht-
  Von: Timo Walther [mailto:twal...@apache.org]
  Gesendet: Donnerstag, 26. März 2015 09:28
  An: dev@flink.apache.org
  Betreff: Re: [VOTE] Name of Expression API Representation
 
  +Table API
 
  Same thoughts as Stephan. Table is more common in the economy than
  Relation.
 
  On 25.03.2015 21:30, Stephan Ewen wrote:
   +Table API / Table
  
   I have a feeling that Relation is a name mostly used by people with a
   deeper background in (relational) databases, while table is more the
   pragmatic developer term. (As a reason for my choice) Am 25.03.2015
   20:37 schrieb Fabian Hueske fhue...@gmail.com:
  
   I think the voting scheme is clear.
   The mail that started the thread says:
  
   The name with the most votes is chosen.
   If the vote ends with no name having the most votes, a new vote with
   an alternative voting scheme will be done.
  
   So let's go with a single vote and handle corner cases as they appear.
  
   2015-03-25 20:24 GMT+01:00 Ufuk Celebi u...@apache.org:
  
   +Table, DataTable
  
   ---
  
   How are votes counted? When voting for the name of the project, we
   didn't vote for one name, but gave a preference ordering.
  
   In this case, I am for Table or DataTable, but what happens if I
   vote for Table and then there is a tie between DataTable and
   Relation? Will Table count for DataTable then?
  
   – Ufuk
  
   On 25 Mar 2015, at 18:33, Vasiliki Kalavri
   vasilikikala...@gmail.com
   wrote:
  
   +Relation
   On Mar 25, 2015 6:29 PM, Henry Saputra henry.sapu...@gmail.com
   wrote:
   +Relation
  
   PS
   Aljoscha, don't forget to cast your own vote :)
  
  
   On Wednesday, March 25, 2015, Aljoscha Krettek
   aljos...@apache.org
   wrote:
  
   Please vote on the new name of the equivalent to DataSet and
   DataStream in the new expression-based API.
  
From the previous discussion thread three names emerged:
   Relation, Table and DataTable.
  
   The vote is open for the next 72 hours.
   The name with the most votes is chosen.
   If the vote ends with no name having the most votes, a new vote
   with an alternative voting scheme will be done.
  
   Please vote either of these:
  
   +Relation
   +Table
   +DataTable
  
  
 
 



Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Kostas Tzoumas
+1 for an early milestone release. Perhaps we can call it 0.9-milestone or
so?

On Thu, Mar 26, 2015 at 11:01 AM, Robert Metzger rmetz...@apache.org
wrote:

 Two weeks have passed since we've discussed the 0.9 release the last time.

 The ApacheCon is in 18 days from now.
 If we want, we can also release a 0.9.0-beta release that contains known
 bugs, but allows our users to try out the new features easily (because they
 are part of a release). The vote for such a release would be mainly about
 the legal aspects of the release rather than the stability. So I suspect
 that the vote will go through much quicker.



 On Fri, Mar 13, 2015 at 12:01 PM, Robert Metzger rmetz...@apache.org
 wrote:

  I've reopened https://issues.apache.org/jira/browse/FLINK-1650 because
  the issue is still occurring.
 
  On Thu, Mar 12, 2015 at 7:05 PM, Ufuk Celebi u...@apache.org wrote:
 
  On Thursday, March 12, 2015, Till Rohrmann till.rohrm...@gmail.com
  wrote:
 
   Have you run the 20 builds with the new shading code? With new shading
  the
   TaskManagerFailsITCase should no longer fail. If it still does, then
 we
   have to look into it again.
 
 
  No, rebased on Monday before shading. Let me rebase and rerun tonight.
 
 
 



Re: UDP support in Streaming API

2015-03-26 Thread Janani Chakkaradhari
Hi Stephan,

Yes, you are right. I will try writing a custom data source as you
mentioned. Also, I still need to check the possibility of our system to use
Apache Kafka as a broker. Could it be possible for you to point out here
the downsides of using UDP as a source for streaming data?

Thanks,
Janani

On Wed, Mar 25, 2015 at 9:44 PM, Stephan Ewen se...@apache.org wrote:

 Hi Janani!

 Do I understand you correctly in that you want a Flink stream source that
 receives UDP datagrams and turns them into Flink DataStream?

 Such a thing is not in there, yet. The interface to define custom data
 sources is rather simple, though, it should be possible to add something
 like this.

 Two things are not clear to me however:

 1) Where are the datagrams sent to? That would need to be the point where
 the data source runs.

 2) You need a way of turning the datagrams (which are just bytes) into
 records. Would that be hard-wired in your case?

 So, while probably possible, I would guess a UDP source has quite a few
 downsides over a message queue as a source. Using something like Kafka to
 communicate the source data is easier and better recoverable.

 Does your setup allow to put the data into Kafka and having Flink read the
 streams from Kafka?

 Greetings,
 Stephan
  Am 25.03.2015 19:20 schrieb Janani Chakkaradhari 
 janani.cs...@gmail.com
 :

  ​Hi,
 
  Does Flink's streaming api has support for
  ​ reading streams of data via Java UDP (DatagramSocket)? If so kindly
  advise me in which release of Flink I can find it.
 
  Thanks,
  Janani
 



Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Paris Carbone
+1 for an early release. It will help unblock the samoa PR that has 0.9 
dependencies.

 On 26 Mar 2015, at 11:44, Kostas Tzoumas ktzou...@apache.org wrote:
 
 +1 for an early milestone release. Perhaps we can call it 0.9-milestone or
 so?
 
 On Thu, Mar 26, 2015 at 11:01 AM, Robert Metzger rmetz...@apache.org
 wrote:
 
 Two weeks have passed since we've discussed the 0.9 release the last time.
 
 The ApacheCon is in 18 days from now.
 If we want, we can also release a 0.9.0-beta release that contains known
 bugs, but allows our users to try out the new features easily (because they
 are part of a release). The vote for such a release would be mainly about
 the legal aspects of the release rather than the stability. So I suspect
 that the vote will go through much quicker.
 
 
 
 On Fri, Mar 13, 2015 at 12:01 PM, Robert Metzger rmetz...@apache.org
 wrote:
 
 I've reopened https://issues.apache.org/jira/browse/FLINK-1650 because
 the issue is still occurring.
 
 On Thu, Mar 12, 2015 at 7:05 PM, Ufuk Celebi u...@apache.org wrote:
 
 On Thursday, March 12, 2015, Till Rohrmann till.rohrm...@gmail.com
 wrote:
 
 Have you run the 20 builds with the new shading code? With new shading
 the
 TaskManagerFailsITCase should no longer fail. If it still does, then
 we
 have to look into it again.
 
 
 No, rebased on Monday before shading. Let me rebase and rerun tonight.
 
 
 
 



Re: GSoC proposal

2015-03-26 Thread Gábor Gévay
Hello,

Thank you very much for your comments! I will remove the part about
the windowing optimizations (though, that was my favourite part :) ),
and think about what other statistics could be added. And thank you
for the link with the collection of many relevant algorithms, they are
very interesting!

Best regards,
Gabor



2015-03-26 17:35 GMT-05:00 Paris Carbone par...@kth.se:
 Hi Gabor,

 Approximate statistics is a really good topic, I think there is a lot to do 
 if you focus there. One idea would also be to include some of your 
 contributions to the incremental machine learning library that will be 
 available by June. From there you will be able to also use sampling and 
 stream mining primitives out-of-the-box among others. Regarding window 
 optimisations, as Gyula said, there is not much to do simply because we are 
 working heavily on it already. Good luck and thanks for the proposal!

 Paris

 On 26 Mar 2015, at 22:59, Gyula Fóra gyula.f...@gmail.com wrote:

 Hey Gabor,

 Thank you for the proposal. It has many interesting ideas and a good
 potential.

 My comments:

 We already have a large amount of ongoing work on the windowing
 optimizations, covering your suggestions in section 1. It would be better
 to drop that part from the project because thats very heavily on the
 research side and as I said we are working on this at SICS.

 I like the list that you made for section 2., and this should be the main
 emphasis on the project. It would indeed be very nice to have a wide range
 of statistics that we can compute (or approximate - this should be optional
 thoug) on streams and windows (maybe we should also add some practical
 stuff like top-k, distinct etc).

 Here is a list of interesting papers that seems to be related to this
 project

 https://gist.github.com/debasishg/8172796

 Cheers,
 Gyula

 On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I will be applying to the Google Summer of Code, and I wrote most of
 the proposal:
 http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
 I would appreciate it if you could comment on it.

 Gyula Fora, git blame is telling me that you wrote most of the
 relevant parts of the windowing code, so I would be especially
 interested in what you think of my improvement ideas.

 Best regards,
 Gabor




Re: ApacheCon 2015 is coming to Austin, Texas, USA

2015-03-26 Thread Stephan Ewen
I think you meant Fabian ;-)

On Wed, Mar 25, 2015 at 4:05 PM, Henry Saputra henry.sapu...@gmail.com
wrote:

 Hi Stephan,

 Glad to meet and chat for sure =)

 Love to see Flink represented in the ApacheCon.

 - Henry

 On Wed, Mar 25, 2015 at 3:12 AM, Fabian Hueske fhue...@gmail.com wrote:
  Thanks Henry for sharing!
 
  I will be in Austin and give a talk on Flink [1].
  Just ping me if you'd like to meet and chat :-)
 
  Cheers, Fabian
 
  [1] http://sched.co/2P9s
 
  2015-03-25 1:11 GMT+01:00 Henry Saputra henry.sapu...@gmail.com:
 
  Dear Apache Flink enthusiast,
 
  In just a few weeks, we'll be holding ApacheCon in Austin, Texas, and
  we'd love to have you in attendance. You can save $300 on admission by
  registering NOW, since the early bird price ends on the 21st.
 
  Register at http://s.apache.org/acna2015-reg
 
  ApacheCon this year celebrates the 20th birthday of the Apache HTTP
  Server, and we'll have Brian Behlendorf, who started this whole thing,
  keynoting for us, and you'll have a chance to meet some of the
  original Apache Group, who will be there to celebrate with us.
 
  We've got 7 tracks of great talks, as well as BOFs, the Apache
  BarCamp, project-specific hack events, and evening events where you
  can deepen your connection with the larger Apache community. See the
  full schedule at http://apacheconna2015.sched.org/
 
  And if you have any questions, comments, or just want to hang out with
  us before and during the event, follow us on Twitter - @apachecon - or
  drop by #apachecon on the Freenode IRC network.
 
  Hope to see you in Austin!
 
  - Henry
 
 



Re: ApacheCon 2015 is coming to Austin, Texas, USA

2015-03-26 Thread Henry Saputra
Oh my goodness, I am so sorry Fabian =(

I sent the email out in the morning before I hit my coffee.

Looking forward meeting you at the ApacheCon, Fabian =)

- Henry

On Thu, Mar 26, 2015 at 10:26 AM, Stephan Ewen se...@apache.org wrote:
 I think you meant Fabian ;-)

 On Wed, Mar 25, 2015 at 4:05 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:

 Hi Stephan,

 Glad to meet and chat for sure =)

 Love to see Flink represented in the ApacheCon.

 - Henry

 On Wed, Mar 25, 2015 at 3:12 AM, Fabian Hueske fhue...@gmail.com wrote:
  Thanks Henry for sharing!
 
  I will be in Austin and give a talk on Flink [1].
  Just ping me if you'd like to meet and chat :-)
 
  Cheers, Fabian
 
  [1] http://sched.co/2P9s
 
  2015-03-25 1:11 GMT+01:00 Henry Saputra henry.sapu...@gmail.com:
 
  Dear Apache Flink enthusiast,
 
  In just a few weeks, we'll be holding ApacheCon in Austin, Texas, and
  we'd love to have you in attendance. You can save $300 on admission by
  registering NOW, since the early bird price ends on the 21st.
 
  Register at http://s.apache.org/acna2015-reg
 
  ApacheCon this year celebrates the 20th birthday of the Apache HTTP
  Server, and we'll have Brian Behlendorf, who started this whole thing,
  keynoting for us, and you'll have a chance to meet some of the
  original Apache Group, who will be there to celebrate with us.
 
  We've got 7 tracks of great talks, as well as BOFs, the Apache
  BarCamp, project-specific hack events, and evening events where you
  can deepen your connection with the larger Apache community. See the
  full schedule at http://apacheconna2015.sched.org/
 
  And if you have any questions, comments, or just want to hang out with
  us before and during the event, follow us on Twitter - @apachecon - or
  drop by #apachecon on the Freenode IRC network.
 
  Hope to see you in Austin!
 
  - Henry
 
 




Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Kostas Tzoumas
The ASF press team wants to announce next week, so a 3-day vote right now
might cancel the subject line of this thread :-)

Perhaps we can reach consensus in the DISCUSS thread or have a 24-hour vote?

I agree with Stephan on 0.9.0.M1 (or 0.9.0-m1 or whatever), as it seems
that other open source projects are using this naming scheme.

Kostas



On Thu, Mar 26, 2015 at 6:10 PM, Stephan Ewen se...@apache.org wrote:

 I think Milestone pretty much says that we have some crucial things in
 there, but not all. Beta in comparison, has an immature early version
 connotation.

 We are, for example, using a milestone 1 version of Jetty for the Web
 Frontend, so that is a pretty standard thing, in my opinion:

 dependency
 groupIdorg.eclipse.jetty/groupId
 artifactIdjetty-server/artifactId
 version8.0.0.M1/version
 /dependency


 On Thu, Mar 26, 2015 at 3:44 PM, Robert Metzger rmetz...@apache.org
 wrote:

  Looks like we need to vote on 0.9-beta or 0.9-milestone.
 
  Can we find consensus whether to add a 1 after the name? -beta1 or
  -milestone1.
  Adding a 1 allows us to create a second beta/milestone release.
 
  I'm against adding a 1.
 
  On Thu, Mar 26, 2015 at 3:40 PM, Ufuk Celebi u...@apache.org wrote:
 
  
   On 26 Mar 2015, at 11:01, Robert Metzger rmetz...@apache.org wrote:
  
Two weeks have passed since we've discussed the 0.9 release the last
   time.
   
The ApacheCon is in 18 days from now.
If we want, we can also release a 0.9.0-beta release that contains
   known
bugs, but allows our users to try out the new features easily
 (because
   they
are part of a release). The vote for such a release would be mainly
  about
the legal aspects of the release rather than the stability. So I
  suspect
that the vote will go through much quicker.
  
   +1 for 0.9-beta
  
 



Re: GSoC proposal

2015-03-26 Thread Gyula Fóra
Hey Gabor,

Thank you for the proposal. It has many interesting ideas and a good
potential.

My comments:

We already have a large amount of ongoing work on the windowing
optimizations, covering your suggestions in section 1. It would be better
to drop that part from the project because thats very heavily on the
research side and as I said we are working on this at SICS.

I like the list that you made for section 2., and this should be the main
emphasis on the project. It would indeed be very nice to have a wide range
of statistics that we can compute (or approximate - this should be optional
thoug) on streams and windows (maybe we should also add some practical
stuff like top-k, distinct etc).

Here is a list of interesting papers that seems to be related to this
project

https://gist.github.com/debasishg/8172796

Cheers,
Gyula

On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I will be applying to the Google Summer of Code, and I wrote most of
 the proposal:
 http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
 I would appreciate it if you could comment on it.

 Gyula Fora, git blame is telling me that you wrote most of the
 relevant parts of the windowing code, so I would be especially
 interested in what you think of my improvement ideas.

 Best regards,
 Gabor



Re: [GSoc][flink-streaming] Interested in pursuing FLINK-1617 and FLINK-1534

2015-03-26 Thread Akshay Dixit
Thanks for going through it Gyula.
I've made the necessary amends to the timeline and submitted the proposal.

Regards,
Akshay Dixit


On Thu, Mar 26, 2015 at 8:53 PM, Gyula Fóra gyf...@apache.org wrote:

 I think it looks good for a start, we will have to work on the API a little
 bit together to make it fit smoothly with what we currently have.

 There is a few gaps in the timeline but that you have probably noticed :)

 Otherwise +1 from me.

 On Wed, Mar 25, 2015 at 11:35 PM, Akshay Dixit akshayd...@gmail.com
 wrote:

  Hi,
  The link to the draft proposal that I've prepared is
  https://gist.github.com/akshaydixi/88f3fbcebab0119a6a31
  It would be great if I could get some feedback on it.
  Regards,
  Akshay Dixit
 
  On Wed, Mar 25, 2015 at 2:03 AM, Akshay Dixit akshayd...@gmail.com
  wrote:
 
   Thanks Gyula.
  
   I agree too that simple and working implementations are preferrable
 over
   hacky complex solutions. I'll start sketching out an initial
  straighforward
   API with only basic pattern matching features
   and base it on the existing windowing API. I'll post a draft of the
   proposal,  keeping the points you've said in mind, tomorrow, so you can
   look it over to see if its all right.
   Regards,
   Akshay Dixit
  
   On Tue, Mar 24, 2015 at 6:30 PM, Gyula Fóra gyf...@apache.org wrote:
  
   Hey Dixit,
  
   Sorry for the delay, I had to discuss this in more detail with some of
  our
   other core developers.
  
   The consensus seems to be that we would like push this project in a
   direction where the changes can be quickly included in the next
  releases.
   For this it is essential that we implement features that are complete
  (and
   clean) from the users perspective. This does not necessarily mean that
  we
   would like to have everything at once but rather that it is preferable
  to
   start with something clean and simple (for instance the naive chained
   filter approach) and progressively build more complex logic.
  
   This also mean that we would like to avoid researchy code in the
   codebase
   as much as possible. Of course once we have a stable api for this
   functionality we can work towards making the optimizations that you
 have
   mentioned like operator sharing and so on.
  
   The ideal proposal would give a clear sketch of the pattern matching
 API
   that you would like to implement, which might be some added operators
 at
   first to the current API and possible a DSL later with more advanced
   functionality (this would probably go in a separate library until it
 is
   very stable).
  
   So please in the proposal include a preview of what the pattern
 matching
   syntax would look like integrated with the current operators, how it
  would
   interact with other parts of the system etc.
  
   These are the thing we need to figure out before we consider the
   optimizations I think, because it usually turns out, that the API
   semantics
   you would like to provide can hugely affect (probably limit) the
   possibilities that you have afterwards in terms of optimizations.
  
   Let me know if you have further questions regarding this :)
  
   Gyula
  
   On Tue, Mar 24, 2015 at 12:01 PM, Gyula Fóra gyf...@apache.org
 wrote:
  
Hey,
   
Give me an hour or so as I am in a meeting currently, but I will get
   back
to you afterwards.
   
Regards,
Gyula
   
On Tue, Mar 24, 2015 at 11:03 AM, Akshay Dixit 
 akshayd...@gmail.com
wrote:
   
Hi,
It'd really help if I got a reply soon. It'll be helpful in writing
  the
proposal since the deadline is on 27th. Thanks
Regards,
Akshay Dixit
   
On Sun, Mar 22, 2015 at 1:17 AM, Akshay Dixit 
 akshayd...@gmail.com
wrote:
   
 Thanks for the explanation Marton. I've decided to try out for
FLINK-1534.

 After reading through the thesis[4] and a few other
  papers[1][2][3],
   I
 believe I've gathered a little context to ask more questions. But
  I'm
still
 not sure how Flink's internals work
 so please bear with me. Although the ongoing effort to document
 the
 architecture and internal is really helpful for newbies like me
 and
would
 greatly decrease the ramping up time.

 Detecting a pattern of events would comprise of a pipeline that
   accepts
 the pattern query and
 sources of DataStreams, and outputs detected matches of that
  pattern
   to
a
 sink or forwards it
 along to another stream for further computation.

 As you said, a simple filter-join-aggregate query system could be
 developed implementing using the existing Streaming windowing
 API.
 But matching over complex events and decoding their pattern
 queries
would
 require implementing a DSL that transforms queries into an
  evaluation
 model. For e.g,
 in [1], the authors have implemented an NFA automaton with a
 shared
 versioned buffer that models the queries. In [4], the authors
 propose a 

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Henry Saputra
Yeah, always prefer to get it with consensus that VOTE

I am fine with either.

- Henry

On Thu, Mar 26, 2015 at 11:03 AM, Kostas Tzoumas ktzou...@apache.org wrote:
 The ASF press team wants to announce next week, so a 3-day vote right now
 might cancel the subject line of this thread :-)

 Perhaps we can reach consensus in the DISCUSS thread or have a 24-hour vote?

 I agree with Stephan on 0.9.0.M1 (or 0.9.0-m1 or whatever), as it seems
 that other open source projects are using this naming scheme.

 Kostas



 On Thu, Mar 26, 2015 at 6:10 PM, Stephan Ewen se...@apache.org wrote:

 I think Milestone pretty much says that we have some crucial things in
 there, but not all. Beta in comparison, has an immature early version
 connotation.

 We are, for example, using a milestone 1 version of Jetty for the Web
 Frontend, so that is a pretty standard thing, in my opinion:

 dependency
 groupIdorg.eclipse.jetty/groupId
 artifactIdjetty-server/artifactId
 version8.0.0.M1/version
 /dependency


 On Thu, Mar 26, 2015 at 3:44 PM, Robert Metzger rmetz...@apache.org
 wrote:

  Looks like we need to vote on 0.9-beta or 0.9-milestone.
 
  Can we find consensus whether to add a 1 after the name? -beta1 or
  -milestone1.
  Adding a 1 allows us to create a second beta/milestone release.
 
  I'm against adding a 1.
 
  On Thu, Mar 26, 2015 at 3:40 PM, Ufuk Celebi u...@apache.org wrote:
 
  
   On 26 Mar 2015, at 11:01, Robert Metzger rmetz...@apache.org wrote:
  
Two weeks have passed since we've discussed the 0.9 release the last
   time.
   
The ApacheCon is in 18 days from now.
If we want, we can also release a 0.9.0-beta release that contains
   known
bugs, but allows our users to try out the new features easily
 (because
   they
are part of a release). The vote for such a release would be mainly
  about
the legal aspects of the release rather than the stability. So I
  suspect
that the vote will go through much quicker.
  
   +1 for 0.9-beta
  
 



Re: Memory segment error

2015-03-26 Thread Andra Lungu
Sure,

3470 [main] INFO  org.apache.flink.runtime.taskmanager.TaskManager  - Using
820 MB for Flink managed memory.

On Thu, Mar 26, 2015 at 4:48 PM, Robert Metzger rmetz...@apache.org wrote:

 Hi,

 during startup, Flink will log something like:
 16:48:09,669 INFO  org.apache.flink.runtime.taskmanager.TaskManager
  - Using 1193 MB for Flink managed memory.

 Can you tell us how much memory Flink is managing in your case?



 On Thu, Mar 26, 2015 at 4:46 PM, Andra Lungu lungu.an...@gmail.com
 wrote:

  Hello everyone,
 
  I guess I need to revive this old discussion:
 
 
 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Memory-segment-error-when-migrating-functional-code-from-Flink-0-9-to-0-8-td3687.html
 
  At that point, the fix was to kindly ask Alex to make his project work
 with
  0.9.
 
  Now, I am not that lucky!
 
  This is the code:
  https://github.com/andralungu/gelly-partitioning/tree/alphaSplit
 
  The main program(NodeSplitting) is working nicely, I get the correct
  result. But if you run the test,  you will see that collection works and
  cluster fails miserably with this exception:
 
  Caused by: java.lang.Exception: The data preparation for task 'Join(Join
 at
  weighEdges(NodeSplitting.java:112)) (04e172e761148a65783a4363406e08c0)' ,
  caused an error: Too few memory segments provided. Hash Join needs at
 least
  33 memory segments.
  at
 
 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:471)
  at
 
 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
  at
 
 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: java.lang.IllegalArgumentException: Too few memory segments
  provided. Hash Join needs at least 33 memory segments.
 
  I am running locally, from IntelliJ, on a tiny graph.
  $ cat /proc/meminfo
  MemTotal:   11405696 kB
  MemFree: 5586012 kB
  Buffers:  178100 kB
 
  I am sure I did not run out of memory...
 
  Any thoughts on this?
 
  Thanks!
  Andra
 



Re: GSoC proposal

2015-03-26 Thread Paris Carbone
Hi Gabor,

Approximate statistics is a really good topic, I think there is a lot to do if 
you focus there. One idea would also be to include some of your contributions 
to the incremental machine learning library that will be available by June. 
From there you will be able to also use sampling and stream mining primitives 
out-of-the-box among others. Regarding window optimisations, as Gyula said, 
there is not much to do simply because we are working heavily on it already. 
Good luck and thanks for the proposal! 

Paris

 On 26 Mar 2015, at 22:59, Gyula Fóra gyula.f...@gmail.com wrote:
 
 Hey Gabor,
 
 Thank you for the proposal. It has many interesting ideas and a good
 potential.
 
 My comments:
 
 We already have a large amount of ongoing work on the windowing
 optimizations, covering your suggestions in section 1. It would be better
 to drop that part from the project because thats very heavily on the
 research side and as I said we are working on this at SICS.
 
 I like the list that you made for section 2., and this should be the main
 emphasis on the project. It would indeed be very nice to have a wide range
 of statistics that we can compute (or approximate - this should be optional
 thoug) on streams and windows (maybe we should also add some practical
 stuff like top-k, distinct etc).
 
 Here is a list of interesting papers that seems to be related to this
 project
 
 https://gist.github.com/debasishg/8172796
 
 Cheers,
 Gyula
 
 On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote:
 
 Hello,
 
 I will be applying to the Google Summer of Code, and I wrote most of
 the proposal:
 http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
 I would appreciate it if you could comment on it.
 
 Gyula Fora, git blame is telling me that you wrote most of the
 relevant parts of the windowing code, so I would be especially
 interested in what you think of my improvement ideas.
 
 Best regards,
 Gabor
 



[jira] [Created] (FLINK-1785) Master tests in flink-tachyon fail with java.lang.NoSuchFieldError: IBM_JAVA

2015-03-26 Thread Henry Saputra (JIRA)
Henry Saputra created FLINK-1785:


 Summary: Master tests in flink-tachyon fail with 
java.lang.NoSuchFieldError: IBM_JAVA
 Key: FLINK-1785
 URL: https://issues.apache.org/jira/browse/FLINK-1785
 Project: Flink
  Issue Type: Bug
  Components: test
Reporter: Henry Saputra


The master fail in flink-tachyon test when running mvn test:

{code}
---
 T E S T S
---

---
 T E S T S
---
Running org.apache.flink.tachyon.HDFSTest
Running org.apache.flink.tachyon.TachyonFileSystemWrapperTest
java.lang.NoSuchFieldError: IBM_JAVA
at 
org.apache.hadoop.security.UserGroupInformation.getOSLoginModuleName(UserGroupInformation.java:303)
at 
org.apache.hadoop.security.UserGroupInformation.clinit(UserGroupInformation.java:348)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:807)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:266)
at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:122)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:775)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:642)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:334)
at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:316)
at org.apache.flink.tachyon.HDFSTest.createHDFS(HDFSTest.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)

...

Results :

Failed tests:
  HDFSTest.createHDFS:76 Test failed IBM_JAVA
  HDFSTest.createHDFS:76 Test failed Could not initialize class
org.apache.hadoop.security.UserGroupInformation
  TachyonFileSystemWrapperTest.testTachyon:149 Test failed with
exception: Cannot initialize task 'DataSink (CsvOutputFormat (path:
tachyon://x1carbon:18998/result, delimiter:  ))': Could not initialize
class org.apache.hadoop.security.UserGroupInformation

Tests in error:
  HDFSTest.destroyHDFS:83 NullPointer
  HDFSTest.destroyHDFS:83 NullPointer
  TachyonFileSystemWrapperTest.testHadoopLoadability:116 »
NoClassDefFound Could...

Tests run: 6, Failures: 3, Errors: 3, Skipped: 0

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Travis-CI builds queuing up

2015-03-26 Thread Robert Metzger
Travis replied me with very good news: Somebody from INFRA was asking the
same question around the same time as I did and Travis is working on adding
more build capacity for the apache github organization.
I hope we'll soon have quicker builds again.

On Tue, Mar 24, 2015 at 4:42 PM, Henry Saputra henry.sapu...@gmail.com
wrote:

 That's good idea.

 Should be good to have mix of stable with Apache Jenkins for master
 and PRs, and Travis for individual forks.

 - Henry

 On Tue, Mar 24, 2015 at 8:03 AM, Maximilian Michels m...@apache.org
 wrote:
  Hey!
 
  I would also like to continue using Travis but the current situation is
 not
  acceptable because we practically can't use Travis anymore for pull
  requests or the current master. If it cannot be resolved then I think we
  should move on.
 
  The builds service team [1] at Apache offers Jenkins [2] for continuous
  integration. I think it should be fairly simple to set up. We could still
  use Travis in our forked repositories but have a reliable CI solution for
  the master and pull requests.
 
  Max
 
  [1] https://builds.apache.org/
  [2] http://jenkins-ci.org
 
  On Tue, Mar 24, 2015 at 3:46 PM, Márton Balassi 
 balassi.mar...@gmail.com
  wrote:
 
  I also like the travis infrastucture. Thanks for bringing this up and
  reaching out to the travis guys.
 
  On Tue, Mar 24, 2015 at 3:38 PM, Robert Metzger rmetz...@apache.org
  wrote:
 
   Hi guys,
  
   the build queue on travis is getting very very long. It seems that it
  takes
   4 days now until commits to master are build. The nightly builds from
 the
   website and the maven snapshots are also delayed by that.
   Right now,  there are 33 pull request builds scheduled (
   https://travis-ci.org/apache/flink/pull_requests), and 8 builds on
  master:
   https://travis-ci.org/apache/flink/builds.
  
   The problem is that travis accounts are per github user. In our case,
 the
   user is apache, so all ASF projects that have travis enabled share 5
   concurrent builders.
  
   I would actually like to continue using Travis.
  
   The easiest option is probably asking travis if they can give the
  apache
   user more build capacity.
  
   If thats not possible, we have to look into other options.
  
  
   I'm going to ask Travis if they can do anything about it.
  
   Robert
  
 



Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Timo Walther

+1 for a beta release. So there is no feature-freeze until the RC right?


On 26.03.2015 15:32, Márton Balassi wrote:

+1 for the early release.

I'd call it 0.9-milestone1.

On Thu, Mar 26, 2015 at 1:37 PM, Maximilian Michels m...@apache.org wrote:


+1 for a beta release: 0.9-beta.

On Thu, Mar 26, 2015 at 12:09 PM, Paris Carbone par...@kth.se wrote:


+1 for an early release. It will help unblock the samoa PR that has 0.9
dependencies.


On 26 Mar 2015, at 11:44, Kostas Tzoumas ktzou...@apache.org wrote:

+1 for an early milestone release. Perhaps we can call it 0.9-milestone

or

so?

On Thu, Mar 26, 2015 at 11:01 AM, Robert Metzger rmetz...@apache.org
wrote:


Two weeks have passed since we've discussed the 0.9 release the last

time.

The ApacheCon is in 18 days from now.
If we want, we can also release a 0.9.0-beta release that contains

known

bugs, but allows our users to try out the new features easily (because

they

are part of a release). The vote for such a release would be mainly

about

the legal aspects of the release rather than the stability. So I

suspect

that the vote will go through much quicker.



On Fri, Mar 13, 2015 at 12:01 PM, Robert Metzger rmetz...@apache.org
wrote:


I've reopened https://issues.apache.org/jira/browse/FLINK-1650

because

the issue is still occurring.

On Thu, Mar 12, 2015 at 7:05 PM, Ufuk Celebi u...@apache.org wrote:


On Thursday, March 12, 2015, Till Rohrmann till.rohrm...@gmail.com
wrote:


Have you run the 20 builds with the new shading code? With new

shading

the

TaskManagerFailsITCase should no longer fail. If it still does,

then

we

have to look into it again.


No, rebased on Monday before shading. Let me rebase and rerun

tonight.








Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Márton Balassi
@Timo: No feature freeze for this, yes.

On Thu, Mar 26, 2015 at 3:36 PM, Timo Walther twal...@apache.org wrote:

 +1 for a beta release. So there is no feature-freeze until the RC right?



 On 26.03.2015 15:32, Márton Balassi wrote:

 +1 for the early release.

 I'd call it 0.9-milestone1.

 On Thu, Mar 26, 2015 at 1:37 PM, Maximilian Michels m...@apache.org
 wrote:

  +1 for a beta release: 0.9-beta.

 On Thu, Mar 26, 2015 at 12:09 PM, Paris Carbone par...@kth.se wrote:

  +1 for an early release. It will help unblock the samoa PR that has 0.9
 dependencies.

  On 26 Mar 2015, at 11:44, Kostas Tzoumas ktzou...@apache.org wrote:

 +1 for an early milestone release. Perhaps we can call it 0.9-milestone

 or

 so?

 On Thu, Mar 26, 2015 at 11:01 AM, Robert Metzger rmetz...@apache.org
 wrote:

  Two weeks have passed since we've discussed the 0.9 release the last

 time.

 The ApacheCon is in 18 days from now.
 If we want, we can also release a 0.9.0-beta release that contains

 known

 bugs, but allows our users to try out the new features easily (because

 they

 are part of a release). The vote for such a release would be mainly

 about

 the legal aspects of the release rather than the stability. So I

 suspect

 that the vote will go through much quicker.



 On Fri, Mar 13, 2015 at 12:01 PM, Robert Metzger rmetz...@apache.org
 wrote:

  I've reopened https://issues.apache.org/jira/browse/FLINK-1650

 because

 the issue is still occurring.

 On Thu, Mar 12, 2015 at 7:05 PM, Ufuk Celebi u...@apache.org wrote:

  On Thursday, March 12, 2015, Till Rohrmann till.rohrm...@gmail.com
 wrote:

  Have you run the 20 builds with the new shading code? With new

 shading

 the

 TaskManagerFailsITCase should no longer fail. If it still does,

 then

 we

 have to look into it again.


 No, rebased on Monday before shading. Let me rebase and rerun

 tonight.







Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Ufuk Celebi

On 26 Mar 2015, at 11:01, Robert Metzger rmetz...@apache.org wrote:

 Two weeks have passed since we've discussed the 0.9 release the last time.
 
 The ApacheCon is in 18 days from now.
 If we want, we can also release a 0.9.0-beta release that contains known
 bugs, but allows our users to try out the new features easily (because they
 are part of a release). The vote for such a release would be mainly about
 the legal aspects of the release rather than the stability. So I suspect
 that the vote will go through much quicker.

+1 for 0.9-beta