Re: Flink on Tez

2014-11-10 Thread sirinath
All this is good where you can swap engines but at this point I think you should make the internal engine as good as possible to it dominated all other engines. This way there will be not use to change. Also relying or running on top of software designed for other ecosystems is not a good trent. B

Re: Flink on Tez

2014-11-10 Thread Henry Saputra
HI Kostas, Since Tez underneath is using YARN so what does local execution means in this case? - Henry On Fri, Nov 7, 2014 at 10:03 AM, Kostas Tzoumas wrote: > Hello Flink and Tez, > > I would like to point you to a first version of Flink running on > Tez. This is a Flink subproject (to be init

[jira] [Created] (FLINK-1233) Flaky Test AggregateITCase

2014-11-10 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1233: --- Summary: Flaky Test AggregateITCase Key: FLINK-1233 URL: https://issues.apache.org/jira/browse/FLINK-1233 Project: Flink Issue Type: Bug Components:

[jira] [Created] (FLINK-1232) Allow I/O spill file writers to work with futures/callbacks

2014-11-10 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1232: --- Summary: Allow I/O spill file writers to work with futures/callbacks Key: FLINK-1232 URL: https://issues.apache.org/jira/browse/FLINK-1232 Project: Flink Issu

[jira] [Created] (FLINK-1231) Add test for streaming remote execution

2014-11-10 Thread JIRA
Márton Balassi created FLINK-1231: - Summary: Add test for streaming remote execution Key: FLINK-1231 URL: https://issues.apache.org/jira/browse/FLINK-1231 Project: Flink Issue Type: Test

Re: Flink on Tez

2014-11-10 Thread Kostas Tzoumas
Sean, these are all sensible questions. As this codebase matures and is eventually committed to Flink, it makes sense to create a guide of cases where one engine would be a better fit than another. Following the discussion in https://issues.apache.org/jira/browse/SPARK-3561, I read that it is not

Re: Coarse-grained FT implementation

2014-11-10 Thread Stephan Ewen
Hey everyone! Sorry to be late to answer to this question. The short anser is: Our fault tolerance is very comparable to Spark's RDD lineage. We internally build the computation graph of the operators (we call it JobGraph / ExecutionGraph) which we use both for execution and re-execution in case

Re: Embeded Use

2014-11-10 Thread Stephan Ewen
Yes, we should definitely add that, thanks. On Sat, Nov 8, 2014 at 11:41 AM, sirinath wrote: > Perhaps you can add more on this in the documentation. Embedded use in not > very clear. > > > > -- > View this message in context: > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabbl

[jira] [Created] (FLINK-1230) Add embedded collection execution to documentation

2014-11-10 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1230: --- Summary: Add embedded collection execution to documentation Key: FLINK-1230 URL: https://issues.apache.org/jira/browse/FLINK-1230 Project: Flink Issue Type: Im

[jira] [Created] (FLINK-1229) Synchronize WebClient arguments with command line arguments

2014-11-10 Thread Timo Walther (JIRA)
Timo Walther created FLINK-1229: --- Summary: Synchronize WebClient arguments with command line arguments Key: FLINK-1229 URL: https://issues.apache.org/jira/browse/FLINK-1229 Project: Flink Issu

Re: KNIME Integration of Flink

2014-11-10 Thread Stephan Ewen
Hi Arvid! What you describe sounds like a great use case. I am not aware of an integration of Flink with KNIME. Triggering progamms from other programs should work through the Client & PackagedProgram classes- - https://github.com/apache/incubator-flink/blob/master/flink-clients/src/main/java/or

Re: [jira] [Created] (FLINK-1228) Add REST Interface to JobManager

2014-11-10 Thread Flavio Pompermaier
That would be very useful! On Mon, Nov 10, 2014 at 3:25 PM, Arvid Heise (JIRA) wrote: > Arvid Heise created FLINK-1228: > -- > > Summary: Add REST Interface to JobManager > Key: FLINK-1228 > URL: https://issues.apache

[jira] [Created] (FLINK-1228) Add REST Interface to JobManager

2014-11-10 Thread Arvid Heise (JIRA)
Arvid Heise created FLINK-1228: -- Summary: Add REST Interface to JobManager Key: FLINK-1228 URL: https://issues.apache.org/jira/browse/FLINK-1228 Project: Flink Issue Type: Improvement

KNIME Integration of Flink

2014-11-10 Thread Arvid Heise
Dear Flinkler, For my current project, we want to outsource some performance critical parts of a complex KNIME workflow to Flink. Is there already a way to trigger a Flink workflow from KNIME? If not, we will probably provide a straight-forward way to execute Flink (Scala) programs from KNIME with

[jira] [Created] (FLINK-1227) KeySelector can't implement ResultTypeQueryable

2014-11-10 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-1227: --- Summary: KeySelector can't implement ResultTypeQueryable Key: FLINK-1227 URL: https://issues.apache.org/jira/browse/FLINK-1227 Project: Flink Issue Typ

[jira] [Created] (FLINK-1226) No way to give Configuration to DataSource

2014-11-10 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1226: Summary: No way to give Configuration to DataSource Key: FLINK-1226 URL: https://issues.apache.org/jira/browse/FLINK-1226 Project: Flink Issue Type: Improvem

Re: Type extraction with generic variables

2014-11-10 Thread Stephan Ewen
Hi Vasia! I think in your case, it would be even better to use "getMapReturnTypes( MapFunction mapInterface, TypeInformation inType)" That way you can pass the function that you wrap (the one that maps the vertex values) and get its produced type. But I think the example shows that we need much

Re: Help with .getExecutionEnvironment() method

2014-11-10 Thread Stephan Ewen
Nice! On Fri, Nov 7, 2014 at 11:09 PM, Gyula Fóra wrote: > Okay, I got it working :) > Turns out I only needed the client from the contextenvironment to get it > working. > > Thanks for the help :) > > On Fri, Nov 7, 2014 at 4:26 PM, Aljoscha Krettek > wrote: > > > Ah ok, maybe you can expose m

Re: Hi / Aggregation support

2014-11-10 Thread Viktor Rosenfeld
Yes, you would need static import methods. Best, Viktor Stephan Ewen wrote > I guess you would need static method imports to make the code look like > this, which I think is fine. > > On Mon, Nov 10, 2014 at 12:00 PM, Fabian Hueske < > fhueske@ > > wrote: > >> How/where do you plan to define

Re: Hi / Aggregation support

2014-11-10 Thread Stephan Ewen
I guess you would need static method imports to make the code look like this, which I think is fine. On Mon, Nov 10, 2014 at 12:00 PM, Fabian Hueske wrote: > How/where do you plan to define the methods min(1), max(1), and cnt()? > If these are static methods in some kind of Aggregation class, it

Re: Hi / Aggregation support

2014-11-10 Thread Fabian Hueske
How/where do you plan to define the methods min(1), max(1), and cnt()? If these are static methods in some kind of Aggregation class, it won't look so concise anymore, or am I missing something here? I would be fine with both ways, the second one being nice, if it can be done like that. 2014-11-1

Re: Collection serializers

2014-11-10 Thread Márton Balassi
+1, it would be useful for a number of use cases. On Mon, Nov 10, 2014 at 10:31 AM, Stephan Ewen wrote: > I was wondering whether we should add dedicated collection serializers to > the Java API, similar as the scala API has them. > > The advantage would be not to depend on kryo for that. Kryo b

Re: Flink on Tez

2014-11-10 Thread Fabian Hueske
Flavio, you can switch between both engines with virtually no effort if you use YARN. In that case, you'll either start Flink's own runtime or Tez on YARN. 2014-11-09 14:34 GMT+01:00 sirinath : > Why don't you have an internal engine which addresses both the concerns > raised than one over the ot

Re: Hi / Aggregation support

2014-11-10 Thread Gyula Fora
I also support this approach: ds.groupBy(0).aggregate(min(1), max(1), cnt()) I think it makes the code more readable, because it is easy to see whats in the result tuple. Gyula > On 10 Nov 2014, at 10:49, Aljoscha Krettek wrote: > > I like this version: ds.groupBy(0).aggregate(min(1), max(

Re: Type extraction with generic variables

2014-11-10 Thread Timo Walther
Hey, the TypeExtractor.getForClass() is only intended for primitive types, if you want to extract all kinds of types you should use TypeExtractor.createTypeInfo(Type t) (a class is also a type). However, I think it would be better if your method takes TypeInformation as an argument instead of

Re: HBase 0.98 addon for Flink 0.8

2014-11-10 Thread Fabian Hueske
I don't think we need to bundle the HBase input and output format in a single PR. So, I think we can proceed with the IF only and target the OF later. However, the fix for Kryo should be in the master before merging the PR. Till is currently working on that and said he expects this to be done by en

Re: Hi / Aggregation support

2014-11-10 Thread Aljoscha Krettek
I like this version: ds.groupBy(0).aggregate(min(1), max(1), cnt()), very concise. On Mon, Nov 10, 2014 at 10:42 AM, Viktor Rosenfeld wrote: > Hi Fabian, > > I ran into a problem with your syntax example: > > DataSet> ds = ... > DataSet,Integer, Integer, Long> result = > ds.groupBy(0).min(1).andM

Re: Hi / Aggregation support

2014-11-10 Thread Viktor Rosenfeld
Hi Fabian, I ran into a problem with your syntax example: DataSet> ds = ... DataSet,Integer, Integer, Long> result = ds.groupBy(0).min(1).andMax(1).andCnt(); Basically, in the example above we don't know how long the chain of aggregation method calls is. Also, each aggregation method call add

Collection serializers

2014-11-10 Thread Stephan Ewen
I was wondering whether we should add dedicated collection serializers to the Java API, similar as the scala API has them. The advantage would be not to depend on kryo for that. Kryo becomes quite inefficient when it does not know the element data type (the user did not register it), and we can de