Re: suggestion for Quickstart

2016-02-26 Thread Stefano Baghino
Hi Tara, thank you so much for reporting the issues you had, I'll open a ticket and start working on it. Best, Stefano On Fri, Feb 26, 2016 at 2:08 AM, Tara Athan wrote: > On > https://ci.apache.org/projects/flink/flink-docs-master/quickstart/setup_quickstart.html > > some of the instructions

Re: Watermarks with repartition

2016-02-26 Thread Aljoscha Krettek
Hi, yes, your description is spot on! Cheers, Aljoscha > On 26 Feb 2016, at 00:19, Zach Cox wrote: > > I think I found the information I was looking for: > > RecordWriter broadcasts each emitted watermark to all outgoing channels [1]. > > StreamInputProcessor tracks the max watermark received

Re: Need some help to understand the cause of the error

2016-02-26 Thread Aljoscha Krettek
Hi, as far as I can see it the problem is in this line: k.sum(3) using field indices is only valid for Tuple Types. In your case you should be able to use this: k.sum(“field3”) because this is a field of your Reading type. Cheers, Aljoscha > On 26 Feb 2016, at 02:44, Nirmalya Sengupta > wrote

Kafka issue

2016-02-26 Thread Gyula Fóra
Hey, For one of our jobs we ran into this issue. It's probably some dependency issue but we cant figure it out as a very similar setup works without issues for a different program. java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object; at kafka.consumer.FetchR

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Till Rohrmann
Hi Saiph, you can do it the following way: input.keyBy(0).timeWindow(Time.seconds(10)).fold(0, new FoldFunction, Integer>() { @Override public Integer fold(Integer integer, Tuple2 o) throws Exception { return integer + 1; } }); Cheers, Till ​ On Thu, Feb 25, 2016 at 7:58 PM,

flink-storm FlinkLocalCluster issue

2016-02-26 Thread #ZHANG SHUHAO#
Hi everyone, I'm a student researcher working on Flink recently. I'm trying out the flink-storm example project, version 0.10.2, flink-storm-examples, word-count-local. But, I got the following error: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not enough free s

Need some help to understand the cause of the error

2016-02-26 Thread Nirmalya Sengupta
Hello Aljoscha, I have also tried by using the field's name in the sum("field3") function (like you have suggested), but this time the exception is different: Exception in thread "main" java.lang.ExceptionInInitializerError at

Re: Kafka issue

2016-02-26 Thread Till Rohrmann
Hi Gyula, could it be that you compiled against a different Scala version than the one you're using for running the job? This usually happens when you compile against 2.10 and let it run with version 2.11. Cheers, Till On Fri, Feb 26, 2016 at 10:09 AM, Gyula Fóra wrote: > Hey, > > For one of o

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
I am not sure what is happening. I tried running against a Flink cluster that is definitely running the correct Scala version (2.10) and I still got the error. So it might be something with the pom.xml but we just don't see how it is different from the correct one. Gyula Till Rohrmann ezt írta (

Re: Kafka issue

2016-02-26 Thread Robert Metzger
Are you building 1.0-SNAPSHOT yourself or are you relying on the snapshot repository? We had issues in the past that jars in the snapshot repo were incorrect On Fri, Feb 26, 2016 at 10:45 AM, Gyula Fóra wrote: > I am not sure what is happening. I tried running against a Flink cluster > that is

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
I was using the snapshot repo in this case, let me try building my own version... Maybe this is interesting: mvn dependency:tree | grep 2.11 [INFO] | \- org.apache.kafka:kafka_2.11:jar:0.8.2.2:compile [INFO] | +- org.scala-lang.modules:scala-xml_2.11:jar:1.0.2:compile [INFO] | +- org.scal

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
That actually seemed to be the issue, not that I compiled my own version it doesnt have these wrond jars in the dependency tree... Gyula Fóra ezt írta (időpont: 2016. febr. 26., P, 11:01): > I was using the snapshot repo in this case, let me try building my own > version... > > Maybe this is int

Re: flink-storm FlinkLocalCluster issue

2016-02-26 Thread Till Rohrmann
Hi Shuhao, the configuration you’re providing is only used for the storm compatibility layer and not Flink itself. When you run your job locally, the LocalFlinkMiniCluster should be started with as many slots as your maximum degree of parallelism is in your topology. You can check this in FlinkLoc

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
Thanks Robert, so apparently the snapshot version was screwed up somehow and included the 2.11 dependencies. Now it works. Cheers, Gyula Gyula Fóra ezt írta (időpont: 2016. febr. 26., P, 11:09): > That actually seemed to be the issue, not that I compiled my own version > it doesnt have these w

Flink streaming throughput

2016-02-26 Thread おぎばやしひろのり
Hello, I started evaluating Flink and tried simple performance test. The result was just about 4000 messages/sec with 300% CPU usage. I think this is quite low and wondering if it is a reasonable result. If someone could check it, it would be great. Here is the detail: [servers] - 3 Kafka broker

Re: The way to itearte instances in AllWindowFunction in current Master branch

2016-02-26 Thread HungChang
Many thanks Aljoscha! It can replay computing old instances now. The result looks absolutely correct. When printint currentTimestamp there are values such as 1456480762777, 1456480762778...which are not -1s. So I'm a bit confused about extractTimestamp(). Can I ask why curTimeStamp = currentTimes

Re: The way to itearte instances in AllWindowFunction in current Master branch

2016-02-26 Thread HungChang
Ah! My incorrect code segment made the Watermark not going forward and always stay at the same moment in the past. Is that true and the issue? Cheers, Hung -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/The-way-to-itearte-instances-in-AllW

Re: Flink streaming throughput

2016-02-26 Thread Stephan Ewen
Hi! I would try and dig bit by bit into what the bottleneck is: 1) Disable the checkpointing, see what difference that makes 2) Use a dummy sink (discarding) rather than elastic search, to see if that is limiting 3) Check the JSON parsing. Many JSON libraries are very CPU intensive and easily

Re: The way to itearte instances in AllWindowFunction in current Master branch

2016-02-26 Thread Aljoscha Krettek
Hi, yes that seems to have been the issue. The Math.max() is used to ensure that the timestamp does never decrease, because this is not allowed for a watermark. Cheers, Aljoscha > On 26 Feb 2016, at 11:11, HungChang wrote: > > Ah! My incorrect code segment made the Watermark not going forward a

Re: flink-storm FlinkLocalCluster issue

2016-02-26 Thread #ZHANG SHUHAO#
Hi till, Thanks for your reply. But it appears that it only started with #slot of 1. I have traced down to the source code of flink step by step, where I have confirmed it. I'm using flink 0.10.2, source code downloaded from flink website. Nothing have been changed. I simply try to run the flin

Re: Frequent exceptions killing streaming job

2016-02-26 Thread Stephan Ewen
Was the contended lock part of Flink's runtime, or the application code? If it was part of the Flink Runtime, can you share what you found? On Thu, Feb 25, 2016 at 6:03 PM, Nick Dimiduk wrote: > For what it's worth, I dug into the TM logs and found that this exception > was not the root cause, m

Re: Flink streaming throughput

2016-02-26 Thread おぎばやしひろのり
Stephan, Thank you for your quick response. I will try and post the result later. Regards, Hironori 2016-02-26 19:45 GMT+09:00 Stephan Ewen : > Hi! > > I would try and dig bit by bit into what the bottleneck is: > > 1) Disable the checkpointing, see what difference that makes > 2) Use a dummy

Re: flink-storm FlinkLocalCluster issue

2016-02-26 Thread Stephan Ewen
Hi! On 0.10.x, the Storm compatibility layer does not properly configure the Local Flink Executor to have the right parallelism. In 1.0 that is fixed. If you try the latest snapshot, or the 1.0-Release-Candidate-1, it should work. Greetings, Stephan On Fri, Feb 26, 2016 at 12:16 PM, #ZHANG SHU

Re: Error in import of flink-streaming-examples project [StockPrices.java]

2016-02-26 Thread Stephan Ewen
I think this example refers to a much older version (0.8) and is no longer compatible On Wed, Feb 24, 2016 at 4:02 PM, subash basnet wrote: > Hello there, > > I imported the flink-streaming-examples project [ > https://github.com/mbalassi/flink/tree/stockprices/flink-staging/flink-streaming/flin

Re: streaming hdfs sub folders

2016-02-26 Thread Stephan Ewen
Hi! Have a look at the class-level comments in "InputFormat". They should describe how input formats first generate splits (for parallelization) on the master, and the workers open each split. So you need something like this: AvroInputFormat avroInputFormat = new AvroInputFormat(new Path("hdfs:/

Re: Graph with stream of updates

2016-02-26 Thread Robert Metzger
Hi Ankur, Can you provide a bit more information on what you are trying to achieve? Do you want to keep a graph build from an stream of events within Flink and query that? Or you you want to change the dataflow graph of Flink while a job is running? Regards, Robert On Thu, Feb 25, 2016 at 11:1

Re: Graph with stream of updates

2016-02-26 Thread Ankur Sharma
Hello, Thanks for reply. I want to create a graph from stream and query it. You got it right. Stream may be edges that are getting added or removed from the graph. Is there a way to create a empty global graph that can be transformed using a stream of updates? Best, Ankur Sharma 3.15 E1.1 Uni

Re: Graph with stream of updates

2016-02-26 Thread Vasiliki Kalavri
Hi Ankur, you can have custom state in your Flink operators, including a graph. There is no graph state abstraction provided at the moment, but it shouldn't be too hard for you to implement your own. If your use-case only requires processing edge additions only, then you might want to take a look

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-26 Thread Stephan Ewen
Hi Cory, there is also a new release candidate which should be clean dependency wise. I hope it is feasible for you to stay on stable versions. The CI infrastructure still seems to have issues that mix Scala versions between snapshot builds. We are looking into this... Stephan On Wed, Feb 24,

Re: The way to itearte instances in AllWindowFunction in current Master branch

2016-02-26 Thread Aljoscha Krettek
Hi Hung, after some discussion the way that window functions are used will change back to the way it was in 0.10.x, i.e. the Iterable is always part of the apply function. Sorry for the inconvenience this has caused. Cheers, Aljoscha > On 26 Feb 2016, at 11:48, Aljoscha Krettek wrote: > > Hi,

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Saiph Kappa
Why the ".keyBy"? I don't want to count tuples by Key. I simply want to count all tuples that are contained in a window. On Fri, Feb 26, 2016 at 9:14 AM, Till Rohrmann wrote: > Hi Saiph, > > you can do it the following way: > > input.keyBy(0).timeWindow(Time.seconds(10)).fold(0, new > FoldFunct

Iterations problem in command line

2016-02-26 Thread Marcela Charfuelan
Hello, I implemented an algorithm that includes iterations (EM algorithm) and I am getting different results when running in eclipse (Luna Release (4.4.0)) and when running in the command line using Flink run; the program does not crash is just that after the first iteration the results are d

Re: Need some help to understand the cause of the error

2016-02-26 Thread Aljoscha Krettek
Hi, which version of Flink are you using, by the way? This would help me narrow down on possible causes of the problem. Cheers, Aljoscha > On 26 Feb 2016, at 10:34, Nirmalya Sengupta > wrote: > > Hello Aljoscha, > > I have also tried by using the field's name in the sum("field3") function >

Re: Frequent exceptions killing streaming job

2016-02-26 Thread Nick Dimiduk
Sorry I wasn't clear. No, the lock contention is not in Flink. On Friday, February 26, 2016, Stephan Ewen wrote: > Was the contended lock part of Flink's runtime, or the application code? > If it was part of the Flink Runtime, can you share what you found? > > On Thu, Feb 25, 2016 at 6:03 PM, Ni

Re: Need some help to understand the cause of the error

2016-02-26 Thread Nirmalya Sengupta
Hello Aljoscha, I am using Flink 0.10.1 (see below) and flinkspector (0.1-SNAPSHOT). - org.apache.flink flink-scala 0.10.1 org.apache.flink flink-streaming-scala 0.10.1 org.apache.flink flink-clients 0.10.1

Re: Watermarks with repartition

2016-02-26 Thread Zach Cox
Thanks for the confirmation Aljoscha! I wrote up results from my little experiment: https://github.com/zcox/flink-repartition-watermark-example -Zach On Fri, Feb 26, 2016 at 2:58 AM Aljoscha Krettek wrote: > Hi, > yes, your description is spot on! > > Cheers, > Aljoscha > > On 26 Feb 2016, at

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Saiph Kappa
That code will not run in parallel right? So, a map-reduce task would yield better performance no? On Fri, Feb 26, 2016 at 6:06 PM, Stephan Ewen wrote: > Then go for: > > input.timeWindowAll(Time.seconds(10)).fold(0, new > FoldFunction, Integer>() { @Override public > Integer fold(Integer inte

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Stephan Ewen
True, at this point it does not pre-aggregate in parallel, that is actually a feature on the list but not yet added... On Fri, Feb 26, 2016 at 7:08 PM, Saiph Kappa wrote: > That code will not run in parallel right? So, a map-reduce task would > yield better performance no? > > > > On Fri, Feb 26

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Stephan Ewen
Then go for: input.timeWindowAll(Time.seconds(10)).fold(0, new FoldFunction, Integer>() { @Override public Integer fold(Integer integer, Tuple2 o) throws Exception { return integer + 1; } }); Try to explore the API a bit, most things should be quite intuitive. There are also some docs: https://ci

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Gyula Fóra
Hey, I am wondering if the following code will result in identical but more efficient (parallel): input.keyBy(assignRandomKey).window(Time.seconds(10) ).reduce(count).timeWindowAll(Time.seconds(10)).reduce(count) Effectively just assigning random keys to do the preaggregation and then do a windo

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Stephan Ewen
Yes, Gyula, that should work. I would make the random key across a range of 10 * parallelism. On Fri, Feb 26, 2016 at 7:16 PM, Gyula Fóra wrote: > Hey, > > I am wondering if the following code will result in identical but more > efficient (parallel): > > input.keyBy(assignRandomKey).window(Ti

Re: Watermarks with repartition

2016-02-26 Thread Aljoscha Krettek
Cool, that’s a nice write up. Would you maybe be interested in integrating this as some sort of internal documentation in Flink? So that prospective contributors can get to know this stuff. Cheers, Aljoscha > On 26 Feb 2016, at 18:32, Zach Cox wrote: > > Thanks for the confirmation Aljoscha! I

Re: Watermarks with repartition

2016-02-26 Thread Zach Cox
Sure, want me to open a jira issue and then PR a new page into https://github.com/apache/flink/tree/master/docs/internals, following these instructions? http://flink.apache.org/contribute-documentation.html -Zach On Fri, Feb 26, 2016 at 1:13 PM Aljoscha Krettek wrote: > Cool, that’s a nice wri

Re: Watermarks with repartition

2016-02-26 Thread Aljoscha Krettek
Yes, that would be perfect. Thanks! -- Aljoscha > On 26 Feb 2016, at 20:53, Zach Cox wrote: > > Sure, want me to open a jira issue and then PR a new page into > https://github.com/apache/flink/tree/master/docs/internals, following these > instructions? http://flink.apache.org/contribute-docume

Job Manager HA manual setup

2016-02-26 Thread Welly Tambunan
Hi All, We have already try to setup the Job Manager HA based on the documentation and using script and provided zookeeper. It works. However currently everything is done using start-cluster script that I believe will require passwordlress ssh between node. We are restricted with our environment

Re: Job Manager HA manual setup

2016-02-26 Thread Welly Tambunan
typos We have tried this one the job manager can failover, but the task manager CAN'T be relocated to the new task manager. Is there some settings for this one ? Or is the task manager also can be relocate to the new job manager ? Cheers On Sat, Feb 27, 2016 at 7:27 AM, Welly Tambunan wrote:

RE: flink-storm FlinkLocalCluster issue

2016-02-26 Thread #ZHANG SHUHAO#
Thanks for the confirmation. When will 1.0 be ready in maven repo? From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of Stephan Ewen Sent: Friday, February 26, 2016 9:07 PM To: user@flink.apache.org Subject: Re: flink-storm FlinkLocalCluster issue Hi! On 0.10.x, the Storm com