Re: Flink Scala performance

2015-07-18 Thread Michele Bertoni
hi, actually the same happens to me on my macbook pro when not plugged to power 
but with battery
and twice if i am using hdfs

in my case it seems like in power saving mode jvm commands has a very high 
latency

i.e. a simple "hdfs dfs -ls /“ takes about 20 seconds when only on battery, so 
it is not related to flink

cheers


> Il giorno 18/lug/2015, alle ore 23:22, Vinh June  
> ha scritto:
> 
> it sounds unreasonable for me, because I'm working on other Java projects
> also, non of them takes that long to fire up JVM. Strange !
> Do you have any suggestion to fix this ? 
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2151.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.



Re: Flink Scala performance

2015-07-18 Thread Vinh June
it sounds unreasonable for me, because I'm working on other Java projects
also, non of them takes that long to fire up JVM. Strange !
Do you have any suggestion to fix this ? 



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2151.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-17 Thread Stephan Ewen
The 349ms is how long it takes to run the job. The 18s is what it takes the
command line client to submit the job.

Like I said before, may be there are super long delays on your system when
you spawn JVMs, or in your DNS resolution. Thay way, connecting to the
cluster to submit the job will take a long time...

On Thu, Jul 16, 2015 at 5:53 PM, Vinh June 
wrote:

> I just checked on web job manager, it says that runtime for flink job is
> 349ms, but actually it takes 18s using "time" command in terminal
> Should I care more about the latter timing ?
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2106.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: Flink Scala performance

2015-07-16 Thread Vinh June
I just checked on web job manager, it says that runtime for flink job is
349ms, but actually it takes 18s using "time" command in terminal
Should I care more about the latter timing ?



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2106.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-16 Thread Stephan Ewen
Is it possible that it takes a long time to spawn JVMs on your system? That
this takes up all the time?

On Thu, Jul 16, 2015 at 3:34 PM, Vinh June 
wrote:

> Here are my logs
> http://pastebin.com/AJwiy2D8
> http://pastebin.com/K05H3Qur
> from client log, it seems to take ~2s, but with "time flink run ...",
> actual
> time is ~18s
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2095.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: Flink Scala performance

2015-07-16 Thread Vinh June
Here are my logs
http://pastebin.com/AJwiy2D8
http://pastebin.com/K05H3Qur
from client log, it seems to take ~2s, but with "time flink run ...", actual
time is ~18s



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2095.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-16 Thread Stephan Ewen
If you use the sample data from the example, there must be an issue with
the setup.

In Flink's standalone mode, it runs in 100ms on my machine.

It may be possible that the command line client takes a long time to start
up, so it appears that the program run time is long. If it takes so long,
one reason may be slow DNS resolution.

You can check that by looking at the logs of the client process (int the
"log" folder).

Stephan


On Thu, Jul 16, 2015 at 2:06 PM, Vinh June 
wrote:

> @Stephan: I use the sample data comes with the sample
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2091.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: Flink Scala performance

2015-07-16 Thread Vinh June
@Stephan: I use the sample data comes with the sample



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2091.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-16 Thread Chiwan Park
You can increase Flink managed memory by increasing Taskmanager JVM Heap 
(taskmanager.heap.mb) in flink-conf.yaml.
There is some explanation of options in Flink documentation [1].

Regards,
Chiwan Park

[1] 
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#common-options

> On Jul 16, 2015, at 7:23 PM, Vinh June  wrote:
> 
> I found it in JobManager log
> 
> "21:16:54,986 INFO  org.apache.flink.runtime.taskmanager.TaskManager  
>
> - Using 25 MB for Flink managed memory."
> 
> is there a way to explicitly assign this for local ?
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2087.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.





Re: Flink Scala performance

2015-07-16 Thread Vinh June
I found it in JobManager log

"21:16:54,986 INFO  org.apache.flink.runtime.taskmanager.TaskManager
 
- Using 25 MB for Flink managed memory."

is there a way to explicitly assign this for local ?



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2087.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-16 Thread Stephan Ewen
Vinh,

Are you using the sample data built into the example, or are you using your
own data?

On Thu, Jul 16, 2015 at 8:54 AM, Vinh June 
wrote:

> I ran it on local, from terminal.
> And it's the Word Count example so it's small
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2074.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: Flink Scala performance

2015-07-16 Thread Ufuk Celebi
Hey Vinh,

you have to look into the logs folder and find the log of the TaskManager 
(something like *taskmanager*.log)

– Ufuk

On 16 Jul 2015, at 11:35, Vinh June  wrote:

> Hi Max, 
> When I call 'flink run', it doesn't show any information like that
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2083.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.



Re: Flink Scala performance

2015-07-16 Thread Vinh June
Hi Max, 
When I call 'flink run', it doesn't show any information like that



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2083.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-16 Thread Maximilian Michels
HI Vinh,

If you run your program locally, then Flink uses the local execution mode
which allocates only little managed memory. Managed memory is used by Flink
to perform operations on serialized data. These operations can get slow if
too little memory gets allocated because data needs to be spilled to disk.
That would of course be different in a cluster environment where you
configure the memory explicitly.

When the task manager starts, it tells you how much memory it allocates.
For example, in my case:

10:12:37,655 INFO
org.apache.flink.runtime.taskmanager.TaskManager  - Using 1227
MB for Flink managed memory.

How does that look in your case?

Cheers,
Max



On Thu, Jul 16, 2015 at 8:54 AM, Vinh June 
wrote:

> I ran it on local, from terminal.
> And it's the Word Count example so it's small
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2074.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: Flink Scala performance

2015-07-15 Thread Vinh June
I ran it on local, from terminal. 
And it's the Word Count example so it's small



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2074.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink Scala performance

2015-07-15 Thread Aljoscha Krettek
Hi,
that depends. How are you executing the program? Inside an IDE? By starting
a local cluster? And then, how big is your input data?

Cheers,
Aljoscha

On Wed, 15 Jul 2015 at 23:45 Vinh June  wrote:

> I just realized that Flink program takes a lot of time to run, for example,
> just the simple word count example in 0.9 takes 18s to run on my laptop
> (mbp
> mac os 10.9, i5, 8gb ram, ssd).
> Any one can explain this / suggest a work around ?
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Flink Scala performance

2015-07-15 Thread Vinh June
I just realized that Flink program takes a lot of time to run, for example,
just the simple word count example in 0.9 takes 18s to run on my laptop (mbp
mac os 10.9, i5, 8gb ram, ssd).
Any one can explain this / suggest a work around ?



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.