Hi Nicolau,

Your description is not weird at all—ForkJoinPool's Worker threads will
scan for work when they have none of their own, which they do for about 2
seconds, and if they can't find work they'll decommission themselves.

Note that the default configuration for Akka needs to be tuned for the
application at hand as there is no "one-size-fits-all" when it comes to
different kinds of applications/systems.

On Sun, Jul 12, 2015 at 7:30 PM, Nicolau Werneck <nwern...@gmail.com> wrote:

> That message was pretty crazy, so I wanted to give some follow-up. I made
> some changes to my system, and now I was able to reach 100% CPU utilization
> most of the time. I am creating this map-reduce framework while at the same
> time using Spark to solve the same problem, and I believe I got a 40% time
> improvement over it. So that was cool.
>
> I would still like to understand better how Akka is behaving in my system,
> though. What I am doing is similar to a big "word count" problem, with 1M
> lines and 10k words per line out of a possible 100k words vocabulary. The
> change I did was that instead of having each word to be counted traveling
> as a separate message to the final reducers, I am now moving iterators
> around. So I start reading a file chunk, then map functions lazily, and
> then I only actually parse and count the "words" up in the reducers, an
> then I later I add them up. Reducing the number of messages flowing
> apparently was a great help.
>
> What is crazy is that I can still notice by following the process in htop
> that there is some cyclic behavior. The CPUs get to 100% with user load,
> but then move to ~100% kernel load, and then there is a brief periods of
> less than 100% occupation. Still looks like some kind of IO related
> locking, but it should be possible to have this program running like a
> steady flow of data processing... So I'm still looking for advice on how to
> debug this system. Is Kamon the best tool to make this kind of inspection?
>
> Thanks, and sorry if this question is getting too weird or off-topic!
>     ++nic
>
>
>
>
> On Saturday, June 27, 2015 at 4:20:46 PM UTC-3, Nicolau Werneck wrote:
>>
>> I have created this small framework for running map-reduce data
>> processing jobs based on Akka. It's on github,
>> https://github.com/projetoeureka/akka-mapreduce. (I know, I could be
>> using Spark, but I still think it may pay off some day...)
>>
>> I have used it successfully with small tests and some medium sized jobs,
>> but now I'm finally trying to run a very large job. I have not yet tried to
>> use a cluster with separate machines, but I am using a 32-cores (actually 8
>> cores x4 threads AFAIK) AWS machine for this test.
>>
>> I am writing because I am a bit confused about the behavior I am seeing
>> in the machine when I execute the job. The program appears to be running
>> fine, I have some output indicating that. But if I open "htop" to look at
>> the CPU load, there is a weird behavior. The total CPU load oscillates,
>> going to almost 100% for a little time in bursts, and then moving down to a
>> very low value. There will be like 6 java threads running with 30% CPU load
>> for a while, and then 20 threads with 100% and then back to low. that
>> happens like every 20 seconds.
>>
>> It looks like some kind of file blocking, but I find it very difficult...
>> Also in my problem each line form the input generates lots of data to the
>> reducers, so the file IO really should not matter.
>>
>> Any ideas of what may be causing this, and how I could investigate? In my
>> 2x2 cores notebook I get 400% CPU utilization, so I don't even know exactly
>> how to reproduce the problem in other situations.
>>
>> Tha attachments show the weird behavior on htop.
>>
>> Thanks,
>>     ++nic
>>
>  --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Cheers,
√

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to