> From your trace it appears that you have NPTL turned off. This means each
> thread has a pid and shows up in top. There are various disadvantages of
> this, but the main advantage is that if you do a stack dump and translate
> the pid's into hexadecimal, you can cross-match the pid's against the java
> tid's.
>
K, just as complementary information : Htop shows the same on my 3 nodes (2 
debian stables and 1 unstable), so I guess a lot of linux users have the same 
settings. Besides, 'top' and 'ps aux' show only one java process.


> Please could you find out which threads are using all the CPU? For 
> example in the attached picture, 28142 (6dee) is the top item, using 38%.
>
Yep, so I put back the default limit of 500 threads and here are the stack 
traces:
http://chronos.kwain.net/~jflesch/new.wrapper.log
http://chronos.kwain.net/~jflesch/new.wrapper2.log
One both, there were two processes using more than 10% of the CPU (all the 
others were < 5%) : 1718 (6B6) and 1730 (6C2).
I was able to find the second one in the dump, but not the first one :/


> Right now my node is using 30-50% of one core (2.2GHz atm). And that's with
> heavy logging, and queued requests. There have in the past been various
> causes for 100% CPU usage, in approximate order of popularity:
> - Running out of memory causing constant garbage collection. (easily
> checked with wrapper.java.additional=-Xloggc:freenet.loggc).
>
I added this option to my wrapper.conf, but nothing special appear in my 
wrapper.log or in my freenet-latest.log. Nextgens suggested it may be due to 
swapping (I gave 196MB of memory to my node, but the computer only has 
256MB), do you think it's possible ? and if yes, would the CPU used be 
specifically attached to the node ('top' shows that at least 70% of the CPU 
is used by the node) ?


> - Bugs causing infinite loops.
> - Heavy logging (check your config settings).
>
logger.priority=NORMAL (on my 3 nodes)

> - FEC encodes/decodes (not likely if no queued stuff/clients).
>
My queues are empty on my 3 nodes.


> - Resuming requests on startup (again not likely, but check that
> downloads.dat.gz is empty, it should be ~21 bytes if so).
>
22 bytes :)

> Most likely your problem is caused by one of the above issues IMHO. Could
> you please get a stack dump and another one of those thread/load dumps
> simultaneously? Then add "wrapper.java.additional=-Xloggc:freenet.loggc" to
> your wrapper.conf. Don't restart the node yet as the thread/load dump may
> suggest something I'd want more info on.
>
Hm, too late since I reduced the thread limit to 100. The two stack traces 
above were done ~1min after the start of the node. Do you want me to wait 
longer before doing other stack traces ?


> More comments below...
>
> On Thursday 21 February 2008 00:47, you wrote:
> > Hi,
> >
> > I'm sorry to annoy you with that (I know that a lot of users probably
> > already did) , but I have various questions about the node, the main
> > one being "why does it use so much resources ?!".
> >
> > There is a reason behind this question : I'll have to shut down my
> > node on my dedicated server because, honestly, having the CPU used at
> > 100%,  3/4 of the memory used all the time, and a load always >= 2
> > even when no downloads/insertions are running is not acceptable for me
> > anymore :(  (it's a celeron 2,6Ghz with 256MB of memory)
>
> If you could try to help me with some debugging first that'd be awesome. If
> not you're just moaning. :)
>
> I'm assuming you also run More Important Things on the machine mentioned?
> 100% CPU should not be a reason to decomission it by itself, unless power
> consumption is a big deal.
>
> > See by yourself:
> > http://zeus.kwain.net/~jflesch/chronos.jpg
> > http://chronos.kwain.net/~jflesch/wrapper.log (I have these messages
> > almost each hour in my wrapper.log, and after some restarts, the
> > wrapper stops the node).
>
> Does not explain why it restarts the node in the first place. But it does
> suggest a memory problem.
>
> > Note to toad : can you remove my ref from the seednodes file please ?
> >
> > I have the feeling that the CPU use is somehow related to the number
> > of threads used by the node. And when I see the number of threads used
> > by the node, I can only wonder "damn ! why does it need so much
> > threads ?!". I mean, an usual network application only need some few
> > threads for each socket :
> > - One thread reading what's coming on the socket, and starting a new
> >  thread for each message read (using a thread poll like you do if
> >  needed)
> > - If you have a lot of things to send, one thread used to send all the
> >   data that are put in a queue (or any more sophisticated structure).
> > When I look at the node stats, I see a lot of threads who, according
> > to their names, are probably waiting for a specific message on the
> > socket or waiting to send their own message on it.
> > Moreover, I think that by reducing drastically the number of threads,
> > the node could spare a lot of memory currently used in the threads
> > stacks. So, would it possible with the current freenet code ? (I tried
> > to determine that by myself, but I must admit that the code of the
> > node is quite huge).
>
> I very much doubt that thread stacks are *that* big a problem. Java is
> designed to be multi-threaded. And there are good reasons to be
> multi-threaded. Most of our threads are described below:
> - We do indeed have one thread per UDP socket to handle incoming data.
> - We also have one thread to handle sending data (and a few other
> administrative tasks).
> - We have threads for FEC encoding.
> - We have threads for sending requests and inserts (and announcements).
> Receiving requests and inserts uses a thread for a short period and then
> goes asynchronous. The point is, this is core logic, it is *really*
> important that it work well. The code is vastly simpler by using a thread
> per request rather than a state machine, and in terms of getting good
> debugged code, simplicity matters. If we have to reduce the number of
> threads used by requests, the best solution is probably to pull in a
> continuations library and use that, but so far I haven't seen any reason to
> do so.
> - We have threads for block transfer. This can be reduced; right now it is
> a good deal more complicated than it needs to be. It will happen when we do
> some major link level rewriting.
> - We have threads for FCP and HTTP - one per connection and one per server
> socket. This is easiest, and on an idle node it's just 2 threads (3 if
> console is enabled).
> - There are a few housekeeping jobs that still have dedicated threads even
> though they sleep most of the time e.g. DNSRequester. Feel free to convert
> them into ticker jobs.
>
> > About the memory use, I have a similar question : Where all the memory
> > gave to the node goes ?! Before, I was believing that most of the
> > memory was used to index the datastore. But a node with a default
> > configuration only use 20MB of memory for this task, so for what are
> > used the remaining 100MB?
>
> BDB allows you to control some but not all of its memory usage. The part it
> lets you control is 20MB, it uses a significant amount beyond that (maybe
> 50%). On all nodes I've profiled, most of the remaining memory is used for
> queued requests. I have no idea why a node with no queued downloads would
> use a lot of memory or CPU, my experience suggests this is a bug.
>
Ok, tell me if you need more informations (logs, etc). I already gave my logs 
(wrapper.log / freenet-latest.log) to nextgens about one week ago, but he 
didn't find anything conclusive :/


> > I'm asking these questions because I really think that if these issues
> > could be solve, then the bandwidth available over freenet could become
> > really more interesting (more bandwidth => more spreading of freenet
> > around the world => more probabilities to reach the countries for
> > which Freenet is designed).
>
> Freenet is quite capable of reaching the sort of bandwidth that is readily
> available to most users right now. For example on my node, with a 40kB/sec
> output limit:
>     * Total Input: 4.15 GiB (33.4 KiB/sec)
>     * Total Output: 4.37 GiB (35.2 KiB/sec)
>     * Payload Output: 2.80 GiB (22.5 KiB/sec)(64%)
>
Hm, yes, in fact, I have similar values on my stats page. But when I start a 
download in Thaw, the maximum speed I can get is between 500bytes/s and 
7Kb/s. But now you say it, I think it's probably more a matter of churn/etc 
than a matter of CPU :/


> Dedicated hosts on 100Mbps lines are really not that interesting for a
> fully distributed system.
>
> > Moreover, I'm used to see Freenet just
> > like another layer above UDP/IP, but I can't imagine having a network
> > layer using so much memory/CPU (and that's regrettable because I
> > really believe that distributed storage can be the future of
> > networking).
>
> It is possible to go over the top with optimising. Programmer time is very
> expensive compared to RAM. The average new PC has 1GB of RAM, maybe 512M
> for a really low-end system.
>
hem ... both of my nodes have only 256MB of ram (and I can't upgrade the 
dedicated one) ... and I think a lot of users (not just "few geeks") are 
running their nodes on old computers with similar amounts of memory instead 
of their usual personnal computer.


> Having said that, by all means we should deal 
> with all the low-hanging fruit; Freenet should not use more memory and CPU
> than it needs to, for several reasons:
> - People in the places that need Freenet most may have low-end hardware.
> - Freenet's impact on a desktop system needs to be minimal if people are to
> continue to use it and allow it to run 24x7 in the background.
> - Upstream bandwidth is rising, albeit slowly in most places.
> - A few geeks will run it on low-end dedicated servers, whether on their
> home LAN or at Bytemark etc.



-- 
Jerome Flesch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080221/b0e007bb/attachment.pgp>

Reply via email to