Re: help in tracking down 1481 memory leak (with reproduction steps)
Hello, There's a tentative fix at [1]. If anyone could take a look at it and try it out that would speed things up. I'm running on that branch +24hrs with no issues so far. Best regards, [1]: https://github.com/freenet/fred/pull/640
Re: help in tracking down 1481 memory leak (with reproduction steps)
This is somewhat preliminary but I got to this point: [1] (`always use BouncyCastle in KeyGenUtils`) Found some (old) posts about memory leaks with dynamic providers[2][3][4]. In [2] it's mentioned a statical method to install providers. I'm still running test against those commits [1] and the previous one [5]. Best regards, [1]: https://github.com/freenet/fred/commit/abad64d133ed9a674d5f666f48db178a85652b9e [2]: http://www.bouncycastle.org/wiki/display/JA1/Provider+Installation [3]: http://disq.us/p/198y23e [4]: http://bouncy-castle.1462172.n4.nabble.com/Re-memory-leak-td4655694.html [5]: https://github.com/freenet/fred/commit/8988466283ee43f8ac35c308d4af3fc59172472f
Re: help in tracking down 1481 memory leak (with reproduction steps)
On 2018-10-08 18:00, Arne Babenhauserheide wrote: > DC* writes: > > Do you have experience with profiling Java for memory leaks? > No, I have no experience but I'm trying to triangle the issues commit by commit to see where the issue was introduced and reduce the scope. > The only lead I have right now is that something with threading might go > wrong, since we now have native thread priorities and these might be > stalling something which would release references to objects. I saw some commits relating to threading, so it may be the case. I tried to see the changes between 1480 and 1481 but there are no tags/branches to easily compare. I'm moving between commits by cherry-picking, is there a better way? Best regards,
Re: help in tracking down 1481 memory leak (with reproduction steps)
DC* writes: > Here are my logs (log.level DEBUG). My node restarted several times at > 15m, 20m, 30m. The log named `check-alive.log` is the output from the > gist (it's cut off ubut shows enough information). Thank you! Yours is the first reproduction outside my own machines. I was short of concluding that it’s just something borked here, but it seems there’s an actual (and serious) problem with 1481. > If there is anything else I could help with, let me know. Do you have experience with profiling Java for memory leaks? The only lead I have right now is that something with threading might go wrong, since we now have native thread priorities and these might be stalling something which would release references to objects. Best wishes, Arne -- Unpolitisch sein heißt politisch sein ohne es zu merken signature.asc Description: PGP signature
Re: help in tracking down 1481 memory leak (with reproduction steps)
DC* writes: > Are there any debug/logging/stack trace setting we could enable to see where > it died? You can set logging in wrapper.conf, see the wrapper.logfile.loglevel and wrapper.console.loglevel lines. > I'm gonna setup an container to try this out. Thank you! Best wishes, Arne -- Unpolitisch sein heißt politisch sein ohne es zu merken signature.asc Description: PGP signature
help in tracking down 1481 memory leak (with reproduction steps)
Hi, The past weeks I’ve been stumped trying to track down a severe memory leak which prevents releasing 1481. I have a hard time tracking down where and why exactly it happens, therefore I’d be very grateful for your help. The following shows how to reproduce the problem on GNU/Linux: Getting Freenet 1481 to crash with an Out-of-Memory error within less than 30 minutes. The gist is: Upload a file. tee freenet-1481-OOM-reproduction.sh << EOF wget https://github.com/freenet/fred/releases/download/build01481/new_installer_offline_1481.jar java -jar new_installer_offline_1481.jar # click through the setup wizard and the in-browser first-run wizard, give Freenet high upload bandwidth (i.e. 164kiB/s) # give freenet time to start the FCP server sleep 180 # prepare a file to upload INSERTFILE="$(mktemp /tmp/insert.temp.XX)" head -c 100M < /dev/urandom > "$INSERTFILE" IDENT=testupload"${INSERTFILE##*.}" # prepare the command to connect to freenet and upload the file # connect with HELLO TEMPFILE="$(mktemp /tmp/insert.temp.XX)" echo ClientHello > $TEMPFILE echo "Name=Upload-Test${INSERTFILE##*.}" >> $TEMPFILE echo ExpectedVersion=2 >> $TEMPFILE echo End >> $TEMPFILE echo >> $TEMPFILE # upload with ClientPut echo ClientPut >> $TEMPFILE echo "DontCompress=true" >> $TEMPFILE echo "URI=CHK@/testupload" >> $TEMPFILE echo "Identifier=$IDENT" >> $TEMPFILE echo MaxRetries=-1 >> $TEMPFILE echo UploadFrom=direct >> $TEMPFILE echo DataLength=$(ls -l $INSERTFILE | cut -d " " -f 5) >> $TEMPFILE echo Persistence=forever >> $TEMPFILE echo Global=true >> $TEMPFILE echo End >> $TEMPFILE cat $INSERTFILE >> $TEMPFILE # run the insert (cat $TEMPFILE | nc 127.0.0.1 9481) & # watch how long the node lives for i in {1..100}; do curl 'http://127.0.0.1:/stats/?fproxyAdvancedMode=2' 2>/dev/null | grep -io nodeUptimeSession.*'<' | grep -io '[^;]*s<' | grep -io '.*s' ; curl 'http://127.0.0.1:/stats/?fproxyAdvancedMode=2' 2>/dev/null | grep -io '[^>]* java memory.*&' | grep -io '[^&]*'; sleep 5; done EOF I hope this allows you to reproduce the problem — and I would be very happy if you could find and fix the source of the problem! This has been blocking the release of 1481 for far too long. With 33 peers as target (but only up to 25 actually connected, my connection isn’t that fast), this gets Freenet to die with an OOM within less than 15 minutes (last successful stats site at 14m17s). Best wishes, Arne -- Unpolitisch sein heißt politisch sein ohne es zu merken signature.asc Description: PGP signature