[19:26] <FuriousRage> hmm, the problem with my node dying constantly seams to sort of be solver now when i run sun java 1.6.0-beta [19:43] <FuriousRage> toad_: 1.6 seams to work much better for me in windows with freenet then 1.4 and 1.5 did ;> [19:43] <FuriousRage> but my node havent died for a few hours now [19:44] <FuriousRage> ;> [19:44] <toad_> FuriousRage: your node was dying on windows? [19:44] <toad_> what do you mean, with backoff? [19:44] <Ricky_081> toad_: I have 1.6, too - seems stable on a suse linux 10 machine [19:44] <toad_> Ricky_081: define stable ? [19:44] <FuriousRage> toad_: not backed off [19:44] <toad_> hmmm, interesting [19:44] <toad_> FuriousRage: your node was dying on windows? [19:44] <toad_> what do you mean, with backoff? [19:44] <Ricky_081> toad_: I have 1.6, too - seems stable on a suse linux 10 machine [19:44] <toad_> Ricky_081: define stable ? [19:44] <FuriousRage> toad_: not backed off [19:44] <toad_> hmmm, interesting [19:47] <sandos__> Ive never seen ping times freeze on windows+1.5.06 [19:47] <FuriousRage> it did it all the time for me [19:47] <Ricky_081> toad_: higher cpu power used. - But also new kernel with 1kHz timer [19:48] <Ricky_081> toad_: and voluntary krenel preemption enabled [19:49] <Ricky_081> toad_: kernel 2.6.13-15.10 for P IV compiled [19:50] <Ricky_081> toad_: CONNECTED: 18 | BACKED OFF: 8 ==> before ot was about 4 connected, the rest was backed off [19:52] <Ricky_081> sandos: my pingtimes "exploded" on linux+1.5.06 - first they were about 300ms, then they raised to 1k but no upstream used and processor idle [19:53] <sandos__> ok [19:53] <sandos__> how are they now? [19:54] <Ricky_081> sandos: between 200 and 400ms [19:54] <Ricky_081> sandos: with 16 nodes connected and 35kByte/s upstream
On Sat, May 27, 2006 at 08:33:27PM +0100, Matthew Toseland wrote: > Some folk say 1.6 may fix the problem. Feedback would be appreciated. > > On Fri, May 26, 2006 at 09:05:32PM +0100, Matthew Toseland wrote: > > According to edt, non-NPTL operation of the JVM is not supported on > > 64-bit platforms. > > > > So that is yet another reason why the hack that we used in 0.5 isn't the > > right solution. It doesn't work on 64-bit. It doesn't work on Fedora > > Core 5, and it probably won't work on recent SuSE either. > > > > On Fri, May 26, 2006 at 07:40:39PM +0100, Matthew Toseland wrote: > > > It would be useful to know for sure that this happens on 1.6 beta and > > > 1.5.0_07... Anyone? > > > > > > On Fri, May 26, 2006 at 05:53:21PM +0100, Matthew Toseland wrote: > > > > Another 3 stack traces here, of a different lost lock (still around > > > > PacketSender). > > > > > > > > http://amphibian.dyndns.org/argh.2.txt > > > > > > > > The obvious solution would seem to be - and has been in the past - > > > > LD_LIBRARY_PATH. Unfortunately there are systems on which this causes a > > > > crash by itself e.g. some gentoo's, and nextgens tells me that some > > > > users seem to get the same bug on Windows, although this is difficult to > > > > confirm as they can't easily get a stack trace. > > > > > > > > For me this is triggered by inserts. > > > > > > > > It is known to happen on 1.4.2 and 1.5.0_06 (Sun *and* Blackdown). > > > > > > > > What we DO need to know is if it happens on Windows. Anyone who can get > > > > a stack dump on a Windows node, watch out for all nodes getting backed > > > > off due to Timeout3 or AcceptedTimeout (the same reason on all or most > > > > nodes), and get some stack dumps. Our past experience is that this is an > > > > NPTL issue and therefore linux-specific. > > > > > > > > IBM isn't tested yet. GCJ/GIJ should be immune, and nextgens is working > > > > on that. > > > > > > > > On Wed, May 24, 2006 at 11:27:30PM +0100, Matthew Toseland wrote: > > > > > Observe the two stack traces here: > > > > > http://amphibian.dyndns.org/argh.txt > > > > > > > > > > Look at PacketSender in both cases. There were some seconds between > > > > > them, but they're both the same. It has locked one lock, and it is > > > > > waiting for the other. The other lock is not held by any thread. > > > > > > > > > > This is accompanied by wierd symptoms: Every node is backed off > > > > > because > > > > > of an AcceptedTimeout. > > > > > > > > > > In conclusion? The current 0.7 code triggers a JVM bug - at least on > > > > > my > > > > > machine - which kills us. I've seen the same thing with logging. > > > > > > > > > > Any ideas for a way forward? Or any ideas for why I am wrong (I hope I > > > > > am)? This is consistent, I just did another one, many minutes later. > > > > > It > > > > > always has: > > > > > > > > > > "PacketSender thread for 0" daemon prio=1 tid=0x0825bbd8 nid=0x8c0 > > > > > waiting for monitor entry [0xb11ff000..0xb11ff5c0] > > > > > at freenet.node.KeyTracker.getNextUrgentTime(KeyTracker.java:790) > > > > > - waiting to lock <0x7ef4d718> (a > > > > > freenet.support.UpdatableSortedLinkedListWithForeignIndex) > > > > > at freenet.node.PeerNode.getNextUrgentTime(PeerNode.java:641) > > > > > - locked <0x7e129c78> (a freenet.node.PeerNode) > > > > > at freenet.node.PacketSender.realRun(PacketSender.java:85) > > > > > at freenet.node.PacketSender.run(PacketSender.java:47) > > > > > at java.lang.Thread.run(Thread.java:595) > > > > > > > > > > And in all 3 cases, (and with the same problem with logging earlier), > > > > > 0x7ef4d718 is not locked by any thread. > > > > > > > > > > And it's not looping; it's the same lock it's trying to get, and the > > > > > same lock it's got already, in all 3 cases. > > > > > > > > > > This is with sun java 1.5.0_06. > > > > > -- > > > > > Matthew J Toseland - toad at amphibian.dyndns.org > > > > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > > > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Devl mailing list > > > > > Devl at freenetproject.org > > > > > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > > > > > > > > -- > > > > Matthew J Toseland - toad at amphibian.dyndns.org > > > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > > > > > > > > > _______________________________________________ > > > > Devl mailing list > > > > Devl at freenetproject.org > > > > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > > > > > > -- > > > Matthew J Toseland - toad at amphibian.dyndns.org > > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > > > > > _______________________________________________ > > > Devl mailing list > > > Devl at freenetproject.org > > > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > > > > -- > > Matthew J Toseland - toad at amphibian.dyndns.org > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > _______________________________________________ > > Devl mailing list > > Devl at freenetproject.org > > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > > -- > Matthew J Toseland - toad at amphibian.dyndns.org > Freenet Project Official Codemonkey - http://freenetproject.org/ > ICTHUS - Nothing is impossible. Our Boss says so. > _______________________________________________ > Devl mailing list > Devl at freenetproject.org > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl -- Matthew J Toseland - toad at amphibian.dyndns.org Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20060527/fae70142/attachment.pgp>
