On Monday 29 December 2008 15:56, j16sdiz at freenetproject.org wrote: > Author: j16sdiz > Date: 2008-12-29 15:56:55 +0000 (Mon, 29 Dec 2008) > New Revision: 24821 > > Added: > trunk/plugins/XMLSpider/db/ > trunk/plugins/XMLSpider/db/Config.java > trunk/plugins/XMLSpider/db/Page.java > trunk/plugins/XMLSpider/db/PageTimeStampComparator.java > trunk/plugins/XMLSpider/db/PerstRoot.java > trunk/plugins/XMLSpider/db/Status.java > trunk/plugins/XMLSpider/db/Term.java > trunk/plugins/XMLSpider/db/TermPosition.java > Removed: > trunk/plugins/XMLSpider/Config.java > trunk/plugins/XMLSpider/MaxPageId.java > trunk/plugins/XMLSpider/Page.java > trunk/plugins/XMLSpider/Status.java > trunk/plugins/XMLSpider/Term.java > trunk/plugins/XMLSpider/TermPosition.java > Modified: > trunk/plugins/XMLSpider/IndexWriter.java > trunk/plugins/XMLSpider/XMLSpider.java > trunk/plugins/XMLSpider/web/ConfigPage.java > trunk/plugins/XMLSpider/web/MainPage.java > Log: > Port the whole thing to PERST > > Less disk i/o, faster processing, lessor CPU, messier code
Yay, now we depend on 3 databases! It looks like cleaner code to me, well, at least XMLSpider.java itself is a lot cleaner... Perst always seemed to fit a lot more comfortably into the java way of doing things than db4o, with good collections (unlike db4o), garbage collection (unlike db4o) etc; it's a set of persistent types rather than being a queryable database, but that's often better, although there are definitely cases where db4o queries, or automatic index updating, have simplified things significantly ... Using accessors everywhere avoids the need to explicitly store affected objects? Actually w.r.t. accessors automatically storing, that IS possible in db4o - you just have to store the container in the load-into-memory callback. The only reason I don't do that in the client layer is that passing around the container in arguments makes it very easy to discover when I accidentally run database code on the wrong thread... messy though. I believe Perst does still have the corrupts-the-database-on-out-of-disk-space bug though ... all 3 databases had this, but they fixed it in db4o when I reported it. Are you using multiple transactions simultaneously? Always calling to root and thereby automatically not getting cached objects when there is a rollback is interesting, rollback on db4o is too much of a PITA to be used routinely. On the other hand, on db4o I can keep stuff in RAM and just use it, provided I only access the database on one thread... What's up with the assert !isPersistent()'s e.g. in Config? AFAICS Config is always persistent ... PS Term.MD5 : you don't need to create the md5hash before you assign to it. And convertToHex : we have a perfectly good bytesToHex in HexUtil, please avoid duplication unless you have a good reason. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 827 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20090110/a0e97cb5/attachment.pgp>