I was able to reproduce some database corruption, using the attached foo.pl script.
I ran it once to create the xapian db, and ctrl-c'd after a while. xapian-check was happy. Then I ran it in an evil loop: while xapian-check foo; do perl foo.pl & sleep 1m; kill -9 %1; done After about an hour, xapian-check found that the database had indeed become corrupt: hjoey@pell:~/foo>xapian-check . record: baseA blocksize=8K items=2580000 lastblock=27242 revision=258 levels=2 root=27030 B-tree checked okay record table structure checked OK termlist: baseA blocksize=8K items=2580000 lastblock=20465 revision=258 levels=2 root=20303 B-tree checked okay termlist table structure checked OK postlist: baseB blocksize=8K items=5610265 lastblock=54544 revision=259 levels=2 root=8817 B-tree checked okay document id 2580001 in doclen stream is larger than get_last_docid() 2580000 document id 2580001: length 14 doesn't match 0 in the termlist table document id 2580002 in doclen stream is larger than get_last_docid() 2580000 document id 2580002: length 14 doesn't match 0 in the termlist table document id 2580003 in doclen stream is larger than get_last_docid() 2580000 document id 2580003: length 14 doesn't match 0 in the termlist table document id 2580004 in doclen stream is larger than get_last_docid() 2580000 document id 2580004: length 14 doesn't match 0 in the termlist table document id 2580005 in doclen stream is larger than get_last_docid() 2580000 document id 2580005: length 14 doesn't match 0 in the termlist table ... postlist table errors found: 20000 position: baseB blocksize=8K items=12949980 lastblock=35161 revision=259 levels=2 root=34883 B-tree checked okay position table structure checked OK This is different corruption than I have seen from ikiwiki, but it certianly suggests that relying on the automatic flush at exit is not safe if there is any chance that the script can be killed. (Or the machine unexpectedly lose power?) -- see shy jo
use Search::Xapian::WritableDatabase; use Search::Xapian; my $stemmer=Search::Xapian::Stem->new("english"); my $db=Search::Xapian::WritableDatabase->new("foo", Search::Xapian::DB_CREATE_OR_OPEN); for (1..10000) { my $pageterm="U:$_\n"; my $doc=Search::Xapian::Document->new(); $doc->set_data( "url=foo\n". "sample=foo($_)oobar($_)baz\n". "modtime=".localtime(time)."\n" ); my $tg = Search::Xapian::TermGenerator->new(); $tg->set_stemmer($stemmer); $tg->set_document($doc); $tg->index_text("$_ $_$_", 2); $tg->index_text("foo$_", 1, "XLINK"); $doc->add_term($pageterm); $db->replace_document_by_term($pageterm, $doc); }
signature.asc
Description: Digital signature