There's someone who could help me plz? Thx
----- Messaggio inoltrato da [EMAIL PROTECTED] ----- Date: Sat, 24 Nov 2001 03:56:35 +0100 (CET) From: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] Subject: [htdig] >>> Databases, rundig & updatedig To: HtDig General Mailing list <[EMAIL PROTECTED]> Hallo, I installed the 3.2.0b4 and there's few things that I can't understand about htdig databases. BTW I use the standard htdig.conf without any special parameter for the databases. 1. What's the use of db.words.db_weakcmpr? I used the rundig sample of the 3.2.0b4 and there isn't any "move" in it about this database. When I ran the script, everything seemed to work but the engine didn't find anything during the searches. :(( I had some suspects, so I renamed the db.words.db.work_weakcmpr in db.words.db_weakcmpr and...automagically everything really worked :))) Plz someone modify that script :)) 2. What are the databases that I really need for updating? After running rundig I had the following databases: total 45M -rw-r--r-- 1 root 11M Nov 22 11:19 db.words.db -rw-r--r-- 1 root 11M Nov 20 18:21 db.words.db.work -rw-r--r-- 1 root 10M Nov 22 11:19 db.excerpts -rw-r--r-- 1 root 10M Nov 20 18:21 db.excerpts.work -rw-r--r-- 1 root 712k Nov 22 11:19 db.docdb -rw-r--r-- 1 root 712k Nov 20 18:21 db.docdb.work -rw-r--r-- 1 root 320k Nov 22 11:19 db.docs.index -rw-r--r-- 1 root 320k Nov 20 18:21 db.docs.index.work -rw-r--r-- 1 root 16k Nov 22 11:16 db.words.db.work_weakcmpr -rw-r--r-- 1 root 16k Nov 22 11:42 db.words.db_weakcmpr I used the updatedig sample of the 3.2.0b4 and after the dig phase (htdig -a - t -vv -s -c htdig.conf) i got 2 new databases more: -rw-r--r-- 1 root 47M Nov 22 12:59 db.worddump (47Mb !!!) -rw-r--r-- 1 root 7.1M Nov 22 12:59 db.docs the first database was an ASCII file, the second one was a DATA file (command file filename :)) I tried to understand the reasons why I had a file of 47Mb and I saw the -t htdig flag. But on Htdig site there was something about db.wordlist database (or ASCII file) and nothing about these ones I got :(( BTW in the updatedig script there was a move command about db.wordlist.old in db.wordlist, but I never found these files. So I ask you more info about them...and ...Do I really need a 47Mb-file?....Do I misunderstand or i need the -a -c flags only to update my databases? 3. I modified the updatedig script to have a report of the updating every time. The first report has a lot of "not changed" and few "changed" (...and "pushing"). The second and the following reports were totally different. They looked like the rundig report...no more changed/not changed....just... 1616:1292:4:http://www.unina.it/universit/amministrazione/personale/mobilita.htm l: title: UniNa_Amminstrazione size = 6811 1617:877:4:http://www.unina.it/universit/amministrazione/statistiche/dal97/medic ina.html: title: UniNa_Amminstrazione size = 44153 ...and... 1622:1928:4:http://www.unina.it/universit/didattica/economia/PERFeco_sotto.html: title: UniNa_Ateneo pushing http://www.unina.it/universit/didattica/economia/PERFeco_laterale.html + pushing http://www.unina.it/universit/didattica/economia/PERFeco_centrale.html + size = 871 Someone could explain me why? Why I don't get simply changed(--> pushing)/not changed in my report? 4. Do I need a purge phase between digging and merging in the updatedig script? In my Update Report I got a lot of "Not found: http://www.unina.it/universit/....... Ref: http://www.unina.it/universit/....". Do I need to purge all these references? 5. I scheduled rundig to be execute 1 time a month and updatedig to be executed 1 time a day. I executed manually the first "rundigging"...with the database dir totally empty. When the cron will execute rundig again, it'll reindex from scratch, but do I need to remove in advance the actual database? Or htdig will delete them for me and then it'll rebuild them again? With my scheduling, there will be a day in which rundig and updatedig will run together (one after other). Do I really need to schedule updatedig soon after rundig? ...or I could don't run it in that day? Thank you Pietro Palladino _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html ----- Fine messaggio inoltrato ----- _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

