D'oh! That's what I get for coding into the wee hours of the morning on too much coffee and not enough food. :-)
I've attached corrections to the addnodes.sh and dedupe.sh scripts. It occurred to me later that if addnodes.sh is run periodically without *really* eliminating duplicate noderefs using the same version, then the seednodes.ref file is going to just grow and grow and grow. I've solved the problem by a sort of "brute force" method for now. In the case of duplicate noderefs using the same version, only one is saved, the others are mindlessly tossed in the bit bucket. I've also improved the code in a few other places, removing some unnecessary stuff, simplifying and speeding it up a good bit. Enjoy! On 17-Feb-2004 Conrad Sabatier wrote: > OK, I've finally come up with a deduping script. It's not perfect, in that I > didn't know what to do with duplicate nodes with the same version, so I just > left them alone, but it's a start. Hopefully, over time, it'll work out OK, > the idea being that as later versions of nodes are added, the older dupes > will > drop out. > > For this reason, I've modified the addnodes.sh script to just add everything > in > noderefs.txt (after first deduping it) to seednodes.ref, and then deduping > seednodes.ref. The problem with the earlier version of addnodes.sh was that > if > a node was already present in seednodes.ref, the one in noderefs.txt would > never get added, even if it was newer. > > I've also cleaned up and "normalized" all the other scripts, cleaning up the > headers and adding usage tips, as well as made them safer by having them > backup > files before modifying them. > > Hope you'll find them useful. > > Conrad (my God, it's nearly 5 a.m.!) > > -- > Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas" -- Conrad Sabatier <[EMAIL PROTECTED]> - "In Unix veritas"
#!/bin/sh # # dedupe.sh # # Dedupe nodes in input file (seednodes.ref or noderefs.txt) # # Author: dolphin # # Usage: ./dedupe.sh inputfile # # Put this script in your main freenet dir and run from there # (companion script addnodes.sh expects to find it in current dir) # # Algorithm: # # Step 1: backup input file! :-) # # Step 2: read noderef records one at a time from input file, # saving version info to a temp file for each unique IP # # Step 3: remove all but the highest version number in each IP's # temp file # # Step 4: reread the input file one record at a time as above, # saving only the first node whose version is found in # its IP's temp file, discarding all others, writing # output to a temporary nodes file # # Step 5: move temporary nodes file to original input file # # Not terribly elegant, but it gets the job one :-) # # One little bugaboo: if a node has only multiple # references to a single version, all but one are deleted. # Didn't know how else to handle this without more # to go on. :-) if [ $# -ne 1 ] then echo "Usage: ./dedupe.sh infile" exit 1 fi infile=$1 cp ${infile} ${infile}.bak #save a copy just in case! outfile=/tmp/$(basename ${infile}).$$ rm -f /tmp/node.* ${outfile} # #function: read a single noderef from input, save to temp file # read_one_noderef() { rm -f /tmp/noderef.$$ while read line do echo $line >> /tmp/noderef.$$ if [ "$line" = "End" ] then break fi done } # Collect node version info i=0 while : do read_one_noderef if [ ! -f /tmp/noderef.$$ ] then break #end of file fi physical_tcp=$(awk '/^physical.tcp=/ {split($0, a, "[=:]"); print a[2]}' /tmp/n oderef.$$) version=$(awk -F',' '/^version=/ {print $4}' /tmp/noderef.$$) echo "${version}" >> /tmp/node.${physical_tcp} i=$((i + 1)) done < $infile echo "$i node references found in $infile" echo "Deduping..." # # unique-ify version info (determine latest version of a node) # for f in /tmp/node.* do sort -urn -o ${f}.tmp $f head -n 1 ${f}.tmp > $f done i=0 while : do read_one_noderef if [ ! -f /tmp/noderef.$$ ] then break #end of file fi physical_tcp=$(awk '/^physical.tcp=/ {split($0, a, "[=:]"); print a[2]}' /tmp/n oderef.$$) version=$(awk -F',' '/^version=/ {print $4}' /tmp/noderef.$$) if grep -q $version /tmp/node.${physical_tcp} then cat /tmp/noderef.$$ >> $outfile cat /dev/null > /tmp/node.${physical_tcp} i=$((i + 1)) fi done < $infile mv $outfile $infile echo "Saved $i unique nodes to $infile" rm -f /tmp/node*
#!/bin/sh # # addnodes.sh # # Add nodes from current routing table to seednodes.ref # Dedupes seednodes.ref after adding nodes # # Author: dolphin # # Usage: ./addnodes.sh # # Put this script in your main freenet dir and run from there # (expects to find noderefs.txt, seednodes.ref and dedupe.sh # in current dir) # # Requires: wget cd $(realpath $(dirname $0)) cp noderefs.txt noderefs.txt.bak echo "Fetching noderefs.txt..." wget -q -O noderefs.txt http://localhost:8888/servlet/nodestatus/noderefs.txt ./dedupe.sh noderefs.txt cat noderefs.txt >> seednodes.ref ./dedupe.sh seednodes.ref
_______________________________________________ Support mailing list [EMAIL PROTECTED] http://news.gmane.org/gmane.network.freenet.support Unsubscribe at http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/support Or mailto:[EMAIL PROTECTED]