On Fri, Jun 06, 2003 at 06:07:30PM -0400, Duke, Brian wrote:
>    page looks like wikiurl?PHPSESSIONID=... Is the session id causing
>    htdig to think that every url it gets is a new one?

The site's using sessions, eh? You'll need the cookie patch. Check out the
sidebar on: http://xtrinsic.com/geek/articles/language.phtml
(copied and pasted for your reading pleasure -- if the formatting seems
wrong, it probably is...check out the original page).
Dealing With Cookies

I had to install the cookies patch to v 3.1.6 to keep htdig from crawling
out of control. Here are some notes on installing that patch as well. To
be honest, I had my sys admin do this for me...Although he was unable to
get it to work and I ended up copying and pasting the binary from another
installation that was built from source instead of from the debian
package. I have NOT tested either of these two methods, however, I believe
they make sense and should work. :)

    * The debian package does NOT seem to get along with this patch for
     Woody + stable. It was completely fine on my lap top though (Woody +
     unstable). YMMV.
    * The correctly patched htdig binary file has the following md5 hash:
     9237f812f9e30d6f482cffd0656304ed htdig
      (check by typing md5sum htdig)
    * First download the cookies patch (right click and Save This Link
     As...).
    * Note: this is a COMPRESSED patch. You need to gunzip it before you
     can use it. e.g. zcat cookies.0.gz > cookies.0 and then run it with
     patch -p0 < cookies.0 [Thanks Gilles with a very old browser.]

Once you've installed the patch, don't forget to add the following to your
configuration file:

    * url_rewrite_rules: (.*)\\?SESSION=.* \\1 \
      (.*)\\&SESSION=.* \\1
    * disable_cookies: false

How to install a patch

The following instructions are from Neil Kohl. Thanks Neil! :)

    * To apply the patch you need the htdig source code on your machine
     and you must have permission to modify and create files in the
     source directory.
    * Download and uncompress the patch, then copy it to the top of the
     htdig source tree. Then run the patch command from the top of the
     htdig source tree. patch applies the patches contained in the file
     to the appropriate source code files and is usually run like this:
      patch -b < patch-file-name
    * The -b flag creates backups of any files it touches with the
     extension .orig.
    * Note: if you don't use the -pnum (mentioned below) you'll be
     prompted to enter file names. See: man patch for more information
     and look for the -pnum flag.
    * After you've run patch, you have to recompile and install the htdig
     binaries (run configure if you're never compiled this source before,
     make all, make install as per the documentation).

Patching on debian

Nate gives the following debian-specific instructions.

    * download the debianized source of htdig debian htdig package
    * if you have source uris in /etc/apt/sources.list
      apt-get source htdig
    * if you don't have source uris, then when you get the source and the
     patch, extract them, cd to the htdig directory and patch it with the
     debian patch
      patch -p1 <../filename.diff
    * then patch the source with your patch:
      patch -p1 <../filename.diff
    * you may need to reverse the order of the patches if your patch fails
     (patches can fail if another patch changes too much it may get lost
     and not know what to patch). Assuming it patches correctly, and you
     have a devel enviornment:
      apt-get build-dep htdig
      apt-get install fakeroot
      fakeroot ./debian/rules binary
    * (if you downloaded the debian patch manually you'll probably have to
     chmod +x debian/rules first)
    * if it all works out you should have one or more .deb's built at the
    * end ..
    * ask the admin to install them, but they also should mark the
     packages as held, otherwise the package system may automatically
     overwrite them:
      dpkg --get-selections >selections
      (edit selections, find htdig, change install to hold)
      dpkg --set-selections <selections



-- 
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]


-------------------------------------------------------
This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to