Morten Bo Johansen <[EMAIL PROTECTED]> writes:

> Andrew M. Bishop <[EMAIL PROTECTED]> wrote:
> 
> AMB> Morten Bo Johansen <[EMAIL PROTECTED]> writes:
> 
> >> Andrew M. Bishop <[EMAIL PROTECTED]> wrote:
> 
> > AMB>> No it does not, and rightly so since doing it this way will not add a
> > AMB>> User-Agent field which imdb expects.  The wwwoffle program does not
> > AMB>> add in a User-Agent of its own.  The CensorHeader section of the
> > AMB>> configuration file only censors existing headers, it does not add
> > AMB>> headers of its own.
> >> 
> >> When speaking of your TODO-list, I suppose there would no chance
> >> of adding a little tweak to wwwoffle that would let one send
> >> customized headers on a url-configurable basis?

> I have defined a lot of small functions in my bashrc where I
> make wwwoffle register outgoing requests for information that I
> want from various sources and which I initiate from the command
> line: in casu, just writing e.g. 'actor bodil kjer' on the
> command line to have wwwoffle place an outgoing request for
> that actor with IMDb.com, fetch the page and enter browser to
> read it is obviously more convenient than having to open a
> browser, access the bookmark of a cached copy of IMDb's search
> page, find the text entry field on that page, enter text and
> press "submit", exit the browser, fetch the page, enter the
> browser again to read it.

In this case all that you need is that the wwwoffle program sends a
User-Agent header with all requests that it makes to the WWWOFFLE
server.  This can be combined with a CensorHeader option in the
configuration file that makes sure that imdb.com gets a User-Agent
header that it allows.  You don't need a general purpose addition of
headers to the outgoing requests.

All you need to do is change the source code in wwwoffle.c.
Everywhere that it says something like

  write_formatted(socket,"GET %s HTTP/1.0\r\n"
                         "Pragma: wwwoffle\r\n"
                         "Accept: */*\r\n"
                         "\r\n",
                  refresh);

you need to add in one extra line so that it now reads

  write_formatted(socket,"GET %s HTTP/1.0\r\n"
                         "Pragma: wwwoffle\r\n"
                         "User-Agent: wwwoffle\r\n"
                         "Accept: */*\r\n"
                         "\r\n",
                  refresh);

Then you need an entry in the CensorHeader section of the
configuration file:

CensorHeader
{
<*:/*imdb.com/*> User-Agent = Mozilla/5.0 (X11; Linux i686; U;)
}

I am not sure that imdb.com will accept the header shown above, but it
should be easy enough to find one that it will accept.

-- 
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop                             [EMAIL PROTECTED]
                                      http://www.gedanken.demon.co.uk/

WWWOFFLE users page:
        http://www.gedanken.demon.co.uk/wwwoffle/version-2.7/user.html

Reply via email to