Attached is an initial version of a small shell script that will
pass on URLs to wwwoffle from a urlview index and append strings
to them or replace substrings in them depending on the host.
Please feel free to modify or enhance.

It goes to work, if you put

   COMMAND leanpage.sh %s

in your ~/.urlview file   

This will allow you to fetch "printer friendly" versions of
pages instead of their beefed up counterparts, and as a bonus
you will also mostly get the information you want on a single
page, as for instance with nytimes.com.

There is of course a possibility that you will get an occasional
screwed URL if the string is appended to pages for which no
text-based version exists, such as an index page, an image or the
like, but in those few instances you can just edit the URL after
the fact and refetch, and besides the script could easily be
modified to deal with this too.

It could be nice if something like this could be implemented
directly in wwwoffle, so one wasn't restricted to urlview. What
do you think of such an idea?

Have a nice week-end.


Morten


-- 
"All you need in this life is ignorance and confidence, and then
 success is sure."                                  (Mark Twain)
#!/bin/sh

## leanpage.sh: gets lean, "printer friendly" versions where available 
## of fluffy pages, by way of urlview(1).

if [ -z $(echo $1 | grep "www.") ]
then
    host=$(expr "$1" : '.*http://\([^/]*\)')  # http://foo
else
    host=$(expr "$1" : '.*www\.\([^/]*\)')    # www.foo og http://www.foo
fi

case "$host" in
    nytimes.com)
        appendix="=&pagewanted=print"
        url=$1$appendix
        ;;
    eb.dk)
        appendix="&TemplateID=207"
        url=$1$appendix
        ;;
    linuxjournal.com)
        url=$(echo $1 | sed 's/article.php/print.php/g')
        ;;
    newsfactor.com)
        url=$(echo $1 | sed 's/story/printer/g')
        ;;
    *)
        wwwoffle $1 >/dev/null 2>&1 &
        exit 0
        ;;
esac

wwwoffle $url >/dev/null 2>&1 &

Reply via email to