Hrvoje Niksic wrote:

> Maybe the right thing would be for `--post-data' to only apply to the
> URL it precedes, as in:
>
>     wget --post-data=foo URL1 --post-data=bar URL2 URL3
>
<snip>
> But I'm not at all sure that it's even possible to do this and keep
> using getopt!

I'll start by saying that I don't know enough about getopt to comment on
whether Hrvoje's suggestion will work.

It's hard to imagine a situation where wget's current behavior makes sense
over multiple URLs. I'm sure someone can come up with an example, but it's
likely to be an unusual case. I see the ability to POST a form as being most
useful when a site requires some kind of form-based authentication to
proceed with looking at other pages within the site.

Some alternatives that occur to me follow.

Alternative #1. Only apply --post-data to the first URL on the command line.
(A simple solution that probably covers the majority of cases.)


Alternative #2. Allow POST and GET as keywords in the URL list so that:

wget POST http://www.somesite.com/post.cgi --post-data 'a=1&b=2' GET
http://www.somesite.com/getme.html

would explicitly specify which URL uses POST and which uses GET. If more
than one POST is specified, all use the same --post-data.


Alternative #3. Look for <form> tags and have --post-file specify the data
to be specified to various forms:

--form-action=URL1 'a=1&b=2'
--form-action=URL2 'foo=bar'


Alternative #4. Allow complex sessions to be defined using a "session" file
such as:

wget --session=somefile --user-agent='my robot'

Options specified on the command line apply to every URL. If somefile
contained:

--post-data 'data=foo' POST URL1
--post-data 'data=bar' POST URL2
--referer=URL3 GET URL4

It would be the same logically equivalent to the following three commands:

wget --user-agent='my robot' --post-data 'data=foo' POST URL1
wget --user-agent='my robot' --post-data 'data=bar' POST URL2
wget --user-agent='my robot' --referer=URL3 GET URL4

with wget's state maintained across the session.

Tony

Reply via email to