Re: Python Web Servers and Page Retrievers

2007-04-11 Thread Max Erickson
"Collin Stocks" <[EMAIL PROTECTED]> wrote:

> --=_Part_19087_21002019.1176329323968
> I tried it, and when checking it using a proxy, saw that it
> didn't really work, at least in the version that I have (urllib
> v1.17 and urllib2 v2.5). It just added that header onto the end,
> therefore making there two User-Agent headers, each with
> different values. I might add that my script IS able to retrieve
> search pages from Google, whereas both urllibs are FORBIDDEN with
> the headers that they use. 
> 

I don't know enough about either library to argue about it, but here 
is what I get following the Dive Into Python example(but hitting 
google for a search):

>>> import urllib2
>>> opener=urllib2.build_opener()
>>> request=urllib2.Request('http://www.google.com/search?
q=tesla+battery')
>>> request.add_header('User-Agent','OpenAnything/1.0 
+http://diveintopython.org/')
>>> data=opener.open(request).read()
>>> data
'tesla battery - Google Search<
[snip rest of results page]

This is with python 2.5 on windows.


max

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Web Servers and Page Retrievers

2007-04-11 Thread Subscriber123

And yes, I do have two email addresses that I use for Python-List

On 4/11/07, Collin Stocks <[EMAIL PROTECTED]> wrote:


I tried it, and when checking it using a proxy, saw that it didn't really
work, at least in the version that I have (urllib v1.17 and urllib2 v2.5).
It just added that header onto the end, therefore making there two
User-Agent headers, each with different values. I might add that my script
IS able to retrieve search pages from Google, whereas both urllibs are
FORBIDDEN with the headers that they use.

On 4/8/07, Max Erickson <[EMAIL PROTECTED]> wrote:
>
> Subscriber123 <[EMAIL PROTECTED]> wrote:
> > urllib, or urllib2 for advanced users. For example, you can
> > easily set your own headers when retrieving and serving pages,
> > such as the User-Agent header which you cannot set in either
> > urllib or urllib2.
>
> Sure you can. See:
>
> http://www.diveintopython.org/http_web_services/user_agent.html
>
> (though the behavior was changed for python 2.3 to make setting the
> user agent work better)
>
>
> max
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python Web Servers and Page Retrievers

2007-04-11 Thread Collin Stocks

I tried it, and when checking it using a proxy, saw that it didn't really
work, at least in the version that I have (urllib v1.17 and urllib2 v2.5).
It just added that header onto the end, therefore making there two
User-Agent headers, each with different values. I might add that my script
IS able to retrieve search pages from Google, whereas both urllibs are
FORBIDDEN with the headers that they use.

On 4/8/07, Max Erickson <[EMAIL PROTECTED]> wrote:


Subscriber123 <[EMAIL PROTECTED]> wrote:
> urllib, or urllib2 for advanced users. For example, you can
> easily set your own headers when retrieving and serving pages,
> such as the User-Agent header which you cannot set in either
> urllib or urllib2.

Sure you can. See:

http://www.diveintopython.org/http_web_services/user_agent.html

(though the behavior was changed for python 2.3 to make setting the
user agent work better)


max


--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python Web Servers and Page Retrievers

2007-04-08 Thread Max Erickson
Subscriber123 <[EMAIL PROTECTED]> wrote:
> urllib, or urllib2 for advanced users. For example, you can
> easily set your own headers when retrieving and serving pages,
> such as the User-Agent header which you cannot set in either
> urllib or urllib2. 

Sure you can. See:

http://www.diveintopython.org/http_web_services/user_agent.html

(though the behavior was changed for python 2.3 to make setting the 
user agent work better)


max


-- 
http://mail.python.org/mailman/listinfo/python-list