Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-24 Thread est
On Feb 25, 5:46 am, 7stud <[EMAIL PROTECTED]> wrote:
> On Feb 24, 4:41 am, est <[EMAIL PROTECTED]> wrote:
>
>
>
>
>
> > On Feb 23, 2:42 am, Rob Wolfe <[EMAIL PROTECTED]> wrote:
>
> > > est <[EMAIL PROTECTED]> writes:
> > > > Hi all,
>
> > > > I need urllib2 do perform series of HTTP requests with cookie from
> > > > PREVIOUS request(like our browsers usually do ). Many people suggest I
> > > > use some library(e.g. pycURL) instead but I guess it's good practise
> > > > for a python beginner to DIY something rather than use existing tools.
>
> > > > So my problem is how to expand the urllib2 class
>
> > > > from cookielib import CookieJar
> > > > class SmartRequest():
> > > >     cj=CookieJar()
> > > >     def __init__(self, strUrl, strContent=None):
> > > >         self.Request    =   urllib2.Request(strUrl, strContent)
> > > >         self.cj.add_cookie_header(self.Request)
> > > >         self.Response   =   urllib2.urlopen(Request)
> > > >         self.cj.extract_cookies(self.Response, self.Request)
> > > >     def url
> > > >     def read(self, intCount):
> > > >         return self.Response.read(intCount)
> > > >     def headers(self, strHeaderName):
> > > >         return self.Response.headers[strHeaderName]
>
> > > > The code does not work because each time SmartRequest is initiated,
> > > > object 'cj' is cleared. How to avoid that?
> > > > The only stupid solution I figured out is use a global CookieJar
> > > > object. Is there anyway that could handle all this INSIDE the class?
>
> > > > I am totally new to OOP & python programming, so could anyone give me
> > > > some suggestions? Thanks in advance
>
> > > Google for urllib2.HTTPCookieProcessor.
>
> > > HTH,
> > > Rob- Hide quoted text -
>
> > > - Show quoted text -
>
> > Wow, thank you Rob Wolfe! Your reply is shortest yet most helpful! I
> > solved this problem by the following code.
>
> > class HTTPRefererProcessor(urllib2.BaseHandler):
> >     """Add Referer header to requests.
>
> >     This only makes sense if you use each RefererProcessor for a
> > single
> >     chain of requests only (so, for example, if you use a single
> >     HTTPRefererProcessor to fetch a series of URLs extracted from a
> > single
> >     page, this will break).
>
> >     There's a proper implementation of this in module mechanize.
>
> >     """
> >     def __init__(self):
> >         self.referer = None
>
> >     def http_request(self, request):
> >         if ((self.referer is not None) and
> >             not request.has_header("Referer")):
> >             request.add_unredirected_header("Referer", self.referer)
> >         return request
>
> >     def http_response(self, request, response):
> >         self.referer = response.geturl()
> >         return response
>
> >     https_request = http_request
> >     https_response = http_response
>
> > def main():
> >     cj = CookieJar()
> >     opener = urllib2.build_opener(
> >         urllib2.HTTPCookieProcessor(cj),
> >         HTTPRefererProcessor(),
> >     )
> >     urllib2.install_opener(opener)
>
> >     urllib2.urlopen(url1)
> >     urllib2.urlopen(url2)
>
> > if "__main__" == __name__:
> >     main()
>
> > And it's working great!
>
> > Once again, thanks everyone!
>
> How does the class HTTPReferrerProcessor do anything useful for you?- Hide 
> quoted text -
>
> - Show quoted text -

Well, it's more browser-like. Many be I should have snipped
HTTPReferrerProcessor code for this discussion.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-24 Thread 7stud
On Feb 24, 4:41 am, est <[EMAIL PROTECTED]> wrote:
> On Feb 23, 2:42 am, Rob Wolfe <[EMAIL PROTECTED]> wrote:
>
>
>
> > est <[EMAIL PROTECTED]> writes:
> > > Hi all,
>
> > > I need urllib2 do perform series of HTTP requests with cookie from
> > > PREVIOUS request(like our browsers usually do ). Many people suggest I
> > > use some library(e.g. pycURL) instead but I guess it's good practise
> > > for a python beginner to DIY something rather than use existing tools.
>
> > > So my problem is how to expand the urllib2 class
>
> > > from cookielib import CookieJar
> > > class SmartRequest():
> > >     cj=CookieJar()
> > >     def __init__(self, strUrl, strContent=None):
> > >         self.Request    =   urllib2.Request(strUrl, strContent)
> > >         self.cj.add_cookie_header(self.Request)
> > >         self.Response   =   urllib2.urlopen(Request)
> > >         self.cj.extract_cookies(self.Response, self.Request)
> > >     def url
> > >     def read(self, intCount):
> > >         return self.Response.read(intCount)
> > >     def headers(self, strHeaderName):
> > >         return self.Response.headers[strHeaderName]
>
> > > The code does not work because each time SmartRequest is initiated,
> > > object 'cj' is cleared. How to avoid that?
> > > The only stupid solution I figured out is use a global CookieJar
> > > object. Is there anyway that could handle all this INSIDE the class?
>
> > > I am totally new to OOP & python programming, so could anyone give me
> > > some suggestions? Thanks in advance
>
> > Google for urllib2.HTTPCookieProcessor.
>
> > HTH,
> > Rob- Hide quoted text -
>
> > - Show quoted text -
>
> Wow, thank you Rob Wolfe! Your reply is shortest yet most helpful! I
> solved this problem by the following code.
>
> class HTTPRefererProcessor(urllib2.BaseHandler):
>     """Add Referer header to requests.
>
>     This only makes sense if you use each RefererProcessor for a
> single
>     chain of requests only (so, for example, if you use a single
>     HTTPRefererProcessor to fetch a series of URLs extracted from a
> single
>     page, this will break).
>
>     There's a proper implementation of this in module mechanize.
>
>     """
>     def __init__(self):
>         self.referer = None
>
>     def http_request(self, request):
>         if ((self.referer is not None) and
>             not request.has_header("Referer")):
>             request.add_unredirected_header("Referer", self.referer)
>         return request
>
>     def http_response(self, request, response):
>         self.referer = response.geturl()
>         return response
>
>     https_request = http_request
>     https_response = http_response
>
> def main():
>     cj = CookieJar()
>     opener = urllib2.build_opener(
>         urllib2.HTTPCookieProcessor(cj),
>         HTTPRefererProcessor(),
>     )
>     urllib2.install_opener(opener)
>
>     urllib2.urlopen(url1)
>     urllib2.urlopen(url2)
>
> if "__main__" == __name__:
>     main()
>
> And it's working great!
>
> Once again, thanks everyone!

How does the class HTTPReferrerProcessor do anything useful for you?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-24 Thread est
On Feb 23, 2:42 am, Rob Wolfe <[EMAIL PROTECTED]> wrote:
> est <[EMAIL PROTECTED]> writes:
> > Hi all,
>
> > I need urllib2 do perform series of HTTP requests with cookie from
> > PREVIOUS request(like our browsers usually do ). Many people suggest I
> > use some library(e.g. pycURL) instead but I guess it's good practise
> > for a python beginner to DIY something rather than use existing tools.
>
> > So my problem is how to expand the urllib2 class
>
> > from cookielib import CookieJar
> > class SmartRequest():
> >     cj=CookieJar()
> >     def __init__(self, strUrl, strContent=None):
> >         self.Request    =   urllib2.Request(strUrl, strContent)
> >         self.cj.add_cookie_header(self.Request)
> >         self.Response   =   urllib2.urlopen(Request)
> >         self.cj.extract_cookies(self.Response, self.Request)
> >     def url
> >     def read(self, intCount):
> >         return self.Response.read(intCount)
> >     def headers(self, strHeaderName):
> >         return self.Response.headers[strHeaderName]
>
> > The code does not work because each time SmartRequest is initiated,
> > object 'cj' is cleared. How to avoid that?
> > The only stupid solution I figured out is use a global CookieJar
> > object. Is there anyway that could handle all this INSIDE the class?
>
> > I am totally new to OOP & python programming, so could anyone give me
> > some suggestions? Thanks in advance
>
> Google for urllib2.HTTPCookieProcessor.
>
> HTH,
> Rob- Hide quoted text -
>
> - Show quoted text -

Wow, thank you Rob Wolfe! Your reply is shortest yet most helpful! I
solved this problem by the following code.

class HTTPRefererProcessor(urllib2.BaseHandler):
"""Add Referer header to requests.

This only makes sense if you use each RefererProcessor for a
single
chain of requests only (so, for example, if you use a single
HTTPRefererProcessor to fetch a series of URLs extracted from a
single
page, this will break).

There's a proper implementation of this in module mechanize.

"""
def __init__(self):
self.referer = None

def http_request(self, request):
if ((self.referer is not None) and
not request.has_header("Referer")):
request.add_unredirected_header("Referer", self.referer)
return request

def http_response(self, request, response):
self.referer = response.geturl()
return response

https_request = http_request
https_response = http_response

def main():
cj = CookieJar()
opener = urllib2.build_opener(
urllib2.HTTPCookieProcessor(cj),
HTTPRefererProcessor(),
)
urllib2.install_opener(opener)

urllib2.urlopen(url1)
urllib2.urlopen(url2)

if "__main__" == __name__:
main()

And it's working great!

Once again, thanks everyone!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-24 Thread est
On Feb 23, 5:57 am, 7stud <[EMAIL PROTECTED]> wrote:
> On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote:
>
> > Hi all,
>
> > I need urllib2 do perform series of HTTP requests with cookie from
> > PREVIOUS request(like our browsers usually do ).
>
> Cookies from a previous request made in the currently running
> program?  Or cookies from requests that were made when you previously
> ran the program?
>
>
>
>
>
>
>
> > from cookielib import CookieJar
> > class SmartRequest():
> > cj=CookieJar()
> > def __init__(self, strUrl, strContent=None):
> > self.Request=   urllib2.Request(strUrl, strContent)
> > self.cj.add_cookie_header(self.Request)
> > self.Response   =   urllib2.urlopen(Request)
> > self.cj.extract_cookies(self.Response, self.Request)
> > def url
> > def read(self, intCount):
> > return self.Response.read(intCount)
> > def headers(self, strHeaderName):
> > return self.Response.headers[strHeaderName]
>
> > The code does not work because each time SmartRequest is initiated,
> > object 'cj' is cleared. How to avoid that?
> > The only stupid solution I figured out is use a global CookieJar
> > object. Is there anyway that could handle all this INSIDE the class?
>
> Examine this code and its output:
>
> class SmartRequest(object):
> def __init__(self, id):
> if not getattr(SmartRequest, 'cj', None):
> SmartRequest.cj = "I'm a cookie jar. Created by request:

the getattr method is exactly what I am looking for, thanks!


On Feb 23, 2:05 pm, 7stud <[EMAIL PROTECTED]> wrote:
> On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote:
>
>
>
> > class SmartRequest():
>
> You should always define a class like this:
>
> class SmartRequest(object):
>
> unless you know of a specific reason not to.

Thanks for the advice!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-23 Thread Steve Holden
7stud wrote:
> On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote:
>> class SmartRequest():
>>
> 
> You should always define a class like this:
> 
> class SmartRequest(object):
> 
> 
> unless you know of a specific reason not to.
> 
> 
It's much easier, though, just to put

__metaclass__ = type

at the start of any module where you want exlusively new-style objects. 
And I do agree that you should use exclusively new-style objects without 
a good reason for not doing, though thanks to Guido's hard work it 
mostly doesn't matter.

$ cat test94.py
__metaclass__ = type

class Rhubarb:
 pass

rhubarb = Rhubarb()

print type(Rhubarb)
print type(rhubarb)


$ python test94.py



regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-22 Thread 7stud
On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote:
>
> class SmartRequest():
>

You should always define a class like this:

class SmartRequest(object):


unless you know of a specific reason not to.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-22 Thread 7stud
On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I need urllib2 do perform series of HTTP requests with cookie from
> PREVIOUS request(like our browsers usually do ).
>

Cookies from a previous request made in the currently running
program?  Or cookies from requests that were made when you previously
ran the program?

>
> from cookielib import CookieJar
> class SmartRequest():
>     cj=CookieJar()
>     def __init__(self, strUrl, strContent=None):
>         self.Request    =   urllib2.Request(strUrl, strContent)
>         self.cj.add_cookie_header(self.Request)
>         self.Response   =   urllib2.urlopen(Request)
>         self.cj.extract_cookies(self.Response, self.Request)
>     def url
>     def read(self, intCount):
>         return self.Response.read(intCount)
>     def headers(self, strHeaderName):
>         return self.Response.headers[strHeaderName]
>
> The code does not work because each time SmartRequest is initiated,
> object 'cj' is cleared. How to avoid that?
> The only stupid solution I figured out is use a global CookieJar
> object. Is there anyway that could handle all this INSIDE the class?
>

Examine this code and its output:

class SmartRequest(object):
def __init__(self, id):
if not getattr(SmartRequest, 'cj', None):
SmartRequest.cj = "I'm a cookie jar. Created by request:
%s" % id


r1 = SmartRequest(1)
r2 = SmartRequest(2)

print r1.cj
print r2.cj

--output:--
I'm a cookie jar. Created by request: 1
I'm a cookie jar. Created by request: 1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-22 Thread 7stud
On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I need urllib2 do perform series of HTTP requests with cookie from
> PREVIOUS request(like our browsers usually do ). Many people suggest I
> use some library(e.g. pycURL) instead but I guess it's good practise
> for a python beginner to DIY something rather than use existing tools.
>
> So my problem is how to expand the urllib2 class
>
> from cookielib import CookieJar
> class SmartRequest():
>     cj=CookieJar()
>     def __init__(self, strUrl, strContent=None):
>         self.Request    =   urllib2.Request(strUrl, strContent)
>         self.cj.add_cookie_header(self.Request)
>         self.Response   =   urllib2.urlopen(Request)
>         self.cj.extract_cookies(self.Response, self.Request)
>     def url
>     def read(self, intCount):
>         return self.Response.read(intCount)
>     def headers(self, strHeaderName):
>         return self.Response.headers[strHeaderName]
>
> The code does not work because each time SmartRequest is initiated,
> object 'cj' is cleared.

That's because every time you create a SmartRequest, this line
executes:

cj=CookieJar()

That creates a new, *empty* cookie jar, i.e. it has no knowledge of
any previously set cookies.

> How to avoid that?

If you read the docs on the cookielib module, and in particular
CookieJar objects, you will notice that CookieJar objects are
described in a section that is titled:  CookieJar and FileCookieJar
Objects.

Hmm...I wonder what the difference is between a CookieJar object and a
FileCookieJar Object?

--
FileCookieJar implements the following additional methods:

save(filename=None, ignore_discard=False, ignore_expires=False)
Save cookies to a file.

load(filename=None, ignore_discard=False, ignore_expires=False)
Load cookies from a file.


That seems promising.





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: n00b with urllib2: How to make it handle cookie automatically?

2008-02-22 Thread Rob Wolfe
est <[EMAIL PROTECTED]> writes:

> Hi all,
>
> I need urllib2 do perform series of HTTP requests with cookie from
> PREVIOUS request(like our browsers usually do ). Many people suggest I
> use some library(e.g. pycURL) instead but I guess it's good practise
> for a python beginner to DIY something rather than use existing tools.
>
> So my problem is how to expand the urllib2 class
>
> from cookielib import CookieJar
> class SmartRequest():
> cj=CookieJar()
> def __init__(self, strUrl, strContent=None):
> self.Request=   urllib2.Request(strUrl, strContent)
> self.cj.add_cookie_header(self.Request)
> self.Response   =   urllib2.urlopen(Request)
> self.cj.extract_cookies(self.Response, self.Request)
> def url
> def read(self, intCount):
> return self.Response.read(intCount)
> def headers(self, strHeaderName):
> return self.Response.headers[strHeaderName]
>
> The code does not work because each time SmartRequest is initiated,
> object 'cj' is cleared. How to avoid that?
> The only stupid solution I figured out is use a global CookieJar
> object. Is there anyway that could handle all this INSIDE the class?
>
> I am totally new to OOP & python programming, so could anyone give me
> some suggestions? Thanks in advance

Google for urllib2.HTTPCookieProcessor.

HTH,
Rob
-- 
http://mail.python.org/mailman/listinfo/python-list


n00b with urllib2: How to make it handle cookie automatically?

2008-02-21 Thread est
Hi all,

I need urllib2 do perform series of HTTP requests with cookie from
PREVIOUS request(like our browsers usually do ). Many people suggest I
use some library(e.g. pycURL) instead but I guess it's good practise
for a python beginner to DIY something rather than use existing tools.

So my problem is how to expand the urllib2 class

from cookielib import CookieJar
class SmartRequest():
cj=CookieJar()
def __init__(self, strUrl, strContent=None):
self.Request=   urllib2.Request(strUrl, strContent)
self.cj.add_cookie_header(self.Request)
self.Response   =   urllib2.urlopen(Request)
self.cj.extract_cookies(self.Response, self.Request)
def url
def read(self, intCount):
return self.Response.read(intCount)
def headers(self, strHeaderName):
return self.Response.headers[strHeaderName]

The code does not work because each time SmartRequest is initiated,
object 'cj' is cleared. How to avoid that?
The only stupid solution I figured out is use a global CookieJar
object. Is there anyway that could handle all this INSIDE the class?

I am totally new to OOP & python programming, so could anyone give me
some suggestions? Thanks in advance
-- 
http://mail.python.org/mailman/listinfo/python-list