Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 25, 5:46 am, 7stud <[EMAIL PROTECTED]> wrote: > On Feb 24, 4:41 am, est <[EMAIL PROTECTED]> wrote: > > > > > > > On Feb 23, 2:42 am, Rob Wolfe <[EMAIL PROTECTED]> wrote: > > > > est <[EMAIL PROTECTED]> writes: > > > > Hi all, > > > > > I need urllib2 do perform series of HTTP requests with cookie from > > > > PREVIOUS request(like our browsers usually do ). Many people suggest I > > > > use some library(e.g. pycURL) instead but I guess it's good practise > > > > for a python beginner to DIY something rather than use existing tools. > > > > > So my problem is how to expand the urllib2 class > > > > > from cookielib import CookieJar > > > > class SmartRequest(): > > > > cj=CookieJar() > > > > def __init__(self, strUrl, strContent=None): > > > > self.Request = urllib2.Request(strUrl, strContent) > > > > self.cj.add_cookie_header(self.Request) > > > > self.Response = urllib2.urlopen(Request) > > > > self.cj.extract_cookies(self.Response, self.Request) > > > > def url > > > > def read(self, intCount): > > > > return self.Response.read(intCount) > > > > def headers(self, strHeaderName): > > > > return self.Response.headers[strHeaderName] > > > > > The code does not work because each time SmartRequest is initiated, > > > > object 'cj' is cleared. How to avoid that? > > > > The only stupid solution I figured out is use a global CookieJar > > > > object. Is there anyway that could handle all this INSIDE the class? > > > > > I am totally new to OOP & python programming, so could anyone give me > > > > some suggestions? Thanks in advance > > > > Google for urllib2.HTTPCookieProcessor. > > > > HTH, > > > Rob- Hide quoted text - > > > > - Show quoted text - > > > Wow, thank you Rob Wolfe! Your reply is shortest yet most helpful! I > > solved this problem by the following code. > > > class HTTPRefererProcessor(urllib2.BaseHandler): > > """Add Referer header to requests. > > > This only makes sense if you use each RefererProcessor for a > > single > > chain of requests only (so, for example, if you use a single > > HTTPRefererProcessor to fetch a series of URLs extracted from a > > single > > page, this will break). > > > There's a proper implementation of this in module mechanize. > > > """ > > def __init__(self): > > self.referer = None > > > def http_request(self, request): > > if ((self.referer is not None) and > > not request.has_header("Referer")): > > request.add_unredirected_header("Referer", self.referer) > > return request > > > def http_response(self, request, response): > > self.referer = response.geturl() > > return response > > > https_request = http_request > > https_response = http_response > > > def main(): > > cj = CookieJar() > > opener = urllib2.build_opener( > > urllib2.HTTPCookieProcessor(cj), > > HTTPRefererProcessor(), > > ) > > urllib2.install_opener(opener) > > > urllib2.urlopen(url1) > > urllib2.urlopen(url2) > > > if "__main__" == __name__: > > main() > > > And it's working great! > > > Once again, thanks everyone! > > How does the class HTTPReferrerProcessor do anything useful for you?- Hide > quoted text - > > - Show quoted text - Well, it's more browser-like. Many be I should have snipped HTTPReferrerProcessor code for this discussion. -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 24, 4:41 am, est <[EMAIL PROTECTED]> wrote: > On Feb 23, 2:42 am, Rob Wolfe <[EMAIL PROTECTED]> wrote: > > > > > est <[EMAIL PROTECTED]> writes: > > > Hi all, > > > > I need urllib2 do perform series of HTTP requests with cookie from > > > PREVIOUS request(like our browsers usually do ). Many people suggest I > > > use some library(e.g. pycURL) instead but I guess it's good practise > > > for a python beginner to DIY something rather than use existing tools. > > > > So my problem is how to expand the urllib2 class > > > > from cookielib import CookieJar > > > class SmartRequest(): > > > cj=CookieJar() > > > def __init__(self, strUrl, strContent=None): > > > self.Request = urllib2.Request(strUrl, strContent) > > > self.cj.add_cookie_header(self.Request) > > > self.Response = urllib2.urlopen(Request) > > > self.cj.extract_cookies(self.Response, self.Request) > > > def url > > > def read(self, intCount): > > > return self.Response.read(intCount) > > > def headers(self, strHeaderName): > > > return self.Response.headers[strHeaderName] > > > > The code does not work because each time SmartRequest is initiated, > > > object 'cj' is cleared. How to avoid that? > > > The only stupid solution I figured out is use a global CookieJar > > > object. Is there anyway that could handle all this INSIDE the class? > > > > I am totally new to OOP & python programming, so could anyone give me > > > some suggestions? Thanks in advance > > > Google for urllib2.HTTPCookieProcessor. > > > HTH, > > Rob- Hide quoted text - > > > - Show quoted text - > > Wow, thank you Rob Wolfe! Your reply is shortest yet most helpful! I > solved this problem by the following code. > > class HTTPRefererProcessor(urllib2.BaseHandler): > """Add Referer header to requests. > > This only makes sense if you use each RefererProcessor for a > single > chain of requests only (so, for example, if you use a single > HTTPRefererProcessor to fetch a series of URLs extracted from a > single > page, this will break). > > There's a proper implementation of this in module mechanize. > > """ > def __init__(self): > self.referer = None > > def http_request(self, request): > if ((self.referer is not None) and > not request.has_header("Referer")): > request.add_unredirected_header("Referer", self.referer) > return request > > def http_response(self, request, response): > self.referer = response.geturl() > return response > > https_request = http_request > https_response = http_response > > def main(): > cj = CookieJar() > opener = urllib2.build_opener( > urllib2.HTTPCookieProcessor(cj), > HTTPRefererProcessor(), > ) > urllib2.install_opener(opener) > > urllib2.urlopen(url1) > urllib2.urlopen(url2) > > if "__main__" == __name__: > main() > > And it's working great! > > Once again, thanks everyone! How does the class HTTPReferrerProcessor do anything useful for you? -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 23, 2:42 am, Rob Wolfe <[EMAIL PROTECTED]> wrote: > est <[EMAIL PROTECTED]> writes: > > Hi all, > > > I need urllib2 do perform series of HTTP requests with cookie from > > PREVIOUS request(like our browsers usually do ). Many people suggest I > > use some library(e.g. pycURL) instead but I guess it's good practise > > for a python beginner to DIY something rather than use existing tools. > > > So my problem is how to expand the urllib2 class > > > from cookielib import CookieJar > > class SmartRequest(): > > cj=CookieJar() > > def __init__(self, strUrl, strContent=None): > > self.Request = urllib2.Request(strUrl, strContent) > > self.cj.add_cookie_header(self.Request) > > self.Response = urllib2.urlopen(Request) > > self.cj.extract_cookies(self.Response, self.Request) > > def url > > def read(self, intCount): > > return self.Response.read(intCount) > > def headers(self, strHeaderName): > > return self.Response.headers[strHeaderName] > > > The code does not work because each time SmartRequest is initiated, > > object 'cj' is cleared. How to avoid that? > > The only stupid solution I figured out is use a global CookieJar > > object. Is there anyway that could handle all this INSIDE the class? > > > I am totally new to OOP & python programming, so could anyone give me > > some suggestions? Thanks in advance > > Google for urllib2.HTTPCookieProcessor. > > HTH, > Rob- Hide quoted text - > > - Show quoted text - Wow, thank you Rob Wolfe! Your reply is shortest yet most helpful! I solved this problem by the following code. class HTTPRefererProcessor(urllib2.BaseHandler): """Add Referer header to requests. This only makes sense if you use each RefererProcessor for a single chain of requests only (so, for example, if you use a single HTTPRefererProcessor to fetch a series of URLs extracted from a single page, this will break). There's a proper implementation of this in module mechanize. """ def __init__(self): self.referer = None def http_request(self, request): if ((self.referer is not None) and not request.has_header("Referer")): request.add_unredirected_header("Referer", self.referer) return request def http_response(self, request, response): self.referer = response.geturl() return response https_request = http_request https_response = http_response def main(): cj = CookieJar() opener = urllib2.build_opener( urllib2.HTTPCookieProcessor(cj), HTTPRefererProcessor(), ) urllib2.install_opener(opener) urllib2.urlopen(url1) urllib2.urlopen(url2) if "__main__" == __name__: main() And it's working great! Once again, thanks everyone! -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 23, 5:57 am, 7stud <[EMAIL PROTECTED]> wrote: > On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote: > > > Hi all, > > > I need urllib2 do perform series of HTTP requests with cookie from > > PREVIOUS request(like our browsers usually do ). > > Cookies from a previous request made in the currently running > program? Or cookies from requests that were made when you previously > ran the program? > > > > > > > > > from cookielib import CookieJar > > class SmartRequest(): > > cj=CookieJar() > > def __init__(self, strUrl, strContent=None): > > self.Request= urllib2.Request(strUrl, strContent) > > self.cj.add_cookie_header(self.Request) > > self.Response = urllib2.urlopen(Request) > > self.cj.extract_cookies(self.Response, self.Request) > > def url > > def read(self, intCount): > > return self.Response.read(intCount) > > def headers(self, strHeaderName): > > return self.Response.headers[strHeaderName] > > > The code does not work because each time SmartRequest is initiated, > > object 'cj' is cleared. How to avoid that? > > The only stupid solution I figured out is use a global CookieJar > > object. Is there anyway that could handle all this INSIDE the class? > > Examine this code and its output: > > class SmartRequest(object): > def __init__(self, id): > if not getattr(SmartRequest, 'cj', None): > SmartRequest.cj = "I'm a cookie jar. Created by request: the getattr method is exactly what I am looking for, thanks! On Feb 23, 2:05 pm, 7stud <[EMAIL PROTECTED]> wrote: > On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote: > > > > > class SmartRequest(): > > You should always define a class like this: > > class SmartRequest(object): > > unless you know of a specific reason not to. Thanks for the advice! -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
7stud wrote: > On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote: >> class SmartRequest(): >> > > You should always define a class like this: > > class SmartRequest(object): > > > unless you know of a specific reason not to. > > It's much easier, though, just to put __metaclass__ = type at the start of any module where you want exlusively new-style objects. And I do agree that you should use exclusively new-style objects without a good reason for not doing, though thanks to Guido's hard work it mostly doesn't matter. $ cat test94.py __metaclass__ = type class Rhubarb: pass rhubarb = Rhubarb() print type(Rhubarb) print type(rhubarb) $ python test94.py regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote: > > class SmartRequest(): > You should always define a class like this: class SmartRequest(object): unless you know of a specific reason not to. -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote: > Hi all, > > I need urllib2 do perform series of HTTP requests with cookie from > PREVIOUS request(like our browsers usually do ). > Cookies from a previous request made in the currently running program? Or cookies from requests that were made when you previously ran the program? > > from cookielib import CookieJar > class SmartRequest(): > cj=CookieJar() > def __init__(self, strUrl, strContent=None): > self.Request = urllib2.Request(strUrl, strContent) > self.cj.add_cookie_header(self.Request) > self.Response = urllib2.urlopen(Request) > self.cj.extract_cookies(self.Response, self.Request) > def url > def read(self, intCount): > return self.Response.read(intCount) > def headers(self, strHeaderName): > return self.Response.headers[strHeaderName] > > The code does not work because each time SmartRequest is initiated, > object 'cj' is cleared. How to avoid that? > The only stupid solution I figured out is use a global CookieJar > object. Is there anyway that could handle all this INSIDE the class? > Examine this code and its output: class SmartRequest(object): def __init__(self, id): if not getattr(SmartRequest, 'cj', None): SmartRequest.cj = "I'm a cookie jar. Created by request: %s" % id r1 = SmartRequest(1) r2 = SmartRequest(2) print r1.cj print r2.cj --output:-- I'm a cookie jar. Created by request: 1 I'm a cookie jar. Created by request: 1 -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
On Feb 21, 11:50 pm, est <[EMAIL PROTECTED]> wrote: > Hi all, > > I need urllib2 do perform series of HTTP requests with cookie from > PREVIOUS request(like our browsers usually do ). Many people suggest I > use some library(e.g. pycURL) instead but I guess it's good practise > for a python beginner to DIY something rather than use existing tools. > > So my problem is how to expand the urllib2 class > > from cookielib import CookieJar > class SmartRequest(): > cj=CookieJar() > def __init__(self, strUrl, strContent=None): > self.Request = urllib2.Request(strUrl, strContent) > self.cj.add_cookie_header(self.Request) > self.Response = urllib2.urlopen(Request) > self.cj.extract_cookies(self.Response, self.Request) > def url > def read(self, intCount): > return self.Response.read(intCount) > def headers(self, strHeaderName): > return self.Response.headers[strHeaderName] > > The code does not work because each time SmartRequest is initiated, > object 'cj' is cleared. That's because every time you create a SmartRequest, this line executes: cj=CookieJar() That creates a new, *empty* cookie jar, i.e. it has no knowledge of any previously set cookies. > How to avoid that? If you read the docs on the cookielib module, and in particular CookieJar objects, you will notice that CookieJar objects are described in a section that is titled: CookieJar and FileCookieJar Objects. Hmm...I wonder what the difference is between a CookieJar object and a FileCookieJar Object? -- FileCookieJar implements the following additional methods: save(filename=None, ignore_discard=False, ignore_expires=False) Save cookies to a file. load(filename=None, ignore_discard=False, ignore_expires=False) Load cookies from a file. That seems promising. -- http://mail.python.org/mailman/listinfo/python-list
Re: n00b with urllib2: How to make it handle cookie automatically?
est <[EMAIL PROTECTED]> writes: > Hi all, > > I need urllib2 do perform series of HTTP requests with cookie from > PREVIOUS request(like our browsers usually do ). Many people suggest I > use some library(e.g. pycURL) instead but I guess it's good practise > for a python beginner to DIY something rather than use existing tools. > > So my problem is how to expand the urllib2 class > > from cookielib import CookieJar > class SmartRequest(): > cj=CookieJar() > def __init__(self, strUrl, strContent=None): > self.Request= urllib2.Request(strUrl, strContent) > self.cj.add_cookie_header(self.Request) > self.Response = urllib2.urlopen(Request) > self.cj.extract_cookies(self.Response, self.Request) > def url > def read(self, intCount): > return self.Response.read(intCount) > def headers(self, strHeaderName): > return self.Response.headers[strHeaderName] > > The code does not work because each time SmartRequest is initiated, > object 'cj' is cleared. How to avoid that? > The only stupid solution I figured out is use a global CookieJar > object. Is there anyway that could handle all this INSIDE the class? > > I am totally new to OOP & python programming, so could anyone give me > some suggestions? Thanks in advance Google for urllib2.HTTPCookieProcessor. HTH, Rob -- http://mail.python.org/mailman/listinfo/python-list
n00b with urllib2: How to make it handle cookie automatically?
Hi all, I need urllib2 do perform series of HTTP requests with cookie from PREVIOUS request(like our browsers usually do ). Many people suggest I use some library(e.g. pycURL) instead but I guess it's good practise for a python beginner to DIY something rather than use existing tools. So my problem is how to expand the urllib2 class from cookielib import CookieJar class SmartRequest(): cj=CookieJar() def __init__(self, strUrl, strContent=None): self.Request= urllib2.Request(strUrl, strContent) self.cj.add_cookie_header(self.Request) self.Response = urllib2.urlopen(Request) self.cj.extract_cookies(self.Response, self.Request) def url def read(self, intCount): return self.Response.read(intCount) def headers(self, strHeaderName): return self.Response.headers[strHeaderName] The code does not work because each time SmartRequest is initiated, object 'cj' is cleared. How to avoid that? The only stupid solution I figured out is use a global CookieJar object. Is there anyway that could handle all this INSIDE the class? I am totally new to OOP & python programming, so could anyone give me some suggestions? Thanks in advance -- http://mail.python.org/mailman/listinfo/python-list