Re: [Tutor] Read same instance twice

2008-10-27 Thread Kent Johnson
On Mon, Oct 27, 2008 at 4:03 PM, Øyvind <[EMAIL PROTECTED]> wrote:
> Hello.
>
> I am trying to gather some information from a webpage:
>
> side = urlopen("http://www.website.no";)
> rawstr = r"""spy.target="_top">(.*?)$"""
> rawstr2 = r"""spy.target2="_top">(.*?)$"""
>
> compile_obj = re.compile(rawstr,  re.IGNORECASE| re.MULTILINE| re.VERBOSE
> | re.UNICODE)
> compile_obj2 = re.compile(rawstr2,  re.IGNORECASE| re.MULTILINE|
> re.VERBOSE | re.UNICODE)
>
> liste = self.compile_obj.findall(side.read())
>
> liste = self.compile_obj2.findall(side.read())
>
> It works like a dream getting the first info, but the second doesn't work.
> The instance is empty.

> How can I easiest pick up more information from the site without opening
> it more than once?

Just remember the data. It's like reading a file, you can't read the
same file twice without re-opening, but you can remember the (string)
data from the file and use it however you want to:
data = side.read()
liste = self.compile_obj.findall(data)
liste2 = self.compile_obj2.findall(data)

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read same instance twice

2008-10-27 Thread Brian C. Lane
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Øyvind wrote:
> Hello.
> 
> I am trying to gather some information from a webpage:
> 
> side = urlopen("http://www.website.no";)
> rawstr = r"""spy.target="_top">(.*?)$"""
> rawstr2 = r"""spy.target2="_top">(.*?)$"""
> 
> compile_obj = re.compile(rawstr,  re.IGNORECASE| re.MULTILINE| re.VERBOSE
> | re.UNICODE)
> compile_obj2 = re.compile(rawstr2,  re.IGNORECASE| re.MULTILINE|
> re.VERBOSE | re.UNICODE)
> 
> liste = self.compile_obj.findall(side.read())
> 
> liste = self.compile_obj2.findall(side.read())
> 
> It works like a dream getting the first info, but the second doesn't work.
> The instance is empty.
> 

That's because you read all of it and passed it to the first regex.

Change to:

side = urlopen("http://www.website.no";).read()

then:

liste = compile_obj.findall(side)
liste = compile_obj2.findall(side)

That reads the site's contents once, then you can do whatever you want
with it in your program. I'm not sure why you had the self. reference to
compile_obj, so mix to fit your circumstances :)

Brian

- --
- ---[Office 68.6F]--[Outside 54.2F]--[Server 100.6F]--[Coaster 69.6F]---
- ---[   LADY MARY (367013060) @ 47 36.3071 -122 23.1817]---
Software, Linux, Microcontrollers http://www.brianlane.com
AIS Parser SDKhttp://www.aisparser.com

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)
Comment: Remember Lexington Green!

iD8DBQFJBi/UIftj/pcSws0RAjtiAJ45Sp++yj8jUhir6lwehLqRzBJswwCfREh7
J83jy1sN1xf8Gi+dWZs9GNM=
=8YQT
-END PGP SIGNATURE-
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor