Re: [Tutor] sort list alphabetically

2005-11-24 Thread lmac
Ok. That's the point. I think i meant case-sensitive. There are
some ways described here that will me help out.
Yes, the list is sorted when i print it out.
It was my fault, sorry guys.

Thank you a lot.

mac
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] sort list alphabetically

2005-11-23 Thread lmac
Hallo,

i have a list with the dirs/files from the current path.
When i use sort() to sort the list alphabetically the list
is still unsorted. How to use ?

dirs_files = os.listdir(os.getcwd())
print dirs_files
dirs_files.sort()
print dirs_files

Thank you.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] again... regular expression

2005-11-21 Thread lmac
Ok. There is an error i made. The links in the HTML-Site are starting
with good.php so there was no way ever to find an link.

re_site = re.compile(r"good\.php.+'")
for a in file:
z = re_site.search(a)
if z != None:
print z.group(0)


This will give me every line starting with "good.php" but does contain
not the first ' at the end, there are more tags and text which ends with
' too. So how can i tell in an regex to stop at the first found ' after
good.php ???

Thank you.


> Hallo.
> I want to parse a website for links of this type:
> 
> http://www.example.com/good.php?test=anything&egal=total&nochmal=nummer&so=site&seite=22";>
> 
> -
> re_site = re.compile(r'http://\w+.\w+.\w+./good.php?.+";>')
> for a in file:
>   z = re_site.search(a)
>   if z != None:
>   print z.group(0)
> 
> -
> 
> I still don't understand RE-Expressions. I tried some other expressions
>  but didn't get it work.
> 
> The End of the link is ">. So it should not be a problem to extract the
> link but it is.
> 
> Thank you for the help.
> 
> mac
> 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] again... regular expression

2005-11-21 Thread lmac
Hallo.
I want to parse a website for links of this type:

http://www.example.com/good.php?test=anything&egal=total&nochmal=nummer&so=site&seite=22";>

-
re_site = re.compile(r'http://\w+.\w+.\w+./good.php?.+";>')
for a in file:
z = re_site.search(a)
if z != None:
print z.group(0)

-

I still don't understand RE-Expressions. I tried some other expressions
 but didn't get it work.

The End of the link is ">. So it should not be a problem to extract the
link but it is.

Thank you for the help.

mac

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] code improvement for beginner ?

2005-10-10 Thread lmac
Danny Yoo wrote:
> 
> On Sat, 8 Oct 2005, lmac wrote:
> 
> 
>>Ok. Here we go. Wanted to start my page long ago. Now is the right time.
>>
>>http://daderoid.freewebspace24.de/python/python1.html
> 
> 
> Hi lmac,
> 
> I'll pick out some stuff that I see; I'm sure others will be happy to give
> comments too.  I'll try to make sure that all the criticism I give is
> constructive in nature, and if you have questions on any of it, please
> feel free to ask about it.
> 
> 
> I'll concentrate on the imgreg() function first.  The declaration of
> 'images' as a global variable looks a little weird.  I do see that
> 'imgreg' feeds into 'images'.  Not a major issue so far, but you might
> want to see if it's possible to do without the global, and explicitly pass
> in 'images' as another parameter to imgreg.  Globals just bother me on
> principle.  *grin*
> 
The thing is i want to download the images after i got the pages. So i
thought i use an global var that it is in scope at the end of the script.
> 
> You may want to document what 'patt' and 'search' are meant to be.  A
> comment at the top of imgreg, like:
> 
>  """imgreg searches for a pattern 'patt' within the text 'search'.  If
>  a match exists, adds it to the set of images, and returns 1.  Else,
>  returns 0."""
> 
> will help a lot.  Documenting the intent of a function is important,
> because people are forgetful.  No programming language can prevent memory
> loss:  what we should try to do is to compensate for our forgetfulness.
> *grin*
> 
> 
> Looking at pageimgs(): I'm not sure what 't' means in the open statement:
> 
>   f = open(filename, "rt")
> 
> and I think that 't' might be a typo: I'm surprised that Python doesn't
> complain.  Can anyone confirm this?  I think you may have tried to do "r+"
> mode, but even then, you probably don't: you're just reading from the
> file, and don't need to write back to it.
> 
> 
> Looking further into pageimgs(): again, I get nervous about globals.  The
> use of the 'r1' global variable is mysterious.  I had to hunt around to
> figure out what it was it near the middle of the program.
> 
> If anything, I'd recommend naming your global variables with more meaning.
> A name like 'image_regex_patterns' will work better than 'r1'.  Also, it
> looks like pageimgs() is hardcoded to assume 'r1' has three regular
> expressions in it, as it calls imgreg three times for each pattern in r1.
> 
>   if imgreg(r1[0],a) == 1:
>   continue
>   if imgreg(r1[1],a) == 1:
>   continue
>   imgreg(r1[2],a)
> 
> and that looks peculiar.  Because this snippet of code is also
> copy-and-pasted around line 106, it appears to be a specific kind of
> conceptual task that you're doing to register images.
> 
> I think that the use of 'r1' and 'imgreg' should be intrinsically tied.
> I'd recommend revising imgreg() so that when we register images, we don't
> have to worry that we've called it on all the regular expressions in r1.
> That is, let imgreg worry about it, not clients: have imgreg go through
> r1[0:3] by itself.
> 
> 
> If we incorporate these changes, the result might look something like
> this:
> 
> ###
> image_regex_patterns = map(re.compile,
>[r'http://\w+.\w+.\w+/i.+.gif',
> r'http://\w+.\w+.\w+/i.+.jpg',
> r'http://\w+.\w+.\w+/i.+.png'])
This one is very good. I stumbled sometimes over map() but didn't know
how to use it. This makes it easier.
> def imgreg(search):
> """Given a search text, looks for image urls for registration.  If
> a new one can be found, returns 1.  Otherwise, returns 0.
> 
> Possible bug: does not register all images in the search text, but only
> the first new one it can find.
> """
> for r in image_regex_patterns:
> z = r.search(search)
> if z != None:
> x = z.group(0)
> if x not in images:
> images.append(x)
> return 1
> return 0
> ###
The purpose for storing the images in an global list was to download all
images after the pages were saved and don't download an image
again if it was already saved 

Re: [Tutor] code improvement for beginner ?

2005-10-08 Thread lmac
Ok. Here we go. Wanted to start my page long ago. Now is the right time.

http://daderoid.freewebspace24.de/python/python1.html

Thank you.



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] code improvement for beginner ?

2005-10-08 Thread lmac
Hi there,
i wonder if i could post some of my scripts and anyone can tell me if
there is a better way for coding the problem. In the way of some
teaching class. ;-)

Or is this mailing-list only for specific questions ?

Thanks.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] find data in html file

2005-09-30 Thread lmac
>
>
>Message: 5
>Date: Fri, 30 Sep 2005 10:32:41 -0400
>From: Kent Johnson <[EMAIL PROTECTED]>
>Subject: Re: [Tutor] find data in html file
>Cc: tutor@python.org
>Message-ID: <[EMAIL PROTECTED]>
>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>lmac wrote:
>  
>
>>> It's not this simple. The whole thing is that i try to use ebay.de for 
>>> fetching websites
>>> when i give an articlenumber. The downloading of the site for a specific 
>>> article is no problem.
>>> But to get the data like price,bidders,shipment etc without the official 
>>> eBayAPI is hard.
>>> Maybe anyone has a solution made ?
>>> 
>>> Thanks anyway. I tried the htmllib. This is a very good lib but i don't get 
>>> it to work cos
>>> there is no  thing for the data i want to get. This is for html-tags. 
>>> And to store data
>>> in my own XML-files. (what i am goint to do when i get the data).
>>
>>
>
>You can try BeautifulSoup which is designed for screen-scraping:
>http://www.crummy.com/software/BeautifulSoup/
>
>But looking at the source for an eBay page it looks challenging. I wonder why 
>you don't use the eBay API to get the information you want? It seems to be 
>free for up to 10,000 requests a month and there is a python package to access 
>it.
>
>Kent
>
>
>
>--
>
>Message: 6
>Date: Fri, 30 Sep 2005 15:55:26 +0100
>From: paul brian <[EMAIL PROTECTED]>
>Subject: Re: [Tutor] find data in html file
>To: Python Tutor 
>Message-ID:
>   <[EMAIL PROTECTED]>
>Content-Type: text/plain; charset=ISO-8859-1
>
>  
>
>>> But to get the data like price,bidders,shipment etc without the official
>>> eBayAPI is hard.
>>> Maybe anyone has a solution made ?
>>
>>
>
> Ebay specifically change around their HTML codes, tags and formatting
> (in quite a clever way) to stop people doing exactly what you are
> trying to do. I think it changes every month.
>
> Like people say, use the API - You need to become an "ebay developer"
> (signup) and can use your own code or the python-ebay thing for free
> in "the sandbox", but must pay $100 or so to have your code verified
> as "not likey to scrunch our servers" before they give you a key for
> the real world.
>
> Its a bit of a pain, so i just hacked turbo-ebay a while back and made
> do.  Worked quite well really.
>
>
>--
>--
>Paul Brian
>m. 07875 074 534
>t. 0208 352 1741
>
>
>  
>
I look on it. (BeutifulSoup). At eBay i have now an DevAccount. But if read 
clearly it is
only the sandbox. Not the real eBay database, what means that i have not access 
to actual
ongoing auctions. Am i right ? 10.000 is more then enough. 
The other thing is i want to write this under Linux. I use only Linux for 
Internet surfing etc.
And the eBay-API is an windows-dll. Of cos pyEbay is working under Linux too.

Thanx for the tips. I think "Ich schmeiß die Flinte ins Korn und mache alles 
manuell".

;-)



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] find data in html file

2005-09-30 Thread lmac
Date: Wed, 28 Sep 2005 09:25:53 +0100
From: Ed Singleton <[EMAIL PROTECTED]>
Subject: Re: [Tutor] find data in html file
To: tutor@python.org
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1

On 27/09/05, lmac <[EMAIL PROTECTED]> wrote:

>> Hi there,
>> i have a base-question. If i want to read some kind of data out of a line
>> which i know the start-tag and the end-tag in an html-file how do i
>> recognize
>> if it's more than one line ?
>>
>> Example:
>>
>> Some textlinktext . DATA  etc.
>>
>> I would use >text as the starting tag to localize the beginning of the DATA.
>> And then  as the ending tag of the DATA. But if there is \n then
>> there are more than
>> one line.
>  
>

Hopefully it's just a typo or something, but you appear to have your
ending  and  tags the wrong way round.

You should be closing the cell before you close the row.

How do you want to get the data out?  This case is simple enough that
you could do a lazy (non-greedy) regex statement for it.  Something
like "([\s|\S]+?)" would do it.

Ed

It's not this simple. The whole thing is that i try to use ebay.de for fetching 
websites
when i give an articlenumber. The downloading of the site for a specific 
article is no problem.
But to get the data like price,bidders,shipment etc without the official 
eBayAPI is hard.
Maybe anyone has a solution made ?

Thanks anyway. I tried the htmllib. This is a very good lib but i don't get it 
to work cos
there is no  thing for the data i want to get. This is for html-tags. And 
to store data
in my own XML-files. (what i am goint to do when i get the data).





___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] find data in html file

2005-09-27 Thread lmac
Hi there,
i have a base-question. If i want to read some kind of data out of a line
which i know the start-tag and the end-tag in an html-file how do i 
recognize
if it's more than one line ?

Example:

Some textlinktext . DATA  etc.

I would use >text as the starting tag to localize the beginning of the DATA.
And then  as the ending tag of the DATA. But if there is \n then 
there are more than
one line.

I hope i explained it well what i am going for. English is not my native 
language.

Thank you.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] find() function an Tupel. Always returns -1.

2005-08-26 Thread lmac
hi there,

i got a problem with Tupel and the find() function. I know in the document
are this Keywords which i am looking for but find() always returns -1.

Thanks for the help.

fedora_user



#!/usr/bin/python
# -*- coding: utf_8 -*-

import string
import sys
import os
import urllib


anf_bez = 
('Startpreis:','Restzeit:','Angebotsbeginn:','Übersicht:','Artikelstandort:','Versand
 
nach:',
'Artikelnummer:','Kategorie')
end_bez = ('','MESZ','MESZ','Gebote','','','','')

# Artikelnummer von dem die Infos gespeichert werden
artikelno = `sys.argv[1:2]`
artikelno = artikelno[2:-2]

if len(artikelno) != 0:

TARGET_DIR = "/opt/internet/eBay/"
EBAY_HTTP = "http://cgi.ebay.de/ws/eBayISAPI.dll?ViewItem&item=";
EBAY_PAGE = EBAY_HTTP + artikelno

SAVE_PAGE = ""
SAVE_PAGE = SAVE_PAGE + "eBay-artikel" + artikelno + ".html"
SAVE_PAGE = os.path.join(TARGET_DIR,SAVE_PAGE)

# webSite laden und speichern
urllib.urlretrieve(EBAY_PAGE,SAVE_PAGE)

# webSite öffnen und absuchen
file = open(SAVE_PAGE,"rt")
for a in file:
asi = 0  # Suchindex für 'anf_bez'
esi = 0  # Suchindex für 'end_bez'
while asi < 8:
anf = -1
end = -1
anf = a.find( anf_bez[asi] )   # < 
always returns -1, never find anything  ? -
if anf != -1:
end = a[anf].find( end_bez[esi] )
if end != -1:
print a[anf:end]

asi = asi+1
esi = esi+1
print asi,esi

print EBAY_PAGE
print SAVE_PAGE
else:
print "Artikelnummer als Argument übergeben."



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] long int in list as argument for seek() function

2005-06-08 Thread lmac
Hi there,
i want to use an long int from an list which i got from my function 
find_lineno().
But i got this error and i don't understand why i can not use this long 
as an argument.
Where do i find a good documentation on errors so that i complete 
understand what
the heck is going on.
Many thanks.

ERROR:
---
Traceback (most recent call last):
  File "./extrmails.py", line 42, in ?
inputfile.seek(0,li)
IOError: [Errno 22] Invalid argument
---


CODE-START:
-

inputfile=open("mails","rt")

# --
def reset_inputfile():
inputfile.seek(0,0)

# --
def find_lineno(string):
f = -1
a = "start"
found_lines = []
reset_inputfile()

while len(a) != 0:
a = inputfile.readline()
f = a.find(string)
if f != -1:
found_lines.append(inputfile.tell())

return found_lines

# --

from_lineno=find_lineno("From:")
subj_lineno=find_lineno("Subject:")

print len(subj_lineno)
print len(from_lineno)

reset_inputfile()

for li in subj_lineno:   
inputfile.seek(0,li)<-- ??? 
...
..
--
CODE-END

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor