[Tutor] extract plain english words from html
hi. i have a ton of html files from which i want to extract the plain english words, and then write those words into a single text file. example: <... all kinds html tags ...> this is text from the above, i want to extract the string 'this is text' and write it out to a text file. note that all of the html files have the same format, i.e. the text is always surrounded by the same html tags. also, i am sorting through thousands of html files, so whatever i do needs to be fast. any ideas? marc --- The apocalyptic vision of a criminally insane charismatic cult leader http://www.marcbuehler.net __ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] passing variable to python script
hi. i want to pass an argument (a number) to a python script when running it: > python script.py i want to be able to use within script.py as a parameter. how do i set this up? marc --- The apocalyptic vision of a criminally insane charismatic cult leader http://www.marcbuehler.net __ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to alter list content
hi. i create a list of all JPG files with: >>> list = glob.glob('*.JPG') the content of 'list' is now: >>> print list ['DSC1.JPG', 'DSC2.JPG', 'DSC3.JPG'] but what i want is this type of list: ['DSC1', 'DSC2', 'DSC3'] i.e. the names w/o the file extension. what's the easiest way of doing this? marc --- The apocalyptic vision of a criminally insane charismatic cult leader http://www.marcbuehler.net __ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] how to extract number of files from directory
hi. i'm new to Python ... i would like to extract the number of JPG files from the current directory and use that number as a parameter in my python script. i tried: a = os.system('ls *JPG | wc -l') when i do: print a i get '0'. what am i missing? marc --- The apocalyptic vision of a criminally insane charismatic cult leader http://www.marcbuehler.net __ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor