> Which method is best and most pythonic to scrape text data with
> minimal formatting?
Use the HTMLParser module.
> I want to change the above to:
>
> Trigger: Debate on budget in Feb-Mar. New moves to
> cutmedical costs by better technology.
>
> Since I wanted some practice in regex, I starte
> Is there a better way for raw_input to accept both caps and lower
case
> letters than:
>
> def aFunction():
>action = raw_input("Perform an action?(y,n): ")
>if action == 'y' or action == 'Y':
if action in 'yY':
>anotherFunction()
>elif action == 'n' or action == 'N':
Which method is best and most pythonic to scrape text data with
minimal formatting?
I'm trying to read a large html file and strip out most of the markup,
but leaving the simple formatting like , , and . For example:
Trigger:
Debate on budget in Feb-Mar. New moves to cut medical costs by better
Jeff Shannon said unto the world upon 2005-02-15 21:20:
On Tue, 15 Feb 2005 17:19:37 -0500, Brian van den Broek
<[EMAIL PROTECTED]> wrote:
My Node class defines a _parse method which separates out the node
header, and sends those lines to a _parse_metadata method. This is
where the elif chain occu
Oops, you probably want to do this then-
for i in range( 0, 3 ):
oThread = Thread( target=mainFunction ).start()
while oThread:
print 'sleeping 3 seconds'
time.sleep( 3 )
A if generally has an implicit else: pass clause as I
think of it, so it will just keep
That is an attempt to catch the death of the thread. I guess I'm not
taking the right steps ;-)
Bernard
Liam Clarke wrote:
I'm sorry, but when does oThread get the value 1?
If you're testing for it's existence via a True/False thing, try
if oThread:
But otherwise, I'm not sure what you're expec
On Feb 16, 2005, at 02:36, Liam Clarke wrote:
I'm sorry, but when does oThread get the value 1?
If you're testing for it's existence via a True/False thing, try
if oThread:
But otherwise, I'm not sure what you're expecting to get.
Once again, you hit the spot, Liam. It seems that a Thread object
I'm sorry, but when does oThread get the value 1?
If you're testing for it's existence via a True/False thing, try
if oThread:
But otherwise, I'm not sure what you're expecting to get.
On Tue, 15 Feb 2005 20:58:15 -0500, Bernard Lebel
<[EMAIL PROTECTED]> wrote:
> Hello,
>
> I have already m
On Feb 16, 2005, at 01:58, Bernard Lebel wrote:
Now, I have a list of "jobs", each job being a windows bat file that
launches an executable and performs a rendering task. So I have this
queue of jobs, and would like to launch one only when the previous one
has finished, and in a separate window.
On Tue, 15 Feb 2005 17:19:37 -0500, Brian van den Broek
<[EMAIL PROTECTED]> wrote:
> My Node class defines a _parse method which separates out the node
> header, and sends those lines to a _parse_metadata method. This is
> where the elif chain occurs -- each line of the metadata starts with a
> ta
Hello,
I have already messed a little with simple thread programming, wich took
this form:
from threading import Thread
def mainFunction():
pass
Thread( target=mainFunction ).start()
Now, I have a list of "jobs", each job being a windows bat file that
launches an executable and performs
Well, thanks everyone who answered, much clearer now.
Bernard
Max Noel wrote:
In a slightly more generic fashion (everybody started dropping
examples), the goal of an integer (euclidian) division (say, a / b) is
to express an integer as such:
a = b * quotient + remainder
Where all the n
On Wed, 16 Feb 2005, Tony Meyer wrote:
> >> Is there a better way for raw_input to accept both caps and
> >> lower case letters than:
> [...]
> >>if action == 'y' or action == 'Y':
> >
> > if action in 'yY':
> > dostuff()
> [...]
> > Although, that does mean that if a user enters 'nN' they'l
Liam Clarke said unto the world upon 2005-02-15 18:08:
Hi Brian, why not take it the next step and
for key in metadata_dict:
if data.startswith(key):
exec('''self.%s = """%s"""''' %(metadata_dict[key],
data[len(key):]))
# tripl
In a slightly more generic fashion (everybody started dropping
examples), the goal of an integer (euclidian) division (say, a / b) is
to express an integer as such:
a = b * quotient + remainder
Where all the numbers used are integers, and 0 <= remainder < b.
When you perform integer di
>> Is there a better way for raw_input to accept both caps and
>> lower case letters than:
[...]
>>if action == 'y' or action == 'Y':
>
> if action in 'yY':
> dostuff()
[...]
> Although, that does mean that if a user enters 'nN' they'll
> get no, but that shouldn't be a huge problem, and it i
if action in 'yY':
dostuff()
elif action in 'nN':
doothersutff()
Although, that does mean that if a user enters 'nN' they'll get no,
but that shouldn't be a huge problem, and it it does, you can just do
a if len(action) != 1...
HTH
Liam Clarke
On Tue, 15 Feb 2005 15:16:37 -0800, Luke Jordan
Hi all, thanks to all for running such a great list.
Is there a better way for raw_input to accept both caps and lower case
letters than:
def aFunction():
action = raw_input("Perform an action?(y,n): ")
if action == 'y' or action == 'Y':
anotherFunction()
elif action == 'n' or act
Hi Brian, why not take it the next step and
> for key in metadata_dict:
> if data.startswith(key):
> exec('''self.%s = """%s"""''' %(metadata_dict[key],
> data[len(key):]))
> # triple quotes as there may be quotes in meta
On Tue, 15 Feb 2005 14:26:52 -0800 (PST), Da
>
> Hi Bernard,
>
> Another familiar example of modulo is checking to see if a number is even
> or odd:
>
Since Danny got it started with the examples, I'll give another
canonical example of the use of the modulus operator. Imagine that
we're trying
Brian van den Broek wrote:
[snip text]
class A:
def __init__(self):
self.something = None
self.something_else = None
self.still_another_thing = None
def update(self, data):
for key in metadata_dict:
if data.startswith(key):
exec('''self.%s = ""
> A remainder is what's left over after a division:
>
> 10/3 = 3 remainder 1
> 12/5 = 2 remainder 2
> 27/3 = 9 remainder 0
>
> and the modulus operator (which is % in python) gives you that remainder:
>
> 10%3 = 1
> 12%5 = 2
> 27%3 = 0
Hi Bernard,
Another familiar example of modulo is checking
Hi all,
I'm still plugging away at my project of writing code to process
treepad files. (This was the task which I posted about in the recent
"help with refactoring needed -- which approach is more Pythonic?"
thread.)
My present problem is how best to reorganize a long (20 elements) elif
chain
A remainder is what's left over after a division:
10/3 = 3 remainder 1
12/5 = 2 remainder 2
27/3 = 9 remainder 0
and the modulus operator (which is % in python) gives you that remainder:
10%3 = 1
12%5 = 2
27%3 = 0
See http://mathworld.wolfram.com/Remainder.html and
http://mathworld.wolfram.com/
Hi,
I'm reading a Python book right now (Learning Python, a great book!), and there
are few terms that come are brought up a few times but without any explanation.
So what are:
- "remainders" (in the context of remainders-of-division modulus for numbers)
- "modulus" (in the same context; I have a
Problem solved. Thanks
--- Kent Johnson <[EMAIL PROTECTED]> wrote:
> Try it with non-greedy matches. You are matching
> everything from the first
> in one match. Also I think you want to escape the .
> before (you want just paragraphs that end
> in a period?)
>
> pattern = re.compile(""" hr
Try it with non-greedy matches. You are matching everything from the first
in one match. Also I think you want to escape the . before (you want just paragraphs that end
in a period?)
pattern = re.compile("""(.*?)\.""", re.DOTALL)
Kent
Ron Nixon wrote:
Trying to scrape a newspaper site for arti
Coupla nits:
On Tue, 15 Feb 2005 14:39:30 -0500, Kent Johnson <[EMAIL PROTECTED]> wrote:
> from string import punctuation
> from time import time
>
>
> words = open(r'D:\Personal\Tutor\ArtOfWar.txt').read().split()
Another advantage of the first method is that it allows a more elegant
word coun
Trying to scrape a newspaper site for articles using
this code whic ws done with help from the list:
import urllib, re
pattern = re.compile("""(.*).""", re.DOTALL)
page
=urllib.urlopen("http://www.startribune.com";).read()
for headline, body in pattern.findall(page):
print body
It should g
Ryan Davis wrote:
Here's one way to iterate over that to get the counts. I'm sure there are
dozens.
###
x = 'asdf foo bar foo'
counts = {}
for word in x.split():
... counts[word] = x.count(word)
...
counts
{'foo': 2, 'bar': 1, 'asdf': 1}
###
The dictionary takes care of duplicates. If you are
On Tue, 15 Feb 2005 18:03:57 +, Max Noel <[EMAIL PROTECTED]> wrote:
>
> On Feb 15, 2005, at 17:19, Ron Nixon wrote:
>
> > Thanks to everyone who replied to my post. All of your
> > suggestions seem to work. My thanks
> >
> > Ron
>
> Watch out, though, for all of this to work flawless
> Other than using a several print statments to look for
> seperate words like this, is there a way to do it so
> that I get a individual count of each word:
>
> word1 xxx
> word2 xxx
> words xxx
The classic approach is to create a dictionary.
Add each word as you come to it and increment the val
Ahem, we heard you the first time!
:-)
Alan G.
- Original Message -
From: "l4 l'l1" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, February 15, 2005 10:47 AM
Subject: [Tutor] Variables
> How can I do it with several variables?
>
>
> I = "John"
> print "%s used to love pizza" % I
>
> About
> I'm already hitting my conceptual troubles however, as I'm
visualising
> each table as a 'card'.
Maybe but the cards are made up of rows, each row with fields.
Think like a spreadsheet. Each sheet can have references to
other sheets - like Tabs in Excel
> dimensional. But what I was wonderi
> How can I do it with several variables?
> I = "John"
> print "%s used to love pizza" % I
wrap them in parens:
>>> a,b = 3,4
>>> print "%d x %d = %d" % (a,b, a*b)
> About 10 or more...
Same technique but you might find it easier to use labels to
identify the fields.
>>> sum = a+b
>>> print
On Feb 15, 2005, at 17:19, Ron Nixon wrote:
Thanks to everyone who replied to my post. All of your
suggestions seem to work. My thanks
Ron
Watch out, though, for all of this to work flawlessly you first have
to remove all punctuation (either with regexes or with multiple
foo.replace('[symbol]',
Thanks to everyone who replied to my post. All of your
suggestions seem to work. My thanks
Ron
--- Ryan Davis <[EMAIL PROTECTED]> wrote:
> You could use split() to split the contents of the
> file into a list of strings.
>
> ###
> >>> x = 'asdf foo bar foo'
> >>> x.split()
> ['asdf', 'foo', 'b
On Tue, 15 Feb 2005, Ron Nixon wrote:
> I know that you can do this to get a count of home many times a word
> appears in a file
>
>
> f = open('text.txt').read()
> print f.count('word')
>
> Other than using a several print statments to look for seperate words
> like this, is there a way to do i
Ron Nixon wrote:
f = open('text.txt').read()
print f.count('word')
Other than using a several print statments to look for
seperate words like this, is there a way to do it so
that I get a individual count of each word:
word1 xxx
word2 xxx
words xxx
etc.
Someone else might offer a better way of find
Ron Nixon wrote:
I know that you can do this to get a count of home
many times a word appears in a file
f = open('text.txt').read()
print f.count('word')
Other than using a several print statments to look for
seperate words like this, is there a way to do it so
that I get a individual cou
You could use split() to split the contents of the file into a list of strings.
###
>>> x = 'asdf foo bar foo'
>>> x.split()
['asdf', 'foo', 'bar', 'foo']
###
Here's one way to iterate over that to get the counts. I'm sure there are
dozens.
###
>>> x = 'asdf foo bar foo'
>>> counts = {}
>>> for
Ron,
is there a way to do it so
> that I get a individual count of each word:
>
> word1 xxx
> word2 xxx
> words xxx
>
> etc.
Ron, I'm gonna throw some untested code at you. Let me know if you
understand it or not:
word_counts = {}
for line in f:
for word in line.split():
if word in
I know that you can do this to get a count of home
many times a word appears in a file
f = open('text.txt').read()
print f.count('word')
Other than using a several print statments to look for
seperate words like this, is there a way to do it so
that I get a individual count of each word:
word1
I am forwarding this mail, since tutor@python.org is not added by Matt Hauser.
Thank you,
Vishnu.
-Original Message-
From: Matt Hauser [mailto:[EMAIL PROTECTED]
Sent: Tuesday, February 15, 2005 7:28 PM
To: Vishnu
Subject: Re: [Tutor] Variables
#Create a list of people
whoLovesPizza = ["
Liam,
I think what you want is called a view. A view is a memory based table
defined by a query as follows:
CREATE VIEW myview (
column1,
column2,
... )
AS
BEGIN
SELECT * FROM table1
END;
In this example, you can now SELECT * FROM myview, and get table1. You
can put joined tables or
I don't think you can do exactly that. But SQL does have powerful capabilities to do selects on
multiple tables at once. It's called a 'join' and it is very common.
For examples suppose you have a customer database with a Customer table:
cust_id cust_name
111 Liam Clarke
222 Kent Johnson
Hi,
Method-I:
=
I1 = "John1"
I2 = "John2"
I3 = "John3"
print "%s, %s and %s used to love pizza" % (I1,I2,I3)
Method-II:
=
use dictionaries,
name["I1"] = "John1"
name["I2"] = "John2"
name["I3"] = "John3"
print "%{I1}s, %{I2}s and %{I3}s used to love pizza" % name
HTH,
Vish
.
_
고.. 감.. 도.. 사.. 랑.. 만.. 들.. 기.. MSN 러브
http://www.msn.co.kr/love/
___
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor
Ack, sorry, just found the advanced tutorial and 'joins'.
Sorry.
*embarrassed*
(Also wondering if I spelt embarrassed right.)
Regards,
Liam Clarke
On Tue, 15 Feb 2005 23:26:06 +1300, Liam Clarke <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Working my way through the basics of SQL, and I can see th
Danny Yoo wrote:
On Mon, 14 Feb 2005, Bill Kranec wrote:
I'm using Kinterbasdb to access a Firebird database through Python, and
when I retrieve a row with a datetime value, I get a tuple like:
>>> myCursor.execute( 'SELECT * FROM table' )
>>> for row in myCursor.fetchall():
print row
(, 'v
How can I do it with several variables?
I = "John"
print "%s used to love pizza" % I
About 10 or more...
HELP plz :)
_
증권 정보 가장 빠르고 편하게 보실 수 있습니다. MSN 증권/투자
http://www.msn.co.kr/stock/
_
How can I do it with several variables?
I = "John"
print "%s used to love pizza" % I
About 10 or more...
HELP plz :)
_
확인하자. 오늘의 운세 무료 사주, 궁합, 작명, 전생 가이드
http://www.msn.co.kr/fortune/default.asp
___
a = "foo"
b = "bar"
c = "duck"
print "I will say only this - %s to your %s and no %s" % (a, b, c)
I will say only this - foo to your bar and no duck
And so forth.
On Tue, 15 Feb 2005 19:07:56 +0900, ì ìì <[EMAIL PROTECTED]> wrote:
> How can I do it with several variables?
>
> I = "John"
> pr
Hi,
Working my way through the basics of SQL, and I can see that it's very
powerful for searching by different criteria.
I'm already hitting my conceptual troubles however, as I'm visualising
each table as a 'card'.
Always my problems, too much imagination. But yeah, it seems very 1
dimensional.
How can I do it with several variables?
I = "John"
print "%s used to love pizza" % I
About 10 or more...
HELP plz :)
_
보다 빠르고 보기 편한 뉴스. 오늘의 화제는 MSN 뉴스에서 확인하세요.
http://www.msn.co.kr/news/
__
55 matches
Mail list logo