subject:"Re\: \[Tutor\] Question regarding parsing HTML with BeautifulSoup"

Re: [Tutor] Question regarding parsing HTML with BeautifulSoup

2007-01-04 Thread Shuai Jiang (Runiteking1)


Hi,

Wow, thats much more elegant than the idea I thought of.

Thank you very much Kent!

Marshall

On 1/3/07, Kent Johnson <[EMAIL PROTECTED]> wrote:


Shuai Jiang (Runiteking1) wrote:
> Hello,
>
> I'm working on a program that need to parse a financial document on the
> internet
> using BeautifulSoup. Because of the nature of the information, it is all
> grouped
> as a table. I needed to get 3 types of info and have succeeded quite
> well using
> BeautifulSoup, but encountered problems on the third one.
>
> My question is that is there any easy way to parse an HTML tables column
> easily using BeautifulSoup. I copied the table here and I need to
> extract the EPS. The numbers are
> every sixth one from the   tag ex 2.27, 1.86, 1.61...

Here is one way, found with a little experimenting at the command prompt:

In [1]: data = '''

...: '''
In [3]: from BeautifulSoup import BeautifulSoup as BS

In [4]: soup=BS(data)

In [11]: for tr in soup.table.findAll('tr'):
: print tr.contents[11].string
:
:
EPS
2.27
  1.86
1.61
  1.27
1.18
  0.84
0.73
  0.46
0.2
  0.0

Kent






--
I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as
equals.
   Sir Winston Churchill
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Question regarding parsing HTML with BeautifulSoup

2007-01-03 Thread Kent Johnson

Shuai Jiang (Runiteking1) wrote:
> Hello,
> 
> I'm working on a program that need to parse a financial document on the 
> internet
> using BeautifulSoup. Because of the nature of the information, it is all 
> grouped
> as a table. I needed to get 3 types of info and have succeeded quite 
> well using
> BeautifulSoup, but encountered problems on the third one.
> 
> My question is that is there any easy way to parse an HTML tables column
> easily using BeautifulSoup. I copied the table here and I need to 
> extract the EPS. The numbers are
> every sixth one from the   tag ex 2.27, 1.86, 1.61...

Here is one way, found with a little experimenting at the command prompt:

In [1]: data = '''

...: '''
In [3]: from BeautifulSoup import BeautifulSoup as BS

In [4]: soup=BS(data)

In [11]: for tr in soup.table.findAll('tr'):
: print tr.contents[11].string
:
:
EPS
2.27
  1.86
1.61
  1.27
1.18
  0.84
0.73
  0.46
0.2
  0.0

Kent


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Question regarding parsing HTML with BeautifulSoup

Re: [Tutor] Question regarding parsing HTML with BeautifulSoup

2 matches

Site Navigation

Mail list logo

Footer information