Re: [Tutor] Feedparser and google news/google reader

2010-03-11 Thread DK
DK  gmail.com> writes:

> 
> I didn't include the code because its really just two lines, apologies:
> 

Well it turns out I AM a moron. Finding len(x) gives me the length of the 
dictionary, not the number of entries. I should have used len(x['entries']).

Using the shared URL from my google reader and adding ?n=150, I was able to 
pull all 150 news items saved in my reader folder.

The URL looks like this: 
('http://www.google.com/reader/public/atom/user/USERID#HERE/lab
el/HIE?n=155'

I'm still not sure how to pull more google news search items from the 
associated 
rss feed, so if anyone has a clue, I'd appreciate a hint (tho I suppose 
we've left python land and entered google land now).

Apologies for the static, 

dk

 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Feedparser and google news/google reader

2010-03-11 Thread DK
Alan Gauld  btinternet.com> writes:

> 
> I have no idea if this is relevantt and without code I suspect we will all
> be guessing blindly but...
> 
> Have you checked Google's terms of use? I know they make it hard to
> screen scrape their search engine so they may have similar limits on
> their feeds. Just a thought.
> 

I didn't include the code because its really just two lines, apologies:

import feedparser

feed =
'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&as_scoring=r&as_maxm=3&q=health+information+exchange&as_qdr=a&as_drrb=q&as_mind=8&as_minm=2&cf=all&as_maxd=100&output=rss'

x = feedparser.parse(feed)

len(x) (<-this always yields 10 items)

In the meantime, i'll check the terms of service.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Feedparser and google news/google reader

2010-03-10 Thread Alan Gauld

"David Kim"  wrote

me CRAZY. I can't seem to pull more than 10 items from a google news 
feed.

For example, I'd like to pull 1000 google news items (using some search
term, let's say 'lightsabers'). The associated atom feed url, however, 
only

holds ten items. And its hard to do some of the clustering exercises with
only ten items!


I have no idea if this is relevantt and without code I suspect we will all
be guessing blindly but...

Have you checked Google's terms of use? I know they make it hard to
screen scrape their search engine so they may have similar limits on
their feeds. Just a thought.

Alan G.


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Feedparser and google news/google reader

2010-03-10 Thread David Kim
I have been working through some of the examples in the Programming
Collective Intelligence book by Toby Segaran. I highly recommend it, btw.

Anyway, one of the simple exercises required is using feedparser to pull in
RSS/Atom feeds from different sources (before doing more interesting
things). The algorithm stuff I pretty much follow, but one thing is driving
me CRAZY. I can't seem to pull more than 10 items from a google news feed.
For example, I'd like to pull 1000 google news items (using some search
term, let's say 'lightsabers'). The associated atom feed url, however, only
holds ten items. And its hard to do some of the clustering exercises with
only ten items!

Anyway, I imagine this must be a straightforward thing and I'm being a
moron, but I don't know where else to ask this question. I did see some
posts about an n=100 term one can add to the url (the limit seems to be 100
items), but it only seems to effect the webpage view and not the feed. I've
also tried subscribing to the feed in Google Reader and making the feed
public, but I seem to be running into the same problem. Is this a feedparser
thing or a google thing?

The url I'm using is
http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&as_scoring=r&as_maxm=3&q=health+information+exchange&as_qdr=a&as_drrb=q&as_mind=8&as_minm=2&cf=all&as_maxd=100&output=rss

Can anyone help me? I'm tearing my hair out and want to choke my computer.
It's probably not relevant, but I'm running Snow Leopard and Python 2.6
(actually EPD 6.1).
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor