RC 1 is Done.
This is how it looks now: http://easycaptures.com/1076798580
All changes are in my branch.

On Wed, Jul 29, 2009 at 15:34, Udi h Bauman <[email protected]> wrote:

>
> I was going to suggest YQL term-extraction, which is quite good. But
> be sure to update on today's news regarding Y! & MS, which makes any
> usage of Yahoo!'s API's a very risky bet.
>
>
> Udi
>
>
> On 7/29/09, Refael <[email protected]> wrote:
> >
> > I think I have a jackpot
> >
> > Using Yahoo Term extractor on a random 20 articles I get:
> > [(u'django', 14),
> >  (u'google', 4),
> >  (u'python', 3),
> >  (u'oxford', 2),
> >  (u'ruby on rails', 2),
> >  (u'migrations', 2),
> >  (u'apps', 2),
> >  (u'iteration', 2),
> >  (u'snippets', 2),
> >  (u'models', 2),
> >  (u'running', 2),
> >  (u'google maps', 2),
> >  (u'unit tests', 1),
> >  (u'geek night', 1),
> >  (u'ali', 1),
> >  (u'dba', 1),
> >  (u'celebrities', 1),
> >  (u'data models', 1),
> >  (u'vagas para', 1),
> >  (u'admin interface', 1),
> >  (u'nbsp', 1),
> >  (u'password xxx', 1),
> >  (u'internet explorer', 1),
> >  (u'volta', 1),
> >  (u's\xe3o paulo', 1),
> >  (u'long time', 1),
> >  (u'larson', 1),
> >  (u'staging', 1),
> >  (u'capabilities', 1),
> >  (u'blog', 1),
> >  (u'pra valer', 1),
> >  (u'dict', 1),
> >  (u'search software', 1),
> >  (u'advice', 1),
> >  (u'interactive map', 1),
> >  (u'crash', 1),
> >  (u'banco de dados', 1),
> >  (u'keyword arguments', 1),
> >  (u'export library', 1),
> >  (u'core management', 1),
> >  (u'fantasy sport', 1),
> >  (u'submission', 1),
> >  (u'foi', 1),
> >  (u'html javascript', 1),
> >  (u'last time', 1),
> >  (u'cms', 1),
> >  (u'database name', 1),
> >  (u'enthusiasts', 1),
> >  (u'map', 1),
> >  (u'cairo', 1),
> >  (u'creation', 1),
> >  (u'sync', 1),
> >  (u'meta', 1),
> >  (u'sem', 1),
> >  (u'inkscape', 1),
> >  (u'pylons', 1),
> >  (u'pdf export', 1),
> >  (u'abc', 1),
> >  (u'install software', 1),
> >  (u'exit 1', 1),
> >  (u'uma', 1),
> >  (u'irc', 1),
> >  (u'dias', 1),
> >  (u'exercise', 1),
> >  (u'best project', 1),
> >  (u'time one', 1),
> >  (u'reason', 1),
> >  (u'interface', 1),
> >  (u'webapp', 1),
> >  (u'bottom line', 1),
> >  (u'database engine', 1),
> >  (u'friends houses', 1),
> >  (u'looking at the environment', 1),
> >  (u'launch', 1),
> >  (u'content types', 1),
> >  (u'ajax', 1),
> >  (u'discussion groups', 1),
> >  (u'new game', 1),
> >  (u'new features', 1),
> >  (u'aptitude', 1),
> >  (u'para quem', 1),
> >  (u'fun parties', 1),
> >  (u'few days', 1),
> >  (u'jay graves', 1),
> >  (u'interact', 1),
> >  (u'private league', 1),
> >  (u'lot', 1),
> >  (u'hollywood', 1),
> >  (u'checkout', 1),
> >  (u'public presentation', 1),
> >  (u'game model', 1),
> >  (u'fun things', 1),
> >  (u'south project', 1),
> >  (u'slides', 1),
> >  (u'freelancer', 1),
> >  (u'object oriented', 1),
> >  (u'sphinx', 1),
> >  (u'insights', 1),
> >  (u'scratchpad', 1),
> >  (u'initial release', 1),
> >  (u'rio', 1),
> >  (u'super models', 1),
> >  (u'presentation program', 1),
> >  (u'browser', 1),
> >  (u'debugging', 1),
> >  (u'positive reaction', 1),
> >  (u'initial development', 1),
> >  (u'business logic', 1),
> >  (u'representative locator', 1),
> >  (u'traditional fantasy', 1),
> >  (u'implementations', 1),
> >  (u'raw', 1),
> >  (u'absolute url', 1),
> >  (u'o tempo', 1),
> >  (u'technology', 1),
> >  (u'greenpeace', 1),
> >  (u'html css', 1),
> >  (u'pdftk', 1),
> >  (u'line test', 1),
> >  (u'nas', 1),
> >  (u'functionality', 1),
> >  (u'import user', 1),
> >  (u'sqlite3', 1),
> >  (u'webdesigner', 1),
> >  (u'server os', 1),
> >  (u'record time', 1),
> >  (u'quote', 1),
> >  (u'first installment', 1),
> >  (u'test automation conference', 1),
> >  (u'sys', 1),
> >  (u'fantasy game', 1),
> >  (u'pool', 1),
> >  (u'first name last name', 1),
> >  (u'design patterns', 1),
> >  (u'modes', 1),
> >  (u'driven development', 1),
> >  (u'os system', 1),
> >  (u'databases', 1),
> >  (u'output variables', 1),
> >  (u'cookbook', 1)
> > ]
> >
> > On Jul 13, 7:49 pm, Imri Goldberg <[email protected]> wrote:
> >> My shneckel:
> >> 1. Have a simple cull list (take the 5 minutes to write it, and it will
> do
> >> 80% of the work)2. Use TF/IDF
> >>
> >>
> >>
> >> On Mon, Jul 13, 2009 at 7:02 PM, Refael <[email protected]> wrote:
> >>
> >> > I've run the data trough Whoosh, and now the hardest part is to cull
> >> > the words.
> >> > For example these are the top 10 word counts:
> >> > (u'django', 15051),
> >> > (u'have', 4066),
> >> > (u'your', 3770),
> >> > (u'us', 3311),
> >> > (u'python', 2738),
> >> > (u'some', 2713),
> >> > (u'site', 2501),
> >> > (u'code', 2359),
> >> > (u'like', 2335),
> >> > (u'project', 2327),
> >>
> >> > Any ideas how to sort out relevant tags?
> >>
> >> > On Jun 25, 4:36 pm, benny daon <[email protected]> wrote:
> >> > > Hi all,I've got a project going with the aim of improving
> >> > djangoproject.com.
> >> > > So far I've forked the original code, cleaned it up, added buildout
> so
> >> > > installation will be a breeze, and added django-south so we can
> easily
> >> > > upgrade the database.
> >> > > Jacob KM sent me a link to a dump of the current database which I
> >> > included
> >> > > in the migration script so the code pulls the dump and use it to
> >> > > create
> >> > the
> >> > > database and add all the rows. There are almost 5000 rows in the
> >> > > model,
> >> > > pointing to django related posts. The next step is to extract common
> >> > > tags
> >> > > from  the title and summary fields of the FeedItem.
> >> > > A friend recommended I use Solr or Lucene for this job which makes
> >> > > sense.
> >> > My
> >> > > issue is that I never used them before. If you know what needs to be
> >> > > done
> >> > > and have some time, please assign this ticket -
> >> >http://bitbucket.org/daonb/django-website/issue/3/-to yourself, fork
> the
> >> > > code, do it, and send me a 'pull request'.
> >>
> >> > > Thanks,
> >>
> >> > > Benny.
> >>
> >> > > BTW - there's much more to do in this project. Please feel free to
> >> > > open
> >> > > tickets with suggestions/bugs or better yet - send code. Jacob said
> he
> >> > will
> >> > > use it in the live site.
> >>
> >> --
> >> Imri Goldberg
> >> --------------------------------------www.algorithm.co.il/blogs/
> >> --------------------------------------
> >> -- insert signature here ----
> > >
> >
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"PyWeb-IL" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pyweb-il?hl=en
-~----------~----~----~----~------~----~------~--~---

_______________________________________________
Python-il mailing list
[email protected]
http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il

לענות