I've run the data trough Whoosh, and now the hardest part is to cull
the words.
For example these are the top 10 word counts:
(u'django', 15051),
(u'have', 4066),
(u'your', 3770),
(u'us', 3311),
(u'python', 2738),
(u'some', 2713),
(u'site', 2501),
(u'code', 2359),
(u'like', 2335),
(u'project', 2327),

Any ideas how to sort out relevant tags?



On Jun 25, 4:36 pm, benny daon <[email protected]> wrote:
> Hi all,I've got a project going with the aim of improving djangoproject.com.
> So far I've forked the original code, cleaned it up, added buildout so
> installation will be a breeze, and added django-south so we can easily
> upgrade the database.
> Jacob KM sent me a link to a dump of the current database which I included
> in the migration script so the code pulls the dump and use it to create the
> database and add all the rows. There are almost 5000 rows in the model,
> pointing to django related posts. The next step is to extract common tags
> from  the title and summary fields of the FeedItem.
> A friend recommended I use Solr or Lucene for this job which makes sense. My
> issue is that I never used them before. If you know what needs to be done
> and have some time, please assign this ticket 
> -http://bitbucket.org/daonb/django-website/issue/3/- to yourself, fork the
> code, do it, and send me a 'pull request'.
>
> Thanks,
>
> Benny.
>
> BTW - there's much more to do in this project. Please feel free to open
> tickets with suggestions/bugs or better yet - send code. Jacob said he will
> use it in the live site.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"PyWeb-IL" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pyweb-il?hl=en
-~----------~----~----~----~------~----~------~--~---

_______________________________________________
Python-il mailing list
[email protected]
http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il

לענות