On Mon, Jul 27, 2009 at 11:14 PM, alex finn<fin...@gmail.com> wrote:
>
> Hello,
>
> I'm working on a huge django-based application that heavily utilizes
> full-text search. Up until now I've been using external search service
> built using SOLR that is deployed separately.
> The problem is with such an approach I need to maintain 2 different
> environments - one for python/django and one for java/solr. It's not a
> big issue but I started to look for an alternative that will allow me
> to build search functionality in the django application. I've found
> several solutions so far:
> 1. pylucene
> 2. sphinx
> 3. whoosh
>
> Now the problem is that I have no real experience with any of those
> and would like to have some advise from those who tried any of the
> options (or may be there is anything else available?).
> PyLucene seems to be using java lucene library which may unnecessary
> affect the application performance as it is "embeds a Java VM with
> Lucene into a Python process".

> I found some links in the internet saying that sphinx can not handle
> non-ascii character sets well which is critical for me.
>
> Whoosh seems to be a little amateur - did not see any links on real
> world usage.
>
> Did you try one of these options? Or do you know any other solutions
> for the problem?

As one data point - I use Sphinx fairly extensively at work. During
development, I looked at Lucene as well, in the form of Solr (which is
a nice wrapper around Lucene).

Unfortunately, my experience has been that it's not as easy as saying
"which one is better". Both have their advantages, and both have
limitations. It depends upon the feature set that you are looking for
in your searches. In particular, issues like:
 - Frequency with which your corpus will update
 - How much you need to update items in your index
 - The type of queries you want to issue (simple words? documents
between two dates?)
 - The type of aggregate queries you want to issue (e.g., number of
documents with a given author)

Regarding non-ascii character sets - Sphinx can handle them fine, as
long as it is correctly configured. There is documentation, but it
adequate, not fantastic. A little experimentation will probably be
required before you can be confident you've got things right.

My recommendation in short is that Sphinx is probably worth a try. It
isn't perfect, and it won't suit every user, but what it does, it does
quite well, and it's a lot less heavyweight than a J2EE server.

Yours,
Russ Magee %-)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to