On 9/17/2010 7:22 PM, Chris Hostetter wrote:
a) not really.   assuming you have no problem modifying the indexing code
in the way you want, and are primarily worried about searching from
various clients, then the most straight forward approach is probably to
use RewriteRules (or something equivilent) to do regex replacments in your
query strings before solr ever sees them.

That's an interesting idea. I am using haproxy, it might be able to do that. We don't have various clients, the index is pretty much used only by our web applications. One set of apps (the one we are phasing out) is using code actually intended for our old search engine's HTTP interface. We hacked together a shim to translate the old query syntax and use xslt to reformat Solr's output for it. The other set of apps is Java, using SolrJ.

b) i'm not sure if you realize that you can't make your index smaller by
removing a field from your schema -- not unless you also reindex all of
hte documents that (use to) have a value in that field.  depending on your
priorities, doing this twice (once to remove ft_text, and then once again
later to add ft_text back and remove catchall) may not be the best use of
your time/resources -- it might be more productive to accelerate your
switch to using dismax, and only do the reindexing once to eliminate your
catchall field.

I do know that I have to reindex. It's a process that only takes about six hours. Afterwards, instead of only a little more than half of each index fitting into the disk cache, it'll be about three quarters. As it might be a few months before we can start effectively using dismax, I'm OK with doing rebuilds twice.

Thanks,
Shawn

Reply via email to