i added your comments to the wiki: http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage? action=diff&version=11
On Jan 3, 2008, at 10:38 AM, Jens Kraemer wrote: > Hi! > > On Wed, Jan 02, 2008 at 02:30:23PM -0500, John Bachir wrote: >> The documentation* states that when using a single index for multiple >> models, the default_field list should be set to the same thing for >> all models. >> >> However, in my application, all my models have very different fields >> and this is not possible. I still want the results returned sorted by >> term frequency across all indexed content in each model. > > Short answer: > > It's safe for you to specify the same large :default_field list > containing > fields from all models in all your acts_as_ferret calls. aaf > doesn't use > this list but only hands it through to Ferret's query parser which > uses > it to expand queries that have no fields specified. > >> What is the purpose of default_field? Under what multi-model >> circumstance, if any, is it not necessary to use it? > > Long answer: > > The default_field option determines which fields Ferret will search > for > when there is no explicit field specified in a query. > > Suppose your index has the fields :id and :text (with id being > untokenized). With an empty default_field value (or '*', which > means the > same), and a :or_default value of false (as aaf sets it) you get > parsed > queries like this: > > 'tree' > --> 'id:tree text:tree' > > 'some tree' (meaning some AND tree because or_default == false) > --> '+(id:some) +(id:tree text:tree)' > > With 'some' being a stop word, one would expect the second query to > yield the same result as the first one, but since the query is run > against all fields, including :id, which is untokenized and therefore > has no analyzer, we end up querying our id field with a required term > query and get no result at all. > > I remember there has been some debate about this topic a year ago > or so, > and in theory it would be possible for Ferret to parse queries the > other way > around to work around this issue, but afair Dave brought up some good > reasons to leave it as it is. > > The solution is to tell Ferret which fields to search when no > fields are > specified for a query (or part of a query) with the :default_field > option. Usually aaf does this automatically by collecting all > tokenized > fields from the model. Now with a shared index there are n models but > one index, so here we need to have a joint list of all tokenized > fields > across all these models for the :default_field parameter. > > Since aaf is called in every single model, I didn't find an easy > way to > build this list automatically and decided to leave it up to the > user to > specify this list in the acts_as_ferret calls of every model. Not > really > DRY indeed. Patches welcome ;-) > > Here's a small script reproducing the issue: > http://pastie.caboo.se/134443 > > So to summarize: > > You need to specify :default_field if you're using :single_index => > true > in combination with :or_default => false (aaf default) and you have > queries that may contain stop words and that are not constrained to a > list of fields specified in the query string. > > > Cheers, > Jens > > > > > -- > Jens Krämer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database > _______________________________________________ > Ferret-talk mailing list > [email protected] > http://rubyforge.org/mailman/listinfo/ferret-talk _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

