Quite a bit of this is over hy head at this point. I shold NOT have duplicate fields in the column. I wonder how that affects things.
Dennis Gearon Signature Warning ---------------- It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' EARTH has a Right To Life, otherwise we all die. --- On Fri, 12/17/10, Dyer, James <james.d...@ingrambook.com> wrote: > From: Dyer, James <james.d...@ingrambook.com> > Subject: RE: A schema inside a Solr Schema (Schema in a can) > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Date: Friday, December 17, 2010, 9:43 AM > There's also one "gotcha" we've > experienced when searching acrosse multi-valued > fields: SOLR will match across field occurences. > In the example below, if you were to search > q=contrib_name:(james AND smith), you will get this record > back. It matches one name from one contributor and > another name from a different contributor. This is not > what our users want. > > As a work-around, I am converting these to phrase queries > with slop: "james smith"~50 ... Just use a slop # > smaller than your positionIncrementGap and bigger than the # > of terms entered. This will prevent the cross-field > matches yet allow the words to occur in any order. > > The problem with this approach is that Lucene doesn't > support wildcards in phrases. Unlucky for us, because > our app automatically adds a wildcard to every term entered > in Contributor searching. So when we convert to SOLR > we will have to disable this "feature" for multi-word > queries. I experimented with the double metaphone > filter (too many false positive matches) and edge n-gram > filter (could make the index very big) to alleviate this > loss of functionality. Currently I have it set up to > index each name as the full name plus the first > initial. (so "j dyer" would match but not "ja dyer") > If this is considered not-good-enough, we can probably see > about doing the edge n-grams several characters out... > > > If anyone else has any other ideas I should try, please do > speak up. Thank you. > > James Dyer > E-Commerce Systems > Ingram Content Group > (615) 213-4311 > > > -----Original Message----- > From: Dyer, James > Sent: Friday, December 17, 2010 10:59 AM > To: solr-user@lucene.apache.org > Subject: RE: A schema inside a Solr Schema (Schema in a > can) > > Dennis, > > I may be misunderstanding your question, but think I've > just worked through something similar. We're indexing > book metadata, and a book can have more than one > Contributor. We want to store both the contributor's > name, their Role and their id (from our rel db). With > our old system, we had to do something like this: > > contrib: dyer, james|author|123 > contrib: smith, sam|editor|456 > > But Lucene/Solr will guanantee that multivalued fields > return in exactly the same order you put them in. So > with SOLR we can do this: > > contrib_name: dyer, james > contrib_name: smith, sam > contrib_role: author > contrib_role: editor > contrib_id:123 > contrib_id:456 > > The trick is to be very careful you put everything in the > same order (its easy if it is all from the same SQL query > from an relational database). If one of the data > elements is a NULL you have to use a placeholder (like an > empty string or a zero). > > Another option is use a dynamic field: > > contrib_123: dyer, james > contrib_456: smith, sam > > The problem here is if you want to display and use a > fieldlist (fl=), you cannot use wildcards (ex: fl=contrib_* > doesn't work). Same for searching (q=, qf=). You > can only use dynamic fields if you know the fieldname at > runtime you need to deal with. > > Both of these options might be more work for your app to > deal than the delimiter approach. And, in our case, we > could stick with the delimiter field and store it and then > have a separate indexed field that just has the name (as > this is all we search on). You could even just have 1 > field if you used a fancy analysis sequence that would only > index the element(s) you wanted indexes... > > James Dyer > E-Commerce Systems > Ingram Content Group > (615) 213-4311 > > > -----Original Message----- > From: Dennis Gearon [mailto:gear...@sbcglobal.net] > > Sent: Friday, December 17, 2010 12:43 AM > To: solr-user@lucene.apache.org > Subject: A schema inside a Solr Schema (Schema in a can) > > Is it possible to put name value pairs of any type in a > native Solr Index field type? Like JSON/XML/YML? > > The reason that I ask, since you asked, is I want my main > index schema to be a base object, and another multivalue > column to be the attributes of base object inherited > descendants. > > Is there any other way to do this? > > What are the limitations in searching and indexing > documents with multivalue fields? > > Dennis Gearon > > Signature Warning > ---------------- > It is always a good idea to learn from your own mistakes. > It is usually a better idea to learn from others’ > mistakes, so you do not have to make them yourself. from > 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' > > EARTH has a Right To Life, > otherwise we all die. >