RE: schema-based Index-time field boosting
Aaaargh! OK, I would like a document with (eg.) a title containing a term to score higher than one on (eg.) a summary containing the same term, all other things being equal. You seem to be arguing against field boosting in general, and I don't understand why! May as well let this drop since we don't seem to be talking about the same thing . . . but thanks anyway, Ian. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: 30 November 2009 23:05 To: solr-user@lucene.apache.org Subject: RE: schema-based Index-time field boosting : I am talking about field boosting rather than document boosting, ie. I : would like some fields (say eg. title) to be louder than others, : across ALL documents. I believe you are at least partially talking : about document boosting, which clearly applies on a per-document basis. index time boosts are all the same -- it doesn't matter if htey are field boosts or document boosts -- a document boost is just a field boost for every field in the document. : If it helps, consider a schema version of the following, from : org.apache.solr.common.SolrInputDocument: : : /** :* Adds a field with the given name, value and boost. If a field with : the name already exists, then it is updated to :* the new value and boost. :* :* @param name Name of the field to add :* @param value Value of the field :* @param boost Boost value for the field :*/ : public void addField(String name, Object value, float boost ) ... : Where a constant boost value is applied consistently to a given field. : That is what I was mistakenly hoping to achieve in the schema. I still : think it would be a good idea BTW. Regards, But now we're right back to what i was trying to explain before: index time boost values like these are only used as a multiplier in the fieldNorm. when included as part of the document data, you can specify a fieldBoost for fieldX of docA that's greater then the boost for fieldX of docB and that will make docA score higher then docB when fieldX contains the same number of matches and is hte same length -- but if you apply a constant boost of B to fieldX for every doc (which is what a feature to hardcode boosts in schema.xml might give you) then the net effect would be zero when scoring docA and docB, because the fieldNorm's for fieldX in both docs would include the exact same multiplier. -Hoss
Re: schema-based Index-time field boosting
Ian - now you're talking *term* boosting, which is a dynamic query- time factor, not something specified at index time. Here's what I suggest as a starting point for this sort of thing, in Solr request format: http://localhost:8983/solr/select? defType=dismaxq=appleqf=name^2+manu Where the term apple is queried against both the name and manu(facturer) fields. And matches in the name field get boosted by a factor of 2. This is using the dismax query parser. Using index-time boosts are becoming less and less favorable - rarely any need to do that given the more flexible dynamic control you can have over scoring at query-time. And I'm sure Hoss isn't arguing against field boosting, given he's one of the gurus behind the magic of dismax. He's simply saying if you apply a constant boost to all documents, you've effectively done nothing. Erik On Dec 3, 2009, at 3:37 AM, Ian Smith wrote: Aaaargh! OK, I would like a document with (eg.) a title containing a term to score higher than one on (eg.) a summary containing the same term, all other things being equal. You seem to be arguing against field boosting in general, and I don't understand why! May as well let this drop since we don't seem to be talking about the same thing . . . but thanks anyway, Ian. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: 30 November 2009 23:05 To: solr-user@lucene.apache.org Subject: RE: schema-based Index-time field boosting : I am talking about field boosting rather than document boosting, ie. I : would like some fields (say eg. title) to be louder than others, : across ALL documents. I believe you are at least partially talking : about document boosting, which clearly applies on a per-document basis. index time boosts are all the same -- it doesn't matter if htey are field boosts or document boosts -- a document boost is just a field boost for every field in the document. : If it helps, consider a schema version of the following, from : org.apache.solr.common.SolrInputDocument: : : /** :* Adds a field with the given name, value and boost. If a field with : the name already exists, then it is updated to :* the new value and boost. :* :* @param name Name of the field to add :* @param value Value of the field :* @param boost Boost value for the field :*/ : public void addField(String name, Object value, float boost ) ... : Where a constant boost value is applied consistently to a given field. : That is what I was mistakenly hoping to achieve in the schema. I still : think it would be a good idea BTW. Regards, But now we're right back to what i was trying to explain before: index time boost values like these are only used as a multiplier in the fieldNorm. when included as part of the document data, you can specify a fieldBoost for fieldX of docA that's greater then the boost for fieldX of docB and that will make docA score higher then docB when fieldX contains the same number of matches and is hte same length -- but if you apply a constant boost of B to fieldX for every doc (which is what a feature to hardcode boosts in schema.xml might give you) then the net effect would be zero when scoring docA and docB, because the fieldNorm's for fieldX in both docs would include the exact same multiplier. -Hoss
Behavior of filter query
Hi, I am working on SOLR 1.4. And am trying to implement the multi select feature on my site. I dont want the faceting counts but only results. I tried diff variations in my query : They are: http://localhost:8080/solr/select?q=*:*fq=product_category:Mobiles+%2Bproperty_make:(Nokia%20OR%20Sony-Ericsson)+%2Bproperty_bodyType:(CandyBar%20OR%20Slider) Result : 47 http://localhost:8080/solr/select?q=Mobilesqt=storefq=property_make:(Nokia%20OR%20Sony-Ericsson)fq=property_bodyType:(CandyBar%20OR%20Slider) Result : 47 http://localhost:8080/solr/select?q=Mobilesqt=storefq=property_make:(Nokia%20OR%20Sony-Ericsson)+%2Bproperty_bodyType:(CandyBar%20OR%20Slider) Result : 148 Now problem is, on the basis of my knowledge, I expected all three queries to give out the same results. Then, why is 3rd one behaving differently? By the way, in the third query, the result set includes entries which have either of them, ie it is behaving like OR, not AND. Also, if anyone could tell me, performance wise which is the best... For more insight: Schema: fieldType name=lowercase class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / filter class=solr.TrimFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=family_id type=string indexed=true stored=true omitNorms=true/ field name=product_category type=string indexed=true stored=true omitNorms=true / field name=img_url type=string indexed=false stored=true / field name=title type=text indexed=true stored=true / field name=price type=float stored=true/ field name=sdp_url type=string indexed=false stored=true / field name=inStock type=boolean indexed=true stored=false / field name=property_keywords type=string stored=false indexed=true multiValued=true/ field name=property_features type=string stored=true indexed=true multiValued=true/ dynamicField name=property_* type=string stored=true indexed=true/ field name=text type=lowercase indexed=true stored=false multiValued=true/ copyField source=property_* dest=text/ Handler: requestHandler name=store class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str str name=qf product_category /str str name=mm100%/str /lst lst name=appends str name=fqinStock:true/str /lst /requestHandler I am looking out for a quick response. Any kind of help would be highly appreciated. Regards, Gunjan -- View this message in context: http://old.nabble.com/Behavior-of-filter-query-tp26623237p26623237.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Behavior of filter query
Guess I found the problem.. By using fq=%2B bla blo seems to be fixing the problem... Can anyone tell me why is it necessary? also, which query is the best amongst all? -Gunjan gunjan_versata wrote: Hi, I am working on SOLR 1.4. And am trying to implement the multi select feature on my site. I dont want the faceting counts but only results. I tried diff variations in my query : They are: http://localhost:8080/solr/select?q=*:*fq=product_category:Mobiles+%2Bproperty_make:(Nokia%20OR%20Sony-Ericsson)+%2Bproperty_bodyType:(CandyBar%20OR%20Slider) Result : 47 http://localhost:8080/solr/select?q=Mobilesqt=storefq=property_make:(Nokia%20OR%20Sony-Ericsson)fq=property_bodyType:(CandyBar%20OR%20Slider) Result : 47 http://localhost:8080/solr/select?q=Mobilesqt=storefq=property_make:(Nokia%20OR%20Sony-Ericsson)+%2Bproperty_bodyType:(CandyBar%20OR%20Slider) Result : 148 Now problem is, on the basis of my knowledge, I expected all three queries to give out the same results. Then, why is 3rd one behaving differently? By the way, in the third query, the result set includes entries which have either of them, ie it is behaving like OR, not AND. Also, if anyone could tell me, performance wise which is the best... For more insight: Schema: fieldType name=lowercase class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / filter class=solr.TrimFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=family_id type=string indexed=true stored=true omitNorms=true/ field name=product_category type=string indexed=true stored=true omitNorms=true / field name=img_url type=string indexed=false stored=true / field name=title type=text indexed=true stored=true / field name=price type=float stored=true/ field name=sdp_url type=string indexed=false stored=true / field name=inStock type=boolean indexed=true stored=false / field name=property_keywords type=string stored=false indexed=true multiValued=true/ field name=property_features type=string stored=true indexed=true multiValued=true/ dynamicField name=property_* type=string stored=true indexed=true/ field name=text type=lowercase indexed=true stored=false multiValued=true/ copyField source=property_* dest=text/ Handler: requestHandler name=store class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str str name=qf product_category /str str name=mm100%/str /lst lst name=appends str name=fqinStock:true/str /lst /requestHandler I am looking out for a quick response. Any kind of help would be highly appreciated. Regards, Gunjan -- View this message in context: http://old.nabble.com/Behavior-of-filter-query-tp26623237p26623470.html Sent from the Solr - User mailing list archive at Nabble.com.
How to instruct MoreLikeThisHandler to sort results
Hi Folks, is there any way to instruct MoreLikeThisHandler to sort results? I was wondering that MLTHandler recognizes faceting parameters among others, but it ignores the sort parameter. Best, Sascha
Facet query with special characters
Hello, I've encountered some strange behaviour in Solr facet querying, and I've not been able to find anything on this on the web. Perhaps someone can shed some light on this? The problem: When performing a facet query where part of the value portion has a special character (a minus sign in this case), the query returns zero results unless I put a wildcard (*) at the end. Here is my query: This produces zero 'numFound': http://localhost:8983/solr/select/?wt=xmlindent=onrows=20q=((signature:3083 AND host:pds-comp.domain)) AND _time:[091119124039 TO 091203124039]facet=truefacet.field=hostfacet.field=sourcetypefacet.field=userfacet.field=signature This produces 28 'numFound': http://localhost:8983/solr/select/?wt=xmlindent=onrows=20q=((signature:3083 AND host:pds-comp.domain*)) AND _time:[091119124039 TO 091203124039]facet=truefacet.field=hostfacet.field=sourcetypefacet.field=userfacet.field=signature (Note: all hit results are for hostpds-comp.domain/host - there are no other characters in the resulting field values) I've tried escaping the minus sign in various ways, encoding etc., but nothing seems to work. Can anyone help? Many thanks, Peter _ Add your Gmail and Yahoo! Mail email accounts into Hotmail - it's easy http://clk.atdmt.com/UKM/go/186394592/direct/01/
Re: Retrieving large num of docs
Hi Hoss, I was experimenting with various queries to solve this problem and in one such test I remember that requesting only the ID did not change the retrieval time. To be sure, I tested it again using the curl command today and it confirms my previous observation. Also, enableLazyFieldLoading setting is set to true in my solrconfig. Another general observation (off topic) is that having a moderately large multi valued text field (~200 entries) in the index seems to slow down the search significantly. I removed the 2 multi valued text fields from my index and my search got ~10 time faster. :) - Raghu On Thu, Dec 3, 2009 at 2:14 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : I think I solved the problem of retrieving 300 docs per request for now. The : problem was that I was storing 2 moderately large multivalued text fields : though I was not retrieving them during search time. I reindexed all my : data without storing these fields. Now the response time (time for Solr to : return the http response) is very close to the QTime Solr is showing in the Hmmm two comments: 1) the example URL from your previous mail... : http://localhost:1212/solr/select/?rows=300q=%28ResumeAllText%3A%28%28%28%22java+j2ee%22+%28java+j2ee%29%29%29%5E4%29%5E1.0%29start=0wt=python ...doesn't match your earlier statemnet that you are only returning hte id field (there is no fl param in that URL) ... are you certain you werent' returning those large stored fields in teh response? 2) assuming you were actually using an fl param to limit the fields, make sure you have this setting in your solrconfig.xml... enableLazyFieldLoadingtrue/enableLazyFieldLoading ..that should make it pretty fast to return only a few fields of each document, even if you do have some jumpto stored fields that aren't being returned. -Hoss
weird behavior between 2 enviorments
I have 2 environments one works great for this query: my osx environment: http://localhost:8983/solr/select?q=countryName:%22Bosnia%20and%20Herzegovina%22 - returns 2 results my linux environment: http://localhost:8983/solr/select?q=countryName:%22Bosnia%20and%20Herzegovina%22 - returns 0 results same configs, same index etc, both using solr 1.4, in linux env if I run this query: /solr/select?q=id:96465437 response − lst name=responseHeader int name=status0/int int name=QTime1/int − lst name=params str name=qid:96465437/str /lst /lst − result name=response numFound=1 start=0 − doc ... str name=countryNameBosnia and Herzegovina/str /doc /result /response So the records are in the index. I checked the admin, they are indexed using the same type (text), and I cannot see any differences. any idea why it works on one env and not the other? anything I can check in admin to get to the bottom of this? thanks Joel
Problem with searching with first capital letter
I have a problem with SOLR searching. When i`am searching query: dog* everything is ok, but when query is Dog*(with first capital letter), i get no results. Any advice? My config: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType -- View this message in context: http://old.nabble.com/Problem-with-searching-with-first-capital-letter-tp26627677p26627677.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: weird behavior between 2 enviorments
Are you querying both systems from the same browser / client? Try adding debugQuery=true and see of the query parses the same for both (could be the browser/client doing extra escaping or something). -Yonik http://www.lucidimagination.com On Thu, Dec 3, 2009 at 10:14 AM, Joel Nylund jnyl...@yahoo.com wrote: I have 2 environments one works great for this query: my osx environment: http://localhost:8983/solr/select?q=countryName:%22Bosnia%20and%20Herzegovina%22 - returns 2 results my linux environment: http://localhost:8983/solr/select?q=countryName:%22Bosnia%20and%20Herzegovina%22 - returns 0 results same configs, same index etc, both using solr 1.4, in linux env if I run this query: /solr/select?q=id:96465437 response - lst name=responseHeader int name=status0/int int name=QTime1/int - lst name=params str name=qid:96465437/str /lst /lst - result name=response numFound=1 start=0 - doc ... str name=countryNameBosnia and Herzegovina/str /doc /result /response So the records are in the index. I checked the admin, they are indexed using the same type (text), and I cannot see any differences. any idea why it works on one env and not the other? anything I can check in admin to get to the bottom of this? thanks Joel
Re: schema-based Index-time field boosting
An index-time boost means this document is a better answer, regardless of the query. To weight title matches higher than summary matches, use field boosts at query time. They work great. There is no need to modify Solr to get that behavior. wunder On Dec 3, 2009, at 12:37 AM, Ian Smith wrote: Aaaargh! OK, I would like a document with (eg.) a title containing a term to score higher than one on (eg.) a summary containing the same term, all other things being equal. You seem to be arguing against field boosting in general, and I don't understand why! May as well let this drop since we don't seem to be talking about the same thing . . . but thanks anyway, Ian. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: 30 November 2009 23:05 To: solr-user@lucene.apache.org Subject: RE: schema-based Index-time field boosting : I am talking about field boosting rather than document boosting, ie. I : would like some fields (say eg. title) to be louder than others, : across ALL documents. I believe you are at least partially talking : about document boosting, which clearly applies on a per-document basis. index time boosts are all the same -- it doesn't matter if htey are field boosts or document boosts -- a document boost is just a field boost for every field in the document. : If it helps, consider a schema version of the following, from : org.apache.solr.common.SolrInputDocument: : : /** :* Adds a field with the given name, value and boost. If a field with : the name already exists, then it is updated to :* the new value and boost. :* :* @param name Name of the field to add :* @param value Value of the field :* @param boost Boost value for the field :*/ : public void addField(String name, Object value, float boost ) ... : Where a constant boost value is applied consistently to a given field. : That is what I was mistakenly hoping to achieve in the schema. I still : think it would be a good idea BTW. Regards, But now we're right back to what i was trying to explain before: index time boost values like these are only used as a multiplier in the fieldNorm. when included as part of the document data, you can specify a fieldBoost for fieldX of docA that's greater then the boost for fieldX of docB and that will make docA score higher then docB when fieldX contains the same number of matches and is hte same length -- but if you apply a constant boost of B to fieldX for every doc (which is what a feature to hardcode boosts in schema.xml might give you) then the net effect would be zero when scoring docA and docB, because the fieldNorm's for fieldX in both docs would include the exact same multiplier. -Hoss
RE: Facet query with special characters
Hello Solr Forum, I believe I have found a solution (workaround?) for performing an explicit (non-wildcarded) field query with values that contain special (escaped) characters. Instead of: field:value-with-escape-chars change this to: field:[value-with-escape-chars TO value-with-escape-chars] (Note that for SolrJ, use QueryParser.escape(), to ultimately turn this into: field:[\value\-with\-escape\-chars\ TO \value\-with\-escape\-chars\]) If the value being queried has no special characters (e.g. host:localhost), the above is not necessary, which leads me to believe this more of a workaround than the 'supported way'. Please do correct me/clarify if you know differently, or know of a better/more efficient method. In early tests with 200,000+ hits, there appears no performance hit for using the range form. Not sure if this affects performance for millions+ hits. Thanks, Peter From: pete...@hotmail.com To: solr-user@lucene.apache.org Subject: Facet query with special characters Date: Thu, 3 Dec 2009 13:29:45 + Hello, I've encountered some strange behaviour in Solr facet querying, and I've not been able to find anything on this on the web. Perhaps someone can shed some light on this? The problem: When performing a facet query where part of the value portion has a special character (a minus sign in this case), the query returns zero results unless I put a wildcard (*) at the end. Here is my query: This produces zero 'numFound': http://localhost:8983/solr/select/?wt=xmlindent=onrows=20q=((signature:3083 AND host:pds-comp.domain)) AND _time:[091119124039 TO 091203124039]facet=truefacet.field=hostfacet.field=sourcetypefacet.field=userfacet.field=signature This produces 28 'numFound': http://localhost:8983/solr/select/?wt=xmlindent=onrows=20q=((signature:3083 AND host:pds-comp.domain*)) AND _time:[091119124039 TO 091203124039]facet=truefacet.field=hostfacet.field=sourcetypefacet.field=userfacet.field=signature (Note: all hit results are for hostpds-comp.domain/host - there are no other characters in the resulting field values) I've tried escaping the minus sign in various ways, encoding etc., but nothing seems to work. Can anyone help? Many thanks, Peter _ Add your Gmail and Yahoo! Mail email accounts into Hotmail - it's easy http://clk.atdmt.com/UKM/go/186394592/direct/01/ _ Add your Gmail and Yahoo! Mail email accounts into Hotmail - it's easy http://clk.atdmt.com/UKM/go/186394592/direct/01/
Re: weird behavior between 2 enviorments
The schemas probably aren't the same. Looks like one has position increments enabled for the stopword filter in the field type, and one doesn't. -Yonik http://www.lucidimagination.com On Thu, Dec 3, 2009 at 11:00 AM, Joel Nylund jnyl...@yahoo.com wrote: same client, here are the debug results, something interesting is going on, I dont understand solr/lucene well enough to understand, see below not working env (linux) response - lst name=responseHeader int name=status0/int int name=QTime2/int - lst name=params str name=debugQuerytrue/str str name=qcountryName:Bosnia and Herzegovina/str /lst /lst result name=response numFound=0 start=0/ - lst name=debug str name=rawquerystringcountryName:Bosnia and Herzegovina/str str name=querystringcountryName:Bosnia and Herzegovina/str str name=parsedqueryPhraseQuery(countryName:bosnia herzegovina)/str str name=parsedquery_toStringcountryName:bosnia herzegovina/str lst name=explain/ str name=QParserLuceneQParser/str - lst name=timing double name=time2.0/double - lst name=prepare double name=time1.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time1.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst - lst name=process double name=time1.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst /lst /lst /response working env (osx) response - lst name=responseHeader int name=status0/int int name=QTime54/int - lst name=params str name=qcountryName:Bosnia and Herzegovina/str str name=debugQuerytrue/str /lst /lst - result name=response numFound=2 start=0 - doc str name=countryNameBosnia and Herzegovina/str str name=id83964763/str /doc - doc str name=countryNameBosnia and Herzegovina/str str name=id96465437/str /doc /result - lst name=debug str name=rawquerystringcountryName:Bosnia and Herzegovina/str str name=querystringcountryName:Bosnia and Herzegovina/str str name=parsedqueryPhraseQuery(countryName:bosnia ? herzegovina)/str str name=parsedquery_toStringcountryName:bosnia ? herzegovina/str - lst name=explain - str name=83964763 15.619301 = fieldWeight(countryName:bosnia herzegovina in 260955), product of: 1.0 = tf(phraseFreq=1.0) 24.990881 = idf(countryName: bosnia=2 herzegovina=2) 0.625 = fieldNorm(field=countryName, doc=260955) /str - str name=96465437 15.619301 = fieldWeight(countryName:bosnia herzegovina in 275091), product of: 1.0 = tf(phraseFreq=1.0) 24.990881 = idf(countryName: bosnia=2 herzegovina=2) 0.625 = fieldNorm(field=countryName, doc=275091) /str /lst str name=QParserLuceneQParser/str - lst name=timing double name=time53.0/double - lst name=prepare double name=time24.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst - lst name=process double name=time27.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time27.0/double /lst /lst /lst /lst /response On Dec 3,
replacing all the Field names with a default string of the Document
Hello All, I am in a situation to replace all the Fields-Names ( with a default string VALUE ) present in a document before its being indexed. But I do need Document as it is ( with unchanged filed names ) for all the parts like analysis, payload assigning, etc. So I can not modify document before its being sent to IndexWriter. So I am just wondering whether there is any place where we can replace all the field names just before field names are being added to the Index but after all the analysis stuff is done. Thanks.
Re: Problem with searching with first capital letter
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters http://wiki.apache.org/solr/AnalyzersTokenizersTokenFiltersOn wildcard and fuzzy searches, no text analysis is performed on the search word. HTH Erick On Thu, Dec 3, 2009 at 10:19 AM, Yurish yuris...@inbox.lv wrote: I have a problem with SOLR searching. When i`am searching query: dog* everything is ok, but when query is Dog*(with first capital letter), i get no results. Any advice? My config: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType -- View this message in context: http://old.nabble.com/Problem-with-searching-with-first-capital-letter-tp26627677p26627677.html Sent from the Solr - User mailing list archive at Nabble.com.
Issues with alphanumeric search terms
Hi My solr deployment is giving correct results for normal search terms like john. But when i search with john55 or 55 it will return all the search terms, including those which neither contains john nor 55. Below is the fieldtype defined for this field. fieldType name=mytype class=solr.TextField analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType Is there any other tokenizers or filters need to be set for alphanumeric/Number search? -- View this message in context: http://old.nabble.com/Issues-with-alphanumeric-search-terms-tp26629048p26629048.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: weird behavior between 2 enviorments
thanks that was it Joel On Dec 3, 2009, at 11:06 AM, Yonik Seeley wrote: The schemas probably aren't the same. Looks like one has position increments enabled for the stopword filter in the field type, and one doesn't. -Yonik http://www.lucidimagination.com On Thu, Dec 3, 2009 at 11:00 AM, Joel Nylund jnyl...@yahoo.com wrote: same client, here are the debug results, something interesting is going on, I dont understand solr/lucene well enough to understand, see below not working env (linux) response - lst name=responseHeader int name=status0/int int name=QTime2/int - lst name=params str name=debugQuerytrue/str str name=qcountryName:Bosnia and Herzegovina/str /lst /lst result name=response numFound=0 start=0/ - lst name=debug str name=rawquerystringcountryName:Bosnia and Herzegovina/str str name=querystringcountryName:Bosnia and Herzegovina/str str name=parsedqueryPhraseQuery(countryName:bosnia herzegovina)/str str name=parsedquery_toStringcountryName:bosnia herzegovina/ str lst name=explain/ str name=QParserLuceneQParser/str - lst name=timing double name=time2.0/double - lst name=prepare double name=time1.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time1.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst - lst name=process double name=time1.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst /lst /lst /response working env (osx) response - lst name=responseHeader int name=status0/int int name=QTime54/int - lst name=params str name=qcountryName:Bosnia and Herzegovina/str str name=debugQuerytrue/str /lst /lst - result name=response numFound=2 start=0 - doc str name=countryNameBosnia and Herzegovina/str str name=id83964763/str /doc - doc str name=countryNameBosnia and Herzegovina/str str name=id96465437/str /doc /result - lst name=debug str name=rawquerystringcountryName:Bosnia and Herzegovina/str str name=querystringcountryName:Bosnia and Herzegovina/str str name=parsedqueryPhraseQuery(countryName:bosnia ? herzegovina)/str str name=parsedquery_toStringcountryName:bosnia ? herzegovina/str - lst name=explain - str name=83964763 15.619301 = fieldWeight(countryName:bosnia herzegovina in 260955), product of: 1.0 = tf(phraseFreq=1.0) 24.990881 = idf(countryName: bosnia=2 herzegovina=2) 0.625 = fieldNorm(field=countryName, doc=260955) /str - str name=96465437 15.619301 = fieldWeight(countryName:bosnia herzegovina in 275091), product of: 1.0 = tf(phraseFreq=1.0) 24.990881 = idf(countryName: bosnia=2 herzegovina=2) 0.625 = fieldNorm(field=countryName, doc=275091) /str /lst str name=QParserLuceneQParser/str - lst name=timing double name=time53.0/double - lst name=prepare double name=time24.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst - lst name=process double name=time27.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time27.0/double /lst /lst /lst /lst /response On Dec 3, 2009, at 10:20 AM, Yonik Seeley wrote: Are you querying both systems from the same browser / client? Try adding
debugging javascript DIH
is there a way to print to std out or anything from my javascript DIH transformer? thanks Joel
RE: no error delta fail with DataImportHandler
Unfortunately that isn't it. I have tried id, product_id, and PRODUCT_ID, and they all produce the same result. It finds the modified item, but then does nothing. INFO: Running ModifiedRowKey() for Entity: product Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Creating a connection for entity product with URL: jdbc:oracle:oci:@dev.eline.com Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Time taken for getConnection(): 283 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed ModifiedRowKey for Entity: product rows obtained : 1 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed DeletedRowKey for Entity: product rows obtained : 0 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed parentDeltaQuery for Entity: product Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Delta Import completed successfully Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:0.404 From: noble.p...@corp.aol.com Date: Thu, 3 Dec 2009 12:50:15 +0530 Subject: Re: no error delta fail with DataImportHandler To: solr-user@lucene.apache.org the deltaQuery select 'product_id' and your deltaImportQuery uses ${dataimporter.delta.id} I guess it should have been ${dataimporter.delta. product_id} On Wed, Dec 2, 2009 at 11:52 PM, Thomas Woodard gtfo...@hotmail.com wrote: I'm trying to get delta indexing set up. My configuration allows a full index no problem, but when I create a test delta of a single record, the delta import finds the record but then does nothing. I can only assume I have something subtly wrong with my configuration, but according to the wiki, my configuration should be valid. What I am trying to do is have a single delta detected on the top level entity trigger a rebuild of everything under that entity, the same as the first example in the wiki. Any help would be greatly appreciated. dataConfig á ádataSource name=prodcat driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:oci:@XXX á áuser=XXX password=XXX autoCommit=false transactionIsolation=TRANSACTION_READ_COMMITTED/ á ádocument á á á áentity name=product dataSource=prodcat query= á á á áselect dp.product_id, dp.display_name, dp.long_description, gp.orientation á á á áfrom dcs_product dp, gl_product gp á á á áwhere dp.product_id = gp.product_id transformer=ClobTransformer,HTMLStripTransformer á á á ádeltaImportQuery=select dp.product_id, dp.display_name, dp.long_description, gp.orientation á á á áfrom dcs_product dp, gl_product gp á á á áwhere dp.product_id = gp.product_id á á á áAND dp.product_id = '${dataimporter.delta.id}' á á á ádeltaQuery=select product_id from gl_product_modified where last_modified TO_DATE('${dataimporter.last_index_time}', '-mm-dd hh:mi:ss') á á á árootEntity=false á á á ápk=PRODUCT_ID á á á á á á!-- COLUMN NAMES ARE CASE SENSITIVE. THEY NEED TO BE ALL CAPS OR EVERYTHING FAILS -- á á á á á áfield column=PRODUCT_ID name=product_id/ á á á á á áfield column=DISPLAY_NAME name=name/ á á á á á áfield column=LONG_DESCRIPTION name=long_description clob=true stripHTML=true / á á á á á áfield column=ORIENTATION name=orientation/ á á á á á áentity name=sku dataSource=prodcat query=select ds.sku_id, ds.sku_type, ds.on_sale, '${product.PRODUCT_ID}' || '_' || ds.sku_id as unique_id á á á áfrom dcs_prd_chldsku dpc, dcs_sku ds á á á áwhere dpc.product_id = '${product.PRODUCT_ID}' á á á áand dpc.sku_id = ds.sku_id á á á árootEntity=true pk=PRODUCT_ID, SKU_ID á á á á á á á áfield column=SKU_ID name=sku_id/ á á á á á á á áfield column=SKU_TYPE name=sku_type/ á á á á á á á áfield column=ON_SALE name=on_sale/ á á á á á á á áfield column=UNIQUE_ID name=unique_id/ á á á á á á á áentity name=catalog dataSource=prodcat query=select pc.catalog_id á á á á á á á á á á á á á áfrom gl_prd_catalog pc, gl_sku_catalog sc á á á á á á á á á á á á á áwhere pc.product_id = '${product.PRODUCT_ID}' and sc.sku_id = '${sku.SKU_ID}' and pc.catalog_id = sc.catalog_id pk=SKU_ID, CATALOG_ID á á á á á á á á á á á áfield column=CATALOG_ID name=catalogs/ á á á á á á á á/entity á á á á á á á áentity name=price dataSource=prodcat query=select ds.list_price as price á á á á á á á á á á á á á áfrom dcs_sku ds á á á á á á á á á á á á á áwhere ds.sku_id = '${sku.SKU_ID}' á á á á á á á á á á á á á áand ds.on_sale = 0 á á á á á á á á á á á á á áUNION á á á á á á á á á á á á á áselect ds.sale_price as price á á á á á á á á á á á á á áfrom dcs_sku ds á á á á á á á á á á á á á áwhere ds.sku_id = '${sku.SKU_ID}' á á á á á á á á á á á á á áand ds.on_sale = 1 á á á á á á á á á á á á á ápk=SKU_ID á á á á á á á á á á á
Concurrent Merge Scheduler MaxThread Count
I'm having trouble getting Solr to use more than one thread during index optimizations. I have the following in my solrconfig.xml: mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler int name=maxThreadCount6/int /mergeScheduler I had the same problem some time ago, but upgrading to Solr 1.4 solved the problem. Now it's happening again, with Solr 1.4. No matter what I set in the maxThreadCount, I only see one thread performing the merge when I profile the application. Any idea what could be wrong? Here are the details on my Solr deployment: Solr Specification Version: 1.4.0.2009.10.14.08.05.59 Solr Implementation Version: nightly exported - yonik - 2009-10-14 08:05:59 Lucene Specification Version: 2.9.1-dev Lucene Implementation Version: 2.9.1-dev 824988 - 2009-10-13 21:47:13 Thanks, Gio.
Re: How to instruct MoreLikeThisHandler to sort results
I had open a Jira and submitted a patch for this: https://issues.apache.org/jira/browse/SOLR-1545 Bill On Thu, Dec 3, 2009 at 7:47 AM, Sascha Szott sz...@zib.de wrote: Hi Folks, is there any way to instruct MoreLikeThisHandler to sort results? I was wondering that MLTHandler recognizes faceting parameters among others, but it ignores the sort parameter. Best, Sascha
Re: deleteById without solrj?
http://wiki.apache.org/solr/UpdateXmlMessages#A.22delete.22_by_ID_and_by_Query On Thu, Dec 3, 2009 at 11:57 AM, Joel Nylund jnyl...@yahoo.com wrote: Is there a url based approach to delete a document? thanks Joel
Re: java.lang.NumberFormatException: For input string:
its strange i had a dismaxhandler and it had an empty value for ps field i added a default value like 100 and the error disappeared. markrmiller wrote: Can you share the config files? darniz wrote: Hello All, i am getting this exception when i start solr. when i use hte original schema file and config file it is fine, but when we put our own schema file it gives the error. i made sure we dont have any documents in our index. Still we get this error, any idea Cant figure out which field is causing the problem SEVERE: java.lang.NumberFormatException: For input string: at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:468) at java.lang.Integer.valueOf(Integer.java:553) at org.apache.solr.common.util.DOMUtil.addToNamedList(DOMUtil.java:132) at org.apache.solr.common.util.DOMUtil.nodesToNamedList(DOMUtil.java:98) at org.apache.solr.common.util.DOMUtil.childNodesToNamedList(DOMUtil.java:88) at org.apache.solr.common.util.DOMUtil.addToNamedList(DOMUtil.java:142) at org.apache.solr.common.util.DOMUtil.nodesToNamedList(DOMUtil.java:98) at org.apache.solr.common.util.DOMUtil.childNodesToNamedList(DOMUtil.java:88) at org.apache.solr.core.PluginInfo.init(PluginInfo.java:54) at org.apache.solr.core.SolrConfig.readPluginInfos(SolrConfig.java:220) at org.apache.solr.core.SolrConfig.loadPluginInfo(SolrConfig.java:212) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:184) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) -- View this message in context: http://old.nabble.com/java.lang.NumberFormatException%3A-For-input-string%3A-%22%22-tp26631247p26632600.html Sent from the Solr - User mailing list archive at Nabble.com.
SnapPuller executing on master?
Hello, If I'm on a master (Solr 1.4), why would I see the following in the logs, if I'm running the master Solr JVM with -Denable.master=true: SEVERE: Master at: http://solrbox:8080/solr/foocore/replication is not available. Index fetch failed. Exception: Read timed out I checked the sources, and this comes from SnapPuller. SnapPuller is a slave thing. Why would this be executed on the master? Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
Re: dismax query syntax to replace standard query
I believe you need to use the fq parameter with dismax (not to be confused with qf) to add a filter query in addition to the q parameter. So your text search value goes in q parameter (which searches on the fields you configure) and the rest of the query goes in the fq. Would that work? On Thu, Dec 3, 2009 at 7:28 PM, javaxmlsoapdev vika...@yahoo.com wrote: I have configured dismax handler to search against both title description fields now I have some other attributes on the page e.g. status, name etc. On the search page I have three fields for user to input search values 1)Free text search field (which searchs against both title description) 2)Status (multi select dropdown) 3)name(single select dropdown) I want to form query like textField1:value AND status:(Male OR Female) AND name:abc. I know first (textField1:value searchs against both title description as that's how I have configured dixmax in the configuration) but not sure how I can AND other attributes (in my case status name) note; standadquery looks like following (w/o using dixmax handler) title:testdescription:testname:JoestatusName:(Male OR Female) -- View this message in context: http://old.nabble.com/dismax-query-syntax-to-replace-standard-query-tp26631725p26631725.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Retrieving large num of docs
Hm, hm, interesting. I was looking into something like this the other day (BIG indexed+stored text fields). After seeing enableLazyFieldLoading=true in solrconfig and after seeing fl didn't include those big fields, I though hm, so Lucene/Solr will not be pulling those large fields from disk, OK. You are saying that this may not be true based on your experiment? And what I'm calling your experiment means that you reindexed the same data, but without the 2 multi-valued text fields... .and that was the only change you made and got cca x10 search performance improvement? Sorry for repeating your words, just trying to confirm and understand. Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: Raghuveer Kancherla raghuveer.kanche...@aplopio.com To: solr-user@lucene.apache.org Sent: Thu, December 3, 2009 8:43:16 AM Subject: Re: Retrieving large num of docs Hi Hoss, I was experimenting with various queries to solve this problem and in one such test I remember that requesting only the ID did not change the retrieval time. To be sure, I tested it again using the curl command today and it confirms my previous observation. Also, enableLazyFieldLoading setting is set to true in my solrconfig. Another general observation (off topic) is that having a moderately large multi valued text field (~200 entries) in the index seems to slow down the search significantly. I removed the 2 multi valued text fields from my index and my search got ~10 time faster. :) - Raghu On Thu, Dec 3, 2009 at 2:14 AM, Chris Hostetter wrote: : I think I solved the problem of retrieving 300 docs per request for now. The : problem was that I was storing 2 moderately large multivalued text fields : though I was not retrieving them during search time. I reindexed all my : data without storing these fields. Now the response time (time for Solr to : return the http response) is very close to the QTime Solr is showing in the Hmmm two comments: 1) the example URL from your previous mail... : http://localhost:1212/solr/select/?rows=300q=%28ResumeAllText%3A%28%28%28%22java+j2ee%22+%28java+j2ee%29%29%29%5E4%29%5E1.0%29start=0wt=python ...doesn't match your earlier statemnet that you are only returning hte id field (there is no fl param in that URL) ... are you certain you werent' returning those large stored fields in teh response? 2) assuming you were actually using an fl param to limit the fields, make sure you have this setting in your solrconfig.xml... true ..that should make it pretty fast to return only a few fields of each document, even if you do have some jumpto stored fields that aren't being returned. -Hoss
Stopping Starting
Hello All I am just starting out today with solr and looking for some advice but I first have a problem. I ran the start command ie. user:~/solr/example$ java -jar start.jar Which worked perfect and started to explore the interface. But my terminal window dropped and I it has stopped working. If i try and restart it Im getting errors and its still not working. error like: 2009-12-03 21:55:41.785::WARN: EXCEPTION java.net.BindException: Address already in use So how can I stop and restart the service ? Hope you can help get me going again. Thank you Lee
Re: dismax query syntax to replace standard query
For more complex queries (including full well-formed lucene queries), there's the newly committed extended dismax parser in 1.5-dev (trunk) http://issues.apache.org/jira/browse/SOLR-1553 -Yonik http://www.lucidimagination.com On Thu, Dec 3, 2009 at 4:14 PM, Ian Sugar iansu...@gmail.com wrote: I believe you need to use the fq parameter with dismax (not to be confused with qf) to add a filter query in addition to the q parameter. So your text search value goes in q parameter (which searches on the fields you configure) and the rest of the query goes in the fq. Would that work? On Thu, Dec 3, 2009 at 7:28 PM, javaxmlsoapdev vika...@yahoo.com wrote: I have configured dismax handler to search against both title description fields now I have some other attributes on the page e.g. status, name etc. On the search page I have three fields for user to input search values 1)Free text search field (which searchs against both title description) 2)Status (multi select dropdown) 3)name(single select dropdown) I want to form query like textField1:value AND status:(Male OR Female) AND name:abc. I know first (textField1:value searchs against both title description as that's how I have configured dixmax in the configuration) but not sure how I can AND other attributes (in my case status name) note; standadquery looks like following (w/o using dixmax handler) title:testdescription:testname:JoestatusName:(Male OR Female) -- View this message in context: http://old.nabble.com/dismax-query-syntax-to-replace-standard-query-tp26631725p26631725.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stopping Starting
On Thu, Dec 3, 2009 at 4:57 PM, Lee Smith l...@weblee.co.uk wrote: Hello All I am just starting out today with solr and looking for some advice but I first have a problem. I ran the start command ie. user:~/solr/example$ java -jar start.jar Which worked perfect and started to explore the interface. But my terminal window dropped and I it has stopped working. If i try and restart it Im getting errors and its still not working. error like: 2009-12-03 21:55:41.785::WARN: EXCEPTION java.net.BindException: Address already in use So how can I stop and restart the service ? Try and find the java process and kill it? ps -elf | grep java kill pid If no other Java processes are running under user, then killall java is a quick way to do it (Linux has killall... not sure about other systems). -Yonik http://www.lucidimagination.com
Re: Windows 7 / Java 64bit / solr 1.4 - solr.solr.home problem
I just tested this on trunk and it works. Crazy. I'm on Vista 64 Java 1.6.0_14-b08. I have had security configuration problems and disabled some of it. You might try removing the unpacked solr from example/work/. Have you tried the full path? c:/nginx/... I run this from trunk/example. What directory is solr/jetty? On Wed, Dec 2, 2009 at 1:17 AM, Vladan Popovic vladanpopo...@be-o.com wrote: Hi, I installed solr 1.4 on Windows7 64bit with Java 1.6.0_17-b04 64bit and when I run the command: java -Dsolr.solr.home=multicore -jar start.jar I get the following error message: PS C:\nginx\solr\jetty java -Dsolr.solr.home=c:\nginx\solr\solr\multicore -jar start.jar Exception in thread main java.lang.NoClassDefFoundError: /solr/home=c:\nginx\solr\solr\multicore Caused by: java.lang.ClassNotFoundException: .solr.home=c:\nginx\solr\solr\multicore at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: .solr.home=c:\nginx\solr\solr\multicore. Program will exit. This is what I did to try to solve the problem but failed: 1. I googled quite a lot but didn't see anyone with a similar problem. 2. I tried running java in compatibility modes Vista XP and running it as administrator but none of it worked. 3. I installed Java 32 bit but it didn't solve the problem 4. I tested on Windows Vista 32 bit and Mac OSX snow tiger and it works without a problem 5. I tried to analyze the java log files on Windows 7 x64 but there was nothing there ... Basically whatever home other than the default (solr/) doesn't work. Has anyone else encountered this problem? and would anyone have a clue how to trouble shoot Any help would be appreciated. Best regards Vladan -- Lance Norskog goks...@gmail.com
Re: question about schemas
You can make a separate facet field which contains a range of buckets: 10, 20, 50, or 100 means that the field has a value 0-10, 11-20, 21-50, or 51-100. You could use a separate filter query with values for these buckets. Filter queries are very fast in Solr 1.4 and this would limit your range query execution to documents which match the buckets. But, in general, this is a shopping cart database and Solr/Lucene may not be the best fit for this problem. If you want to do numerical analysis on your shopping carts, check out KNIME: www.knime.org . It's wonderful. On Wed, Dec 2, 2009 at 8:38 AM, gdeconto gerald.deco...@topproducer.com wrote: I dont believe there is any way to link values in one multivalue field to values in other multivalue fields. Re where each doc contains the customer info and info for ALL products that the customer might have (likely done via dynamicfields): one thing you might want to consider is that this solution might lead to performance issues if you need to do range queries such as q=((Width1:[50 TO *] AND Density1:[7 to *]) OR (Width2:[50 TO *] AND Density2:[7 TO *]) OR …) I had a similar problem a while back, and basically had similar options. In my tests, this particular option became slower as I increased the number of products (and so the number of unique values for each product field). If you come up with a solution, let me know. also, another option might be to encode the product information (ie using a field delimiter, something like CSV) and then storing it into a multivalue field for each customer. I dont know how you would search that data tho (maybe by having a unique delimiter for each field?) -- View this message in context: http://old.nabble.com/question-about-schemas-tp26600956p26611997.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: how is score computed with hsin functionquery?
If you use the DataImportHandler you can add your own Javascript code to do the degree-radian conversion. On Wed, Dec 2, 2009 at 8:54 AM, gdeconto gerald.deco...@topproducer.com wrote: Grant Ingersoll-6 wrote: ... Yep. Also note that I added deg() and rad() functions, but for the most part is probably better to do the conversion during indexing. ... as it is not possible for me to convert my data from deg to rad during import (since queries are done using degrees), and manually putting in rad() stmts into my hsin query seems awkward, I was looking at ways to have solr do it for me. One idea I had was to leverage the existing hsin code (see below). My problem is that HavesineFunction expects ValueSource and I do not see a way to convert my radian values back to ValueSource (pls excuse my ignorance on this). Any help/ideas appreciated public class MyValueSourceParser extends ValueSourceParser { public ValueSource parse(FunctionQParser fp) throws ParseException { ValueSource source = fp.parseValueSource(); ValueSource x1 = fp.parseValueSource(); ValueSource y1 = fp.parseValueSource(); ValueSource x2 = fp.parseValueSource(); ValueSource y2 = fp.parseValueSource(); double radius = fp.parseDouble(); return HaversineFunction(Math.toRadians(x1), Math.toRadians(y1), Math.toRadians(x2), Math.toRadians(y2), radius); // ** how do I convert the rad param values to ValueSource ** } } -- View this message in context: http://old.nabble.com/how-is-score-computed-with-hsin-functionquery--tp26504265p26612289.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: how is score computed with hsin functionquery?
Lance Norskog-2 wrote: If you use the DataImportHandler you can add your own Javascript code to do the degree-radian conversion. Thx Lance, but I am not sure what you mean -- View this message in context: http://old.nabble.com/how-is-score-computed-with-hsin-functionquery--tp26504265p26634948.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Webinar: An Introduction to Basics of Search and Relevancy with Apache Solr hosted by Lucid Imagination
All of the Lucid webinars are available: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Podcasts-and-Videos On Wed, Dec 2, 2009 at 2:14 PM, Sulman sulmansar...@gmail.com wrote: Yeah, screen cast or video of talk will be good for those who missed it. Highly recommend it.. Paul Rosen wrote: Is there, or will there be, a screencast of this available? I'm sorry to have missed it. Tom Hill wrote: In this introductory technical presentation, renowned search expert Mark Bennett, CTO of Search Consultancy New Idea Engineering, will present practical tips and examples to help you quickly get productive with Solr, including: * Working with the web command line and controlling your inputs and outputs * Understanding the DISMAX parser * Using the Explain output to tune your results relevance * Using the Schema browser Wednesday, December 2, 2009 11:00am PST / 2:00pm EST Click here to sign up: http://www.eventsvc.com/lucidimagination/120209?trk=WR-DEC2009-AP -- View this message in context: http://old.nabble.com/Webinar%3A-An-Introduction-to-Basics-of-Search-and-Relevancy-with--Apache-Solr-hosted-by-Lucid-Imagination-tp26487883p26617451.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: High add/delete rate and index fragmentation
#1: Yes, compared to relational DBs, Solr/Lucene in general are biased towards slow indexing and fast queries. It automatically merges segments and keeps fragmentation down. The rate of merging can be controlled. #2: The standard architecture is with a master that only does indexing and one or more slaves that only handle queries. The slaves poll the master for index updates regularly. Java 1.4 has a built-in system for this. #3: The standard architecture puts the query servers behind a load balancer. It's the load balancer's job to watch for query servers coming on and off line. An alternate architecture has multiple servers which do both indexing and queries in the same index. This provides the shortest pipeline time from recieving the data to making it available for search. On Wed, Dec 2, 2009 at 2:43 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Rodrigo, It sounds like you're asking about near realtime search support, I'm not sure. So here's few ideas. #1 How often do you need to be able to search on the latest updates (as opposed to updates from lets say, 10 minutes ago)? To topic #2, Solr provides master slave replication. The optimize would happen on the master and the new index files replicated to the slave(s). #3 is a mixed bag at this point, and there is no official solution, yet. Shell scripts, and a load balancer could kind of work. Check out SOLR-1277 or SOLR-1395 for progress along these lines. Jason On Wed, Dec 2, 2009 at 11:53 AM, Rodrigo De Castro rodr...@sacaluta.com wrote: We are considering Solr to store events which will be added and deleted from the index in a very fast rate. Solr will be used, in this case, to find the right event we need to process (since they may have several attributes and we may search the best match based on the query attributes). Our understanding is that the common use cases are those wherein the read rate is much higher than writes, and deletes are not as frequent, so we are not sure Solr handles our use case very well or if it is the right fit. Given that, I have a few questions: 1 - How does Solr/Lucene degrade with the fragmentation? That would probably determine the rate at which we would need to optimize the index. I presume that it depends on the rate of insertions and deletions, but would you have any benchmark on this degradation? Or, in general, how has been your experience with this use case? 2 - Optimizing seems to be a very expensive process. While optimizing the index, how much does search performance degrade? In this case, having a huge degradation would not allow us to optimize unless we switch to another copy of the index while optimize is running. 3 - In terms of high availability, what has been your experience detecting failure of master and having a slave taking over? Thanks, Rodrigo -- Lance Norskog goks...@gmail.com
Re: Behavior of filter query
The second makes 2 large docsets and ands them together against the query. The third makes the intersection of the two filter queries, The third will use the least memory and run the fastest. The first is the slowest because it has to find each docid and check it against the filter query docset. I think. On Thu, Dec 3, 2009 at 1:55 AM, gunjan_versata gunjanga...@gmail.com wrote: Guess I found the problem.. By using fq=%2B bla blo seems to be fixing the problem... Can anyone tell me why is it necessary? also, which query is the best amongst all? -Gunjan gunjan_versata wrote: Hi, I am working on SOLR 1.4. And am trying to implement the multi select feature on my site. I dont want the faceting counts but only results. I tried diff variations in my query : They are: http://localhost:8080/solr/select?q=*:*fq=product_category:Mobiles+%2Bproperty_make:(Nokia%20OR%20Sony-Ericsson)+%2Bproperty_bodyType:(CandyBar%20OR%20Slider) Result : 47 http://localhost:8080/solr/select?q=Mobilesqt=storefq=property_make:(Nokia%20OR%20Sony-Ericsson)fq=property_bodyType:(CandyBar%20OR%20Slider) Result : 47 http://localhost:8080/solr/select?q=Mobilesqt=storefq=property_make:(Nokia%20OR%20Sony-Ericsson)+%2Bproperty_bodyType:(CandyBar%20OR%20Slider) Result : 148 Now problem is, on the basis of my knowledge, I expected all three queries to give out the same results. Then, why is 3rd one behaving differently? By the way, in the third query, the result set includes entries which have either of them, ie it is behaving like OR, not AND. Also, if anyone could tell me, performance wise which is the best... For more insight: Schema: fieldType name=lowercase class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / filter class=solr.TrimFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=family_id type=string indexed=true stored=true omitNorms=true/ field name=product_category type=string indexed=true stored=true omitNorms=true / field name=img_url type=string indexed=false stored=true / field name=title type=text indexed=true stored=true / field name=price type=float stored=true/ field name=sdp_url type=string indexed=false stored=true / field name=inStock type=boolean indexed=true stored=false / field name=property_keywords type=string stored=false indexed=true multiValued=true/ field name=property_features type=string stored=true indexed=true multiValued=true/ dynamicField name=property_* type=string stored=true indexed=true/ field name=text type=lowercase indexed=true stored=false multiValued=true/ copyField source=property_* dest=text/ Handler: requestHandler name=store class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str str name=qf product_category /str str name=mm100%/str /lst lst name=appends str name=fqinStock:true/str /lst /requestHandler I am looking out for a quick response. Any kind of help would be highly appreciated. Regards, Gunjan -- View this message in context: http://old.nabble.com/Behavior-of-filter-query-tp26623237p26623470.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Facet query with special characters
Backslash is the escape character for colon, hyphen, parentheses, brackets, curly braces, tilde, caret. (Did I miss any?) On Thu, Dec 3, 2009 at 7:59 AM, Peter 4U pete...@hotmail.com wrote: Hello Solr Forum, I believe I have found a solution (workaround?) for performing an explicit (non-wildcarded) field query with values that contain special (escaped) characters. Instead of: field:value-with-escape-chars change this to: field:[value-with-escape-chars TO value-with-escape-chars] (Note that for SolrJ, use QueryParser.escape(), to ultimately turn this into: field:[\value\-with\-escape\-chars\ TO \value\-with\-escape\-chars\]) If the value being queried has no special characters (e.g. host:localhost), the above is not necessary, which leads me to believe this more of a workaround than the 'supported way'. Please do correct me/clarify if you know differently, or know of a better/more efficient method. In early tests with 200,000+ hits, there appears no performance hit for using the range form. Not sure if this affects performance for millions+ hits. Thanks, Peter From: pete...@hotmail.com To: solr-user@lucene.apache.org Subject: Facet query with special characters Date: Thu, 3 Dec 2009 13:29:45 + Hello, I've encountered some strange behaviour in Solr facet querying, and I've not been able to find anything on this on the web. Perhaps someone can shed some light on this? The problem: When performing a facet query where part of the value portion has a special character (a minus sign in this case), the query returns zero results unless I put a wildcard (*) at the end. Here is my query: This produces zero 'numFound': http://localhost:8983/solr/select/?wt=xmlindent=onrows=20q=((signature:3083 AND host:pds-comp.domain)) AND _time:[091119124039 TO 091203124039]facet=truefacet.field=hostfacet.field=sourcetypefacet.field=userfacet.field=signature This produces 28 'numFound': http://localhost:8983/solr/select/?wt=xmlindent=onrows=20q=((signature:3083 AND host:pds-comp.domain*)) AND _time:[091119124039 TO 091203124039]facet=truefacet.field=hostfacet.field=sourcetypefacet.field=userfacet.field=signature (Note: all hit results are for hostpds-comp.domain/host - there are no other characters in the resulting field values) I've tried escaping the minus sign in various ways, encoding etc., but nothing seems to work. Can anyone help? Many thanks, Peter _ Add your Gmail and Yahoo! Mail email accounts into Hotmail - it's easy http://clk.atdmt.com/UKM/go/186394592/direct/01/ _ Add your Gmail and Yahoo! Mail email accounts into Hotmail - it's easy http://clk.atdmt.com/UKM/go/186394592/direct/01/ -- Lance Norskog goks...@gmail.com
Re: how is score computed with hsin functionquery?
http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer The DIH is a tool inside solr that scripts pulling documents from different data sources, transforming them and then indexing them. On Thu, Dec 3, 2009 at 3:10 PM, gdeconto gerald.deco...@topproducer.com wrote: Lance Norskog-2 wrote: If you use the DataImportHandler you can add your own Javascript code to do the degree-radian conversion. Thx Lance, but I am not sure what you mean -- View this message in context: http://old.nabble.com/how-is-score-computed-with-hsin-functionquery--tp26504265p26634948.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Windows 7 / Java 64bit / solr 1.4 - solr.solr.home problem
I tried it on Vista 32 Java 1.6.0_17-b04 and it works without a problem. Actually on all other computers in the office there is no problem - I am the only one using Windows 7 I did try with full path and it didn't work as well. Here's the result: Directory: C:\nginx\solr ModeLastWriteTime Length Name - -- d 12/2/2009 10:40 AMetc d 12/2/2009 10:40 AMlib d 12/2/2009 12:02 PMlogs d 12/4/2009 9:25 AMmulticore d 12/2/2009 12:02 PMsolr d 12/2/2009 10:40 AMwebapps -a---10/26/2009 4:27 PM 16411 start.jar PS C:\nginx\solr java -Dsolr.solr.home=c:/nginx/solr/multicore -jar start.jar Exception in thread main java.lang.NoClassDefFoundError: /solr/home=c:/nginx/solr/multicore Caused by: java.lang.ClassNotFoundException: .solr.home=c:.nginx.solr.multicore at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: .solr.home=c:/nginx/solr/multicore. Program will exit. PS C:\nginx\solr The jetty folder that you asked for was actually a config where I tried to move files around to check the dependencies - however this is certainly not the problem because that same config works smoothly on Vista. I also did try to remove everything and reinstall from scratch (solr java) but still the same problem. I removed the example/work folder but still the same. Lance Norskog wrote: I just tested this on trunk and it works. Crazy. I'm on Vista 64 Java 1.6.0_14-b08. I have had security configuration problems and disabled some of it. You might try removing the unpacked solr from example/work/. Have you tried the full path? c:/nginx/... I run this from trunk/example. What directory is solr/jetty? On Wed, Dec 2, 2009 at 1:17 AM, Vladan Popovic vladanpopo...@be-o.com wrote: Hi, I installed solr 1.4 on Windows7 64bit with Java 1.6.0_17-b04 64bit and when I run the command: java -Dsolr.solr.home=multicore -jar start.jar I get the following error message: PS C:\nginx\solr\jetty java -Dsolr.solr.home=c:\nginx\solr\solr\multicore -jar start.jar Exception in thread main java.lang.NoClassDefFoundError: /solr/home=c:\nginx\solr\solr\multicore Caused by: java.lang.ClassNotFoundException: .solr.home=c:\nginx\solr\solr\multicore at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: .solr.home=c:\nginx\solr\solr\multicore. Program will exit. This is what I did to try to solve the problem but failed: 1. I googled quite a lot but didn't see anyone with a similar problem. 2. I tried running java in compatibility modes Vista XP and running it as administrator but none of it worked. 3. I installed Java 32 bit but it didn't solve the problem 4. I tested on Windows Vista 32 bit and Mac OSX snow tiger and it works without a problem 5. I tried to analyze the java log files on Windows 7 x64 but there was nothing there ... Basically whatever home other than the default (solr/) doesn't work. Has anyone else encountered this problem? and would anyone have a clue how to trouble shoot Any help would be appreciated. Best regards Vladan
Document Decay
Hi, I'm looking for a way to have the score of documents decay over time. I want older documents to have a lower score than newer documents. I noted the ReciprocalFloatFunction class. In an example it seemed to be doing just this when you set the function to be: recip(ms(NOW,mydatefield),3.16e-11,1,1) This is supposed to degrade the score to half its value if the mydatefield is 1 year older than the current date. My question with this is, is it making the document score go down to 0.5 or is it making the document score 1/2 of its original value. i.e. The document has score 0.8 Will the score be 0.4 or 0.5 after using this function? Also, are there better alternatives to deal with document decay? Thanks, Stephen
Re: Issues with alphanumeric search terms
h, I don't think you want LowerCaseTokenizerFactory.. from: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LowerCaseTokenizerFactory Creates org.apache.lucene.analysis.LowerCaseTokenizer. Creates tokens by lowercasing all letters and dropping non-letters. Example: I can't == i, can, t also see: http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/analysis/LowerCaseTokenizer.html This seems consistent with this part of your debug query: str name=rawquerystring(phone: 650 AND rowtype:contacts)/str str name=querystring(phone:650 AND rowtype:contacts)/str str name=parsedquery+rowtype:contacts/str str name=parsedquery_toString+rowtype:contacts/str Note that the number portion of your original query is completely missing from the parsed query... How do you want your input tokenized? Maybe you want a WhitespaceTokenizer and a LowerCase *filter*? HTH Erick On Thu, Dec 3, 2009 at 2:05 PM, con convo...@gmail.com wrote: Yes. I meant all the indexed documents. With debugQuery=on, i got the following result: response - lst name=responseHeader int name=status0/int int name=QTime1/int - lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q(phone:650 AND rowtype:contacts)/str str name=wtxml/str str name=rows1/str str name=version2.2/str /lst /lst - result name=response numFound=104 start=0 - doc str name=ADDRESS /str str name=CITY /str str name=COUNTRY /str date name=CREATEDTIME2009-09-22T06:50:36.943Z/date str name=NAMEAdam/str str name=emaila...@abc.com/str str name=firstnameAdam/str str name=lastnamesmith/str str name=localeen_US/str str name=phone /str str name=rowtypecontacts/str /doc /result - lst name=debug str name=rawquerystring(phone:650 AND rowtype:contacts)/str str name=querystring(phone:650 AND rowtype:contacts)/str str name=parsedquery+rowtype:contacts/str str name=parsedquery_toString+rowtype:contacts/str - lst name=explain - str name=1030422en_US 0.99043053 = (MATCH) fieldWeight(rowtype:contacts in 0), product of: 1.0 = tf(termFreq(rowtype:contacts)=1) 0.99043053 = idf(docFreq=104, maxDocs=104) 1.0 = fieldNorm(field=rowtype, doc=0) /str /lst str name=QParserLuceneQParser/str - lst name=timing double name=time1.0/double - lst name=prepare double name=time0.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst - lst name=process double name=time1.0/double - lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst - lst name=org.apache.solr.handler.component.DebugComponent double name=time1.0/double /lst /lst /lst /lst /response Erick Erickson wrote: Hmmm, what does debugQuery=on show? And did you mean documents here? it will return all the search terms Best Erick On Thu, Dec 3, 2009 at 11:40 AM, con convo...@gmail.com wrote: Hi My solr deployment is giving correct results for normal search terms like john. But when i search with john55 or 55 it will return all the search terms, including those which neither contains john nor 55. Below is the fieldtype defined for this field. fieldType name=mytype class=solr.TextField analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType Is there any other tokenizers or filters need to be set for alphanumeric/Number search? -- View this message in context: http://old.nabble.com/Issues-with-alphanumeric-search-terms-tp26629048p26629048.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Issues-with-alphanumeric-search-terms-tp26629048p26631343.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stopping Starting
On Thu, Dec 3, 2009 at 5:01 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Thu, Dec 3, 2009 at 4:57 PM, Lee Smith l...@weblee.co.uk wrote: Hello All I am just starting out today with solr and looking for some advice but I first have a problem. I ran the start command ie. user:~/solr/example$ java -jar start.jar Which worked perfect and started to explore the interface. But my terminal window dropped and I it has stopped working. If i try and restart it Im getting errors and its still not working. error like: 2009-12-03 21:55:41.785::WARN: EXCEPTION java.net.BindException: Address already in use So how can I stop and restart the service ? Try and find the java process and kill it? ps -elf | grep java kill pid If no other Java processes are running under user, then killall java is a quick way to do it (Linux has killall... not sure about other systems). -Yonik http://www.lucidimagination.com On Ubuntu, CentOS and some other linux distros, you can run this command: pkill -f start.jar OR pkill -f java If there are no other java processes running -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: no error delta fail with DataImportHandler
probably you can try out this http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta and it may give you more info on what is happeining On Thu, Dec 3, 2009 at 10:58 PM, Thomas Woodard gtfo...@hotmail.com wrote: Unfortunately that isn't it. I have tried id, product_id, and PRODUCT_ID, and they all produce the same result. It finds the modified item, but then does nothing. INFO: Running ModifiedRowKey() for Entity: product Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Creating a connection for entity product with URL: jdbc:oracle:oci:@dev.eline.com Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Time taken for getConnection(): 283 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed ModifiedRowKey for Entity: product rows obtained : 1 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed DeletedRowKey for Entity: product rows obtained : 0 Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed parentDeltaQuery for Entity: product Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Delta Import completed successfully Dec 3, 2009 9:25:25 AM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:0.404 From: noble.p...@corp.aol.com Date: Thu, 3 Dec 2009 12:50:15 +0530 Subject: Re: no error delta fail with DataImportHandler To: solr-user@lucene.apache.org the deltaQuery select 'product_id' and your deltaImportQuery uses ${dataimporter.delta.id} I guess it should have been ${dataimporter.delta. product_id} On Wed, Dec 2, 2009 at 11:52 PM, Thomas Woodard gtfo...@hotmail.com wrote: I'm trying to get delta indexing set up. My configuration allows a full index no problem, but when I create a test delta of a single record, the delta import finds the record but then does nothing. I can only assume I have something subtly wrong with my configuration, but according to the wiki, my configuration should be valid. What I am trying to do is have a single delta detected on the top level entity trigger a rebuild of everything under that entity, the same as the first example in the wiki. Any help would be greatly appreciated. dataConfig á ádataSource name=prodcat driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:oci:@XXX á áuser=XXX password=XXX autoCommit=false transactionIsolation=TRANSACTION_READ_COMMITTED/ á ádocument á á á áentity name=product dataSource=prodcat query= á á á áselect dp.product_id, dp.display_name, dp.long_description, gp.orientation á á á áfrom dcs_product dp, gl_product gp á á á áwhere dp.product_id = gp.product_id transformer=ClobTransformer,HTMLStripTransformer á á á ádeltaImportQuery=select dp.product_id, dp.display_name, dp.long_description, gp.orientation á á á áfrom dcs_product dp, gl_product gp á á á áwhere dp.product_id = gp.product_id á á á áAND dp.product_id = '${dataimporter.delta.id}' á á á ádeltaQuery=select product_id from gl_product_modified where last_modified TO_DATE('${dataimporter.last_index_time}', '-mm-dd hh:mi:ss') á á á árootEntity=false á á á ápk=PRODUCT_ID á á á á á á!-- COLUMN NAMES ARE CASE SENSITIVE. THEY NEED TO BE ALL CAPS OR EVERYTHING FAILS -- á á á á á áfield column=PRODUCT_ID name=product_id/ á á á á á áfield column=DISPLAY_NAME name=name/ á á á á á áfield column=LONG_DESCRIPTION name=long_description clob=true stripHTML=true / á á á á á áfield column=ORIENTATION name=orientation/ á á á á á áentity name=sku dataSource=prodcat query=select ds.sku_id, ds.sku_type, ds.on_sale, '${product.PRODUCT_ID}' || '_' || ds.sku_id as unique_id á á á áfrom dcs_prd_chldsku dpc, dcs_sku ds á á á áwhere dpc.product_id = '${product.PRODUCT_ID}' á á á áand dpc.sku_id = ds.sku_id á á á árootEntity=true pk=PRODUCT_ID, SKU_ID á á á á á á á áfield column=SKU_ID name=sku_id/ á á á á á á á áfield column=SKU_TYPE name=sku_type/ á á á á á á á áfield column=ON_SALE name=on_sale/ á á á á á á á áfield column=UNIQUE_ID name=unique_id/ á á á á á á á áentity name=catalog dataSource=prodcat query=select pc.catalog_id á á á á á á á á á á á á á áfrom gl_prd_catalog pc, gl_sku_catalog sc á á á á á á á á á á á á á áwhere pc.product_id = '${product.PRODUCT_ID}' and sc.sku_id = '${sku.SKU_ID}' and pc.catalog_id = sc.catalog_id pk=SKU_ID, CATALOG_ID á á á á á á á á á á á áfield column=CATALOG_ID name=catalogs/ á á á á á á á á/entity á á á á á á á áentity name=price dataSource=prodcat query=select ds.list_price as price á á á á á á á á á á á á á áfrom dcs_sku ds á á á á á á á á á á á á á áwhere ds.sku_id = '${sku.SKU_ID}' á á á á á á á á á á á á á áand ds.on_sale = 0 á á á á á á á á á á á á á áUNION á á á á á á á á á á á á á áselect
[PECL-DEV] [ANNOUNCEMENT] solr-0.9.8 (beta) Released
The new PECL package solr-0.9.8 (beta) has been released at http://pecl.php.net/. Release notes - - Fixed config.w32 for Windows build support. (Pierre, Pierrick) - Windows .dll now available at http://downloads.php.net/pierre (Pierre) - Fixed Bug #16943 Segmentation Fault from solr_encode_string() during attempt to retrieve solrXmlNode-children-content when solrXmlNode-children is NULL (Israel) - Disabled Expect header in libcurl (Israel) - Disabled Memory Debugging when normal debug is enabled (Israel) - Added list of contributors to the project (README.CONTRIBUTORS) Package Info - It effectively simplifies the process of interacting with Apache Solr using PHP5 and it already comes with built-in readiness for the latest features available in Solr 1.4. The extension has features such as built-in, serializable query string builder objects which effectively simplifies the manipulation of name-value pair request parameters across repeated requests. The response from the Solr server is also automatically parsed into native php objects whose properties can be accessed as array keys or object properties without any additional configuration on the client-side. Its advanced HTTP client reuses the same connection across multiple requests and provides built-in support for connecting to Solr servers secured behind HTTP Authentication or HTTP proxy servers. It is also able to connect to SSL-enabled containers. Please consult the documentation for more details on features. Related Links - Package home: http://pecl.php.net/package/solr Changelog: http://pecl.php.net/package-changelog.php?package=solr Download: http://pecl.php.net/get/solr-0.9.8.tgz Documentation: http://www.php.net/manual/en/book.solr.php Authors - Israel Ekpo ie...@php.net (lead) -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: Problem with searching with first capital letter
Thank`s for your reply. Now I understood my problem. Erick Erickson wrote: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters http://wiki.apache.org/solr/AnalyzersTokenizersTokenFiltersOn wildcard and fuzzy searches, no text analysis is performed on the search word. HTH Erick On Thu, Dec 3, 2009 at 10:19 AM, Yurish yuris...@inbox.lv wrote: I have a problem with SOLR searching. When i`am searching query: dog* everything is ok, but when query is Dog*(with first capital letter), i get no results. Any advice? My config: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType -- View this message in context: http://old.nabble.com/Problem-with-searching-with-first-capital-letter-tp26627677p26627677.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Problem-with-searching-with-first-capital-letter-tp26627677p26635779.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issues with alphanumeric search terms
I have added filter class=solr.WordDelimiterFilterFactory catenateAll=1 / to both index and query but still getting same behaviour. Is there any other that i am missing? con wrote: Yes. I meant all the indexed documents. With debugQuery=on, i got the following result: response − lst name=responseHeader int name=status0/int int name=QTime1/int − lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q(phone:650 AND rowtype:contacts)/str str name=wtxml/str str name=rows1/str str name=version2.2/str /lst /lst − result name=response numFound=104 start=0 − doc str name=ADDRESS /str str name=CITY /str str name=COUNTRY /str date name=CREATEDTIME2009-09-22T06:50:36.943Z/date str name=NAMEAdam/str str name=emaila...@abc.com/str str name=firstnameAdam/str str name=lastnamesmith/str str name=localeen_US/str str name=phone /str str name=rowtypecontacts/str /doc /result − lst name=debug str name=rawquerystring(phone:650 AND rowtype:contacts)/str str name=querystring(phone:650 AND rowtype:contacts)/str str name=parsedquery+rowtype:contacts/str str name=parsedquery_toString+rowtype:contacts/str − lst name=explain − str name=1030422en_US 0.99043053 = (MATCH) fieldWeight(rowtype:contacts in 0), product of: 1.0 = tf(termFreq(rowtype:contacts)=1) 0.99043053 = idf(docFreq=104, maxDocs=104) 1.0 = fieldNorm(field=rowtype, doc=0) /str /lst str name=QParserLuceneQParser/str − lst name=timing double name=time1.0/double − lst name=prepare double name=time0.0/double − lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst − lst name=process double name=time1.0/double − lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.DebugComponent double name=time1.0/double /lst /lst /lst /lst /response Erick Erickson wrote: Hmmm, what does debugQuery=on show? And did you mean documents here? it will return all the search terms Best Erick On Thu, Dec 3, 2009 at 11:40 AM, con convo...@gmail.com wrote: Hi My solr deployment is giving correct results for normal search terms like john. But when i search with john55 or 55 it will return all the search terms, including those which neither contains john nor 55. Below is the fieldtype defined for this field. fieldType name=mytype class=solr.TextField analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer analyzer type=query tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer /fieldType Is there any other tokenizers or filters need to be set for alphanumeric/Number search? -- View this message in context: http://old.nabble.com/Issues-with-alphanumeric-search-terms-tp26629048p26629048.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Issues-with-alphanumeric-search-terms-tp26629048p26635781.html Sent from the Solr - User mailing list archive at Nabble.com.