Multiple default search fields or catchall field?
Hi, I'm indexing feeds and websites referenced by the feeds. So I have as text fields: title - from the feed entries title description - from the feed entries description text - the websites text When the user doesn't define a default search field, then all three fields should be used for search. And I need to have highlighting. However it should still be possible to search only in title or description. - Do I need a catchall text field with content copied from all text fields? - Do I need to store the content in the catchall field as well as in the individual fields to get highlighting in every case? - Isn't it a big waste of hard disc space to store the content two times? Thanks for any help, Thomas Koch, http://www.koch.ro
Re: Apache solr for multiple searches
Bhuvi HN wrote: Can we have one single instance of the Apache Solr running for both the search like Job search and resume search. Yes, you want to run a multicore (multiple index) setup - see: http://wiki.apache.org/solr/CoreAdmin -- View this message in context: http://old.nabble.com/Apache-solr-for-multiple-searches-tp26551563p26690643.html Sent from the Solr - User mailing list archive at Nabble.com.
How to setup dynamic multicore replication
Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. So when indexing new documents we want to create new cores on the fly using the CoreAdminHandler through SolrJ. What I can't figure out is how I setup solr.xml and solrconfig.xml so that a core automatically is also replicated from the master to it's slaves once it's created. I have a solr.xml that starts like this: ?xml version='1.0' encoding='UTF-8'? solr persistent=true cores adminPath=/admin/cores /cores /solr and the replication part of solrconfig.xml master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAfteroptimize/str str name=confFilesschema.xml/str /lst /requestHandler slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://localhost:8081/solr/replication/str str name=pollInterval00:00:20/str /lst /requestHandler I think I should change the masterUrl in the slave configuration to something like: str name=masterUrlhttp://localhost:8081/solr/${solr.core.name}/replication/str So that the replication automatically finds the correct core replication handler. But how do I tell the slaves a new core is created, and that is should start replicating those to? Thanks in advance. Thijs [1] http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index
RE: why no results?
Hi Regan, I am using STRING fields only for values that in most cases will be used to FACET on.. I suggest using TEXT fields as per the default examples... ALSO, remember that if you do not specify the solr.LowerCaseFilterFactory that your search has just become case sensitive.. I struggled with that one before, so make sure what you are indexing is what you are searching for. * Stick to the default examples that is provided with the SOLR distro and you should be fine. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. enablePositionIncrements=true ensures that a 'gap' is left to allow for accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Jaco Olivier -Original Message- From: regany [mailto:re...@newzealand.co.nz] Sent: 08 December 2009 06:15 To: solr-user@lucene.apache.org Subject: Re: why no results? Tom Hill-7 wrote: Try solr.TextField instead. Thanks Tom, I've replaced the types section above with... types fieldtype name=string class=solr.TextField sortMissingLast=true omitNorms=true / /types deleted my index, restarted Solr and re-indexed my documents - but the search still returns nothing. Do I need to change the type in the fields sections as well? regan -- View this message in context: http://old.nabble.com/why-no-results--tp26688249p26688469.html Sent from the Solr - User mailing list archive at Nabble.com. Please consider the environment before printing this email. This transmission is for the intended addressee only and is confidential information. If you have received this transmission in error, please delete it and notify the sender. The content of this e-mail is the opinion of the writer only and is not endorsed by Sabinet Online Limited unless expressly stated otherwise.
RE: why no results?
Hi, Try changing your TEXT field to type text field name=text type=text indexed=true stored=false multiValued=true / (without the of course :)) That is your problem... also use the text type as per default examples with SOLR distro :) Jaco Olivier -Original Message- From: regany [mailto:re...@newzealand.co.nz] Sent: 08 December 2009 05:44 To: solr-user@lucene.apache.org Subject: why no results? hi all - newbie solr question - I've indexed some documents and can search / receive results using the following schema - BUT ONLY when searching on the id field. If I try searching on the title, subtitle, body or text field I receive NO results. Very confused. :confused: Can anyone see anything obvious I'm doing wrong Regan. ?xml version=1.0 ? schema name=core0 version=1.1 types fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true / /types fields !-- general -- field name=id type=string indexed=true stored=true multiValued=false required=true / field name=title type=string indexed=true stored=true multiValued=false / field name=subtitle type=string indexed=true stored=true multiValued=false / field name=body type=string indexed=true stored=true multiValued=false / field name=text type=string indexed=true stored=false multiValued=true / /fields !-- field to use to determine and enforce document uniqueness. -- uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldtext/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=OR/ !-- copyFields group fields into one single searchable indexed field for speed. -- copyField source=title dest=text / copyField source=subtitle dest=text / copyField source=body dest=text / /schema -- View this message in context: http://old.nabble.com/why-no-results--tp26688249p26688249.html Sent from the Solr - User mailing list archive at Nabble.com. Please consider the environment before printing this email. This transmission is for the intended addressee only and is confidential information. If you have received this transmission in error, please delete it and notify the sender. The content of this e-mail is the opinion of the writer only and is not endorsed by Sabinet Online Limited unless expressly stated otherwise.
Re: How to setup dynamic multicore replication
On Tue, Dec 8, 2009 at 2:43 PM, Thijs vonk.th...@gmail.com wrote: Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. If you go by that wiki link, then there is no need to have multiple cores. It basically says that, in some cases, it is possible to flatten multiple indexes into one index. Am I missing something? -- Regards, Shalin Shekhar Mangar.
Re: How to setup dynamic multicore replication
On Tue, Dec 8, 2009 at 2:43 PM, Thijs vonk.th...@gmail.com wrote: Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. So when indexing new documents we want to create new cores on the fly using the CoreAdminHandler through SolrJ. What I can't figure out is how I setup solr.xml and solrconfig.xml so that a core automatically is also replicated from the master to it's slaves once it's created. I have a solr.xml that starts like this: ?xml version='1.0' encoding='UTF-8'? solr persistent=true cores adminPath=/admin/cores /cores /solr and the replication part of solrconfig.xml master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAfteroptimize/str str name=confFilesschema.xml/str /lst /requestHandler slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://localhost:8081/solr/replication/str str name=pollInterval00:00:20/str /lst /requestHandler I think I should change the masterUrl in the slave configuration to something like: str name=masterUrlhttp://localhost:8081/solr/${solr.core.name}/replication/str So that the replication automatically finds the correct core replication handler. if you have dynamically created cores this is the solution. But how do I tell the slaves a new core is created, and that is should start replicating those to? Thanks in advance. Thijs [1] http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Apache solr for multiple searches
On Tue, Dec 8, 2009 at 2:28 PM, regany re...@newzealand.co.nz wrote: Bhuvi HN wrote: Can we have one single instance of the Apache Solr running for both the search like Job search and resume search. Yes, you want to run a multicore (multiple index) setup - see: http://wiki.apache.org/solr/CoreAdmin Or you could combine them into the same index. That is usually the easier solution. See http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index -- Regards, Shalin Shekhar Mangar.
Re: How to setup dynamic multicore replication
If I for example do: /select?q=type:bookfacet=truefacet.mincount=0facet.field=title the titles that are returned for the facet query also contains titles that are of type dvd. While I only want the unique titles for type book. On 8-12-2009 12:09, Shalin Shekhar Mangar wrote: On Tue, Dec 8, 2009 at 2:43 PM, Thijsvonk.th...@gmail.com wrote: Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. If you go by that wiki link, then there is no need to have multiple cores. It basically says that, in some cases, it is possible to flatten multiple indexes into one index. Am I missing something?
Re: How to setup dynamic multicore replication
But the slave never gets the message that a core is created... at least not in my setup... So it never starts replicating... On 8-12-2009 12:13, Noble Paul നോബിള് नोब्ळ् wrote: On Tue, Dec 8, 2009 at 2:43 PM, Thijsvonk.th...@gmail.com wrote: Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. So when indexing new documents we want to create new cores on the fly using the CoreAdminHandler through SolrJ. What I can't figure out is how I setup solr.xml and solrconfig.xml so that a core automatically is also replicated from the master to it's slaves once it's created. I have a solr.xml that starts like this: ?xml version='1.0' encoding='UTF-8'? solr persistent=true cores adminPath=/admin/cores /cores /solr and the replication part of solrconfig.xml master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAfteroptimize/str str name=confFilesschema.xml/str /lst /requestHandler slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://localhost:8081/solr/replication/str str name=pollInterval00:00:20/str /lst /requestHandler I think I should change the masterUrl in the slave configuration to something like: str name=masterUrlhttp://localhost:8081/solr/${solr.core.name}/replication/str So that the replication automatically finds the correct core replication handler. if you have dynamically created cores this is the solution. But how do I tell the slaves a new core is created, and that is should start replicating those to? Thanks in advance. Thijs [1] http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index
test
test
Tika and DIH integration (https://issues.apache.org/jira/browse/SOLR-1358)
Hi, I am looking into using Solr for indexing a large database that has documents (mostly pdf and msoffice) stored as CLOBs in several tables. It is my understanding that the DIH as provided in Solr 1.4 cannot index these CLOBs yet, and that SOLR-1358 should provide exactly this. So i was wondering what the most 'recommended' way is of solving this .. Should it be done with a custom textextractor of some sort, set on the column/field ? Thanks, Jorg
Re: edismax using bigrams instead of phrases?
On Mon, Dec 7, 2009 at 5:45 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: it would be a mistake to have a pf1 field that was an alias for pf ... as it stands the pf parm in dismax is analogous to a pf* or pf-Infinity Of course -- I waswell, let's just pretend I was drunk. How about pfInf or pfAll? -- Bill Dueber Library Systems Programmer University of Michigan Library
RE: Facet query with special characters
Hello Hoss, Many thanks for your answer. That's very interesting. So, are you saying this is an issue on the index side, rather than the query side? Note that I am (supposed to be) indexing/searching without analysis tokenization (if that's the correct term) - i.e. field values like 'pds-comp.domain' shouldn't be (and I believe aren't) broken up as in 'pds', 'comp' 'domain' etc. (e.g. using the 'text_ws' fieldtype). What would be your opinion on the best way to index/analyze/not-analyze such fields? Thanks! Peter Date: Mon, 7 Dec 2009 15:30:47 -0800 From: hossman_luc...@fucit.org To: solr-user@lucene.apache.org Subject: Re: Facet query with special characters : When performing a facet query where part of the value portion has a : special character (a minus sign in this case), the query returns zero : results unless I put a wildcard (*) at the end. check your analysis configuration for this fieldtype, in particular look at what debugQuery produces for your parsed query, and look at what analysis.jsp says it will do at query time with the input string pds-comp.domain ... because it sounds like you have a disconnect between how the text is indexed and how it is searched. adding a * to your input query forces it to make a WildcardQuery which doesn't use analysis, so you get a match on the literal token. in short: i suspect your problem has nothing to do with query string escaping, and everything to do with field tokenization. -Hoss _ View your other email accounts from your Hotmail inbox. Add them now. http://clk.atdmt.com/UKM/go/186394592/direct/01/
RE: How to setup dynamic multicore replication
Hi, In my environment I create cores on the fly, then replicate the core to all of the slaves. I first create the core on the master and persist the solr.xml via the CoreAdmin API. I then do the same on each of my slaves. After loading / committing / optimizing the data on the master I send the replication request to each of the slaves. So each slave's replication handler http://slave_host_port/solr/core_name/replication gets a request to fetch index which includes the master url http://master_host_port/solr/core_name/replication; The slave's solrconf.xml has no mention of the master as it is all done progromatically. You need to specify the core name in the url, and if you haven't created the core on the master it will result in error. I don't create a new core everytime I update, but I do have the slaves fetch the index after every update. My first attempt to set the polling did not seem to work, and have not had a chance to revisit. I have not found a way to persist the solrconfig.xml with the updates to the slave list, so the control / management is within my application. Hope this highlevel overview helps. Joe Date: Tue, 8 Dec 2009 12:42:12 +0100 From: vonk.th...@gmail.com To: solr-user@lucene.apache.org Subject: Re: How to setup dynamic multicore replication But the slave never gets the message that a core is created... at least not in my setup... So it never starts replicating... On 8-12-2009 12:13, Noble Paul നോബിള് नोब्ळ् wrote: On Tue, Dec 8, 2009 at 2:43 PM, Thijsvonk.th...@gmail.com wrote: Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. So when indexing new documents we want to create new cores on the fly using the CoreAdminHandler through SolrJ. What I can't figure out is how I setup solr.xml and solrconfig.xml so that a core automatically is also replicated from the master to it's slaves once it's created. I have a solr.xml that starts like this: ?xml version='1.0' encoding='UTF-8'? solr persistent=true cores adminPath=/admin/cores /cores /solr and the replication part of solrconfig.xml master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAfteroptimize/str str name=confFilesschema.xml/str /lst /requestHandler slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://localhost:8081/solr/replication/str str name=pollInterval00:00:20/str /lst /requestHandler I think I should change the masterUrl in the slave configuration to something like: str name=masterUrlhttp://localhost:8081/solr/${solr.core.name}/replication/str So that the replication automatically finds the correct core replication handler. if you have dynamically created cores this is the solution. But how do I tell the slaves a new core is created, and that is should start replicating those to? Thanks in advance. Thijs [1] http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index _ Windows Live Hotmail is faster and more secure than ever. http://www.microsoft.com/windows/windowslive/hotmail_bl1/hotmail_bl1.aspx?ocid=PID23879::T:WLMTAGL:ON:WL:en-ww:WM_IMHM_1:092009
Re: Multiple default search fields or catchall field?
See below. On Tue, Dec 8, 2009 at 3:48 AM, Thomas Koch tho...@koch.ro wrote: Hi, I'm indexing feeds and websites referenced by the feeds. So I have as text fields: title - from the feed entries title description - from the feed entries description text - the websites text When the user doesn't define a default search field, then all three fields should be used for search. And I need to have highlighting. However it should still be possible to search only in title or description. - Do I need a catchall text field with content copied from all text fields? This is a common way to do this. You could also write custom code to munge the query, but there's no need to go there as a first option, I'd only think about this if you have problems with the catchall approach. - Do I need to store the content in the catchall field as well as in the individual fields to get highlighting in every case? No. You don't display the catchall field, so you don't need to store it. - Isn't it a big waste of hard disc space to store the content two times? Disk space is cheap. It really depends upon how much data you're storing whether you care. 100M - who cares? 100G - lotsa people care.. But you don't have to so it's a moot point. HTH Erick Thanks for any help, Thomas Koch, http://www.koch.ro
Re: How to setup dynamic multicore replication
Hi, Thanks. That was my second option. But I was hoping that the master and slaves could find that out for themselves. As now I have to also have my 'updater software' know about all the slaves (and maybe even their state). Which it previously had no idea about. This way I can't just plugin a 'empty' slave that knows where it's master is and have it pull in all the required cores and indexes. Thijs On 8-12-2009 14:25, Joe Kessel wrote: Hi, In my environment I create cores on the fly, then replicate the core to all of the slaves. I first create the core on the master and persist the solr.xml via the CoreAdmin API. I then do the same on each of my slaves. After loading / committing / optimizing the data on the master I send the replication request to each of the slaves. So each slave's replication handler http://slave_host_port/solr/core_name/replication gets a request to fetch index which includes the master url http://master_host_port/solr/core_name/replication; The slave's solrconf.xml has no mention of the master as it is all done progromatically. You need to specify the core name in the url, and if you haven't created the core on the master it will result in error. I don't create a new core everytime I update, but I do have the slaves fetch the index after every update. My first attempt to set the polling did not seem to work, and have not had a chance to revisit. I have not found a way to persist the solrconfig.xml with the updates to the slave list, so the control / management is within my application. Hope this highlevel overview helps. Joe Date: Tue, 8 Dec 2009 12:42:12 +0100 From: vonk.th...@gmail.com To: solr-user@lucene.apache.org Subject: Re: How to setup dynamic multicore replication But the slave never gets the message that a core is created... at least not in my setup... So it never starts replicating... On 8-12-2009 12:13, Noble Paul നോബിള് नोब्ळ् wrote: On Tue, Dec 8, 2009 at 2:43 PM, Thijsvonk.th...@gmail.com wrote: Hi I need some help setting up dynamic multicore replication. We are changing our setup from a replicated single core index with multiple document types, as described on the wiki[1], to a dynamic multicore setup. We need this so that we can display facets with a zero count that are unique to the document 'type'. So when indexing new documents we want to create new cores on the fly using the CoreAdminHandler through SolrJ. What I can't figure out is how I setup solr.xml and solrconfig.xml so that a core automatically is also replicated from the master to it's slaves once it's created. I have a solr.xml that starts like this: ?xml version='1.0' encoding='UTF-8'? solr persistent=true cores adminPath=/admin/cores /cores /solr and the replication part of solrconfig.xml master: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAfteroptimize/str str name=confFilesschema.xml/str /lst /requestHandler slave: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://localhost:8081/solr/replication/str str name=pollInterval00:00:20/str /lst /requestHandler I think I should change the masterUrl in the slave configuration to something like: str name=masterUrlhttp://localhost:8081/solr/${solr.core.name}/replication/str So that the replication automatically finds the correct core replication handler. if you have dynamically created cores this is the solution. But how do I tell the slaves a new core is created, and that is should start replicating those to? Thanks in advance. Thijs [1] http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index _ Windows Live Hotmail is faster and more secure than ever. http://www.microsoft.com/windows/windowslive/hotmail_bl1/hotmail_bl1.aspx?ocid=PID23879::T:WLMTAGL:ON:WL:en-ww:WM_IMHM_1:092009
Re: how to do auto-suggest case-insensitive match and return original case field values
Hi again, Just pinging again to any Solr experts out there... sorry that my previous message was a bit long (wanted to fully explain what I've already done and where the exact difficulty arises)... but to summarize: Does anyone know how to use Solr querying with faceting to do an auto-suggest that search case-insensitively yet returns the original mixed case values??? thanks for any help, Leandro -- View this message in context: http://old.nabble.com/how-to-do-auto-suggest-case-insensitive-match-and-return-original-case-field-values-tp26636365p26694224.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: # in query
Thanks Eric, I looked more into this, but still stuck: I have this field indexed using text_rev I looked at the luke analysis for this field, but im unsure how to read it. When I query the field by the id I get: result name=response numFound=1 start=0 − doc str name=id5405255/str str name=textTitle###'s test blog/str /doc /result If I try to query even multiple ### I get nothing. Here is what luke handler says: (btw when I used id instead of docid on luke I got a nullpointer exception /admin/luke?docid=5405255 vs / admin/luke?id=5405255) lst name=textTitle str name=typetext_rev/str str name=schemaITS---/str str name=indexITS--/str int name=docs290329/int int name=distinct401016/int − lst name=topTerms int name=#1;golb49362/int int name=blog49362/int int name=#1;ecapsym29426/int int name=myspace29426/int int name=#1;s8773/int int name=s8773/int int name=#1;ed8033/int int name=de8033/int int name=com6884/int int name=#1;moc6884/int /lst − lst name=histogram int name=1308908/int int name=234340/int int name=421916/int int name=814474/int int name=169122/int int name=325578/int int name=643162/int int name=1281844/int int name=256910/int int name=512464/int int name=1024182/int int name=204872/int int name=409626/int int name=819212/int int name=163842/int int name=327682/int int name=655362/int /lst /lst solr/select?q=textTitle:%23%23%23 - gets no results. I have the same field indexed as a alphaOnlySort, and it gives me lots of results, but not the ones I want. Any other ideas? thanks Joel On Dec 7, 2009, at 3:42 PM, Erick Erickson wrote: Well, the very first thing I would is examine the field definition in your schema file. I suspect that the tokenizers and/or filters you're using for indexing and/or querying is doing something to the # symbol. Most likely stripping it. If you're just searching for the single-letter term #, I *think* the query parser silently just drops that part of the clause out, but check on that. The second thing would be to get a copy of Luke and examine your index to see if what you *think* is in your index actually is there. HTH Erick On Mon, Dec 7, 2009 at 3:28 PM, Joel Nylund jnyl...@yahoo.com wrote: ok thanks, sorry my brain wasn't working, but even when I url encode it, I dont get any results, is there something special I have to do for solr? thanks Joel On Dec 7, 2009, at 3:20 PM, Paul Libbrecht wrote: Sure you have to escape it! %23 otherwise the browser considers it as a separator between the URL for the server (on the left) and the fragment identifier (on the right) which is not sent the server. You might want to read about URL-encoding, escaping with backslash is a shell-thing, not a thing for URLs! paul Le 07-déc.-09 à 21:16, Joel Nylund a écrit : Hi, How can I put a # sign in a query, do I need to escape it? For example I want to query books with title that contain # No work so far: http://localhost:8983/solr/select?q=textTitle:#; http://localhost:8983/solr/select?q=textTitle:# http://localhost:8983/solr/select?q=textTitle:\#; Getting org.apache.lucene.queryParser.ParseException: Cannot parse 'textTitle:\': Lexical error at line 1, column 12. Encountered: EOF after : and sometimes just no response. thanks Joel
Re: Tika and DIH integration (https://issues.apache.org/jira/browse/SOLR-1358)
we are very close to resolving SOLR-1358 . So you may be able to use it On Tue, Dec 8, 2009 at 5:32 PM, Jorg Heymans jorg.heym...@gmail.com wrote: Hi, I am looking into using Solr for indexing a large database that has documents (mostly pdf and msoffice) stored as CLOBs in several tables. It is my understanding that the DIH as provided in Solr 1.4 cannot index these CLOBs yet, and that SOLR-1358 should provide exactly this. So i was wondering what the most 'recommended' way is of solving this .. Should it be done with a custom textextractor of some sort, set on the column/field ? Thanks, Jorg -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: # in query
In Luke, there's a tab that will let you go to a document ID. From there you can see all the fields in a particular document, and examine what the actual tokens stored are. Until and unless you know what tokens are being indexed, you simply can't know what your queries should look like... *Assuming* that the ### are getting indexed and *assuming* your tokenizer tokenized on, whitespace, and *assuming* that by text_rev you are talking about ReversedWildcardFilterFactory, I wouldn't expect a search to match if it wasn't exactly: s'###. But as you see, there's a long chain of assumptions there any one of which may be violated by your schema. So please post the relevant portions of your schema to make it easier to help. Best Erick On Tue, Dec 8, 2009 at 9:54 AM, Joel Nylund jnyl...@yahoo.com wrote: Thanks Eric, I looked more into this, but still stuck: I have this field indexed using text_rev I looked at the luke analysis for this field, but im unsure how to read it. When I query the field by the id I get: result name=response numFound=1 start=0 - doc str name=id5405255/str str name=textTitle###'s test blog/str /doc /result If I try to query even multiple ### I get nothing. Here is what luke handler says: (btw when I used id instead of docid on luke I got a nullpointer exception /admin/luke?docid=5405255 vs /admin/luke?id=5405255) lst name=textTitle str name=typetext_rev/str str name=schemaITS---/str str name=indexITS--/str int name=docs290329/int int name=distinct401016/int - lst name=topTerms int name=#1;golb49362/int int name=blog49362/int int name=#1;ecapsym29426/int int name=myspace29426/int int name=#1;s8773/int int name=s8773/int int name=#1;ed8033/int int name=de8033/int int name=com6884/int int name=#1;moc6884/int /lst - lst name=histogram int name=1308908/int int name=234340/int int name=421916/int int name=814474/int int name=169122/int int name=325578/int int name=643162/int int name=1281844/int int name=256910/int int name=512464/int int name=1024182/int int name=204872/int int name=409626/int int name=819212/int int name=163842/int int name=327682/int int name=655362/int /lst /lst solr/select?q=textTitle:%23%23%23 - gets no results. I have the same field indexed as a alphaOnlySort, and it gives me lots of results, but not the ones I want. Any other ideas? thanks Joel On Dec 7, 2009, at 3:42 PM, Erick Erickson wrote: Well, the very first thing I would is examine the field definition in your schema file. I suspect that the tokenizers and/or filters you're using for indexing and/or querying is doing something to the # symbol. Most likely stripping it. If you're just searching for the single-letter term #, I *think* the query parser silently just drops that part of the clause out, but check on that. The second thing would be to get a copy of Luke and examine your index to see if what you *think* is in your index actually is there. HTH Erick On Mon, Dec 7, 2009 at 3:28 PM, Joel Nylund jnyl...@yahoo.com wrote: ok thanks, sorry my brain wasn't working, but even when I url encode it, I dont get any results, is there something special I have to do for solr? thanks Joel On Dec 7, 2009, at 3:20 PM, Paul Libbrecht wrote: Sure you have to escape it! %23 otherwise the browser considers it as a separator between the URL for the server (on the left) and the fragment identifier (on the right) which is not sent the server. You might want to read about URL-encoding, escaping with backslash is a shell-thing, not a thing for URLs! paul Le 07-déc.-09 à 21:16, Joel Nylund a écrit : Hi, How can I put a # sign in a query, do I need to escape it? For example I want to query books with title that contain # No work so far: http://localhost:8983/solr/select?q=textTitle:#; http://localhost:8983/solr/select?q=textTitle:# http://localhost:8983/solr/select?q=textTitle:\#; Getting org.apache.lucene.queryParser.ParseException: Cannot parse 'textTitle:\': Lexical error at line 1, column 12. Encountered: EOF after : and sometimes just no response. thanks Joel
Re: Exception encountered during replication on slave....Any clues?
Hi, Noble: When I hit the masterUrl from the slave box at http://localhost:8080/postingsmaster/replication I get the following xml response: ?xml version=1.0 encoding=UTF-8 ? - response - lst name=responseHeader int name=status0/int int name=QTime0/int /lst str name=statusOK/str str name=messageNo command/str /response And then when I look in the logs, I see the exception that I mentioned. What exactly does this error mean that replication is not available.By the way, when I go to the admin url for the slave and click on replication, I see a screen with the master url listed (as above) and the word unreachable after it.And, of course, the same exception shows up in the tomcat logs. Thanks, - Bill -- From: Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com Sent: Monday, December 07, 2009 9:20 PM To: solr-user@lucene.apache.org Subject: Re: Exception encountered during replication on slaveAny clues? are you able to hit the http://localhost:8080/postingsmaster/replication using a browser from the slave box. if you are able to hit it what do you see? On Tue, Dec 8, 2009 at 3:42 AM, William Pierce evalsi...@hotmail.com wrote: Just to make doubly sure, per tck's suggestion, I went in and explicitly added in the port in the masterurl so that it now reads: http://localhost:8080/postingsmaster/replication Still getting the same exception... I am running solr 1.4, on Ubuntu karmic, using tomcat 6 and Java 1.6. Thanks, - Bill -- From: William Pierce evalsi...@hotmail.com Sent: Monday, December 07, 2009 2:03 PM To: solr-user@lucene.apache.org Subject: Re: Exception encountered during replication on slaveAny clues? tck, thanks for your quick response. I am running on the default port (8080). If I copy that exact string given in the masterUrl and execute it in the browser I get a response from solr: ?xml version=1.0 encoding=UTF-8 ? - response - lst name=responseHeader int name=status0/int int name=QTime0/int /lst str name=statusOK/str str name=messageNo command/str /response So the masterUrl is reachable/accessible so far as I am able to tell Thanks, - Bill -- From: TCK moonwatcher32...@gmail.com Sent: Monday, December 07, 2009 1:50 PM To: solr-user@lucene.apache.org Subject: Re: Exception encountered during replication on slaveAny clues? are you missing the port number in the master's url ? -tck On Mon, Dec 7, 2009 at 4:44 PM, William Pierce evalsi...@hotmail.comwrote: Folks: I am seeing this exception in my logs that is causing my replication to fail.I start with a clean slate (empty data directory). I index the data on the postingsmaster using the dataimport handler and it succeeds. When the replication slave attempts to replicate it encounters this error. Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller fetchLatestIndex SEVERE: Master at: http://localhost/postingsmaster/replication is not available. Index fetch failed. Exception: Invalid version or the data in not in 'javabin' format Any clues as to what I should look for to debug this further? Replication is enabled as follows: The postingsmaster solrconfig.xml looks as follows: requestHandler name=/replication class=solr.ReplicationHandler lst name=master !--Replicate on 'optimize' it can also be 'commit' -- str name=replicateAftercommit/str !--If configuration files need to be replicated give the names here . comma separated -- str name=confFiles/str /lst /requestHandler The postings slave solrconfig.xml looks as follows: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave !--fully qualified url for the replication handler of master -- str name=masterUrlhttp://localhost/postingsmaster/replication /str !--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. But a snappull can be triggered from the admin or the http API -- str name=pollInterval00:05:00/str /lst /requestHandler Thanks, - Bill -- - Noble Paul | Systems Architect| AOL | http://aol.com
RE: Facet query with special characters
: Note that I am (supposed to be) indexing/searching without analysis : tokenization (if that's the correct term) - i.e. field values like : 'pds-comp.domain' shouldn't be (and I believe aren't) broken up as in : 'pds', 'comp' 'domain' etc. (e.g. using the 'text_ws' fieldtype). ... : What would be your opinion on the best way to index/analyze/not-analyze such fields? a whitespace tokenizer is probeably the best bet, but in order to be certain what's going on, you would need to look at a few things (and if you wanted help from other people, you would need to post those things) that i mentioned before : check your analysis configuration for this fieldtype, in particular look : at what debugQuery produces for your parsed query, and look at what : analysis.jsp says it will do at query time with the input string : pds-comp.domain ... because it sounds like you have a disconnect between : how the text is indexed and how it is searched. adding a * to your ...so what does your schema look like, what is the outputfrom debugQuery, what is the output from analysis.jsp, etc... -Hoss
Re: KStem download
Hi Guys I still have this problem I got the fresh release of apache solr 1.4 i added decleration of Kstemmer in my schema.xml and put the two jars files under \example\lib folder. I some how think its not able to find the solr home, looking at the error. If i make a nightly distribution build and upload the war file in tomcat and in the tomcat webapp i specify the solr.home property to point to example\solr folder and in the example\solr folder i placed a lib folder in which i copied the two jar files, then tomcat works fine. SEVERE: Could not start SOLR. Check solr/home property java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/ResourceLoaderAware at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:621) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124) at java.net.URLClassLoader.defineClass(URLClassLoader.java:260) at java.net.URLClassLoader.access$000(URLClassLoader.java:56) at java.net.URLClassLoader$1.run(URLClassLoader.java:195) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:375) at java.lang.ClassLoader.loadClass(ClassLoader.java:300) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:592) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:357) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:388) at org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:835) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:424) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:414) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:456) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:95) at org.apache.solr.core.SolrCore.init(SolrCore.java:520) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Caused by: java.lang.ClassNotFoundException:
Re: how to do auto-suggest case-insensitive match and return original case field values
: In my web application I want to set up auto-suggest as you type : functionality which will search case-insensitively yet return the original : case terms. It doesn't seem like TermsComponent can do this as it can only : return the lowercase indexed terms your are searching against, not the ... : which provides useful sorting by and returning of term frequency counts in : your index. How does one get this same information with regular Solr Query? : I set up the following prefix query, searching by the indexed lowercased : field and returning the other: The type of approach you are describing (doing a prefix based query for autosuggest) probably won't work very well unless your index is 100% designed just for the autosuggest ... if it's an index about products, and you're just using one of hte fields for autosuggest, you aren't going to get good autosuggest results because the same word is going to appear in multiple products. what you need is an index of *words* that you want to autosuggest, with fields indicating how important those words are that you can use in a function query (this replaces the term freq that TermComponent would use) the fact that your test field is multivalued and stores widly different things in each doc is an example of what i mean. Have you considered the possibility of just indexing the lowercase value concatenated with the regular case value using a special delimiter, and ten returning to your TermComponent based solution? index PowerPoint as powerpoint|PowerPoint and just split on the \ character when you get hte data back from your prefix based term lookup. -Hoss
Re: java.lang.NoSuchMethodError: org.apache.commons.httpclient.HttpConnectionManager.getParams()Lorg/apache/commons/httpclient/params/HttpConnectionManagerParams;
: Strangely i dont get this error when i execute this code from command line. : This error only occurs when i access it from a web application. Secondly, : this same method works fine with another web application. Both web ... : java.lang.NoSuchMethodError: : org.apache.commons.httpclient.HttpConnectionManager.getParams()Lorg/apache/commons/httpclient/params/HttpConnectionManagerParams; : at : org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.setDefaultMaxConnectionsPerHost(CommonsHttpSolrServer.java:455) ...if you only get this problem in some enviornments, then i suspect there is something wrong with those environments (and clearly not with the code).I would start by checking every jar in all of your environments and making sure you don't have multiple copies of hte same jar (in different versions) mistakenly installed. -Hoss
indexing XML with solr example webapp - out of java heap space
Hi! I downloaded SOLR and am trying to index an XML file. This XML file is huge (500M). When I try to index it using the post.jar tool in example\exampledocs, I get a out of java heap space error in the SimplePostTool application. Any ideas how to fix this? Passing in -Xms1024M does not fix it. Feroze.
Re: Solr Admin XPath
Wild shots in the dark: * remove the white psace arround the = characters * replace the single-quote characters with double quote characters : XPathExpression reqPerSec = xpath.compile(/solr/solr-info/QUERYHANDLER/entry[name = 'dismax']/stats/st...@name = 'avgRequestsPerSecond']); ... : This doesn't throw any errors, and the XPath works just fine in /any/ XPath tester I try... except Java. -Hoss
Re: # in query
ok, I just realized I was using the luke handler, didnt know there was a fat client, I assume thats what you are talking about. I downloaded the lukeall.jar, ran it, pointed to my index, found the document in question, didn't see how it was tokenized, but I clicked the reconstruct edit button, this gives me a tab that has tokenized per field, for this field it shows: s|s, ecapsym|myspace, golb|blog title is: ###'s myspace blog schema is: !-- A general unstemmed text field that indexes tokens normally and also reversed (via ReversedWildcardFilterFactory), to enable more efficient leading wildcard queries. -- fieldType name=text_rev class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.ReversedWildcardFilterFactory withOriginal=true maxPosAsterisk=3 maxPosQuestion=2 maxFractionAsterisk=0.33/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType field name=textTitle type=text_rev indexed=true stored=true required=false multiValued=false/ thanks Joel On Dec 8, 2009, at 11:14 AM, Erick Erickson wrote: In Luke, there's a tab that will let you go to a document ID. From there you can see all the fields in a particular document, and examine what the actual tokens stored are. Until and unless you know what tokens are being indexed, you simply can't know what your queries should look like... *Assuming* that the ### are getting indexed and *assuming* your tokenizer tokenized on, whitespace, and *assuming* that by text_rev you are talking about ReversedWildcardFilterFactory, I wouldn't expect a search to match if it wasn't exactly: s'###. But as you see, there's a long chain of assumptions there any one of which may be violated by your schema. So please post the relevant portions of your schema to make it easier to help. Best Erick On Tue, Dec 8, 2009 at 9:54 AM, Joel Nylund jnyl...@yahoo.com wrote: Thanks Eric, I looked more into this, but still stuck: I have this field indexed using text_rev I looked at the luke analysis for this field, but im unsure how to read it. When I query the field by the id I get: result name=response numFound=1 start=0 - doc str name=id5405255/str str name=textTitle###'s test blog/str /doc /result If I try to query even multiple ### I get nothing. Here is what luke handler says: (btw when I used id instead of docid on luke I got a nullpointer exception /admin/luke?docid=5405255 vs /admin/luke?id=5405255) lst name=textTitle str name=typetext_rev/str str name=schemaITS---/str str name=indexITS--/str int name=docs290329/int int name=distinct401016/int - lst name=topTerms int name=#1;golb49362/int int name=blog49362/int int name=#1;ecapsym29426/int int name=myspace29426/int int name=#1;s8773/int int name=s8773/int int name=#1;ed8033/int int name=de8033/int int name=com6884/int int name=#1;moc6884/int /lst - lst name=histogram int name=1308908/int int name=234340/int int name=421916/int int name=814474/int int name=169122/int int name=325578/int int name=643162/int int name=1281844/int int name=256910/int int name=512464/int int name=1024182/int int name=204872/int int name=409626/int int name=819212/int int name=163842/int int name=327682/int int name=655362/int /lst /lst solr/select?q=textTitle:%23%23%23 - gets no results. I have the same field indexed as a alphaOnlySort, and it gives me lots of results, but not the ones I want. Any other ideas? thanks Joel On Dec 7, 2009, at 3:42 PM, Erick Erickson wrote: Well, the very first thing I would is examine the field definition in your schema file. I suspect that the tokenizers and/or filters you're using for indexing and/or querying is doing something to the # symbol. Most likely stripping it. If you're just searching for the single-letter term #, I *think* the query parser silently just drops that part of the clause out, but check on that. The second thing would be to get a copy of Luke and
Re: WELCOME to solr-user@lucene.apache.org
(FYI: in the future please start a new thread with an approriate subject line when you ask questions -- you probably would have gotten a lot more responses fro people interested in Tika and SolrCell if they could tell that this email was about SolrCell) : I found that Tika read the html and extract metadata like meta name=id : content=12 from my htmls but my documents has an already an id setted by : literal.id=10. : : I tried to map the id from Tika by fmap.id=ignored_ but it ignore also my : literal.id H, yeah: that seems like an odd order of operations, but it's documented on the wiki so evidently it's intentional... http://wiki.apache.org/solr/ExtractingRequestHandler#Order_of_field_operations my best sugguestions: * use the capture param to restrict what gets extracted (it's probably possible to write an XPath query that selects everything *except* metadata[id]) * change the name of your uniqueKey field to be something other then id so it's less likely to collide with a value from the document. I also opened two Jira issues that you may want to post comments in... https://issues.apache.org/jira/browse/SOLR-1633 https://issues.apache.org/jira/browse/SOLR-1634 -Hoss
Case Insensitive search not working
Hello, I tried to force case insensitive search by having the following setting in my schema.xml file which I guess is standard for Case sensitive searches : fieldType name=text_ws class=solr.TextField positionIncrementGap=100 analyzer type = index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class = solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class = solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType However when I perform searches on San Jose san jose , I get 16 0 responses back respectively is there anything else I missing here ? -- View this message in context: http://old.nabble.com/Case-Insensitive-search-not-working-tp26699734p26699734.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to do auto-suggest case-insensitive match and return original case field values
Just updated SOLR-1625 to support regexp hints. https://issues.apache.org/jira/browse/SOLR-1625 Cheers, Uri Chris Hostetter wrote: : In my web application I want to set up auto-suggest as you type : functionality which will search case-insensitively yet return the original : case terms. It doesn't seem like TermsComponent can do this as it can only : return the lowercase indexed terms your are searching against, not the ... : which provides useful sorting by and returning of term frequency counts in : your index. How does one get this same information with regular Solr Query? : I set up the following prefix query, searching by the indexed lowercased : field and returning the other: The type of approach you are describing (doing a prefix based query for autosuggest) probably won't work very well unless your index is 100% designed just for the autosuggest ... if it's an index about products, and you're just using one of hte fields for autosuggest, you aren't going to get good autosuggest results because the same word is going to appear in multiple products. what you need is an index of *words* that you want to autosuggest, with fields indicating how important those words are that you can use in a function query (this replaces the term freq that TermComponent would use) the fact that your test field is multivalued and stores widly different things in each doc is an example of what i mean. Have you considered the possibility of just indexing the lowercase value concatenated with the regular case value using a special delimiter, and ten returning to your TermComponent based solution? index PowerPoint as powerpoint|PowerPoint and just split on the \ character when you get hte data back from your prefix based term lookup. -Hoss
Re: Case Insensitive search not working
Did you rebuild the index? Changing the analyzer for the index doesn't affect already indexed documents. Tom On Tue, Dec 8, 2009 at 11:57 AM, insaneyogi3008 insaney...@gmail.comwrote: Hello, I tried to force case insensitive search by having the following setting in my schema.xml file which I guess is standard for Case sensitive searches : fieldType name=text_ws class=solr.TextField positionIncrementGap=100 analyzer type = index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class = solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class = solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType However when I perform searches on San Jose san jose , I get 16 0 responses back respectively is there anything else I missing here ? -- View this message in context: http://old.nabble.com/Case-Insensitive-search-not-working-tp26699734p26699734.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: # in query
Sorry, I usually think of things in Lucene land and reflexively think of the fat client. Anyway, here's your problem I think... WordDelimiterFilterFactory. See: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory It's losing the # altogether, as indicated by the tokens you saw: s|s, ecapsym|myspace, golb|blog not a # in sight. It's kind of subtle, but in the above page entry, this phrase implies that all non alpha-numerics are dropped: (by default, all non alpha-numeric characters) title is: ###'s myspace blog I'm assuming that the Title (if you're looking at it in Luke) is giving back your stored value. The tokens are what count during searching, storing and indexing are orthogonal HTH Erick On Tue, Dec 8, 2009 at 2:25 PM, Joel Nylund jnyl...@yahoo.com wrote: ok, I just realized I was using the luke handler, didnt know there was a fat client, I assume thats what you are talking about. I downloaded the lukeall.jar, ran it, pointed to my index, found the document in question, didn't see how it was tokenized, but I clicked the reconstruct edit button, this gives me a tab that has tokenized per field, for this field it shows: s|s, ecapsym|myspace, golb|blog title is: ###'s myspace blog schema is: !-- A general unstemmed text field that indexes tokens normally and also reversed (via ReversedWildcardFilterFactory), to enable more efficient leading wildcard queries. -- fieldType name=text_rev class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.ReversedWildcardFilterFactory withOriginal=true maxPosAsterisk=3 maxPosQuestion=2 maxFractionAsterisk=0.33/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType field name=textTitle type=text_rev indexed=true stored=true required=false multiValued=false/ thanks Joel On Dec 8, 2009, at 11:14 AM, Erick Erickson wrote: In Luke, there's a tab that will let you go to a document ID. From there you can see all the fields in a particular document, and examine what the actual tokens stored are. Until and unless you know what tokens are being indexed, you simply can't know what your queries should look like... *Assuming* that the ### are getting indexed and *assuming* your tokenizer tokenized on, whitespace, and *assuming* that by text_rev you are talking about ReversedWildcardFilterFactory, I wouldn't expect a search to match if it wasn't exactly: s'###. But as you see, there's a long chain of assumptions there any one of which may be violated by your schema. So please post the relevant portions of your schema to make it easier to help. Best Erick On Tue, Dec 8, 2009 at 9:54 AM, Joel Nylund jnyl...@yahoo.com wrote: Thanks Eric, I looked more into this, but still stuck: I have this field indexed using text_rev I looked at the luke analysis for this field, but im unsure how to read it. When I query the field by the id I get: result name=response numFound=1 start=0 - doc str name=id5405255/str str name=textTitle###'s test blog/str /doc /result If I try to query even multiple ### I get nothing. Here is what luke handler says: (btw when I used id instead of docid on luke I got a nullpointer exception /admin/luke?docid=5405255 vs /admin/luke?id=5405255) lst name=textTitle str name=typetext_rev/str str name=schemaITS---/str str name=indexITS--/str int name=docs290329/int int name=distinct401016/int - lst name=topTerms int name=#1;golb49362/int int name=blog49362/int int name=#1;ecapsym29426/int int name=myspace29426/int int name=#1;s8773/int int name=s8773/int int name=#1;ed8033/int int name=de8033/int int name=com6884/int int name=#1;moc6884/int /lst - lst name=histogram int name=1308908/int int name=234340/int int name=421916/int int name=814474/int int name=169122/int int name=325578/int int name=643162/int int name=1281844/int int name=256910/int
About fsv (sort field falues)
I am tracing QueryComponent.java and would like to know the pourpose of doFSV function. Don't understand what fsv are for. Have tried some queries with fsv=true and some extra info apears in the response: lst name=sort_values/ But don't know what is it for and can't find much info out there. I read: // The query cache doesn't currently store sort field values, and SolrIndexSearcher doesn't // currently have an option to return sort field values. Because of this, we // take the documents given and re-derive the sort values. Is it for cache pourposes? Thanks in advance! -- View this message in context: http://old.nabble.com/About-fsv-%28sort-field-falues%29-tp26700729p26700729.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: About fsv (sort field falues)
On Tue, Dec 8, 2009 at 4:04 PM, Marc Sturlese marc.sturl...@gmail.com wrote: I am tracing QueryComponent.java and would like to know the pourpose of doFSV function. Don't understand what fsv are for. Have tried some queries with fsv=true and some extra info apears in the response: lst name=sort_values/ It's currently an internal feature (i.e. back compat is not guaranteed) used for merging search results in a distributed search. It contains the sort values (i.e. what was used to sort the documents) for everything but score. -Yonik http://www.lucidimagination.com
Re: how to do auto-suggest case-insensitive match and return original case field values
Hello, Thanks for the reply (see below) hossman wrote: The type of approach you are describing (doing a prefix based query for autosuggest) probably won't work very well unless your index is 100% designed just for the autosuggest ... if it's an index about products, and you're just using one of hte fields for autosuggest, you aren't going to get good autosuggest results because the same word is going to appear in multiple products. what you need is an index of *words* that you want to autosuggest, with fields indicating how important those words are that you can use in a function query (this replaces the term freq that TermComponent would use) the fact that your test field is multivalued and stores widly different things in each doc is an example of what i mean. I am using Solr to index biological annotations about proteins (which my documents). There is no tokenization or special analysis of the annotation text strings as they are not free text, each annotation is a single token. Also, for the purpose of my auto-suggest and searching there are actually no different types of annotations, that's why they all go into the same multivalued field for each protein document. I want to use the auto-suggest and search to help biologists (who know the annotation terminology) find all the protein documents with the annotation they are thinking of, and to suggest what is available as they type. The thing is that in my field letter case can be important define the meaning of an annotation, but the biologist might not remember the exact case. Therefore I want them to be able to type in what ever case and the auto-suggest will pull up as they type annotations with the correct case to assist them. Let's just take the fundamental question, independent of any example: is it possible to do a case-insensitive prefix search using faceting (to get the term suggestions) that also returns the originally mixed case terms of *all* those terms listed in lowercase in the facet list? The only other post I saw in this forum on this topic a user seemed to think this was easily doable, but I don't think they actually tried to do it because the faceted search doesn't seem possible, you run into all these problems. It just isn't something Solr/Lucene can actually do the way it is organized. hossman wrote: Have you considered the possibility of just indexing the lowercase value concatenated with the regular case value using a special delimiter, and ten returning to your TermComponent based solution? index PowerPoint as powerpoint|PowerPoint and just split on the \ character when you get hte data back from your prefix based term lookup. I think this is a good workaround, will definitely try it! leandro -- View this message in context: http://old.nabble.com/how-to-do-auto-suggest-case-insensitive-match-and-return-original-case-field-values-tp26636365p2670.html Sent from the Solr - User mailing list archive at Nabble.com.
do copyField's need to exist as Fields?
Hello! (solr newbie alert) I want to pass 4 fields into Solr 1. id (unique) 2. title 3. subtitle 4. body but only want to index and store 2: 1. id (unique) 2. text (copyField of id, title, subtitle, body). The search then searches on text, and returns only matching id's. When I set up the 2 fields, and the copyFields, it doesn't seem to work. I'm guessing for a copyField to work you need to have fields with the same name already set. Is there a different way I should be setting it up to achieve the above?? regan -- View this message in context: http://old.nabble.com/do-copyField%27s-need-to-exist-as-Fields--tp26701706p26701706.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: java.lang.NumberFormatException: For input string:
: its strange i had a dismaxhandler and it had an empty value for ps field : i added a default value like 100 and the error disappeared. I really wish the java compiler had an option so we could say when compiling our code, treat this list of unchecked exceptions like checked exceptions so we could prevent code that doesn't catch NumberFormatException from ever getting committed. I've got a patch that will improve the error message on this in the future... https://issues.apache.org/jira/browse/SOLR-1635 : SEVERE: java.lang.NumberFormatException: For input string: : at : java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) : at java.lang.Integer.parseInt(Integer.java:468) : at java.lang.Integer.valueOf(Integer.java:553) -Hoss
Re: do copyField's need to exist as Fields?
regany wrote: Is there a different way I should be setting it up to achieve the above?? Think I figured it out. I set up the fields so they are present, but get ignored accept for the text field which gets indexed... field name=id type=text indexed=true stored=true multiValued=false required=true / field name=title stored=false indexed=false multiValued=true type=text / field name=subtitle stored=false indexed=false multiValued=true type=text / field name=body stored=false indexed=false multiValued=true type=text / field name=text type=text indexed=true stored=false multiValued=true / and then copyField the first 4 fields to the text field: copyField source=id dest=text / copyField source=title dest=text / copyField source=subtitle dest=text / copyField source=body dest=text / Seems to be working!? :drunk: -- View this message in context: http://old.nabble.com/do-copyField%27s-need-to-exist-as-Fields--tp26701706p26702224.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr usage with Auctions/Classifieds?
hello! just wondering if anyone is using Solr as their search for an auction / classified site, and if so how have you managed your setup in general? ie. searching against listings that may have expired etc. regan -- View this message in context: http://old.nabble.com/Solr-usage-with-Auctions-Classifieds--tp26702828p26702828.html Sent from the Solr - User mailing list archive at Nabble.com.
Multiple Facet prefixes on the same facet field in one request?
Hey all, Is there anyway in Solr 1.4/1.5 to perform multiple facet prefixes on the same facet field in one request? Ex. On field 'Foo' I want to perform a facet prefix of A* and B* so I can get a facet response of all terms prefixed with A and all terms prefixed with B, either grouped together in the same facet result list or seperate facet lists labeled by the prefix. Currently, I perform one request per facet prefix and I am hoping that there is some cryptic way using local params that I am missing that will allow me to do this. Robert. -- View this message in context: http://old.nabble.com/Multiple-Facet-prefixes-on-the-same-facet-field-in-one-request--tp26702997p26702997.html Sent from the Solr - User mailing list archive at Nabble.com.
Packaging installing SOLR on linux
Hello, At the risk of asking a highly general question , can anybody give me pointers or best practices on how best one can package SOLR its associated file as a Linux rps file , so that this core/instance can be ported on multiple instances ? If anybody has experience working on such a system , the knowledge will be very useful . With Regards Sri -- View this message in context: http://old.nabble.com/Packaging---installing-SOLR-on-linux-tp26703295p26703295.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Replicating multiple cores
Yes. I'd highly recommend using the Java replication though. Is there a reason? I understand it's new etc, however I think one issue with it is it's somewhat non-native access to the filesystem. Can you illustrate a real world advantage other than the enhanced admin screens? On Mon, Dec 7, 2009 at 11:13 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Dec 8, 2009 at 11:48 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: If I've got multiple cores on a server, I guess I need multiple rsyncd's running (if using the shell scripts)? Yes. I'd highly recommend using the Java replication though. -- Regards, Shalin Shekhar Mangar.
Solr Cell and Spellchecking.
Following Eric Hatcher's post about using SolrCell and acts_as_solr { http://www.lucidimagination.com/blog/2009/02/17/acts_as_solr_cell/ }, I have been able to index a rich document stream and retrieve it's id. No worries. However, I have the SpellCheckComponent setup to build on commit (buildOnCommit=true). Alas, the rich document text is not being added to the spellchecker dictionary. Is there something special I need to do within the SolrConfig.xml or within the acts_as_solr ruby classes? - thanks in advance for any ideas - Mike Boyle
bool default - if missing when updating uses current or default value?
hello, if I have a booleen fieldType (solr.BoolField) with a default value of true, and I insert a new document I understand that the booleen value will be set to TRUE. But if I update an existing document, and I don't pass in a value for the booleen field, will Solr keep the existing booleen value unchanged, or will it update the booleen value again using the default? - ie. true. ? regan -- View this message in context: http://old.nabble.com/bool-default---if-missing-when-%22updating%22-uses-current-or-default-value--tp26703630p26703630.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multiindexing
A core is one index. I think you mean: 3-5 indexes in different cores. Since you want to search across them, they should have the same schema. There is a feature called Distributed Search that searches across multiple indexes. There is no administration support for indexing parts of one data set into multiple indexes. You have to set up all solr instances and index parts of the data into each one with your own scripting. Does this help? On 12/7/09, Jörg Agatz joerg.ag...@googlemail.com wrote: Hi Users.. i need help with Multiindexing in Solr, i want one Core, and 3 to 5 diferent indizes. So i can search in simultan in all or in some of them. i find the Help im WIKI.. but it dosent Help. http://wiki.apache.org/solr/MultipleIndexes?highlight=%28multi%29 there stand nothing about Multiindexing in Solr. in the Book from Solr1.4 too exist no way to use more than one index in one core/instance? King -- Lance Norskog goks...@gmail.com
Re: question about schemas
I don't know. The common way to do this in Solr is the full denormalization technique, but that blows up in this case. This is not an easy problem space to implement in Solr. Data warehousing star schema techniques may be more appropriate. On 12/7/09, solr-user solr-u...@hotmail.com wrote: Lance Norskog-2 wrote: You can make a separate facet field which contains a range of buckets: 10, 20, 50, or 100 means that the field has a value 0-10, 11-20, 21-50, or 51-100. You could use a separate filter query with values for these buckets. Filter queries are very fast in Solr 1.4 and this would limit your range query execution to documents which match the buckets. Lance, I am afraid that I do not see how to use this suggestion. Which of the three (four?) suggested schemas would I be using? How would these range facets prevent the potential issues I found such as getting product facets instead of customer facets, or having very large numbers of ANDs and ORs, and so forth. -- View this message in context: http://old.nabble.com/question-about-schemas-tp26600956p26679922.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Solr Search in stemmed and non stemmed mode
Short answer: the standard query handler is right for carefully designing queries. The dismax query handler is right for putting a 'search box' in a web page for regular users. On 12/7/09, khalid y kern...@gmail.com wrote: Thanks, I'll read the mail archive. Your suggestion is like mine but whitout the DisMax handler. I'm going to read what is this handler. I have one field text and another text_unstemmed where I copy all others fields. I'm writing my custom query handler who check if quotes exists and switch between the good field. Going to read... Thanks 2009/12/7 Erick Erickson erickerick...@gmail.com Try searching the mail archive for stemmer exact match or similar, this has been discussed multiple times and you'll get more complete discussions wy faster One suggestion is to use two fields, one for the stemmed version and one for the original, then use whichever field you need to via DixMax handler (more detail in the mail archive). Best Erick On Mon, Dec 7, 2009 at 10:02 AM, khalid y kern...@gmail.com wrote: Hi !! I'm looking for a way to have two index in solr one stemmed and another non stemmed. Why ? It's simple :-) My users can do query for : - banking marketing = it return all document matches bank*** and market*** - banking marketing = it return all document matches banking and market*** The second request need that I can switch between stemmed and not stemmed when the user write the keyword with quotes. The optimal solution is : solr can mix gracefully results from stemmed and non stemmed index, with a good score calculation ect... The near optimal solution is : if solr see it switch in non stemmed mod for all key words in query I have an idea but I prefer to listen the comunity voice before to propose it. I'll expose it in my next post. If someone has an graceful idea to do this craps :-) Thanks -- Lance Norskog goks...@gmail.com
Re: search on tomcat server
Solr comes with an example solr installation in the example/ directory. Run this, look at the README.txt file, index the xml files in example/exampledocs, and do queries like 'disk' and 'memory'. And read example/conf/schema.xml and example/conf/solrconfig.xml. Most of the details of what solr does and how to set it up will be clear. On 12/7/09, Sascha Szott sz...@zib.de wrote: Hi Jill, just to make sure your index contains at least one document, what is the output of http://localhost:8080/solr/select?q=*:*debugQuery=trueechoParams=all Best, Sascha Jill Han wrote: In fact, I just followed the instructions titled as Tomcat On Windows. Here are the updates on my computer 1. -Dsolr.solr.home=C:\solr\example 2. change dataDir to dataDirC:\solr\example\data/dataDir in solrconfig.xml at C:\solr\example\conf 3. created solr.xml at C:\Tomcat 5.5\conf\Catalina\localhost ?xml version=1.0 encoding=utf-8? Context docBase=c:/solr/example/apache-solr-1.3.0.war debug=0 crossContext=true Environment name=solr/home type=java.lang.String value=c:/solr/example override=true/ /Context I restarted Tomcat, went to http://localhost:8080/solr/admin/ Entered video in Query String field, and got /** ?xml version=1.0 encoding=UTF-8 ? - response - lst name=responseHeader int name=status0/int int name=QTime0/int - lst name=params str name=rows10/str str name=start0/str str name=indenton/str str name=qvideo/str str name=version2.2/str /lst /lst result name=response numFound=0 start=0 / /response / My questions are 1. is the setting correct? 2. where does solr start to search words entered in Query String field 3. how can I make result page like general searching result page, such as, not found, if found, a url, instead of xml will be returned. Thanks a lot for your helps, Jill -Original Message- From: William Pierce [mailto:evalsi...@hotmail.com] Sent: Friday, December 04, 2009 12:56 PM To: solr-user@lucene.apache.org Subject: Re: search on tomcat server Have you gone through the solr tomcat wiki? http://wiki.apache.org/solr/SolrTomcat I found this very helpful when I did our solr installation on tomcat. - Bill -- From: Jill Han jill@alverno.edu Sent: Friday, December 04, 2009 8:54 AM To: solr-user@lucene.apache.org Subject: RE: search on tomcat server X-HOSTLOC: hermes.apache.org/140.211.11.3 I went through all the links on http://wiki.apache.org/solr/#Search_and_Indexing And still have no clue as how to proceed. 1. do I have to do some implementation in order to get solr to search doc. on tomcat server? 2. if I have files, such as .doc, docx, .pdf, .jsp, .html, etc under window xp, c:/tomcat/webapps/test1, /webapps/test2, What should I do to make solr search those directories 3. since I am using tomcat, instead of jetty, is there any demo that shows the solr searching features, and real searching result? Thanks, Jill -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Monday, November 30, 2009 10:40 AM To: solr-user@lucene.apache.org Subject: Re: search on tomcat server On Mon, Nov 30, 2009 at 9:55 PM, Jill Han jill@alverno.edu wrote: I got solr running on the tomcat server, http://localhost:8080/solr/admin/ After I enter a search word, such as, solr, then hit Search button, it will go to http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10in dent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10in%0Adent=on and display ?xml version=1.0 encoding=UTF-8 ? - http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on response - http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on lst name=responseHeader int name=status0/int int name=QTime0/int - http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on lst name=params str name=rows10/str str name=start0/str str name=indenton/str str name=qsolr/str str name=version2.2/str /lst /lst result name=response numFound=0 start=0 / /response My question is what is the next step to search files on tomcat server? Looks like you have not added any documents to Solr. See the Indexing Documents section at http://wiki.apache.org/solr/#Search_and_Indexing -- Regards, Shalin Shekhar Mangar. -- Lance Norskog goks...@gmail.com
Re: Replicating multiple cores
On Wed, Dec 9, 2009 at 6:14 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Yes. I'd highly recommend using the Java replication though. Is there a reason? I understand it's new etc, however I think one issue with it is it's somewhat non-native access to the filesystem. Can you illustrate a real world advantage other than the enhanced admin screens? Complexity is the main problem w/ rsync based replication. you have to manage so many processes and monitor them separately. The other problem is managing snapshots. These snapshots need to be cleaned up every now and then. You do not have enough info on what is heppening/happened On Mon, Dec 7, 2009 at 11:13 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Dec 8, 2009 at 11:48 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: If I've got multiple cores on a server, I guess I need multiple rsyncd's running (if using the shell scripts)? Yes. I'd highly recommend using the Java replication though. -- Regards, Shalin Shekhar Mangar. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Replicating multiple cores
Complexity is the main problem I agree, replicating multiple cores otherwise means multiple rsyncd processes, and true enough that management of shell scripts multiplies in complexity. 2009/12/8 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: On Wed, Dec 9, 2009 at 6:14 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Yes. I'd highly recommend using the Java replication though. Is there a reason? I understand it's new etc, however I think one issue with it is it's somewhat non-native access to the filesystem. Can you illustrate a real world advantage other than the enhanced admin screens? Complexity is the main problem w/ rsync based replication. you have to manage so many processes and monitor them separately. The other problem is managing snapshots. These snapshots need to be cleaned up every now and then. You do not have enough info on what is heppening/happened On Mon, Dec 7, 2009 at 11:13 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Dec 8, 2009 at 11:48 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: If I've got multiple cores on a server, I guess I need multiple rsyncd's running (if using the shell scripts)? Yes. I'd highly recommend using the Java replication though. -- Regards, Shalin Shekhar Mangar. -- - Noble Paul | Systems Architect| AOL | http://aol.com
Enumerating wildcard terms
Is it possible to enumerate all terms that match the specified wildcard filter term. Similar to Lunce WildCardTermEnum API for example if I search abc* then I just should able to access all the terms abc1, abc2 , abc3... that exists in Index What should be better approach to meet this functionality ? -- Nipen Mark
Re: Enumerating wildcard terms
Mark, The TermsComponent should do the trick for you. http://wiki.apache.org/solr/TermsComponent Erik On Dec 9, 2009, at 7:46 AM, Mark N wrote: Is it possible to enumerate all terms that match the specified wildcard filter term. Similar to Lunce WildCardTermEnum API for example if I search abc* then I just should able to access all the terms abc1, abc2 , abc3... that exists in Index What should be better approach to meet this functionality ? -- Nipen Mark
RE: do copyField's need to exist as Fields?
Hi Regan, Something I noticed on your setup... The ID field in your setup I assume to be your uniqueID for the book or journal (The ISSN or something) Try making this a string as TEXT is not the ideal field to use for unique IDs field name=id type=string indexed=true stored=true multiValued=false required=true / Congrats on figuring out SOLR fields - I suggest getting the SOLR 1.4 Book.. It really saved me a 1000 questions on this mailing list :) Jaco Olivier -Original Message- From: regany [mailto:re...@newzealand.co.nz] Sent: 09 December 2009 00:48 To: solr-user@lucene.apache.org Subject: Re: do copyField's need to exist as Fields? regany wrote: Is there a different way I should be setting it up to achieve the above?? Think I figured it out. I set up the fields so they are present, but get ignored accept for the text field which gets indexed... field name=id type=text indexed=true stored=true multiValued=false required=true / field name=title stored=false indexed=false multiValued=true type=text / field name=subtitle stored=false indexed=false multiValued=true type=text / field name=body stored=false indexed=false multiValued=true type=text / field name=text type=text indexed=true stored=false multiValued=true / and then copyField the first 4 fields to the text field: copyField source=id dest=text / copyField source=title dest=text / copyField source=subtitle dest=text / copyField source=body dest=text / Seems to be working!? :drunk: -- View this message in context: http://old.nabble.com/do-copyField%27s-need-to-exist-as-Fields--tp267017 06p26702224.html Sent from the Solr - User mailing list archive at Nabble.com. Please consider the environment before printing this email. This transmission is for the intended addressee only and is confidential information. If you have received this transmission in error, please delete it and notify the sender. The content of this e-mail is the opinion of the writer only and is not endorsed by Sabinet Online Limited unless expressly stated otherwise.
Re: how to do auto-suggest case-insensitive match and return original case field values
Uri Boness wrote: Just updated SOLR-1625 to support regexp hints. https://issues.apache.org/jira/browse/SOLR-1625 Cheers, Uri This is perfect, exactly what is needed to make this functionality possible. Is the patch already in trunk? thanks, leandro -- View this message in context: http://old.nabble.com/how-to-do-auto-suggest-w--case-insensitive-search-and-suggesting-original-mixed-case-field-values-tp26636365p26706241.html Sent from the Solr - User mailing list archive at Nabble.com.