Wildcard Search
Hi, I am facing a n wierd issue while searching. I am searching for word *sytem*, it displays all the records which contains system, systems etc. But when I tried to search *systems*, it only returns me those records, which have systems-, systems/ etc etc. It is considering wildcard as 1 or more character and not zero character. So, it is not returning records which has systems has one word. Is there any way to resolve this. Please suggest. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is it possible to writing solr result on disk from the server side?
Tanks Paul! Yes, I saw it. In fact, if I well understand, it could be a solution, but a little beat too complicate for what I want to do. Currently my client part is not in Java and I still need a kind of client and server model because it's a Web application and solr have to be running and waiting query continuously. So even if it seems possible with EmbeddedSolrServer, sharing the xml results on a file will be a faster solution for me regarding also the development time. Noble Paul നോബിള് नोब्ळ्-2 wrote: did you consider using an EmbeddedSolrServer? On Thu, May 7, 2009 at 8:25 PM, arno13 arnaud.gaudi...@healthonnet.org wrote: Do you know if it's possible to writing solr results directly on a hard disk from server side and not to use an HTTP connection to transfer the results? While the query time is very fast for solr, I want to do that, cause of the time taken during the transfer of the results between the client and the solr server when you have lot of 'rows'. For instance for 10'000 rows, the query time could be 50 ms and 19s to get the results from the server. As my client and server are on the same system, I could get the results faster directly on the hard disk (or better in a ram disk), is it possible configuring solr for that? Regards, -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23428509.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23441651.html Sent from the Solr - User mailing list archive at Nabble.com.
Organizing multiple searchers around overlapping subsets of data
I have one type of document, but different searchers, each of which is interested in a different subset of the documents, which are different configurations of TV channels {A,B,C,D}. * Application S1 is interested in all channels, i.e. {A,B,C,D}. * Application S2 is interested in {A,B,C}. * Application S3 is interested in {A,C,D}. * Application S4 is interested in {B,D}. As can be seen from this simplified example, the subsets are not disjoint, but do have considerable overlaps. The total data volume is only about 200 MB. There are four searchers, and they may become ten or a dozen. The set elements an application may or may not be interested in, however, i.e. the channels, which are {A,B,C,D} in this example, are not just four, but about 150, each of which has about 1000 documents. What is the best way to organize this? (a) Set up different cores for each application, i.e. going multi-core, thereby incurring a good deal of redundancy, but simplifying searches? (b) Apply filter queries to select documents from only, say 60, 80 or 110 out of 150 channels. (c) Something else I'm not aware of. Am I right in suspecting that multi-core makes less sense with increasing overlaps and hence redundancy? Michael Ludwig
Re: StatsComponent and 1.3
I'm guessing that manipulating the client end, acts_as_solr, is an easier approach then backporting server side functionality. Especially as you will have to forward migrate at some point. Out of curiosity, which version of acts_as_solr are you using? The plugin has moved homes a couple of times, and I have heard and found that the version by Mathias Meyer at http://github.com/mattmatt/acts_as_solr/tree/master is the best. I've used it with 1.4 trunk with no issues, and Mathias has been very responsive. Eric On May 7, 2009, at 10:25 PM, David Shettler wrote: Foreword: I'm not a java developer :) OSVDB.org and datalossdb.org make use of solr pretty extensively via acts_as_solr. I found myself with a real need for some of the StatsComponent stuff (mainly the sum feature), so I pulled down a nightly build and played with it. StatsComponent proved perfect, but... the nightly build output seems to be different, and thus incompatible with acts_as_solr. Now, I realize this is more or less an acts_as_solr issue, but... Is it possible, with some degree of effort (obviously) for me to essentially port some of the functionality of StatsComponent to 1.3 myself? It's that, or waiting for 1.4 to come out and someone developing support for it into acts_as_solr, or myself fixing what I have for acts_as_solr to work with the output. I'm just trying to gauge the easiest solution :) Any feedback or suggestions would be grand. Thanks, Dave Open Security Foundation - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal
Re: Core Reload issue
As long as your indexed documents have the stop words you will continue to see the stop words in the results. On Fri, May 8, 2009 at 11:24 AM, Sagar Khetkade sagar.khetk...@hotmail.com wrote: From my understanding re-indexing the documents is a different thing. If you have the stop word filter for field type say text then after reloading the core if i type in a query which is stop word only it would get parsed from stop word filter which would eventually will not serach against the index. But in my case i am getting the results having the stop word; so the issue. ~Sagar From: noble.p...@gmail.com Date: Tue, 5 May 2009 10:09:29 +0530 Subject: Re: Core Reload issue To: solr-user@lucene.apache.org If you change the the conf files and if you reindex the documents it must be reflected are you sure you re-indexed? On Tue, May 5, 2009 at 10:00 AM, Sagar Khetkade sagar.khetk...@hotmail.com wrote: Hi, I came across a strange problem while reloading the core in multicore scenario. In the config of one of the core I am making changes in the synonym and stopword files and then reloading the core. The core gets reloaded but the changes in stopword and synonym fiels does not get reflected when I query in. The filters for index and query are the same. I face this problem even if I reindex the documents. But when I restart the servlet container in which the solr is embedded I problem does not resurfaces. My ultimate goal is/was to reflect the changes made in the text files inside the config folder. Is this the expected behaviour or some problem at my side. Could anyone suggest me the possible work around? Thanks in advance! Regards, Sagar Khetkade _ More than messages–check out the rest of the Windows Live™. http://www.microsoft.com/india/windows/windowslive/ -- - Noble Paul | Principal Engineer| AOL | http://aol.com _ Planning the weekend ? Here’s what is happening in your town. http://msn.asklaila.com/events/ -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Is it possible to writing solr result on disk from the server side?
Solr does not need any special configuration to do this. Just fire your query once and save the results xml/json into a file or in memory. When you need them again , just read it from disk.memory On Fri, May 8, 2009 at 1:21 PM, arno13 arnaud.gaudi...@healthonnet.org wrote: Tanks Paul! Yes, I saw it. In fact, if I well understand, it could be a solution, but a little beat too complicate for what I want to do. Currently my client part is not in Java and I still need a kind of client and server model because it's a Web application and solr have to be running and waiting query continuously. So even if it seems possible with EmbeddedSolrServer, sharing the xml results on a file will be a faster solution for me regarding also the development time. Noble Paul നോബിള് नोब्ळ्-2 wrote: did you consider using an EmbeddedSolrServer? On Thu, May 7, 2009 at 8:25 PM, arno13 arnaud.gaudi...@healthonnet.org wrote: Do you know if it's possible to writing solr results directly on a hard disk from server side and not to use an HTTP connection to transfer the results? While the query time is very fast for solr, I want to do that, cause of the time taken during the transfer of the results between the client and the solr server when you have lot of 'rows'. For instance for 10'000 rows, the query time could be 50 ms and 19s to get the results from the server. As my client and server are on the same system, I could get the results faster directly on the hard disk (or better in a ram disk), is it possible configuring solr for that? Regards, -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23428509.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23441651.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: What are the Unicode encodings supported by Solr?
KK schrieb: I'd like to know about the different Unicode[/any other?] encodings supported by Solr for posting docs [thru Solrj in my case]. Is it that just UTF-8, UCN supported or other character encodings like NCR(decimal), NCR(hex) etc are supported as well? Any numerical character reference (NCR), decimal or hexadecimal, is valid UTF-8 as long as it maps to a valid Unicode character. I found that for most of the pages the encoding is UTF-8[in this case searching works fine] but for others the encoding is some other character encoding[like NCR(dec), NCR(hex) or might be something else, don't have much idea on this]. Whatever the encoding is, your application needs to know what it is when dealing with bytes read from the network. So when I fetch the page content thru java methods using InputSteamReaders and after stripping various tags what I obtained is raw text with some encoding not getting supported by Solr. Did you make sure to not rely on your platform default encoding (Charset) when constructing the InputStreamReader? If in doubt, take a look at the InputStreamReader constructors. Michael Ludwig
Re: Solr spring application context error
Paul/Erik, Thanks for your reply.I have the jar file containing the plugin code and the applicationContext.xml in solr/home/lib directory.It is instantiating the plugin code.But it is not loading the application context.xml file from solr/home/lib dir.But when i copied the jar file containing the applicationContext.xml file into the solr.war file's WEB-Inf/lib dir and placed the solr.war file in tomcat's web-apps dir ,it worked. As Erik said solr may only load the xml from solr.war file? Please let me know if there is any way to do this by placing the applicationContext.xml file in solr/home/lib. Thanks, Raju Noble Paul നോബിള് नोब्ळ्-2 wrote: a point to keep in mind is that all the plugin code and everything else must be put into the solrhome/lib directory. where have you placed the file com/mypackage/applicationContext.xml ? On Fri, May 8, 2009 at 12:19 AM, Raju444us gudipal...@gmail.com wrote: I have configured solr using tomcat.Everything works fine.I overrode QParserPlugin and configured it.The overriden QParserPlugin has a dependency on another project say project1.So I made a jar of the project and copied the jar to the solr/home lib dir. the project1 project is using spring.It has a factory class which loads the beans.Iam using this factory calss in QParserPlugin to get a bean.When I start my tomcat the factory class is loading fine.But the problem is its not loading the beans.And Iam getting exception org.springframework.beans.factory.BeanDefinitionStoreException: IOException parsing XML document from class path resource [com/mypackage/applicationContext.xml]; nested exception is java.io.FileNotFoundException: class path resource [com/mypackage/applicationContext.xml] cannot be opened because it does not exist Do I need to do something else?. Can anybody please help me. Thanks, Raju -- View this message in context: http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23432901.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23444847.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr spring application context error
I've run into this in the past as well. Its fairly annoying. Anyone know why the limitation? Why aren't we passing the ClassLoader thats loading Solr classes as the parent to the lib dir plugin classloader? - Mark Erik Hatcher wrote: This is probably because Solr loads its extensions from a custom class loader, but if that class then needs to access things from the classpath, it is only going to see the built-in WEB-INF/lib classes, not solr/home lib JAR files. Maybe there is a Spring way to point it at that lib directory also? This is the kinda pain we get, it seems, when reinventing a container, unfortunately. Erik On May 7, 2009, at 2:49 PM, Raju444us wrote: I have configured solr using tomcat.Everything works fine.I overrode QParserPlugin and configured it.The overriden QParserPlugin has a dependency on another project say project1.So I made a jar of the project and copied the jar to the solr/home lib dir. the project1 project is using spring.It has a factory class which loads the beans.Iam using this factory calss in QParserPlugin to get a bean.When I start my tomcat the factory class is loading fine.But the problem is its not loading the beans.And Iam getting exception org.springframework.beans.factory.BeanDefinitionStoreException: IOException parsing XML document from class path resource [com/mypackage/applicationContext.xml]; nested exception is java.io.FileNotFoundException: class path resource [com/mypackage/applicationContext.xml] cannot be opened because it does not exist Do I need to do something else?. Can anybody please help me. Thanks, Raju -- View this message in context: http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23432901.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Mark http://www.lucidimagination.com
Re: Is it possible to writing solr result on disk from the server side?
It's what I do from the client side however I don't know how to do this from the server side (solr). Sorry if I wasn't clear enough. Noble Paul നോബിള് नोब्ळ्-2 wrote: Solr does not need any special configuration to do this. Just fire your query once and save the results xml/json into a file or in memory. When you need them again , just read it from disk.memory On Fri, May 8, 2009 at 1:21 PM, arno13 arnaud.gaudi...@healthonnet.org wrote: Thanks Paul! Yes, I saw it. In fact, if I well understand, it could be a solution, but a little beat too complicate for what I want to do. Currently my client part is not in Java and I still need a kind of client and server model because it's a Web application and solr have to be running and waiting query continuously. So even if it seems possible with EmbeddedSolrServer, sharing the xml results on a file will be a faster solution for me regarding also the development time. Noble Paul നോബിള് नोब्ळ्-2 wrote: did you consider using an EmbeddedSolrServer? On Thu, May 7, 2009 at 8:25 PM, arno13 arnaud.gaudi...@healthonnet.org wrote: Do you know if it's possible to writing solr results directly on a hard disk from server side and not to use an HTTP connection to transfer the results? While the query time is very fast for solr, I want to do that, cause of the time taken during the transfer of the results between the client and the solr server when you have lot of 'rows'. For instance for 10'000 rows, the query time could be 50 ms and 19s to get the results from the server. As my client and server are on the same system, I could get the results faster directly on the hard disk (or better in a ram disk), is it possible configuring solr for that? Regards, -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23428509.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23441651.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23445157.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Wildcard Search
Are you by any chance stemming the field when you index? Erick On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote: Hi, I am facing a n wierd issue while searching. I am searching for word *sytem*, it displays all the records which contains system, systems etc. But when I tried to search *systems*, it only returns me those records, which have systems-, systems/ etc etc. It is considering wildcard as 1 or more character and not zero character. So, it is not returning records which has systems has one word. Is there any way to resolve this. Please suggest. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Wildcard Search
Yes, thats correct. I have applied EnglishPorterFilterFactory at index time as well. Do you think, I should remove it and do the indexing again. Erick Erickson wrote: Are you by any chance stemming the field when you index? Erick On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote: Hi, I am facing a n wierd issue while searching. I am searching for word *sytem*, it displays all the records which contains system, systems etc. But when I tried to search *systems*, it only returns me those records, which have systems-, systems/ etc etc. It is considering wildcard as 1 or more character and not zero character. So, it is not returning records which has systems has one word. Is there any way to resolve this. Please suggest. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Wildcard-Search-tp23440795p23445966.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Wildcard Search
My *guess* is that what you're seeing is that wildcard searches are not analyzed, in this case not run through the stemmer. So your index only contains system and the funky variants (e.g. systems/). I don't really understand why you'd get systems/ in your index, but I'm assuming that your filter chain doesn't remove things like slashes. So, you have system and systems/ in your index, but not systems due to stemming, so searching for systems* translates into systems OR systems/ OR and since no documents have systems, you don't get them as hits. All that said, you need to revisit your indexing parameters to make what happens fit your expectations. I'd advise getting a copy of Luke and pointing it at your index in order to see what *really* gets put in it. Best Erick You need to either introduce filters that remove odd stuff like slashes On Fri, May 8, 2009 at 9:25 AM, dabboo ag...@sapient.com wrote: Yes, thats correct. I have applied EnglishPorterFilterFactory at index time as well. Do you think, I should remove it and do the indexing again. Erick Erickson wrote: Are you by any chance stemming the field when you index? Erick On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote: Hi, I am facing a n wierd issue while searching. I am searching for word *sytem*, it displays all the records which contains system, systems etc. But when I tried to search *systems*, it only returns me those records, which have systems-, systems/ etc etc. It is considering wildcard as 1 or more character and not zero character. So, it is not returning records which has systems has one word. Is there any way to resolve this. Please suggest. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Wildcard-Search-tp23440795p23445966.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr spring application context error
Iam having the same problem.Please let me know if anyone finds answer to this. Thank you, Sachin markrmiller wrote: I've run into this in the past as well. Its fairly annoying. Anyone know why the limitation? Why aren't we passing the ClassLoader thats loading Solr classes as the parent to the lib dir plugin classloader? - Mark Erik Hatcher wrote: This is probably because Solr loads its extensions from a custom class loader, but if that class then needs to access things from the classpath, it is only going to see the built-in WEB-INF/lib classes, not solr/home lib JAR files. Maybe there is a Spring way to point it at that lib directory also? This is the kinda pain we get, it seems, when reinventing a container, unfortunately. Erik On May 7, 2009, at 2:49 PM, Raju444us wrote: I have configured solr using tomcat.Everything works fine.I overrode QParserPlugin and configured it.The overriden QParserPlugin has a dependency on another project say project1.So I made a jar of the project and copied the jar to the solr/home lib dir. the project1 project is using spring.It has a factory class which loads the beans.Iam using this factory calss in QParserPlugin to get a bean.When I start my tomcat the factory class is loading fine.But the problem is its not loading the beans.And Iam getting exception org.springframework.beans.factory.BeanDefinitionStoreException: IOException parsing XML document from class path resource [com/mypackage/applicationContext.xml]; nested exception is java.io.FileNotFoundException: class path resource [com/mypackage/applicationContext.xml] cannot be opened because it does not exist Do I need to do something else?. Can anybody please help me. Thanks, Raju -- View this message in context: http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23432901.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Mark http://www.lucidimagination.com -- View this message in context: http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23446518.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: bug? No highlighting results with dismax and q.alt=*:*
Possibly this issue is related: https://issues.apache.org/jira/browse/SOLR-825 Though it seems that might affect the standard handler, while what I'm seeing is more sepcific to the dismax handler. -Peter On Thu, May 7, 2009 at 8:27 PM, Peter Wolanin peter.wola...@acquia.com wrote: For the Drupal Apache Solr Integration module, we are exploring the possibility of doing facet browsing - since we are using dismax as the default handler, this would mean issuing a query with an empty q and falling back to to q.alt='*:*' or some other q.alt that matches all docs. However, I notice when I do this that we do not get any highlights back in the results despite defining a highlight alternate field. In contrast, if I force the standard request handler then I do get text back from the highlight alternate field: select/?q=*:*qt=standardhl=truehl.fl=bodyhl.alternateField=bodyhl.maxAlternateFieldLength=256 However, I then loose the nice dismax features of weighting the results using bq and bf parameters. So, is this a bug or the intended behavior? The relevant fragment of the solrconfig.xml is this: requestHandler name=partitioned class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=q.alt*:*/str !-- example highlighter config, enable per-query with hl=true -- str name=hltrue/str str name=hl.flbody/str int name=hl.snippets3/int str name=hl.mergeContiguoustrue/str !-- instructs Solr to return the field itself if no query terms are found -- str name=f.body.hl.alternateFieldbody/str str name=f.body.hl.maxAlternateFieldLength256/str Full solrconfig.xml and other files: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/apachesolr/?pathrev=DRUPAL-6--1 -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com
Re: Backups using Java-based Replication (forced snapshot)
Thinking the same last week, as I was tailoring the snapshooter.sh script. The data directory should be kept for the temp snapshot, as a way to ensure linking is occurring on the same device. snapshooter.sh 87 name=${data_dir}/${snap_name} I think only this needs to be configurable for the final move. Grant Ingersoll-6 wrote: On the page http://wiki.apache.org/solr/SolrReplication, it says the following: Force a snapshot on master.This is useful to take periodic backups .command : http://master_host:port/solr/replication? command=snapshoot This then puts the snapshot under the data directory. Perfectly reasonable thing to do. However, is it possible to have it take in a directory location and store the snapshot there? For instance, I may want to have it write to a specific directory that is being watched for backup data. Thanks, Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- View this message in context: http://www.nabble.com/Backups-using-Java-based-Replication-%28forced-snapshot%29-tp23434396p23447792.html Sent from the Solr - User mailing list archive at Nabble.com.
solr + wordpress
Somebody has writte an articles on integrating Solr with wordpress http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/ -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: bug? No highlighting results with dismax and q.alt=*:*
I have experienced it before... maybe you can manage something similar to your q.alt using the params q and qf. Highlight will work in that case (I sorted it out doing that) Peter Wolanin-2 wrote: Possibly this issue is related: https://issues.apache.org/jira/browse/SOLR-825 Though it seems that might affect the standard handler, while what I'm seeing is more sepcific to the dismax handler. -Peter On Thu, May 7, 2009 at 8:27 PM, Peter Wolanin peter.wola...@acquia.com wrote: For the Drupal Apache Solr Integration module, we are exploring the possibility of doing facet browsing - since we are using dismax as the default handler, this would mean issuing a query with an empty q and falling back to to q.alt='*:*' or some other q.alt that matches all docs. However, I notice when I do this that we do not get any highlights back in the results despite defining a highlight alternate field. In contrast, if I force the standard request handler then I do get text back from the highlight alternate field: select/?q=*:*qt=standardhl=truehl.fl=bodyhl.alternateField=bodyhl.maxAlternateFieldLength=256 However, I then loose the nice dismax features of weighting the results using bq and bf parameters. So, is this a bug or the intended behavior? The relevant fragment of the solrconfig.xml is this: requestHandler name=partitioned class=solr.SearchHandler default=true lst name=defaults str name=defTypedismax/str str name=q.alt*:*/str !-- example highlighter config, enable per-query with hl=true -- str name=hltrue/str str name=hl.flbody/str int name=hl.snippets3/int str name=hl.mergeContiguoustrue/str !-- instructs Solr to return the field itself if no query terms are found -- str name=f.body.hl.alternateFieldbody/str str name=f.body.hl.maxAlternateFieldLength256/str Full solrconfig.xml and other files: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/apachesolr/?pathrev=DRUPAL-6--1 -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com -- View this message in context: http://www.nabble.com/bug--No-highlighting-results-with-dismax-and-q.alt%3D*%3A*-tp23438048p23450189.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: preImportDeleteQuery
I'm using full-import, not delta-import. I tried it with delta-import, and it would work, except that I'm querying for a large number of documents so I can't afford the cost of deltaImportQuery for each document. It sounds like $deleteDocId will work. I just need to update from 1.3 to trunk. Thanks! Noble Paul നോബിള് नोब्ळ्-2 wrote: are you doing a full-import or a delta-import? for delta-import there is an option of deletedPkQuery which should meet your needs -- View this message in context: http://www.nabble.com/preImportDeleteQuery-tp23437674p23450308.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr + wordpress
I actually wrote a plugin that integrates Solr with WordPress. http://www.mattweber.org/2009/04/21/solr-for-wordpress/ http://wordpress.org/extend/plugins/solr-for-wordpress/ https://launchpad.net/solr4wordpress Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 8, 2009, at 10:10 AM, Noble Paul നോബിള് नोब्ळ् wrote: Somebody has writte an articles on integrating Solr with wordpress http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/ -- - Noble Paul | Principal Engineer| AOL | http://aol.com
how to pronounce solr
Hi, My company is evaluating different open-source indexing and search software and we are seriously considering Solr. One of my collegues pronounces it differently than I do and I have no basis of correcting him. Is Solr pronounced SOLerrr(emphasis on first syllable), or pirate-like, SolAhhRrr (emphasis on the R). This coworker has just come from a big meeting with various managers where the technology came up and I'm afraid my battle over this very important matter may already have been lost. thank you, Charles
RE: Initialising of CommonsHttpSolrServer in Spring framwork
Ranjeeth, Di you figured aout how to do this.If yes, can you share with me how you did it? Example bean definition in xml will be helpful. --Sachin Funtick wrote: Use constructor and pass URL parameter. Nothing SPRING related... Create a Spring bean with attributes 'MySolr', 'MySolrUrl', and 'init' method... 'init' will create instance of CommonsHttpSolrServer. Configure Spring... I am using Solr 1.3 and Solrj as a Java Client. I am Integarating Solrj in Spring framwork, I am facing a problem, Spring framework is not inializing CommonsHttpSolrServer class, how can I define this class to get the instance of SolrServer to invoke furthur method on this. -- View this message in context: http://www.nabble.com/Initialising-of-CommonsHttpSolrServer-in-Spring-framwork-tp18808743p23451795.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to pronounce solr
This is the funniest e-mail I've had all day. SOLer is the typical pronunciation, but I've heard solAR as well. It's the description of pirate-like that made me chuckle. -Sean Charles Federspiel wrote: Hi, My company is evaluating different open-source indexing and search software and we are seriously considering Solr. One of my collegues pronounces it differently than I do and I have no basis of correcting him. Is Solr pronounced SOLerrr(emphasis on first syllable), or pirate-like, SolAhhRrr (emphasis on the R). This coworker has just come from a big meeting with various managers where the technology came up and I'm afraid my battle over this very important matter may already have been lost. thank you, Charles
Re: StatsComponent and 1.3
On May 7, 2009, at 10:25 PM, David Shettler wrote: I found myself with a real need for some of the StatsComponent stuff (mainly the sum feature), so I pulled down a nightly build and played with it. StatsComponent proved perfect, but... the nightly build output seems to be different, and thus incompatible with acts_as_solr. Could you give some more details on what seems different and incompatible with acts_as_solr? You can query the StatsComponent from Ruby using the solr-ruby library. Using the example from the wiki at http://wiki.apache.org/solr/StatsComponent , it points to http://localhost:8983/solr/select?q=*:*stats=truestats.field=pricestats.field=popularityrows=0indent=true require 'solr' solr = Solr::Connection.new solr.send(Solr::Request::Select.new(:standard, :q = '*:*', :stats = true, 'stats.field' = ['price','popularity'], :rows=0)) Which output this (in irb): = #Solr::Response::Select:0x141692c @header={QTime=1, params={stats=true, qt=standard, stats.field=[price, popularity], q=*:*, rows=0, wt=ruby}, status=0}, @raw_response = {'responseHeader '= {'status '= 0 ,'QTime '= 1 ,'params '= {'stats '= 'true ','q '= '*:*','stats .field '= ['price ','popularity '],'qt '= 'standard ','wt '= 'ruby ','rows '= '0 '}},'response '= {'numFound '= 26 ,'start '= 0 ,'docs '= []},'stats '= {'stats_fields '= {'price '= {'min '= 0.0 ,'max '= 2199.0 ,'sum '= 5251.26995 ,'count '= 15 ,'missing '= 11 ,'sumOfSquares '= 6038619.160315 ,'mean '= 350.084664 ,'stddev '= 547.737557906113 },'popularity '= {'min '= 0.0 ,'max '= 10.0 ,'sum '= 90.0 ,'count '= 26 ,'missing '= 0 ,'sumOfSquares '=628.0,'mean'=3.4615384615384617,'stddev'=3.5578731762756157, @data={response={start=0, docs=[], numFound=26}, stats={stats_fields={price={sumOfSquares=6038619.1603, sum=5251.27, max=2199.0, mean=350.0847, count=15, stddev=547.737557906113, min=0.0, missing=11}, popularity={sumOfSquares=628.0, sum=90.0, max=10.0, mean=3.46153846153846, count=26, stddev=3.55787317627562, min=0.0, missing=0}}}, responseHeader={QTime=1, params={stats=true, qt=standard, stats.field=[price, popularity], q=*:*, rows=0, wt=ruby}, status=0}} Is it possible, with some degree of effort (obviously) for me to essentially port some of the functionality of StatsComponent to 1.3 myself? It's that, or waiting for 1.4 to come out and someone developing support for it into acts_as_solr, or myself fixing what I have for acts_as_solr to work with the output. I'm just trying to gauge the easiest solution :) I'm unclear on what what the discrepancies are, so not quite sure how to help just yet. As Eric asked, what version/branch of acts_as_solr are you using? Erik
Re: how to pronounce solr
On Fri, May 8, 2009 at 2:07 PM, Charles Federspiel charles.federsp...@gmail.com wrote: Hi, My company is evaluating different open-source indexing and search software and we are seriously considering Solr. One of my collegues pronounces it differently than I do and I have no basis of correcting him. Is Solr pronounced SOLerrr(emphasis on first syllable), or pirate-like, SolAhhRrr (emphasis on the R). This coworker has just come from a big meeting with various managers where the technology came up and I'm afraid my battle over this very important matter may already have been lost. thank you, It's pronounced Solar. However you choose to pronounce solar, of course, is up to you or your regionalism. But that's what explains the sun logo and people who make puns about solr energy. ;). It's also the third question in the FAQ http://wiki.apache.org/solr/FAQ#head-0076d43a3911cf40a231e9ecf7df5303ccee0dc7. Just in case you need documented proof to argue your point. Unless of course you wanted it to be a pirate word. Perhaps you should send him or her an eye-patch in case a correction in this matter would hurt the collegues' feelings. On second thought, maybe not. Jon Gorman
Re: JVM exception_access_violation
I updated to Java 6 update 13 and have been running problem free for just over a month. I'll continue this thread if I run into any problems that seem to be related. Yonik Seeley-2 wrote: I assume that you're not using any Tomcat native libs? If you are, try removing them... if not (and the crash happened more than once in the same place) then it looks like a JVM bug rather than flakey hardware and the easiest path forward would be to try the latest Java6 (update 12). -- View this message in context: http://www.nabble.com/JVM-exception_access_violation-tp22623667p23451994.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is it possible to writing solr result on disk from the server side?
On Thu, May 7, 2009 at 10:55 AM, arno13 arnaud.gaudi...@healthonnet.org wrote: Do you know if it's possible to writing solr results directly on a hard disk from server side and not to use an HTTP connection to transfer the results? If you have something like a CSV file (or any other file type that Solr accepts over HTTP), you can instruct that the body be read directly from disk instead. http://wiki.apache.org/solr/UpdateCSV -Yonik http://www.lucidimagination.com
Re: Autocommit blocking adds? AutoCommit Speedup?
Any pointers to this newer more concurrent behavior in lucene? I can try an experiment where I downgrade the iwCommit lock to the iwAccess lock to allow updates to happen during commit. Would you expect that to work? Thanks for bootstrapping me on this. Jim Yonik Seeley-2 wrote: On Thu, May 7, 2009 at 8:37 PM, Jim Murphy jim.mur...@pobox.com wrote: Interesting. So is there a JIRA ticket open for this already? Any chance of getting it into 1.4? No ticket currently open, but IMO it could make it for 1.4. Its seriously kicking out butts right now. We write into our masters with ~50ms response times till we hit the autocommit then add/update response time is 10-30 seconds. Ouch. It's probably been made a little worse lately since Lucene now does fsync on index files before writing the segments file that points to those files. A necessary evil to prevent index corruption. I'd be willing to work on submitting a patch given a better understanding of the issue. Great, go for it! -Yonik http://www.lucidimagination.com -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23452011.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autocommit blocking adds? AutoCommit Speedup?
On Fri, May 8, 2009 at 4:27 PM, Jim Murphy jim.mur...@pobox.com wrote: Any pointers to this newer more concurrent behavior in lucene? At the API level we care about, IndexWriter.commit() instead of close() Also, we shouldn't have to worry about other parts of the code closing the writer on us since things like deleteByQuery no longer need to close the writer to work. core.getSearcher()... if we don't lock until it's finished, then what could currently happen is that you could wind up with a newer version of the index than you thought you might. I think this should be fine though. We'd need to think about what type of synchronization may be needed for postCommit and postOptimize hooks too. Here's the relevant code: iwCommit.lock(); try { log.info(start +cmd); if (cmd.optimize) { openWriter(); writer.optimize(cmd.maxOptimizeSegments); } closeWriter(); callPostCommitCallbacks(); if (cmd.optimize) { callPostOptimizeCallbacks(); } // open a new searcher in the sync block to avoid opening it // after a deleteByQuery changed the index, or in between deletes // and adds of another commit being done. core.getSearcher(true,false,waitSearcher); -Yonik http://www.lucidimagination.com
Re: no subject aka Replication Stall
2009/5/7 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: BTW, if the timeout occurs it resumes from the point where the failure happened. It retries 5 times before giving up. Sweet! I was just going to ask what happens on a timeout. Have you tested this out (say kill the master in the middle of replicating a big file)? -Yonik http://www.lucidimagination.com
Re: Autocommit blocking adds? AutoCommit Speedup?
Created issue: https://issues.apache.org/jira/browse/SOLR-1155 Jim Murphy wrote: Any pointers to this newer more concurrent behavior in lucene? I can try an experiment where I downgrade the iwCommit lock to the iwAccess lock to allow updates to happen during commit. Would you expect that to work? Thanks for bootstrapping me on this. Jim -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23453693.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autocommit blocking adds? AutoCommit Speedup?
Yonik Seeley-2 wrote: ...your code snippit elided and edited below ... Don't take this code as correct (or even compiling) but is this the essence? I moved shared access to the writer inside the read lock and kept the other non-commit bits to the write lock. I'd need to rethink the locking in a more fundamental way but is this close to idea? public void commit(CommitUpdateCommand cmd) throws IOException { if (cmd.optimize) { optimizeCommands.incrementAndGet(); } else { commitCommands.incrementAndGet(); } Future[] waitSearcher = null; if (cmd.waitSearcher) { waitSearcher = new Future[1]; } boolean error=true; iwCommit.lock(); try { log.info(start +cmd); if (cmd.optimize) { closeSearcher(); openWriter(); writer.optimize(cmd.maxOptimizeSegments); } finally { iwCommit.unlock(); } iwAccess.lock(); try { writer.commit(); } finally { iwAccess.unlock(); } iwCommit.lock(); try { callPostCommitCallbacks(); if (cmd.optimize) { callPostOptimizeCallbacks(); } // open a new searcher in the sync block to avoid opening it // after a deleteByQuery changed the index, or in between deletes // and adds of another commit being done. core.getSearcher(true,false,waitSearcher); // reset commit tracking tracker.didCommit(); log.info(end_commit_flush); error=false; } finally { iwCommit.unlock(); addCommands.set(0); deleteByIdCommands.set(0); deleteByQueryCommands.set(0); numErrors.set(error ? 1 : 0); } // if we are supposed to wait for the searcher to be registered, then we should do it // outside of the synchronized block so that other update operations can proceed. if (waitSearcher!=null waitSearcher[0] != null) { try { waitSearcher[0].get(); } catch (InterruptedException e) { SolrException.log(log,e); } catch (ExecutionException e) { SolrException.log(log,e); } } } -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23454419.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autocommit blocking adds? AutoCommit Speedup?
Can we move this to patch files within the JIRA issue please. Will make it easier to review and help out a as a patch to current trunk. --j Jim Murphy wrote: Yonik Seeley-2 wrote: ...your code snippit elided and edited below ... Don't take this code as correct (or even compiling) but is this the essence? I moved shared access to the writer inside the read lock and kept the other non-commit bits to the write lock. I'd need to rethink the locking in a more fundamental way but is this close to idea? public void commit(CommitUpdateCommand cmd) throws IOException { if (cmd.optimize) { optimizeCommands.incrementAndGet(); } else { commitCommands.incrementAndGet(); } Future[] waitSearcher = null; if (cmd.waitSearcher) { waitSearcher = new Future[1]; } boolean error=true; iwCommit.lock(); try { log.info(start +cmd); if (cmd.optimize) { closeSearcher(); openWriter(); writer.optimize(cmd.maxOptimizeSegments); } finally { iwCommit.unlock(); } iwAccess.lock(); try { writer.commit(); } finally { iwAccess.unlock(); } iwCommit.lock(); try { callPostCommitCallbacks(); if (cmd.optimize) { callPostOptimizeCallbacks(); } // open a new searcher in the sync block to avoid opening it // after a deleteByQuery changed the index, or in between deletes // and adds of another commit being done. core.getSearcher(true,false,waitSearcher); // reset commit tracking tracker.didCommit(); log.info(end_commit_flush); error=false; } finally { iwCommit.unlock(); addCommands.set(0); deleteByIdCommands.set(0); deleteByQueryCommands.set(0); numErrors.set(error ? 1 : 0); } // if we are supposed to wait for the searcher to be registered, then we should do it // outside of the synchronized block so that other update operations can proceed. if (waitSearcher!=null waitSearcher[0] != null) { try { waitSearcher[0].get(); } catch (InterruptedException e) { SolrException.log(log,e); } catch (ExecutionException e) { SolrException.log(log,e); } } } -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23455432.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Initialising of CommonsHttpSolrServer in Spring framwork
I am giving you a detailed sample of my spring usage. bean id=solrHttpClient class=org.apache.commons.httpclient.HttpClient property name=httpConnectionManager bean class=org.apache.commons.httpclient.MultiThreadedHttpConnectionManager property name=maxConnectionsPerHost value=10/ property name=maxTotalConnections value=10/ /bean /property /bean bean id=mySearchImpl class=com.me.search.MySearchSolrImpl property name=core1 bean class=org.apache.solr.client.solrj.impl.CommonsHttpSolrServer constructor-arg value=http://localhost/solr/core1/ constructor-arg ref=solrHttpClient/ /bean /property property name=core2 bean class=org.apache.solr.client.solrj.impl.CommonsHttpSolrServer constructor-arg value=http://localhost/solr/core2/ constructor-arg ref=solrHttpClient/ /bean /property /bean Hope this helps. Cheers Avlesh On Sat, May 9, 2009 at 12:39 AM, sachin78 tendulkarsachi...@gmail.comwrote: Ranjeeth, Did you figured aout how to do this? If yes, can you share with me how you did it? Example bean definition in xml will be helpful. --Sachin Funtick wrote: Use constructor and pass URL parameter. Nothing SPRING related... Create a Spring bean with attributes 'MySolr', 'MySolrUrl', and 'init' method... 'init' will create instance of CommonsHttpSolrServer. Configure Spring... I am using Solr 1.3 and Solrj as a Java Client. I am Integarating Solrj in Spring framwork, I am facing a problem, Spring framework is not inializing CommonsHttpSolrServer class, how can I define this class to get the instance of SolrServer to invoke furthur method on this. -- View this message in context: http://www.nabble.com/Initialising-of-CommonsHttpSolrServer-in-Spring-framwork-tp18808743p23451795.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solrconfig.xml
On Thu, May 7, 2009 at 10:06 AM, Francis Yakin fya...@liquid.com wrote: No error, attached is solrconfig.xml files( one is from 1.2.0 that works and the other is 1.3.0 that doesn't work) Francis, it seems the attached files were eaten up by the mailing list. Can you re-send or put them up online somewhere (e.g. on pastebin.com)? -- Regards, Shalin Shekhar Mangar.
Re: Control segment size
On Fri, May 8, 2009 at 1:30 AM, vivek sar vivex...@gmail.com wrote: I did set the maxMergeDocs to 10M, but I still see couple of index files over 30G which do not match with max number of documents. Here are some numbers, 1) My total index size = 66GB 2) Number of total documents = 200M 3) 1M doc = 300MB 4) 10M doc should be roughly around 3-4GB. As you can see couple of files are huge. Are those documents or index files? How can I control the file size so no single file grows more than 10GB. No, there is no way to limit an individual file to a specific size. -- Regards, Shalin Shekhar Mangar.
Re: no subject aka Replication Stall
On Sat, May 9, 2009 at 2:23 AM, Yonik Seeley yo...@lucidimagination.com wrote: 2009/5/7 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: BTW, if the timeout occurs it resumes from the point where the failure happened. It retries 5 times before giving up. Sweet! I was just going to ask what happens on a timeout. Have you tested this out (say kill the master in the middle of replicating a big file)? Actually yes, not by killing the master (then the replication will abort). I modified the master code to close the connection after transferring x MB of data. The slave retried an completed the replication . -Yonik http://www.lucidimagination.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Autocommit blocking adds? AutoCommit Speedup?
First cut of updated handler now in: https://issues.apache.org/jira/browse/SOLR-1155 Needs review from those that know Lucene better, and double check for errors in locking or other areas of the code. Thanks. --j jayson.minard wrote: Can we move this to patch files within the JIRA issue please. Will make it easier to review and help out a as a patch to current trunk. --j Jim Murphy wrote: Yonik Seeley-2 wrote: ...your code snippit elided and edited below ... Don't take this code as correct (or even compiling) but is this the essence? I moved shared access to the writer inside the read lock and kept the other non-commit bits to the write lock. I'd need to rethink the locking in a more fundamental way but is this close to idea? public void commit(CommitUpdateCommand cmd) throws IOException { if (cmd.optimize) { optimizeCommands.incrementAndGet(); } else { commitCommands.incrementAndGet(); } Future[] waitSearcher = null; if (cmd.waitSearcher) { waitSearcher = new Future[1]; } boolean error=true; iwCommit.lock(); try { log.info(start +cmd); if (cmd.optimize) { closeSearcher(); openWriter(); writer.optimize(cmd.maxOptimizeSegments); } finally { iwCommit.unlock(); } iwAccess.lock(); try { writer.commit(); } finally { iwAccess.unlock(); } iwCommit.lock(); try { callPostCommitCallbacks(); if (cmd.optimize) { callPostOptimizeCallbacks(); } // open a new searcher in the sync block to avoid opening it // after a deleteByQuery changed the index, or in between deletes // and adds of another commit being done. core.getSearcher(true,false,waitSearcher); // reset commit tracking tracker.didCommit(); log.info(end_commit_flush); error=false; } finally { iwCommit.unlock(); addCommands.set(0); deleteByIdCommands.set(0); deleteByQueryCommands.set(0); numErrors.set(error ? 1 : 0); } // if we are supposed to wait for the searcher to be registered, then we should do it // outside of the synchronized block so that other update operations can proceed. if (waitSearcher!=null waitSearcher[0] != null) { try { waitSearcher[0].get(); } catch (InterruptedException e) { SolrException.log(log,e); } catch (ExecutionException e) { SolrException.log(log,e); } } } -- View this message in context: http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23457422.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: French and SpellingQueryConverter
On Fri, May 8, 2009 at 2:14 AM, Jonathan Mamou ma...@il.ibm.com wrote: Hi It does not seem to be related to FrenchStemmer, the stemmer does not split a word into 2 words. I have checked with other words and SpellingQueryConverter always splits words with special character. I think that the issue is in SpellingQueryConverter class Pattern.compile.((?:(?!(\\w+:|\\d+)))\\w+);?: According to http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html, \w A word character: [a-zA-Z_0-9] I think that special character should also be added to the regex. If you use spellcheck.q parameter for specifying the spelling query, then the field's analyzer will be used (in this case, FrenchAnalyzer). If you use the q parameter, then the SpellingQueryConverter is used. -- Regards, Shalin Shekhar Mangar.