Search Query (Should I use fq)
I am looking to write a query in which a user will enter two conditions i.e. Search for description:text where category:someCategory So whats the best way to query it 1. q = (description:text) AND (category:someCategory) or 2. q = (description:text) AND (fq=category:someCategory) or Is there a better way then the ones written above ? Thanks Rahul -- View this message in context: http://lucene.472066.n3.nabble.com/Search-Query-Should-I-use-fq-tp3620521p3620521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search Query (Should I use fq)
I am looking to write a query in which a user will enter two conditions i.e. Search for description:text where category:someCategory So whats the best way to query it 1. q = (description:text) AND (category:someCategory) or 2. q = (description:text) AND (fq=category:someCategory) or Is there a better way then the ones written above ? Your second example is invalid actually. Correct syntax is : q=description:textfq=category:someCategory It seems that fq is more appropriate for category:someCategory, if that query would possibly be issued again. Filter queries are cached. http://wiki.apache.org/solr/CommonQueryParameters#fq
Re: 3.5 QueryResponseWriter
Den 30.12.2011 06:03, skrev Chris Hostetter: : Looks like you've experienced the issue described with fixes here: :http://www.lucidimagination.com/search/document/48b9e75fe68be4b7 but specifically, since you've already copied the jar file in question, and are now getting a class not found for the *baseclass* it suggests you have a diff problem : What I have done this far is basicly just to copy the /example/solr : folder, install the webapp .war file in a tomcat instance and start up. :At first I complained about the VelocityResponseWriter, so i created : a /lib folder in /$SOLR_HOME and added the velocity jar from dist. That : seemed to take care of the VRW error.But now I get an : NoClassDefFoundError wich sais something about QueryResponseWriter. So ...that suggests that it is loading VRW at a higher (or lower depending on how you look at it) classloader then where it loads the rest of the solr jars. if you are using the example solr setup, then it sounds like you copied the jar to example/lib (which is where the jetty jars live) instead of example/solr/lib (which would be a new lib folder in the $SOLR_HOME dir. unfortunately, people frequently get these confused, which is one of the reasons i have started encouraging people to just use thelib / declarations in their solrconfig.xml file instead of making a single lib dir in $SOLR_HOME. (but either way, you'll need to remove the copy of the VRW jar you've got loading in the system classpath before either approach will work) -Hoss Well, what I did was to create a lib directory within $SOLR_HOME ( $SOLR_HOME/lib ), and that is where I put the VRM jar found in the dist folder. Then what I did to the solrconfig was basicly to uncomment all of the lib statements and use lib dir=../lib /. The solrconfig is placed as normal in $SOLR_HOME/conf. Is it wrong to do so? To me this QueryResponseWriter thing doesn't necessarily have anything to do with VRM. Are all of the libraries from the /dist and contrib folder necessary for startup? Because my routine in setting up solr is to only copy the /solr folder from /example and into my tomcat environment. So everything above the /solr folder does not exist. When I want to use additional features i mainly copy the needed jar files into $SOLR_HOME/lib as explained above. So this jetty lib folder you are talking about, does not exist for me.
Getting results in (reverse) order they were indexed
Is there any possible way to get the results back from Solr in the reverse order they were indexed (i.e. the documents that was most recently added should be the first in the result) I know I can add a indexedAt=NOW field of type date and sort on it in desc order. But if I have a paginated web application giving 10 results each time, every time user goes to the next page, Solr has to re-evaluate all the results, sort the whole data set on date and return the 10 documents relevant. Which I think is a lot of overhead. Is there a good approach to deal with this problem ? -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-results-in-reverse-order-they-were-indexed-tp3620577p3620577.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search Query (Should I use fq)
Thanks for your help iorixxx . If you can help me solve one of my other questions as well that would be great http://lucene.472066.n3.nabble.com/Getting-results-in-reverse-order-they-were-indexed-td3620577.html http://lucene.472066.n3.nabble.com/Getting-results-in-reverse-order-they-were-indexed-td3620577.html -- View this message in context: http://lucene.472066.n3.nabble.com/Search-Query-Should-I-use-fq-tp3620521p3620586.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr, SQL Server's LIKE
The problem with the wildcard searches is that the input is not analyzed. For english, this might not be such a problem (except if you expect case insenstive search). But than again, you don't get that with like, either. Ngrams bring that and more. What I think is often forgotten when comparing 'like' and Solr search is: Solr's analyzer allow not only for case insenstive search but also for other analysis such as removing diacritics and this is also applied when sorting (you have to create a separate index in the DB, as well, if you want that). Say you have the following names: 'Van Hinden' 'van Hinden' 'Música' 'Musil' like 'mu%' - no hits like 'Mu%' - 1 hit like 'van%' - 1 hit like 'hin%' - no hits with Solr whitespace or standard tokenizer, ngrams and a diacritcs and lowercase filter (no wildcard search): 'mu'/'Mu' - 2 hits sorted ignoring case and diacritics 'van' - 2 hits 'hin' - 2 hits (This is written down from experience. I haven't checked those examples explicitly.) Cheers, Chantal On Fri, 2011-12-30 at 02:00 +0100, Chris Hostetter wrote: : Thanks. I know I'll be able to utilize some of Solr's free text : searching capabilities in other search types in this project. The : product manager wants this particular search to exactly mimic LIKE%. ... : Ex: If I search Albatross I want Albert to be excluded completely, : rather than having a low score. please be specific about the types of queries you want. ie: we need more then one example of the type of input you want to provide, the type of matches you want to see for that input, and the type of matches you want to get back. in your first message you said you need to match company titles pretty exactly but then seem to contradict yourself by saying the SQL's LIKE command fit's the bill -- even though the SQL LIKE command exists specificly for in-exact matches on field values. Based on your one example above of Albatross, you don't need anything special: don't use ngrams, don't use stemming, don't use fuzzy anything -- just search for Albatross and it will match Albatross but not Albert. if you want Albatross to match Albatross Road use some basic tokenization. If all you really care about is prefix searching (which seems suggested by your LIKE% comment above, which i'm guessing is shorthand for something similar to LIKE 'ABC%'), so that queries like abc and abcd both match abcdef and abcd but neither of them match abcd then just use prefix queries (ie: abcd*) -- they should be plenty efficient for your purposes. you only need to worry about ngrams when you want to efficiently match in the middle of a string. (ie: TITLE LIKE %ABC%) -Hoss
RE: Best practices for installing and maintaining Solr configuration
I actually have read that and I have Solr up and running on Tomcat. I didn't realize that it was example/ including Jetty, etc. that was being recommended against, but the $SOLR_HOME, which I created by copying example/solr/ Thanks for the tips on upgrading. I'll keep that in our documentation. Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848 Software Engineer II | Element K | www.elementk.com -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, December 29, 2011 8:18 AM To: solr-user@lucene.apache.org Subject: Re: Best practices for installing and maintaining Solr configuration This should help: http://wiki.apache.org/solr/SolrTomcat The difference here is that you're not copying the example directory, you're copying the example/solr directory. And this is just basically to get the configuration files and directory structure right. You're not copying executables, jars, wars, or any of that stuff from example. You get the war file from the dist directory and that should contain all the executables etc. As to your other questions: 1 If at all possible, upping the match version and reindexing are good things to do. 2 It's also a good idea to update the config files. Alternatively, you can diff the config files between releases to see what the changes are and selectively add them to your config file. But you should test, test, test before rolling out into prod. My rule of thumb for upgrading is to just not upgrade minor releases unless there are compelling reasons. The CHANGES.txt file will identify major additions. There are good reasons not to get too far behind on major (i.e. 3.x - 4.x) releases, the primary one being that Solr only makes an effort to be backwards-compatible through one major release. i.e. 1.4 can be read by 3.x (there was no 2.x Solr release). But no attempt will be made to by 4.x code to read 1.x indexes. Hope this helps Erick On Wed, Dec 28, 2011 at 8:49 AM, Brandon Ramirez brandon_rami...@elementk.com wrote: Hi List, I've seen several Solr developers mention the fact that people often copy example/ to become their solr installation and that that is not recommended. We are rebuilding our search functionality to use Solr and will be deploying it in a few weeks. I have read the README, several wiki articles, mailing list and browsed the Solr distribution. The example/ directory seems to be the only configuration I can find. So, I have to ask: what is the recommended way to install Solr? What about maintaining it? For example, Is it wise to up the luceneMatchVersion and re-index with every upgrade? When new configuration options are added in new versions of Solr, should we worry about updating our configuration to include them? I realize these may be vague questions and the answers could be case-by-case, but some general or high-level documentation may help. Thanks! Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848 Software Engineer II | Element K | www.elementk.comhttp://www.elementk.com/
Solr memory usage
I have solr running on a single machine with 8GB of ram. Right now I have about 1.5 million documents indexed, which had produced a 30GB index. When I look in top, the tomcat process which is hosting solr says that it's using 38GB of VIRT, 6.6G RES, and 2GB SHR. The machine is showing a completely full swap file and very little memory free. Is this because solr is trying to load the entire index into memory? The searches are still responsive, so it doesn't seem to be affecting performance. Thanks.
Re: NoClassDefFoundError: org/apache/solr/common/params/SolrParams
Thanks. I'm still working on this issue with no success so far. I'm reinstalling right now my whole development environment, for I have probably messed with it while attempting to find the reason for this error message. Bruno On 12/29/2011 08:27 PM, Dyer, James wrote: The SolrParams class is in the solrj.jar file so you should verify that this is in the classpath. Also see if it is listed in the manifest.mf file in the war's META-INF dir. If you're running this on a server within Eclipse and letting Eclipse do the deploy, my experience is it can be frustrating at times to get Eclipse to get the dependencies right. In this case look at the Java EE Module Dependencies screen in Eclipse. I often resort to hand-editing the org.eclipse.wst.common.component file in the project's .settings directory. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Bruno Adam Osiek [mailto:baos...@gmail.com] Sent: Thursday, December 29, 2011 4:17 PM To: solr-user@lucene.apache.org Subject: NoClassDefFoundError: org/apache/solr/common/params/SolrParams Hi, I'm trying to deploy a Solrj based application into JBoss AS 7 using Eclipse Indigo. When deploying it I get the following error message: ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC1: Failed to start service jboss.deployment.unit.SolrIntegration.war.POST_MODULE: org.jboss.msc.service.StartException in service jboss.deployment.unit.SolrIntegration.war.POST_MODULE: Failed to process phase POST_MODULE of deployment SolrIntegration.war at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:121) at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1824) at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1759) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [:1.7.0_02] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [:1.7.0_02] at java.lang.Thread.run(Thread.java:722) [:1.7.0_02] *Caused by: java.lang.NoClassDefFoundError: org/apache/solr/common/params/SolrParams* at java.lang.Class.getDeclaredConstructors0(Native Method) [:1.7.0_02] at java.lang.Class.privateGetDeclaredConstructors(Class.java:2404) [:1.7.0_02] at java.lang.Class.getConstructor0(Class.java:2714) [:1.7.0_02] at java.lang.Class.getConstructor(Class.java:1674) [:1.7.0_02] at org.jboss.as.web.deployment.jsf.JsfManagedBeanProcessor.deploy(JsfManagedBeanProcessor.java:105) at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:115) ... 5 more *Caused by: java.lang.ClassNotFoundException: org.apache.solr.common.params.SolrParams* from [Module deployment.SolrIntegration.war:main from Service Module Loader] at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:191) at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:361) at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:333) at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:310) at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:103) ... 11 more === I have searched with no success for a solution. I've managed to deploy successfully *solr.war* into JBoss. Any help will be welcomed. Regards.
Re: 3.5 QueryResponseWriter
: Well, what I did was to create a lib directory within $SOLR_HOME ( : $SOLR_HOME/lib ), and that is where I put the VRM jar found in the dist : folder. Then what I did to the solrconfig was basicly to uncomment all of the : lib statements and use lib dir=../lib /. The solrconfig is placed as : normal in $SOLR_HOME/conf. Is it wrong to do so? Hmmm... that may be the cause of the problem -- you don't need to do both. either add a lib/ directive pointed at where you put VRW *OR* put it in $SOLR_HOME/lib ... don't do both (that may confuse your JVM classloader ... i'd have to sit down and really think through what's happening there) : To me this QueryResponseWriter thing doesn't necessarily have anything to do : with VRM. Are all of the libraries from the /dist and contrib folder necessary it has everything to do with VRW .. VelocityResponseWriter is a subclass of QueryResponseWriter, when VRW is loaded the JVM will attempt to rectify it's entire java class inheritence tree, using the classloader that VRW was loaded with, which then delegates up the tree of classloaders (looking at other specified lib dirs, and the solr.war, and the container classloader, and the bootloader, etc...). if the classpath is wonky (ie: VRW exists in two places for example) then you can get errors like this even if class loader X has already loaded a base class (like QRW) if it's not consulted in the delegation. : for startup? Because my routine in setting up solr is to only copy the /solr : folder from /example and into my tomcat environment. So everything above the well, for starters, you should not copy example/solr into my tomcat environment .. the only thing that needs copied into tomcat is the solr.war (either by putting it in the webapps directory, or by pointing to it fro ma context file) and then solr.war is the only thing that needs to know about (your copy of) the example/solr directory (either using a system property or jndi). But as i said: classpaths are a pain in the ass, if you actually *copy* example/solr somewhere in your tomcat installation dir, it's *possible* that you have done so in a way that tomcat is finding things like the VRW jar in a higher classpath before it ever even loads the solr.war and the QRW base class. : /solr folder does not exist. When I want to use additional features i mainly : copy the needed jar files into $SOLR_HOME/lib as explained above. So this that should work fine ... as long as: a) you aren't telling solr load the same jars more then once (see above about not needing lib/ if you use $SOLR_HOME/lib) ... you can check if that's happening by looking at your log messages on solr startup and see if there are duplicates in the list of jars solr says it's adding ot hte classpath b) tomcat isn't already loading these jars ... this is harder to recognize, but the safe way to do it is to keep all of these jars the hell away from tomcat. -Hoss
Re: Getting results in (reverse) order they were indexed
: Is there any possible way to get the results back from Solr in the reverse : order they were indexed (i.e. the documents that was most recently added : should be the first in the result) : : I know I can add a indexedAt=NOW field of type date and sort on it in desc : order. : : But if I have a paginated web application giving 10 results each time, every : time user goes to the next page, Solr has to re-evaluate all the results, : sort the whole data set on date and return the 10 documents relevant. Which : I think is a lot of overhead. the overhead you are speculating about is largely imagined .. it can be problematic if you have users going to the 1th page, but for normal user traffic you aren't going to see any problems with the computational effort of loading page two (if you use the example solr configs, and display 10 results per page, solr will automaticly cache pages 1-5 for you when page #1 is requested -- see the queryResultWindowSize for details) If you don't want to use a specific indexedAt field, you can use sort=_docid_ desc but it's not 100% garunteed to be reverse the order added if you use a MergePolicy that re-orderes segments. (and even this will have the same imaginary overhead for loading page #2, 3, 4, etc... as if you sort o na field -- all this type of approach saves you is the overhead of the FieldCache) https://wiki.apache.org/solr/CommonQueryParameters#sort -Hoss
RE: Solr, SQL Server's LIKE
Hoss, Thanks. You've answered my question. To clarify, what I should have asked for instead of 'exact' was 'not fuzzy'. For some reason it didn't occur to me that I didn't need n-grams to use the wildcard. You asking for me to clarify what I meant made me realize that the n-grams are the source of all my current problems. :) Thanks! Devon Baumgarten -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, December 29, 2011 7:00 PM To: solr-user@lucene.apache.org Subject: RE: Solr, SQL Server's LIKE : Thanks. I know I'll be able to utilize some of Solr's free text : searching capabilities in other search types in this project. The : product manager wants this particular search to exactly mimic LIKE%. ... : Ex: If I search Albatross I want Albert to be excluded completely, : rather than having a low score. please be specific about the types of queries you want. ie: we need more then one example of the type of input you want to provide, the type of matches you want to see for that input, and the type of matches you want to get back. in your first message you said you need to match company titles pretty exactly but then seem to contradict yourself by saying the SQL's LIKE command fit's the bill -- even though the SQL LIKE command exists specificly for in-exact matches on field values. Based on your one example above of Albatross, you don't need anything special: don't use ngrams, don't use stemming, don't use fuzzy anything -- just search for Albatross and it will match Albatross but not Albert. if you want Albatross to match Albatross Road use some basic tokenization. If all you really care about is prefix searching (which seems suggested by your LIKE% comment above, which i'm guessing is shorthand for something similar to LIKE 'ABC%'), so that queries like abc and abcd both match abcdef and abcd but neither of them match abcd then just use prefix queries (ie: abcd*) -- they should be plenty efficient for your purposes. you only need to worry about ngrams when you want to efficiently match in the middle of a string. (ie: TITLE LIKE %ABC%) -Hoss
Re: a question on jmx solr exposure
: Well, we don't use multicore feature of SOLR, so in our case SOLR : instances : are just separate web-apps. The web-app loading order probably then : affects : on which app gets hold of a jmx 'pipe'. A feature was added in SOLR-1843 specificly to help address this potential collision, by allowing you to override the rootName used in your jmx / declaration (by default it's solr/${corename}) but looking at the issue now i see it was only committed to trunk... https://issues.apache.org/jira/browse/SOLR-1843 ...even though this looks like a fairly straight forward candidate to merge back to 3x. i'll look into it. -Hoss
Re: MLT as a nested query
: is it possible to use MLT as a nested query? I tried the following: : select?q=field1:foo field2:bar AND _query_:{!mlt fl=mltField mindf=1 mintf=1 mlt.match.include=false} selectField:baz} MLT functionality exists in two forms: as a component, that decorates results produced by another search (similar to highlighting and faceting), and as a handler that produces a main result set based on an MLT query (so highlighting and faceting happen to the results of the MLT itself)... https://wiki.apache.org/solr/MoreLikeThis In order for what you are describing to work, someone would have to implement a MLT QParser but no one has ever attempted that to my knowledge. I have considered looking into it, and i suspect it would be somewhat straight forward to do, but only for single node instances -- there is no way i know of for a QParser to sanely generate a query like MLT based on the terms of distributed shards. -Hoss
Re: reposting highlighting questions
: I am new to solr/xml/xslt, and trying to figure out how to display : search query fields highlighted in html. I can enable the highlighting : in the query, and I think I get the correct xml response back (See : below: I search using 'Contents' and the highlighting is shown with : strong and /strong. However, I cannot figure out what to add to the : xslt file to transform it in html. I think it is a question of defining i think you are looking for the disable-output-escaping=yes option on your xsl:value-of select=.../ expression to echo out the highlighted string. hard to be sure since you didn't actaully provide any example of the XSLT or xpath you are trying to use. -Hoss
Re: Decimal Mapping problem
: Try to cast MySQL decimal data type to string, i.e. : : CAST( IF(drt.discount IS NULL,'0',(drt.discount/100)) AS CHAR) as discount : (or CAST AS TEXT) ...to clarify here, the values you are seeing are what happens when the DB returns to DIH a value in a type it doesn't udnerstand -- in this case it's a byte array. DIH isn't sure what do do with this byte array, so it just calls the java toString() method on it. casting that byte array to something DIH understands (like a string) is one way to solve the problem, but the other would be to use some SQL expression that always returns aconsistent type, so the SQL server knows what type to declare in it's response -- in your example you are sometimes returning a string (if NULL, you return the string '0') and sometimes returning a number (if not null, drt.discount/100) use SQL that alwasy returns a number, and this problem will also go away. -Hoss
Highlighting with prefix queries and maxBooleanClause
This question has come up a few times, but I've yet to see a good solution. Basically, if I have highlighting turned on and do a query for q=*, I get an error that maxBooleanClauses has been exceeded. Granted, this is a silly query, but a user might do something similar. My expectation is that queries that work when highlighting is OFF should continue working when it is ON. What's the best solution for queries like this? Is it simply to catch the error and then up maxBooleanClauses? Or to turn off highlighting when this error occurs? Or am I doing something altogether wrong? This is the query I'm using to cause the error: http://localhost:8983/solr/select/?q=*start=0rows=20hl=truehl.fl=text Changing hl to false makes the query go through. I'm using Solr 4.0.0-dev The traceback is: SEVERE: org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024 at org.apache.lucene.search.ScoringRewrite$1.checkMaxClauseCount(ScoringRewrite.java:68) at org.apache.lucene.search.ScoringRewrite$ParallelArraysTermCollector.collect(ScoringRewrite.java:159) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:81) at org.apache.lucene.search.ScoringRewrite.rewrite(ScoringRewrite.java:114) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:312) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:155) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:144) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:384) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:216) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:184) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:205) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:511) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:402) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:121) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1478) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Thanks, Mike
Re: issues with WordDelimiterFilter
: I'm having an issue with the way the WordDelimiterFilter parses compound : words. My field declaration is simple, looks like this: : : analyzer type=index : tokenizer class=solr.WhitespaceTokenizerFactory/ : filter class=solr.WordDelimiterFilterFactory preserveOriginal=1/ : filter class=solr.LowerCaseFilterFactory/ : /analyzer you haven't said anything about what your query time analyzer looks like -- based on your other comments, i'm going to assume it just uses whitespaceTokenizer and lower case filter w/o WDF at all -- but if you don't have any query analyzer declared that means the analyzer above is used in both case, which is most likely not what you want. : : When indexing 'fokker-plank' I do get the token for both fokker, : : planck, and fokker-planck. But in that case the fokker-planck token it : : is followed by a 'planck' token. The analysis looks like this. that is expected - when WDF splits up a token (and keeps hte original) it puts the first of the split tokens at the same position as the original, and each other split token follows in subsequent positions -- positions in token streams are simple integer increments, so there is no way to say that the split fokker and planck tokens appear in that sequence *and* that they both appear at the same position as the original fokker-planck : So in the case where fokker-plank is the first token there should be no : second token, its already been used if the first was matched. The that type of logic (hierarchical sequences of tokens) is just not possible with lucene. : problem manifests itself when doing phrase searches... : : Fokker-Plank equations won't find the exact phrase, Fokker-Plank : equations, because its sees the term planck as between Fokker-Plank and : equations. Hope that makes sense! Should I submit this as a bug? for phrase queries like this to work when using WDF, it's neccessary to use some slop in your phrase query (to overcome the position gaps introduced by the split out tokens) ... either that, or turn off preserveOriginal and use a query analyzer thta also splits at query time : As it stands it would return a true hit (erroneously I believe) on the : phrase search fokker planck, so really all 3 tokens should be returned Hmmm... if you do *not* want a phrase search for fokker planck to match documents containing fokker-planck then why are you using WDF at all? : at offset 0 and there should be no second token so phrase searches are : preserved. if all the tokens wound up in the exact same position, then a phrase query for fokker planck would still match this document (so it wouldn't solve your problem) but you would also get matches for things like the phrase planck fokker -- which is not likelye what *anyone* would expect. -Hoss
Re: [Solr Event Listener plug-in] Execute query search from SolrCore - Java Code
Ok, I have made progresses, I built my architecture and I execute queries , inside the PostCommit method, and they are launched as i want. But The core can't see the early updated documents and the commit ends after than the postCommit method has ended!! But i have to see the early updated document, this is the principal need of my plugin. How can i search the early indexed documents? How can i open the new searcher? and where? Inside the postCommit seems to be not good... Any suggestion? 2011/12/29 Alessandro Benedetti benedetti.ale...@gmail.com Hi guys, I'm developing a custom SolrEventListener, and inside the PostCommit() method I need to execute some queries and collect results. In my SolrEventListener class, I have a SolrCore Object( org.apache.solr.core.SolrCore) and a list of queries (Strings ). How can I use the SolrCore to optimally parse the queries ( I have to parse them like Solr Query parser does and launch them? I'm fighting with Searchers and Execute methods in the solrCore object, but I don't know which is the best way to do this ... Cheers -- -- Benedetti Alessandro Personal Page: http://tigerbolt.altervista.org Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Personal Page: http://tigerbolt.altervista.org Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: [Solr Event Listener plug-in] Execute query search from SolrCore - Java Code
I have tried to open a new searcher and make a forced commit inside the postCommit method of the listener, but it caused many issues. How can I complete the commit and then call the postCommit method of the listener with the logic inside ( with a lot of queries on the last committed docs)? Cheers 2011/12/31 Alessandro Benedetti benedetti.ale...@gmail.com Ok, I have made progresses, I built my architecture and I execute queries , inside the PostCommit method, and they are launched as i want. But The core can't see the early updated documents and the commit ends after than the postCommit method has ended!! But i have to see the early updated document, this is the principal need of my plugin. How can i search the early indexed documents? How can i open the new searcher? and where? Inside the postCommit seems to be not good... Any suggestion? 2011/12/29 Alessandro Benedetti benedetti.ale...@gmail.com Hi guys, I'm developing a custom SolrEventListener, and inside the PostCommit() method I need to execute some queries and collect results. In my SolrEventListener class, I have a SolrCore Object( org.apache.solr.core.SolrCore) and a list of queries (Strings ). How can I use the SolrCore to optimally parse the queries ( I have to parse them like Solr Query parser does and launch them? I'm fighting with Searchers and Execute methods in the solrCore object, but I don't know which is the best way to do this ... Cheers -- -- Benedetti Alessandro Personal Page: http://tigerbolt.altervista.org Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Personal Page: http://tigerbolt.altervista.org Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Personal Page: http://tigerbolt.altervista.org Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to test solr filter
: References: 1324729256338-3610466.p...@n3.nabble.com : 4ef69412.3040...@r.email.ne.jp : In-Reply-To: 4ef69412.3040...@r.email.ne.jp : Subject: How to test solr filter https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is hidden in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. -Hoss
Re: Best practices for installing and maintaining Solr configuration
: I've seen several Solr developers mention the fact that people often : copy example/ to become their solr installation and that that is not : recommended. We are rebuilding our search functionality to use Solr and : will be deploying it in a few weeks. do you have specific examples from the mailing list of the recomendations you are seeing? in my opinion there's nothing wrong with copying the solr example to use as the basis for your own configs -- that's why it's there. Where people tend to run into problems (in my opinion) is... a) they never change the configs. those configs are examples. they showcase features, and the comments suggest best practices. that doesn't mean you need to shoe-horn your data into the declared fields, it doesn't mean you *have* to use dynamic fields in order to add a field with a new name. you should feel free to customize the configs to meet your needs. b) they assume that when upgrading, they should through out their old configs, and copy the newer examples configs. you should by all means *consult* the new configs, because there may be new features in there that you should consider, or new comments suggesting new best-practices, but if you completley throw out your old configs you *have* to reindex. some people either don't realize that and are suprised when they get weird errors, or take it for granted that they should *always* reindex even though it isn't neccessary. My advice: when starting a new collection from scratch: base the configs on the example from the current version of solr you ar using, and customize. when upgrading solr: consult the new example config, and cut/paste anything you think you would like to have into your existing configs, considering the implications of reindexing (ie: changing field types). when adding a new collection to an existing installation: decide if it's really differnet from what you've already got, in which case base the configs off of the current example; or if it's similar to some collection you already have, base the configs off of those. : What about maintaining it? For example, Is it wise to up the : luceneMatchVersion and re-index with every upgrade? When new only if you have read the CHANGES.txt and feel like there is something about hte new features or modified behavior that suggests to you that you want those changes -- if you are happy with the existing behavior, leave it alone. The flip side is that if for some reason you decide you need to re-index everything, then you should consider bumping the luceneMatchVersion up to get what is now considered the correct behavior of those classes -- but just like upgrading, you should test these behavior hcanges and verify they are really what you want. -Hoss
hl.maxAnalyzedChars seems does not work
Hi, My situation is that highlight a text field(about 2M per document) costed too much time(1s). So I want to limit the characters highlighter analyze. These are my highlighting parameters, which seem to have no problem: bool name=hltrue/bool str name=hl.fltext/str int name=hl.snippets1/int int name=hl.maxAnalyzedChars200/int str name=f.text.hl.alternateFieldtext/str int name=hl.maxAlternateFieldLength100/int The parameter hl.maxAlternateFieldLength seems does not work...
Re: Solr memory usage
Hi Bai, Solr doesn't try to load the whole index into memory, no. You can control how much memory Tomcat uses with -Xmx Java command line parameter. Otis Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html From: Bai Shen baishen.li...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, December 30, 2011 9:16 AM Subject: Solr memory usage I have solr running on a single machine with 8GB of ram. Right now I have about 1.5 million documents indexed, which had produced a 30GB index. When I look in top, the tomcat process which is hosting solr says that it's using 38GB of VIRT, 6.6G RES, and 2GB SHR. The machine is showing a completely full swap file and very little memory free. Is this because solr is trying to load the entire index into memory? The searches are still responsive, so it doesn't seem to be affecting performance. Thanks.