Re: Question on query syntax
: Solr can process the query which has NOT operator ("-") in the head. : If Solr find it, Solr adds MatchAllDocsQuery automatically : in the front of that query as follows: thta's not strictly true ... Solr doesn't *add* a MatchAllDocsQuery if the query is entirely prohibitive, instead Solr executes a MatchAllDocsQuery and then filters that by the DocSet returned by the "absolute value" of the orriginal query. the end result should be functionaly equivilent, but this approach caches better (both "-text:foo" and "text:foo" are cached the same) ... the downside is that the debuging info for purely prohibitive queries is currently incorrect (see SOLR-119) -Hoss
Re: Need question to configure Log4j for solr
: The one issue I ran into was with daily rolling log files - maybe I : missed it, but I didn't find that functionality in the JDK logging : package, however it is in log4j. : : I'm not advocating a change, just noting this. We worked around it by : leveraging Resin's support for wrapping a logger (set up for daily : rolling log files) around a webapp. as i recall Resin doens't wrap JDK logging -- they provide subclasses of java.util.logging.Handler that do what they want (log rotation, writitng to syslog, etc...) and provide their own LogManager subclass so they logging can be configured in their resin.conf. (basically they do all the things the java logging spec was designed to let people do to have control over logging without needing any third party logging frameworks ... it's just too bad JDK logging doesn't include all the nice Handlers and Formatters and configuration helper utilities that would make the API seem more usefull to people comaring it with log4j and the other abstraction frameworks) -Hoss
Re: Question on query syntax
On 12-Jul-07, at 6:25 PM, Lance Lance wrote: A simplified version of the problem: text -(collection:pile1) works, while text (-collection:pile1) finds zero records. see my other message. You cannot create a (sub)query with only prohibited clauses. The second query asks: Q = find docs containing 'text' or matching X; X = find docs that don't match 'collection:pile1' # invalid query note that if "-collection:pile1" is the main query, Solr detects this case and handles it. -Mike
RE: Question on query syntax
A simplified version of the problem: text -(collection:pile1) works, while text (-collection:pile1) finds zero records. lance _ From: Lance Lance [mailto:[EMAIL PROTECTED] Sent: Thursday, July 12, 2007 5:58 PM To: 'solr-user@lucene.apache.org' Subject: Question on query syntax Are there any known bugs in the syntax parser? We're using lucene-2.2.0 and Solr 1.2. We have documents with searchable text and a field 'collection'. This query works as expected, finding everything except for collections 'pile1' and 'pile2'. text -(collection:pile1 OR collection:pile2) When we apply De Morgan's Law, we get 0 records: text (-collection:pile1 AND -collection:pile2) This should return all records, but it returns nothing: text (-collection:pile1 OR -collection:pile2) Thanks, Lance
Re: Question on query syntax
Lance, I think you are right. I met the same problem before. > -(collection:pile1 OR collection:pile2) Solr can process the query which has NOT operator ("-") in the head. If Solr find it, Solr adds MatchAllDocsQuery automatically in the front of that query as follows: MatchAllDocsQuery -(collection:pile1 OR collection:pile2) Then Lucene can process this query properly. However, Solr doesn't add MatchAllDocsQuery if the query doesn't have NOT operator in the head. To avoid this problem, you can add "*:*" at the front of your query: (*:* -collection:pile1 AND -collection:pile2) (*:* -collection:pile1 OR -collection:pile2) Thank you, Koji Lance Lance wrote: Are there any known bugs in the syntax parser? We're using lucene-2.2.0 and Solr 1.2. We have documents with searchable text and a field 'collection'. This query works as expected, finding everything except for collections 'pile1' and 'pile2'. text -(collection:pile1 OR collection:pile2) When we apply De Morgan's Law, we get 0 records: text (-collection:pile1 AND -collection:pile2) This should return all records, but it returns nothing: text (-collection:pile1 OR -collection:pile2) Thanks, Lance
RE: Question on query syntax
Ok, here's a simpler version: _ From: Lance Lance [mailto:[EMAIL PROTECTED] Sent: Thursday, July 12, 2007 5:58 PM To: 'solr-user@lucene.apache.org' Subject: Question on query syntax Are there any known bugs in the syntax parser? We're using lucene-2.2.0 and Solr 1.2. We have documents with searchable text and a field 'collection'. This query works as expected, finding everything except for collections 'pile1' and 'pile2'. text -(collection:pile1 OR collection:pile2) When we apply De Morgan's Law, we get 0 records: text (-collection:pile1 AND -collection:pile2) This should return all records, but it returns nothing: text (-collection:pile1 OR -collection:pile2) Thanks, Lance
Re: Question on query syntax
On 12-Jul-07, at 5:58 PM, Lance Lance wrote: Are there any known bugs in the syntax parser? We're using lucene-2.2.0 and Solr 1.2. We have documents with searchable text and a field 'collection'. This query works as expected, finding everything except for collections 'pile1' and 'pile2'. text -(collection:pile1 OR collection:pile2) When we apply De Morgan's Law, we get 0 records: text (-collection:pile1 AND -collection:pile2) This should return all records, but it returns nothing: text (-collection:pile1 OR -collection:pile2) Lucene's "boolean" operators are not true boolean operators. Instead, every clause is one of: OPTIONAL REQUIRED PROHIBITED for a query (or parenthesized subqueries) to match, all REQUIRED clauses must match, zero PROHIBITED clauses must match, and if there are not REQUIRED clauses, at least one OPTIONAL must match. You cannot have only PROHIBITED clauses. Now, the syntax for each is (nothing), +, -, and they can be applied to entire subqueries using brackets: +hello -(goodbye -night) returns docs that have hello, and do not have (goodbye without night) In lucene, AND/OR/NOT are syntactic sugar that translates clauses to the above form. However, it imperfectly matches people's (rational) expectations of how boolean operators work. Also, brackets _create subqueries_, not just group operators. I suggest that AND and OR never be used programmatically, if possible. Try these alternatives: docs (must) containing 'text' that do not match (col=pile1 or col=pile2) text -(collection:pile1 collection:pile2) same as above text -collection:pile1 -collection:pile2 docs (must) contain 'text' that (must) match (col=pile1 or col=pile2) +text +(collection:pile1 collection:pile2) Note in the last example, the + is necessary before the text because otherwise it would be optional and not required (as there are other required clauses). -Mike
Re: Need question to configure Log4j for solr
: the troubles comes when you integrate third-party stuff depending on : log4j (as I currently do). Having said this you have a strong point when : looking at http://www.qos.ch/logging/classloader.jsp there have been several discussions baout changing the logger used by Solr ... the best summation i can give to these discussions is: * JDK logging is universal * using any other logging framework would add a dependency without adding functionality The one issue I ran into was with daily rolling log files - maybe I missed it, but I didn't find that functionality in the JDK logging package, however it is in log4j. I'm not advocating a change, just noting this. We worked around it by leveraging Resin's support for wrapping a logger (set up for daily rolling log files) around a webapp. -- Ken -- Ken Krugler Krugle, Inc. +1 530-210-6378 "If you can't find it, you can't fix it"
Question on query syntax
Are there any known bugs in the syntax parser? We're using lucene-2.2.0 and Solr 1.2. We have documents with searchable text and a field 'collection'. This query works as expected, finding everything except for collections 'pile1' and 'pile2'. text -(collection:pile1 OR collection:pile2) When we apply De Morgan's Law, we get 0 records: text (-collection:pile1 AND -collection:pile2) This should return all records, but it returns nothing: text (-collection:pile1 OR -collection:pile2) Thanks, Lance
Re: Deleting from a very active index
I was going to say... that exception should never happen since solr controls and synchronizes adds/deletes at a higher layer (with only one solr instance accessing an index, we don't really need lucene level locking at all). One major cause of this is a crash/restart of the JVM leaving a stale lock file behind. Those can be removed automatically at startup with a tweak in solrconfig.xml -Yonik On 7/12/07, Matthew Runo <[EMAIL PROTECTED]> wrote: It looks like somehow the write.lock got hung. I manually removed the lock, and now things are good. Very strange.
Re: Deleting from a very active index
It looks like somehow the write.lock got hung. I manually removed the lock, and now things are good. Very strange. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Jul 12, 2007, at 1:32 PM, Matthew Runo wrote: Hello! I'm trying to remove a whole brand from our search index, but at the same time we're also running an import for others. This means the index is extreamly active at this time. I am getting a lock timeout error, but not sure what to do about it... should I just keep trying till it can get the lock to do the delete? [EMAIL PROTECTED]:/home/mruno]$ curl http://search1.zappos.com:8080/solr/ update --silent --data-binary "brand:Harley- Davidson" -H 'Content-type:text/xml; charset=utf-8' org.apache.solr.core.SolrException: Error deleting doc# 966 at org.apache.solr.update.UpdateHandler $DeleteHitCollector.collect(UpdateHandler.java:175) at org.apache.lucene.search.Scorer.score(Scorer.java:49) at org.apache.lucene.search.IndexSearcher.search (IndexSearcher.java:146) at org.apache.solr.search.SolrIndexSearcher.search (SolrIndexSearcher.java:407) at org.apache.lucene.search.Searcher.search(Searcher.java:118) at org.apache.solr.update.DirectUpdateHandler2.deleteByQuery (DirectUpdateHandler2.java:343) at org.apache.solr.handler.XmlUpdateRequestHandler.update (XmlUpdateRequestHandler.java:260) at org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate (XmlUpdateRequestHandler.java:355) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:58) at javax.servlet.http.HttpServlet.service(HttpServlet.java: 710) at javax.servlet.http.HttpServlet.service(HttpServlet.java: 803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:269) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:188) at org.apache.solr.servlet.SolrDispatchFilter.doFilter (SolrDispatchFilter.java:185) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke (StandardWrapperValve.java:210) at org.apache.catalina.core.StandardContextValve.invoke (StandardContextValve.java:174) at org.apache.catalina.core.StandardHostValve.invoke (StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke (ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke (StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service (CoyoteAdapter.java:151) at org.apache.coyote.http11.Http11Processor.process (Http11Processor.java:870) at org.apache.coyote.http11.Http11BaseProtocol $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java: 665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket (PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt (LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool $ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@/opt/solr/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:70) at org.apache.lucene.index.IndexReader.acquireWriteLock (IndexReader.java:626) at org.apache.lucene.index.IndexReader.deleteDocument (IndexReader.java:660) at org.apache.solr.update.UpdateHandler $DeleteHitCollector.collect(UpdateHandler.java:170) ... 27 more ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Deleting from a very active index
Hello! I'm trying to remove a whole brand from our search index, but at the same time we're also running an import for others. This means the index is extreamly active at this time. I am getting a lock timeout error, but not sure what to do about it... should I just keep trying till it can get the lock to do the delete? [EMAIL PROTECTED]:/home/mruno]$ curl http://search1.zappos.com:8080/solr/ update --silent --data-binary "brand:Harley-Davidsonquery>" -H 'Content-type:text/xml; charset=utf-8' org.apache.solr.core.SolrException: Error deleting doc# 966 at org.apache.solr.update.UpdateHandler $DeleteHitCollector.collect(UpdateHandler.java:175) at org.apache.lucene.search.Scorer.score(Scorer.java:49) at org.apache.lucene.search.IndexSearcher.search (IndexSearcher.java:146) at org.apache.solr.search.SolrIndexSearcher.search (SolrIndexSearcher.java:407) at org.apache.lucene.search.Searcher.search(Searcher.java:118) at org.apache.solr.update.DirectUpdateHandler2.deleteByQuery (DirectUpdateHandler2.java:343) at org.apache.solr.handler.XmlUpdateRequestHandler.update (XmlUpdateRequestHandler.java:260) at org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate (XmlUpdateRequestHandler.java:355) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:58) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:269) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:188) at org.apache.solr.servlet.SolrDispatchFilter.doFilter (SolrDispatchFilter.java:185) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke (StandardWrapperValve.java:210) at org.apache.catalina.core.StandardContextValve.invoke (StandardContextValve.java:174) at org.apache.catalina.core.StandardHostValve.invoke (StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke (ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke (StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service (CoyoteAdapter.java:151) at org.apache.coyote.http11.Http11Processor.process (Http11Processor.java:870) at org.apache.coyote.http11.Http11BaseProtocol $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket (PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt (LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool $ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@/opt/solr/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:70) at org.apache.lucene.index.IndexReader.acquireWriteLock (IndexReader.java:626) at org.apache.lucene.index.IndexReader.deleteDocument (IndexReader.java:660) at org.apache.solr.update.UpdateHandler $DeleteHitCollector.collect(UpdateHandler.java:170) ... 27 more ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: snappuller copying to wrong directory?
That change doesn't have anything to do with where snappuller place the snapshots. Is the environment variable data_dir set up correctly in conf/scripts.conf? That's where snappuller puts the snapshots. Bill On 7/12/07, Kevin Lewandowski <[EMAIL PROTECTED]> wrote: I've been running solr replication for several months with no issues but recently had an instance where snappuller was running for about 1.5 hours. rsync was still active, so it was still copying data. I also noticed that there was a snapshot.200707 directory inside of the main index directory. I'm running an early version of snappuller. Could there have been any changes to fix a problem like this? I noticed this one in svn: revision 529471 "avoid recursive find, test for maxdepth support, filter snapshot names on master: SOLR-207" thanks, Kevin
Re: custom sorting for multivalued field
: Is it possible to assign a custom sorting value for : each of the values in the multivalued field? So that : the document gets sorted differently, depending on the : matched value in the multivalued field. Sorting happens extremely independently from matching ... there is no mechanism available in the underlying Lucene code to allow the Sorting logic to know why a particular document is a match. : The other approach would be to store each : document/keyword pair as a separate document with the : sorting value as an explicit field. Is it possible to : filter the results on the Solr end (based on the : relevancy of the matched keyword), so that the same : original document doesn't appear in the result set : twice? can you elaborate a bit more on what exactly it is you are trying to achieve? ...i'm having a hard time understanding hte motivation for sorting on a keyword field where the sort order is on the keyword that matches .. for simple single word queries all document will sort identically, for multi-word queries you might as well just search on each word seperately and concatenate the result sets in order -- except in the case where a single document matches on more then one of your query terms, but you've already said you just want it to appear once ... but why would you want documents in this order in the first place? my first assumption would be thaty ou just want docs which match on very rare keyword to come first and you are only doing searches on this keyword field, then regualr sort by score should do what you want ... but you might wnat to omitNOrms and maybe change the coordFactor in your similarity. -Hoss
Re: Deleting from index via web
On 12-Jul-07, at 6:33 AM, vanderkerkoff wrote: I/my boss and me worked it out. The delete funtion in solr.py looks like this def delete(self, id): xstr = ''+self.escapeVal(`id`)+'' return self.doUpdateXML(xstr) As we're not passing an integer it get's all c*nty booby, technical term. So if I rewrite the delete to be like this def delete(self, id): xstr = ''+ id + '' print xstr return self.doUpdateXML(xstr) It works fine. There's no need for escapeVal, as I know the words I'll be sending prior to the ID, in fact, I'm not sure why escapeVal is in there at all if you can't send it non integer values. Maybe someone can enlighten us. I would suggest replacing it with self.escapeVal(unicode(id)) backticks are equivalent to repr(), which does the wrong thing for strings. -Mike
Re: Need question to configure Log4j for solr
Check two related discussions: http://www.nabble.com/logging---slf4j--tf3366438.html#a9366144 where I suggested using slf4j and: http://www.nabble.com/Changing-Logging-in-Solr-to-Apache-Commons-Logging-tf3484843.html#a9744439 I'm still for switching to slf4j, but am pushing it as JDK logging is fine. Siegfried Goeschl wrote: Hi folks, would be using commons-logging an improvement? It is a common requirement to hook up different logging infrastructure .. Cheers, Siegfried Goeschl Erik Hatcher wrote: On Jul 11, 2007, at 9:07 PM, solruser wrote: How do I configure solr to use log4j logging. I am able to configure tomcat 5.5.23 to use log4j. But I could not get solr to use log4j. I have 3 context of solr running in tomcat which refers to war file in commons. Solr uses standard JDK logging. I'm sure it could be bridged to log4j somehow, but rather I'd recommend you just configure JDK logging how you'd like. Erik
Re: Need question to configure Log4j for solr
: the troubles comes when you integrate third-party stuff depending on : log4j (as I currently do). Having said this you have a strong point when : looking at http://www.qos.ch/logging/classloader.jsp there have been several discussions baout changing the logger used by Solr ... the best summation i can give to these discussions is: * JDK logging is universal * using any other logging framework would add a dependency without adding functionality * there are too many differnet frameworks, each with their own pros/cons supporters/objectors that switching to any of them would be an uphill social battle as well as an code effort expenditure. * as a webapp, Solr runs ina Servlet Container - any third party logging framework we may pick to use could have bad interaction with some Servlet Containers (ie: classloader issues, etc...) but all servlet containers must be able to handle JDK logging. Some reading that should be considered mandatory before any futher discussion... http://www.nabble.com/logging---slf4j--tf3366438.html#a9366144 http://www.nabble.com/Changing-Logging-in-Solr-to-Apache-Commons-Logging-tf3484843.html#a9782039 Specificly with regard to commons-logging, note the last paragraph of this URL... http://wiki.apache.org/jakarta-commons/Commons_Logging_FUD "...In fact, there are very limited circumstances in which Commons Logging is useful. If you're building a stand-alone application, don't use commons-logging. ..." -Hoss
snappuller copying to wrong directory?
I've been running solr replication for several months with no issues but recently had an instance where snappuller was running for about 1.5 hours. rsync was still active, so it was still copying data. I also noticed that there was a snapshot.200707 directory inside of the main index directory. I'm running an early version of snappuller. Could there have been any changes to fix a problem like this? I noticed this one in svn: revision 529471 "avoid recursive find, test for maxdepth support, filter snapshot names on master: SOLR-207" thanks, Kevin
Re: Embedded Solr with Java 1.4.x
solr requires 1.5. It uses generics and a bunch of other 1.5 code. Jery Cook wrote: QUESTION: Jeryl Cook ^ Pharaoh ^ http://pharaohofkush.blogspot.com/ I need to make solr work with java 1.4, the orgnaization I work for has not approved java 1.5 for the network...Before I download the source code and see if this is possible, what do u guys thing the level of effort will be? [Jery Cook]
Re: Facet Field Limits
On 7/12/07, Andrew Nagy <[EMAIL PROTECTED]> wrote: My question is: Is there a way to change the limit per field? Let's say on facet 2 I would like to display 10 values instead of 5 like the other facets. From the wiki: http://wiki.apache.org/solr/SimpleFacetParameters Parameters These are the parameters used to drive the Simple Faceting behavior, note that some parameters may be overridden on a per-field basis with the following syntax: * f..= eg. f.category.facet.limit=5 -Yonik
Facet Field Limits
Hello, I would like to generate a list of facets, let's say on 5 fields. I have the facet limit set to 5 so that for each of the 5 fields there will only by up to 5 values. My question is: Is there a way to change the limit per field? Let's say on facet 2 I would like to display 10 values instead of 5 like the other facets. Thanks! Andrew
Re: Embedded Solr with Java 1.4.x
Oh, and please don't cross-post :-) On 7/12/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 7/12/07, Jery Cook <[EMAIL PROTECTED]> wrote: > http://pharaohofkush.blogspot.com/ > I need to make solr work with java 1.4, the orgnaization I work for has not > approved java 1.5 for the network...Before I download the source code and > see if this is possible, what do u guys thing the level of effort will be? 1) push your organization to get into the 21st century ;-) 2) start with some of the tools available that can convert 1.5 classes to 1.4 If neither (1) or (2) works, the effort level would probably be substantial. -Yonik
Re: Embedded Solr with Java 1.4.x
On 7/12/07, Jery Cook <[EMAIL PROTECTED]> wrote: http://pharaohofkush.blogspot.com/ I need to make solr work with java 1.4, the orgnaization I work for has not approved java 1.5 for the network...Before I download the source code and see if this is possible, what do u guys thing the level of effort will be? 1) push your organization to get into the 21st century ;-) 2) start with some of the tools available that can convert 1.5 classes to 1.4 If neither (1) or (2) works, the effort level would probably be substantial. -Yonik
Embedded Solr with Java 1.4.x
QUESTION: Jeryl Cook ^ Pharaoh ^ http://pharaohofkush.blogspot.com/ I need to make solr work with java 1.4, the orgnaization I work for has not approved java 1.5 for the network...Before I download the source code and see if this is possible, what do u guys thing the level of effort will be? [Jery Cook]
Re: Need question to configure Log4j for solr
Hi Erik, the troubles comes when you integrate third-party stuff depending on log4j (as I currently do). Having said this you have a strong point when looking at http://www.qos.ch/logging/classloader.jsp Cheers, Siegfried Goeschl Erik Hatcher wrote: On Jul 12, 2007, at 9:03 AM, Siegfried Goeschl wrote: would be using commons-logging an improvement? It is a common requirement to hook up different logging infrastructure .. My personal take on it is *adding* a dependency to keep functionality the same isn't an improvement. JDK logging, while not with as many bells and whistles as Commons Logging, log4j, etc, is plenty good enough and keeps us away from many of logging JARmageddon headaches. I'm not against a logging change should others have different opinions with a strong case of improvement. Erik
Re: Need question to configure Log4j for solr
On Jul 12, 2007, at 9:03 AM, Siegfried Goeschl wrote: would be using commons-logging an improvement? It is a common requirement to hook up different logging infrastructure .. My personal take on it is *adding* a dependency to keep functionality the same isn't an improvement. JDK logging, while not with as many bells and whistles as Commons Logging, log4j, etc, is plenty good enough and keeps us away from many of logging JARmageddon headaches. I'm not against a logging change should others have different opinions with a strong case of improvement. Erik
RE: How to run the Embedded Solr Sample
1 we have to set the solr home in the main function manually, because there is some problem to set �CDsolr.solr.home=… in the java command parameters, it looks like SolrCore didn’t read the parameter,I'm not sure about this problem, so we write in main function: Config.setInstanceDir(“E:/apache-solr-1.2.0/example/solr”); 2 there are some libs needed to run the EmbeddedSolr application, too. So we would like to copy these libs to our lib folder, and add them to the java build path. apache-solr-1.2.0/dist/apache-solr-1.2.0.jar apache-solr-1.2.0/lib/lucene-core-2007-05-20_00-04-53.jar apache-solr-1.2.0/lib/lucene-analyzers-2007-05-20_00-04-53.jar apache-solr-1.2.0/lib/lucene-snowball-2007-05-20_00-04-53.jar apache-solr-1.2.0/lib/lucene-highlighter-2007-05-20_00-04-53.jar apache-solr-1.2.0/lib/xpp3-1.1.3.4.O.jar do not use lucene 2.1’s libs but 2.2,it's not supported by apache-solr1.2.0 -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: 2007年7月11日 21:00 To: solr-user@lucene.apache.org Subject: Re: How to run the Embedded Solr Sample > > How can i run this program? > In apache site they said its like sample "example" program. If so where i > have to place this file in tomcat? > If you are running tomcat, this is *not* the way to use solr. Using tomcat, check: http://wiki.apache.org/solr/SolrTomcat
Re: Deleting from index via web
I/my boss and me worked it out. The delete funtion in solr.py looks like this def delete(self, id): xstr = ''+self.escapeVal(`id`)+'' return self.doUpdateXML(xstr) As we're not passing an integer it get's all c*nty booby, technical term. So if I rewrite the delete to be like this def delete(self, id): xstr = ''+ id + '' print xstr return self.doUpdateXML(xstr) It works fine. There's no need for escapeVal, as I know the words I'll be sending prior to the ID, in fact, I'm not sure why escapeVal is in there at all if you can't send it non integer values. Maybe someone can enlighten us. -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11560068 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Need question to configure Log4j for solr
Hi folks, would be using commons-logging an improvement? It is a common requirement to hook up different logging infrastructure .. Cheers, Siegfried Goeschl Erik Hatcher wrote: On Jul 11, 2007, at 9:07 PM, solruser wrote: How do I configure solr to use log4j logging. I am able to configure tomcat 5.5.23 to use log4j. But I could not get solr to use log4j. I have 3 context of solr running in tomcat which refers to war file in commons. Solr uses standard JDK logging. I'm sure it could be bridged to log4j somehow, but rather I'd recommend you just configure JDK logging how you'd like. Erik
Re: Deleting from index via web
ok, I'm now printing out the xstr variable that the delete in solr.py uses when it's trying to delete. it's coming out like this 'news:39' Those quotes look suspicious Going to work out how to switch more debugging on in solr now so I can see what's going on exactly -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11559119 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Deleting from index via web
Different tactic now adding like this idstring = "news:%s"; % self.id c.add(id=idstring,url_t=e_url,body_t=body4solr,title_t=title4solr,summary_t=summary4solr,contact_name_t=contactname4solr) c.commit(optimize=True) Goes in fine, search results show an ID of news:36 Delete like this delidstring = "news:%s"; % self.id c.delete(id=delidstring) c.commit(optimize=True) still no joy -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11559113 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Deleting from index via web
Done some more digging about this here's my delete code def delete(self): from solr import SolrConnection c = SolrConnection(host='localhost:8983', persistent=False) e_url = '/news/' + self.created_at.strftime("%Y/%m/%d") + '/' + self.slug e_url = e_url.encode('ascii','ignore') c.delete(id=e_url) c.commit(optimize=True) I get this back from jetty INFO: delete(id '/news/2007/07/12/pilly') 0 1 It's not deleting the record form the index though, even if I restart jetty. I'm wondering if I can use URL's as ID's now. -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11558048 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Need question to configure Log4j for solr
On Jul 11, 2007, at 9:07 PM, solruser wrote: How do I configure solr to use log4j logging. I am able to configure tomcat 5.5.23 to use log4j. But I could not get solr to use log4j. I have 3 context of solr running in tomcat which refers to war file in commons. Solr uses standard JDK logging. I'm sure it could be bridged to log4j somehow, but rather I'd recommend you just configure JDK logging how you'd like. Erik
RE: A few questions regarding multi-word synonyms and parameters encoding
Hello, > > but honestly i haven't relaly tried anything like this ... > the code for > parsing the synonyms.txt file probaly splits the individual > synonyms on > whitespace to prodce multiple tokens which might screw you up > ... you may > need to get creative (perhaps use a PatternReplaceFilter to > encode your > spaces as "_" before hte SynonymFilter and then another one > to convert the > "_" back to " " after the Synonym filter ... kludgy but it might work) I had to build exactly this recently, but without solr and only lucene. I chose to create a CompressFilter as the last filter, to reduce all tokens into one single token (since it were facet fields i do know there where only a couple of tokens, and not thousands, because then compressing them in a single token might be a problem (not sure)) So for building synonyms on facet fields which can contain multiple tokens, I would add your own SynonymAnalyzer, that compresses tokens and when a compressed token is found in a synonym map, replace the token with the synonym. So, in your SynonymAnalyzer something like private Map synonyms; // initialize it public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream result = super.tokenStream(fieldName, reader); if(fieldName.equals("synonym_field")){ result = new CompressFilter(result,synonyms); } else if(fieldName.equals("compressed_field")){ result = new CompressFilter(result); } return result; } and your CompressFilter public CompressFilter(TokenStream in, Map synonyms) { super(in); this.synonyms = synonyms; } public CompressFilter(TokenStream in) { super(in); } public Token next() throws IOException { Token t = input.next(); if(t==null){ return null; } StringBuffer sb = new StringBuffer(); while(t!=null){ sb.append(t.termText()); t = input.next(); } if(synonyms!=null){ if(synonyms.containsKey(sb.toString())){ sb = new StringBuffer( (String)synonyms.get(sb.toString()) ); } else{ return null; // synonym not found } } return new Token(sb.toString(), 0, sb.toString().length()); } I am not sure though how easy it is to put this in solr, but i suppose it isn't hard. Obviously, I am not sure what happens with the CompressFilter when there are *many* tokens in the "synonym_field" field. Regards Ard > > : Now I want create a link for each of these value so that > the user can filter > : the results by that title by clicking on the link. For > example, if I click > : on "Software Engineer", the results are now narrowed down > to just include > : records with "Software Engineer" in their title. Since > "title" field can > : contain special chars like '+', '&' ..., I really can't > find a clean way to > : do this. At the moment, I replace all the space by '+' and > it seems to work > : for words like "Software engineer" (converted to > "Software+Engineer"). > : However, "C++ Programmer" is converted to "C+++Programmer", > and it doesn't > : seem to work (return no results). Any ideas? > > for starters you need to URL encode *all* of hte characters, > not just the > spaces ... space escapes to "+" but only becuase "+" escapes to %2B. > > second, if you are dealing with multi-word values like this in your > facets, you need to make sure to quote them when doing fq queries to > (before url encoding) ... so if you have a facet.field > "skills" that lists > "C++ Programmer" as the value, the fq query you want to use > would be... > skills:"C++ Programmer" > > when you URL encode that it should become... > > fq=skills%3A%22C%2B%2B+Programmer%22 > > ...use teh echoParams=explicit&debugQuery=true params to see > exactly what > your params look like when they've been URL decoded and what your > query objects look like once they've been parsed. > > > > -Hoss > >
RE: Question about synonyms
Hello, > > Brievly, what I'm looking for is a query that launch > something like this: > > Giving the user search expression > "A B C D" > > Generated Lucene query : > (myfield:I OR myfield:J OR myfield:O OR myfield:K) > > if someone knows a way to reach this goal, please tell me how, i'm > actually tearing my hairs on this issues and I really appreciate some > help !! IMHO, it does not make very much sense to me to rewrite a phraseQuery like "A B C D" into a boolean OR query! But, if you really insist that it should work like this, I think it wont be to hard: You said, myfield:("A B C D") is translated into PhraseQuery(myfield:"I J O K"). So, think you should start from there, and get the term(s) out of the translated PhraseQuery and rewrite this one into a boolean OR query. I am by the way curious how the PhraseQuery works in combination with synonyms, because if my phrase is like: "A B C D", will there be first look for a synonym for "A B C D", if not found for "A B C", then for "B C D", then for "A B" and "C D", and then for individual terms? Think when the phrase grows the combinations grow pretty fast isnt? Regards Ard > > Thanks you, and thanks to the solr team for this amazing product that > really improved by x100 the performance of our search engine ! > > Laurent >
Deleting from index via web
Hello everyone We're adding records to our 1.1 index through django and python like so, using the jetty app. This is in the save definition. from solr import SolrConnection c = SolrConnection(host='localhost:8983', persistent=False) c.add(id=e_url,url_t=e_url,body_t=body4solr,title_t=title4solr,summary_t=summary4solr,contact_name_t=contactname4solr) c.commit(optimize=True) I need to write a script to remove the item from the index in the delete function. Do I need to create all the items like I do on the add or can I just somehow say delete all the records where the id=e_url? Something like from solr import SolrConnection c = SolrConnection(host='localhost:8983', persistent=False) c.delete(* where id=e_url) c.commit(optimize=True) Any help as always is greatly appreciated. -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11556220 Sent from the Solr - User mailing list archive at Nabble.com.