Re: Performance when indexing or cold cache
Hi Walter, Did you find a way to sort out your issue, I would be very interested. Thanks a lot, Walter Underwood wrote: > > We've had some performance problems while Solr is indexing and also when > it > starts with a cold cache. I'm still digging through our own logs, but I'd > like to get more info about this, so any ideas or info are welcome. > > We have four Solr servers on dual CPU PowerPC machines, 2G of heap, about > 100-300 queries/second, 250K docs, Tomcat 6.0.10, not fronted by Apache. > We don't use facets, we sort by score. In general use, there are six > different request handlers called to build a page. Here is one, they > are all very similar. > > > > 0.01 > > exact^8.0 exact_alt^6.0 exact_base^8.0 title^4.0 title_alt^3.0 > title_base^4.0 phonetic_hi^1.0 > > > exact^12.0 exact_alt^9.0 exact_base^12.0 title^6.0 title_alt^4.0 > title_base^6.0 phonetic_hi^1.5 > > > popularity^2.0 > > > id,type,movieid,personid,genreid,score > > 1 > 100 > > > (pushstatus:A AND (type:movie OR type:person)) > > > > wunder > > > > -- View this message in context: http://www.nabble.com/Performance-when-indexing-or-cold-cache-tp13348420p22984912.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Any tips for indexing large amounts of data?
ok but how people do for a frequent update for a large dabase and lot of query on it ? do they turn off the slave during the warmup ?? Noble Paul നോബിള് नोब्ळ् wrote: > > On Thu, Apr 9, 2009 at 8:51 PM, sunnyfr wrote: >> >> Hi Otis, >> How did you manage that? I've 8 core machine with 8GB of ram and 11GB >> index >> for 14M docs and 5 update every 30mn but my replication kill >> everything. >> My segments are merged too often sor full index replicate and cache lost >> and >> I've no idea what can I do now? >> Some help would be brilliant, >> btw im using Solr 1.4. >> > > sunnnyfr , whether the replication is full or delta , the caches are > lost completely. > > you can think of partitioning the index into separate Solrs and > updating one partition at a time and perform distributed search. > >> Thanks, >> >> >> Otis Gospodnetic wrote: >>> >>> Mike is right about the occasional slow-down, which appears as a pause >>> and >>> is due to large Lucene index segment merging. This should go away with >>> newer versions of Lucene where this is happening in the background. >>> >>> That said, we just indexed about 20MM documents on a single 8-core >>> machine >>> with 8 GB of RAM, resulting in nearly 20 GB index. The whole process >>> took >>> a little less than 10 hours - that's over 550 docs/second. The vanilla >>> approach before some of our changes apparently required several days to >>> index the same amount of data. >>> >>> Otis >>> -- >>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >>> >>> - Original Message >>> From: Mike Klaas >>> To: solr-user@lucene.apache.org >>> Sent: Monday, November 19, 2007 5:50:19 PM >>> Subject: Re: Any tips for indexing large amounts of data? >>> >>> There should be some slowdown in larger indices as occasionally large >>> segment merge operations must occur. However, this shouldn't really >>> affect overall speed too much. >>> >>> You haven't really given us enough data to tell you anything useful. >>> I would recommend trying to do the indexing via a webapp to eliminate >>> all your code as a possible factor. Then, look for signs to what is >>> happening when indexing slows. For instance, is Solr high in cpu, is >>> the computer thrashing, etc? >>> >>> -Mike >>> >>> On 19-Nov-07, at 2:44 PM, Brendan Grainger wrote: >>> Hi, Thanks for answering this question a while back. I have made some of the suggestions you mentioned. ie not committing until I've finished indexing. What I am seeing though, is as the index get larger (around 1Gb), indexing is taking a lot longer. In fact it slows down to a crawl. Have you got any pointers as to what I might be doing wrong? Also, I was looking at using MultiCore solr. Could this help in some way? Thank you Brendan On Oct 31, 2007, at 10:09 PM, Chris Hostetter wrote: > > : I would think you would see better performance by allowing auto > commit > : to handle the commit size instead of reopening the connection > all the > : time. > > if your goal is "fast" indexing, don't use autoCommit at all ... >>> just > index everything, and don't commit until you are completely done. > > autoCommitting will slow your indexing down (the benefit being > that more > results will be visible to searchers as you proceed) > > > > > -Hoss > >>> >>> >>> >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Any-tips-for-indexing-large-amounts-of-data--tp13510670p22973205.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > --Noble Paul > > -- View this message in context: http://www.nabble.com/Any-tips-for-indexing-large-amounts-of-data--tp13510670p22986152.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: multiple tokenizers needed
The only thing that comes to mind in a short term way is writing two TokenFilter implementations that wrap the second and third tokenizers On Apr 9, 2009, at 11:00 PM, Ashish P wrote: I want to analyze a text based on pattern ";" and separate on whitespace and it is a Japanese text so use CJKAnalyzer + tokenizer also. in short I want to do: Can anyone please tell me how to achieve this?? Because the above syntax is not at all possible. -- View this message in context: http://www.nabble.com/multiple-tokenizers-needed-tp22982382p22982382.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Question on Solr Distributed Search
On Fri, Apr 10, 2009 at 7:50 AM, vivek sar wrote: > Just an update. I changed the schema to store the unique id field, but > I still get the connection reset exception. I did notice that if there > is no data in the core then it returns the 0 result (no exception), > but if there is data and you search using "shards" parameter I get the > connection reset exception. Can anyone provide some tip on where can I > look for this problem? > > Did you re-index after changing the field to stored? -- Regards, Shalin Shekhar Mangar.
QueryElevationComponent : hot update of elevate.xml
Hello ! Browsing the mailing-list's archives did not help me find the answer, hence the question asked directly here. Some context first : Integrating Solr with a CMS ( eZ Publish ), we chose to support Elevation. The idea is to be able to 'elevate' any object from the CMS. This can be achieved through eZ Publish's back office, with a dedicated Elevate administration GUI, the configuration is stored in the CMS temporarily, and then synchronized frequently and/or on demand onto Solr. This synchronisation is currently done as follows : 1. Generate the elevate.xml based on the stored configuration 2. Replace elevate.xml in Solr's dataDir 3. Commit. It appears that when having elevate.xml in Solr's dataDir, and solely in this case, commiting triggers a reload of elevate.xml. This does not happen when elevate.xml is stored in Solr's conf dir. This method has one main issue though : eZ Publish needs to have access to the same filesystem as the one on which Solr's dataDir is stored. This is not always the case when the CMS is clustered for instance --> show stopper :( Hence the following idea / RFC : How about extending the Query Elevation system with the possibility to push an updated elevate.xml file/XML through HTTP ? This would update the file where it is actually located, and trigger a reload of the configuration. Not being very knowledgeable about Solr's API ( yet ! ), i cannot figure out whether this would be possible, how this would be achievable ( which type of plugin for instance ) or even be valid ? Thanks a lot in advance for your thoughts, -- Nicolas
Re: Any tips for indexing large amounts of data?
they don't usually turn off the slave , but it is not a bad idea if you can take it offline. It is a logistical headache. BTW do you have very good cache hit ratio? then it makes sense to autowarm . --Noble On Fri, Apr 10, 2009 at 4:07 PM, sunnyfr wrote: > > ok but how people do for a frequent update for a large dabase and lot of > query on it ? > do they turn off the slave during the warmup ?? > > > Noble Paul നോബിള് नोब्ळ् wrote: >> >> On Thu, Apr 9, 2009 at 8:51 PM, sunnyfr wrote: >>> >>> Hi Otis, >>> How did you manage that? I've 8 core machine with 8GB of ram and 11GB >>> index >>> for 14M docs and 5 update every 30mn but my replication kill >>> everything. >>> My segments are merged too often sor full index replicate and cache lost >>> and >>> I've no idea what can I do now? >>> Some help would be brilliant, >>> btw im using Solr 1.4. >>> >> >> sunnnyfr , whether the replication is full or delta , the caches are >> lost completely. >> >> you can think of partitioning the index into separate Solrs and >> updating one partition at a time and perform distributed search. >> >>> Thanks, >>> >>> >>> Otis Gospodnetic wrote: Mike is right about the occasional slow-down, which appears as a pause and is due to large Lucene index segment merging. This should go away with newer versions of Lucene where this is happening in the background. That said, we just indexed about 20MM documents on a single 8-core machine with 8 GB of RAM, resulting in nearly 20 GB index. The whole process took a little less than 10 hours - that's over 550 docs/second. The vanilla approach before some of our changes apparently required several days to index the same amount of data. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Mike Klaas To: solr-user@lucene.apache.org Sent: Monday, November 19, 2007 5:50:19 PM Subject: Re: Any tips for indexing large amounts of data? There should be some slowdown in larger indices as occasionally large segment merge operations must occur. However, this shouldn't really affect overall speed too much. You haven't really given us enough data to tell you anything useful. I would recommend trying to do the indexing via a webapp to eliminate all your code as a possible factor. Then, look for signs to what is happening when indexing slows. For instance, is Solr high in cpu, is the computer thrashing, etc? -Mike On 19-Nov-07, at 2:44 PM, Brendan Grainger wrote: > Hi, > > Thanks for answering this question a while back. I have made some > of the suggestions you mentioned. ie not committing until I've > finished indexing. What I am seeing though, is as the index get > larger (around 1Gb), indexing is taking a lot longer. In fact it > slows down to a crawl. Have you got any pointers as to what I might > be doing wrong? > > Also, I was looking at using MultiCore solr. Could this help in > some way? > > Thank you > Brendan > > On Oct 31, 2007, at 10:09 PM, Chris Hostetter wrote: > >> >> : I would think you would see better performance by allowing auto >> commit >> : to handle the commit size instead of reopening the connection >> all the >> : time. >> >> if your goal is "fast" indexing, don't use autoCommit at all ... just >> index everything, and don't commit until you are completely done. >> >> autoCommitting will slow your indexing down (the benefit being >> that more >> results will be visible to searchers as you proceed) >> >> >> >> >> -Hoss >> > >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Any-tips-for-indexing-large-amounts-of-data--tp13510670p22973205.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> --Noble Paul >> >> > > -- > View this message in context: > http://www.nabble.com/Any-tips-for-indexing-large-amounts-of-data--tp13510670p22986152.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul
Re: multiple tokenizers needed
Or have the indexing client split the data at these delimiters and just use the CJKAnalyzer. Erik On Apr 10, 2009, at 7:30 AM, Grant Ingersoll wrote: The only thing that comes to mind in a short term way is writing two TokenFilter implementations that wrap the second and third tokenizers On Apr 9, 2009, at 11:00 PM, Ashish P wrote: I want to analyze a text based on pattern ";" and separate on whitespace and it is a Japanese text so use CJKAnalyzer + tokenizer also. in short I want to do: Can anyone please tell me how to achieve this?? Because the above syntax is not at all possible. -- View this message in context: http://www.nabble.com/multiple-tokenizers-needed-tp22982382p22982382.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: QueryElevationComponent : hot update of elevate.xml
On Apr 10, 2009, at 7:48 AM, Nicolas Pastorino wrote: Hello ! Browsing the mailing-list's archives did not help me find the answer, hence the question asked directly here. Some context first : Integrating Solr with a CMS ( eZ Publish ), we chose to support Elevation. The idea is to be able to 'elevate' any object from the CMS. This can be achieved through eZ Publish's back office, with a dedicated Elevate administration GUI, the configuration is stored in the CMS temporarily, and then synchronized frequently and/or on demand onto Solr. This synchronisation is currently done as follows : 1. Generate the elevate.xml based on the stored configuration 2. Replace elevate.xml in Solr's dataDir 3. Commit. It appears that when having elevate.xml in Solr's dataDir, and solely in this case, commiting triggers a reload of elevate.xml. This does not happen when elevate.xml is stored in Solr's conf dir. This method has one main issue though : eZ Publish needs to have access to the same filesystem as the one on which Solr's dataDir is stored. This is not always the case when the CMS is clustered for instance --> show stopper :( Hence the following idea / RFC : How about extending the Query Elevation system with the possibility to push an updated elevate.xml file/XML through HTTP ? This would update the file where it is actually located, and trigger a reload of the configuration. Not being very knowledgeable about Solr's API ( yet ! ), i cannot figure out whether this would be possible, how this would be achievable ( which type of plugin for instance ) or even be valid ? Perhaps look at implementing custom RequestHandler: http://wiki.apache.org/solr/SolrRequestHandler maybe it could POST the new elevate.xm and then save it to the right place and call commit... ryan
Re: Additive filter queries
That would work, but the other part of our problem comes in when we then try to facet on the resulting set.. If we filter by size 1, for example, and then facet Width again - we get facet results that have no size 1's, because we have no taught solr what 1_W means, etc etc.. I think field collapsing might solve this for us, maybe.. Thanks for your time! Matthew Runo Software Engineer, Zappos.com mr...@zappos.com - 702-943-7833 On Apr 9, 2009, at 5:23 PM, Chris Hostetter wrote: : Right now a document looks like this: : : : : 1598548 : 12545 : Adidas : 1, 2, 3, 4, 5, 6, 7 : AA, A, B, W, W, : Brown : : : If we went down a level, it could look like.. : : : 1598548 : 12545 : 654641654684 : Adidas : 1 : AA : Brown : If you want result at the "product" level then you don't have to have one *doc* per legal size+width pair ... you just need one *term* per valid size+width pair 1, 2, 3, 4, 5, 6, 7 AA, A, B, W, W, 1_W 2W 3_B 3_W 4_AA 4_A 4_B 4_W 4_WW 5_W 5_ 6_ 7_ a search for size 4 clogs would look like... q=clogs&fq=size:5&facet.field=opts&f.opts.facet.prefix=4_ ...and the facet counts for "opts" would tell me what widths were available (and how many). for completeness you typically want to index the pairs in both directions (1_W and W_1 ... typically in seperate fields) so the user can filter by either option first ... for something like size+color this makes sense, but i'm guessing with shoes no one expects to narrow by "width" untill they've narrowed by size first. -Hoss
Index Version Number
Is it possible for a Solr client to determine if the index has changed since the last time it performed a query? For example, is it possible to query the current Lucene indexVersion? Thanks in advance for your help, Richard
Re: Question on Solr Distributed Search
yes - it's all new indexes. I can search them individually, but adding "shards" throws "Connection Reset" error. Is there any way I can debug this or any other pointers? -vivek On Fri, Apr 10, 2009 at 4:49 AM, Shalin Shekhar Mangar wrote: > On Fri, Apr 10, 2009 at 7:50 AM, vivek sar wrote: > >> Just an update. I changed the schema to store the unique id field, but >> I still get the connection reset exception. I did notice that if there >> is no data in the core then it returns the 0 result (no exception), >> but if there is data and you search using "shards" parameter I get the >> connection reset exception. Can anyone provide some tip on where can I >> look for this problem? >> >> > Did you re-index after changing the field to stored? > -- > Regards, > Shalin Shekhar Mangar. >
Re: Help with relevance failure in Solr 1.3
If you don't see the attachments, you can get them here: http://wunderwood.org/solr/ wunder On 4/10/09 10:56 AM, "Walter Underwood" wrote: > We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, and > I would appreciate any ideas. > > Ocassionally, a server will start returning results with really poor > relevance. Single term queries work fine, but multi-term queries are > scored based on the most common term (lowest IDF). > > I don't see anything in the logs when this happens. We have a monitor > doing a search for the 100 most popular movies once per minute to > catch this, so we know when it was first detected. > > I'm attaching two explain outputs, one for the query "changeling" and > one for "the changeling". > > We are running Solr 1.3 with Lucene 2.4.0, and have added a fuzzy query > using JaroWinkler matching. > > I'd appreciate ideas about where to look, what debug output to try, etc. > > wunder
Help with relevance failure in Solr 1.3
We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, and I would appreciate any ideas. Ocassionally, a server will start returning results with really poor relevance. Single term queries work fine, but multi-term queries are scored based on the most common term (lowest IDF). I don't see anything in the logs when this happens. We have a monitor doing a search for the 100 most popular movies once per minute to catch this, so we know when it was first detected. I'm attaching two explain outputs, one for the query "changeling" and one for "the changeling". We are running Solr 1.3 with Lucene 2.4.0, and have added a fuzzy query using JaroWinkler matching. I'd appreciate ideas about where to look, what debug output to try, etc. wunder
Re: Help with relevance failure in Solr 1.3
On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote: We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, and I would appreciate any ideas. Ocassionally, a server will start returning results with really poor relevance. Single term queries work fine, but multi-term queries are scored based on the most common term (lowest IDF). I don't see anything in the logs when this happens. We have a monitor doing a search for the 100 most popular movies once per minute to catch this, so we know when it was first detected. I'm attaching two explain outputs, one for the query "changeling" and one for "the changeling". I'm not sure what exactly you are asking, so bear with me... Are you saying that "the changeling" normally returns results just fine and then periodically it will "go bad" or are you saying you don't understand why "the changeling" scores differently from "changeling"? In looking at the explains, it is weird that in the "the changeling" case, the term changeling doesn't even show up as a term. Can you share your dismax configuration? That will be easier to parse than trying to make sense of the debug query parsing. -Grant
Re: Index Version Number
This info is available via the Luke Handler, I believe: http://localhost:8983/solr/admin/luke/ : In there, I see: version, current, optimized and true information. See also http://wiki.apache.org/solr/LukeRequestHandler HTH, Grant On Apr 10, 2009, at 11:58 AM, Richard Wiseman wrote: Is it possible for a Solr client to determine if the index has changed since the last time it performed a query? For example, is it possible to query the current Lucene indexVersion? Thanks in advance for your help, Richard -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Question on StreamingUpdateSolrServer
Hi, I was using CommonsHttpSolrServer for indexing, but having two threads writing (10K batches) at the same time was throwing, "ProtocolException: Unbuffered entity enclosing request can not be repeated. " I switched to StreamingUpdateSolrServer (using addBeans) and I don't see the problem anymore. The speed is very fast - getting around 25k/sec (single thread), but I'm facing another problem. When the indexer using StreamingUpdateSolrServer is running I'm not able to send any url request from browser to Solr web app. I just get blank page. I can't even get to the admin interface. I'm also not able to shutdown the Tomcat running the Solr webapp when the Indexer is running. I've to first stop the Indexer app and then stop the Tomcat. I don't have this problem when using CommonsHttpSolrServer. Here is how I'm creating it, server = new StreamingUpdateSolrServer(url, 1000,3); I simply call server.addBeans(...) on it. Is there anything else I need to do to make use of StreamingUpdateSolrServer? Why does Tomcat become unresponsive when Indexer using StreamingUpdateSolrServer is running (though, indexing happens fine)? Thanks, -vivek
Re: Help with relevance failure in Solr 1.3
Normally, both "changeling" and "the changeling" work fine. This one server is misbehaving like this for all multi-term queries. Yes, it is VERY weird that the term "changeling" does not show up in the explain. A server will occasionally "go bad" and stay in that state. In one case, two servers went bad and both gave the same wrong results. Here is the dismax config. "groups" means "movies". The title* fields are stemmed and stopped, the "exact*" fields are not. dismax none 0.01 exact^6.0 exact_alt^6.0 exact_base~jw_0.7_1^8.0 exact_alias^8.0 title^3.0 title_alt^3.0 title_base^4.0 exact^9.0 exact_alt^9.0 exact_base^12.0 exact_alias^12.0 title^3.0 title_alt^4.0 title_base^6.0 search_popularity^100.0 1 100 id,type,movieid,personid,genreid type:group OR type:person wunder On 4/10/09 12:51 PM, "Grant Ingersoll" wrote: > > On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote: > >> We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, >> and >> I would appreciate any ideas. >> >> Ocassionally, a server will start returning results with really poor >> relevance. Single term queries work fine, but multi-term queries are >> scored based on the most common term (lowest IDF). >> >> I don't see anything in the logs when this happens. We have a monitor >> doing a search for the 100 most popular movies once per minute to >> catch this, so we know when it was first detected. >> >> I'm attaching two explain outputs, one for the query "changeling" and >> one for "the changeling". > > > I'm not sure what exactly you are asking, so bear with me... > > Are you saying that "the changeling" normally returns results just > fine and then periodically it will "go bad" or are you saying you > don't understand why "the changeling" scores differently from > "changeling"? In looking at the explains, it is weird that in the > "the changeling" case, the term changeling doesn't even show up as a > term. > > Can you share your dismax configuration? That will be easier to parse > than trying to make sense of the debug query parsing. > > -Grant
Re: Question on StreamingUpdateSolrServer
I also noticed that the Solr app has over 6000 file handles open - "lsof | grep solr | wc -l" - shows 6455 I've 10 cores (using multi-core) managed by the same Solr instance. As soon as start up the Tomcat the open file count goes up to 6400. Few questions, 1) Why is Solr holding on to all the segments from all the cores - is it because of auto-warmer? 2) How can I reduce the open file count? 3) Is there a way to stop the auto-warmer? 4) Could this be related to "Tomcat returning blank page for every request"? Any ideas? Thanks, -vivek On Fri, Apr 10, 2009 at 1:48 PM, vivek sar wrote: > Hi, > > I was using CommonsHttpSolrServer for indexing, but having two > threads writing (10K batches) at the same time was throwing, > > "ProtocolException: Unbuffered entity enclosing request can not be repeated. > " > > I switched to StreamingUpdateSolrServer (using addBeans) and I don't > see the problem anymore. The speed is very fast - getting around > 25k/sec (single thread), but I'm facing another problem. When the > indexer using StreamingUpdateSolrServer is running I'm not able to > send any url request from browser to Solr web app. I just get blank > page. I can't even get to the admin interface. I'm also not able to > shutdown the Tomcat running the Solr webapp when the Indexer is > running. I've to first stop the Indexer app and then stop the Tomcat. > I don't have this problem when using CommonsHttpSolrServer. > > Here is how I'm creating it, > > server = new StreamingUpdateSolrServer(url, 1000,3); > > I simply call server.addBeans(...) on it. Is there anything else I > need to do to make use of StreamingUpdateSolrServer? Why does Tomcat > become unresponsive when Indexer using StreamingUpdateSolrServer is > running (though, indexing happens fine)? > > Thanks, > -vivek >
Re: logging
If you use the off the shelf .war, it *should* be the same. (if not, we need to fix it) If you are building your own .war, how SLF4 behaves depends on what implementation is in the runtime path. If you want to use log4j logging, put in the slf4j-log4j.jar in your classpath and you should be all set. On Apr 9, 2009, at 4:56 PM, Kevin Osborn wrote: We built our own webapp that used the Solr JARs. We used Apache Commons/log4j logging and just put log4j.properties in the Resin conf directory. The commons-logging and log4j jars were put in the Resin lib driectory. Everything worked great and we got log files for our code only. So, I upgraded to Solr 1.4 and I no longer get my log file. I assume it has something to do with Solr 1.4 using SL4J instead of JDK logging, but it seems like my code would be independent of that. Any ideas?
Re: logging
Or for my quick and dirty methods (this was just a test), I just removed the jcl-over-slrj JAR, and it worked like normal. From: Ryan McKinley To: solr-user@lucene.apache.org Sent: Friday, April 10, 2009 3:16:30 PM Subject: Re: logging If you use the off the shelf .war, it *should* be the same. (if not, we need to fix it) If you are building your own .war, how SLF4 behaves depends on what implementation is in the runtime path. If you want to use log4j logging, put in the slf4j-log4j.jar in your classpath and you should be all set. On Apr 9, 2009, at 4:56 PM, Kevin Osborn wrote: > We built our own webapp that used the Solr JARs. We used Apache Commons/log4j > logging and just put log4j.properties in the Resin conf directory. The > commons-logging and log4j jars were put in the Resin lib driectory. > Everything worked great and we got log files for our code only. > > So, I upgraded to Solr 1.4 and I no longer get my log file. I assume it has > something to do with Solr 1.4 using SL4J instead of JDK logging, but it seems > like my code would be independent of that. Any ideas? > > >
maxCodeLength in PhoneticFilterFactory
i have this version of solr running: Solr Implementation Version: 1.4-dev 747554M - bwhitman - 2009-02-24 16:37:49 and am trying to update a schema to support 8 code length metaphone instead of 4 via this (committed) issue: https://issues.apache.org/jira/browse/SOLR-813 So I change the schema to this (knowing that I have to reindex) But when I do queries fail with Error_initializing_DoubleMetaphoneclass_orgapachecommonscodeclanguageDoubleMetaphone__at_orgapachesolranalysisPhoneticFilterFactoryinitPhoneticFilterFactoryjava90__at_orgapachesolrschemaIndexSchema$6initIndexSchemajava821__at_orgapachesolrschemaIndexSchema$6initIndexSchemajava817__at_orgapachesolrutilpluginAbstractPluginLoaderloadAbstractPluginLoaderjava149__at_orgapachesolrschemaIndexSchemareadAnalyzerIndexSchemajava831__at_orgapachesolrschemaIndexSchemaaccess$100IndexSchemajava58__at_orgapachesolrschemaIndexSchema$1createIndexSchemajava425__at_orgapachesolrschemaIndexSchema$1createIndexSchemajava410__at_orgapachesolrutilpluginAbstractPluginLoaderloadAbstractPluginLoaderjava141__at_orgapachesolrschemaIndexSchemareadSchemaIndexSchemajava452__at_orgapachesolrschemaIndexSchemainitIndexSchemajava95__at_orgapachesolrcoreSolrCoreinitSolrCorejava501__at_orgapachesolrcoreCoreContainer$InitializerinitializeCoreContainerjava121
PHP Remove From Index/Search By Fields
Hey, How could I write some code in PHP to place in a button to remove a returned item from the index? In turn, is it possible to copy all of the XML elements from said item and place them in a document somewhere locally once it's been removed? Finally, there is one default search field. How do you search on multiple different fields in PHP? If I wanted to search by all of the fields indexed, is that easy to code? What changes do I need to make in the XML schema? Thanks for much for any help! -- View this message in context: http://www.nabble.com/PHP-Remove-From-Index-Search-By-Fields-tp22996701p22996701.html Sent from the Solr - User mailing list archive at Nabble.com.
special characters in Solr search query.
Hi, There is a strange issue while querying on the Solr indexes. If my query contains the special characters like [ ] !<> etc. It is throwing the query parse exception. From my application interface I am able to handle the special characters but the issue is while the document which I am going to index contains any of these special characters it is throwing query parse exception. Can anyone give pointer over this? Thanks in advance. Regards, Sagar Khetkade _ The new Windows Live Messenger. You don’t want to miss this. http://www.microsoft.com/india/windows/windowslive/messenger.aspx
Re: special characters in Solr search query.
On Sat, Apr 11, 2009 at 10:13 AM, Sagar Khetkade wrote: > > There is a strange issue while querying on the Solr indexes. If my query > contains the special characters like [ ] !<> etc. It is throwing the query > parse exception. From my application interface I am able to handle the > special characters but the issue is while the document which I am going to > index contains any of these special characters it is throwing query parse > exception. Can anyone give pointer over this? > Thanks in advance. > You need to escape those characters. Look at http://lucene.apache.org/java/2_4_1/queryparsersyntax.html#Escaping%20Special%20Characters If you are using Solrj, this should be done automatically. Solrj calls ClientUtils.escapeQueryChars under the hood. -- Regards, Shalin Shekhar Mangar.
Re: Question on StreamingUpdateSolrServer
On Sat, Apr 11, 2009 at 3:29 AM, vivek sar wrote: > I also noticed that the Solr app has over 6000 file handles open - > >"lsof | grep solr | wc -l" - shows 6455 > > I've 10 cores (using multi-core) managed by the same Solr instance. As > soon as start up the Tomcat the open file count goes up to 6400. Few > questions, > > 1) Why is Solr holding on to all the segments from all the cores - is > it because of auto-warmer? You have 10 cores, so Solr opens 10 indexes, each of which contains multiple files. That is one reason. Apart from that, Tomcat will keep some file handles for incoming connections. > > 2) How can I reduce the open file count? Are they causing a problem? Tomcat will log messages when it cannot accept incoming connections if it runs out of available file handles. But if you experiencing issues, you can increase the file handle limit or you can set useCompoundFile=true in solrconfig.xml. > > 3) Is there a way to stop the auto-warmer? > 4) Could this be related to "Tomcat returning blank page for every > request"? > It could be. Check the Tomcat and Solr logs. -- Regards, Shalin Shekhar Mangar.
Re: sorlj search
On Wed, Feb 6, 2008 at 10:51 AM, Tevfik Kiziloren wrote: > > Caused by: org.apache.solr.common.SolrException: parsing error >at > > org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:138) >at > > org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:99) >at > > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:317) >at > > org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:84) >... 29 more > Caused by: java.lang.RuntimeException: this must be known type! not: int >at > > org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:217) >at > > org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:235) >at > > org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:123) > Which version of Solr and Solrj client are you using? -- Regards, Shalin Shekhar Mangar.