Something like 'bf' or 'bq' with MoreLikeThis
I'm looking for a way to improve the relevancy of my MLT results. For my index based on movies, the MoreLikeThisHandler is doing a great job of returning related documents by the fields I specify like 'genre', but within my "bands" of results (groups of documents with the same score cause they all match on the mlt.fl and mlt.qf params), there's nothing else to sort the results /within/ those "bands". A good way to help this would be to have a bf=recip(rord(created_at),1,1000,1000), so the newer movies should up higher, but I don't think the MLT handler supports bf or bq. Is there something similar I could use that would accomplish the same thing, maybe using the _val_: hook somewhere? -- View this message in context: http://lucene.472066.n3.nabble.com/Something-like-bf-or-bq-with-MoreLikeThis-tp3989060.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multiple Property Substitution
*bump* I'm also curious is something like this is possible. Being able to nest property substitution variables, especially when using multiple cores, would be a really slick feature. Zach Friedland wrote > > Has anyone found a way to have multiple properties (override & default)? > What > I'd like to create is a default property with an override property that > usually > wouldn't be set, but would be set as a JVM parameter if I want to turn off > replication on a particular index on a particular server. I tried this > syntax > but it didn't work... > > > > name="enable">${Solr.enable.slave.core.override:${Solr.enable.slave.default:false}} > > > > > Thanks > -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-Property-Substitution-tp2223781p3770649.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: omitTermFreq only?
iorixxx wrote > >> Thing is, having a custom Similarity and setting tf=1.0f >> will turn off term >> frequencies globally, which is not what I need; I'd like to >> do it per field. > > I think, it is possible to use different similarities for different > fields. https://issues.apache.org/jira/browse/SOLR-2338 > Ahh...guess I'll have to wait for Solr 4 -- View this message in context: http://lucene.472066.n3.nabble.com/omitTermFreq-only-tp3167128p3708664.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: omitTermFreq only?
I know I'm kind of reopening a closed thread, but I now have the same requirement to omitTermFreq only, but still have the ability to run phrase queries on a field. Thing is, having a custom Similarity and setting tf=1.0f will turn off term frequencies globally, which is not what I need; I'd like to do it per field. For sake of simplicity, I'm using dismax parser with *qf=name^10 description^5 body^1*, and *pf=name description body*. I'd like to turn off tf for the /name/ field, but leave it for /description/ and /body/, while allowing all of them to have positions so that phrase queries work. Unfortunately, setting the /name/ field's omitTermFreqAndPositions="true" also turns off the ability for phrases to work on /name/. Are there any tricks to doing this? I've thought of a custom Similarity and having a copyField for name (/name_phrase/) that leaves termFreqAndPositions, and only using that field in the pf instead of /name/, but that won't really work either. I also tried omitTermFreqAndPositions="true" and omitPositions="false", but that's an invalid setting. Markus Jelsma-2 wrote > > A dirty hack is to return 1.0f for each tf > 0. Just a couple of lines > code > for a custom similarity class. > >> Hello, >> >> I was wondering if there is a way we can omit only the Term Frequency in >> solr? >> >> omitTermFreqAndPositions =true wouldn't work for us since we need the >> positions for supporting phrase queries. >> >> Thanks, >> -Jibo > -- View this message in context: http://lucene.472066.n3.nabble.com/omitTermFreq-only-tp3167128p3708403.html Sent from the Solr - User mailing list archive at Nabble.com.
ord/rord with a function
Is it possible for ord/rord to work with a function? I'm attempting to use rord with a spatial function like the following as a bf: bf=rord(geodist()) If there's no way for this to work, is there a way to simulate the same behavior? For some background, I have two sets of documents: one set applies to a location in NY and another in LA. I want to boost documents that are closer to where the user is searching from. But I only need these sets to be ranked 1 & 2. In other words, the actual distance should not be used to boost the documents, just if you are closer or farther. We may add more locations in the future, so I'd like to be able to rank the locations from closest to furthest. I need some way to rank the distances, and rord is the right idea, but doesn't seem to work with functions. I'm running Solr 3.4, btw. -- View this message in context: http://lucene.472066.n3.nabble.com/ord-rord-with-a-function-tp3691138p3691138.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: edismax doesn't obey 'pf' parameter
I'm observing strange results with both the correct and incorrect behavior happening depending on which field I put in the 'pf' param. I wouldn't think this should be analyzer specific, but is it? If I try: http://localhost:8080/solr/collection1/select?qt=%2Fsearch&q=mickey%20mouse&debugQuery=on&defType=edismax&pf=blah_exact&qf=blah It looks correct: mickey mouse mickey mouse +((DisjunctionMaxQuery((blah:mickey)) DisjunctionMaxQuery((blah:mouse)))~2) DisjunctionMaxQuery((blah_exact:"mickey mouse")) +(((blah:mickey) (blah:mouse))~2) (blah_exact:"mickey mouse") However, If I put in the field I want, for some reason that phrase portion of the query just completely drops off: http://localhost:8080/solr/collection1/select?qt=%2Fsearch&q=mickey%20mouse&debugQuery=on&defType=edismax&pf=name_exact&qf=name Results: mickey mouse mickey mouse +((DisjunctionMaxQuery((name:mickey)) DisjunctionMaxQuery((name:mouse)))~2) () +(((name:mickey) (name:mouse))~2) () The name_exact field's analyzer uses KeywordTokenizer, but again, I think this query is being formed too early in the process for that to matter at this point -- View this message in context: http://lucene.472066.n3.nabble.com/edismax-doesn-t-obey-pf-parameter-tp3589763p3590153.html Sent from the Solr - User mailing list archive at Nabble.com.
edismax doesn't obey 'pf' parameter
If I switch back and forth between defType=dismax and defType=edismax, the edismax doesn't seem to obey my pf parameter. I dug through the code a little bit and in the ExtendedDismaxQParserPlugin (Solr 3.4/Solr3.5), the part that is supposed to add the phrase comes here: Query phrase = pp.parse(userPhraseQuery.toString()); The code in the parse method tries to create a Query against a null field, and then the phrase does not get added to the mainQuery. Is this a known bug or am I missing something in my configuration? My config is very simple: explicit edismax name name_exact^2 id,name,image_url,url *:* -- View this message in context: http://lucene.472066.n3.nabble.com/edismax-doesn-t-obey-pf-parameter-tp3589763p3589763.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Result Grouping
Created an issue in jira for this features: https://issues.apache.org/jira/browse/SOLR-2884 Martijn v Groningen-2 wrote: > > Ok I think I get this. I think this can be achieved if one could > specify a filter inside a group and only documents that pass the > filter get grouped. For example only group documents with the value > image for the mimetype field. This filter should be specified per > group command. Maybe we should open an issue for this? > -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3491886.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Result Grouping
Martijn v Groningen-2 wrote: > > When using the group.field option values must be the same otherwise > they don't get grouped together. Maybe fuzzy grouping would be nice. > Grouping videos and images based on mimetype should be easy, right? > Videos have a mimetype that start with video/ and images have a > mimetype that start with image/. Storing the mime type's subtype and > type in separate fields and group on the type field would do the job. > Off course you need to know the mimetype during indexing, but > solutions like Apache Tika can do that for you. Not necessarily interested in grouping by mimetype (that's an analysis issue). I simply used videos and images as an example. I'm not sure what you mean by fuzzy grouping. But my goal is to have collapse be more selective somehow on what gets grouped. As a more specific example, I have a field called 'type', with the following possible field values: Type -- image video webpage Basically I want to be able to collapse all the images into a single result so that they don't fill up the first page of the results. This is not possible with the current grouping implementation because if you call group.field=type, it'll group everything. I do not want to collapse videos or webpages, only images. I've attached a screenshot of google's srp to help explain what I mean. http://lucene.472066.n3.nabble.com/file/n3471548/Screen_Shot_2011-11-01_at_11.52.04_AM.png Hopefully that makes more sense. If it's still not clear I can email you privately. -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3471548.html Sent from the Solr - User mailing list archive at Nabble.com.
Relevance for MoreLikeThis
I'm using the http://wiki.apache.org/solr/MoreLikeThisHandler MoreLikeThisHandler to find similar documents. It doesn't immediately appear that there is any way to tweak the relevance for the similar results. By default, it sorts those by how *similar* they are to the original document. However, I'd like to apply a little bit more relevance to those results, perhaps by boosting documents with higher ratings. Anyone know if this is possible? If it isn't, I may suggest having a mlt.bf or mlt.bq parameter to allow this. Thoughts? -- View this message in context: http://lucene.472066.n3.nabble.com/Relevance-for-MoreLikeThis-tp3438617p3438617.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Result Grouping
Not necessarily collapse.type=adjacent. That is only when two docs with the same field value appear next to each other. I'm more concerned with the case where we only want a group of a certain type (no matter where the subsequent docs may be), leaving the rest of the documents ungrouped. The current grouping functionality using group.field is basically all-or-nothing: all documents will be grouped by the field value or none will. So there would be no way to, for example, collapse just the videos or images like they do in google. You're correct it would be difficult to support this in a sharded environment, but like most other features, it could be available in a single shard first and work toward supporting it in a sharded env. -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3429618.html Sent from the Solr - User mailing list archive at Nabble.com.
Selective Result Grouping
I'd like to suggest the ability to collapse results in a more similar way to the old SOLR-236 patch that the current grouping functionality doesn't provide. I need the ability to collapse only certain results based on the value of a field, leaving all other results in tact. As an example, consider the following documents: ID TYPE 1 doc 2 image 3 image 4 doc My desired behavior is to collapse results where TYPE:image, producing a result set like the following: 1 2 (collapsed, count=2) 4 Currently, when using the Result Grouping feature, I only have the ability to produce the result set below 1 (grouped, count=2) 2 (grouped, count=2) I'd like to propose repurposing the 'group.query' parameter to achieve this behavior. Currently, the group.query parameter behaves exactly like an 'fq' (at least in terms of the results that are produced). I have yet to come up with a scenario where the group.query could not be accomplished by using the other group params and fq. I'm hoping to collect some thoughts on the subject before submitting a ticket to jira. Thoughts? -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3391538.html Sent from the Solr - User mailing list archive at Nabble.com.
Question Query Detection Strategies?
Hi All, Has anyone tackled the challenge of question detection in search using solr? A lot of my users don't do simple keyword searches, but rather ask questions as their queries. For example: what are the business hours? who is the ceo? what's the weather? more information about joe Are there any strategies beyond adding 'who', 'what', 'when', 'where' to stopwords? If anyone has a link to a good writeup about handling this type of searching, that'd be awesome! -- View this message in context: http://lucene.472066.n3.nabble.com/Question-Query-Detection-Strategies-tp3314477p3314477.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Grouping / Collapse Query
I guess that's a possible solution, but the two concerns I would have are 1) putting the burden of sorting on the client instead of solr, where it belongs. And 2) needing to request more results than I'd want to display in order to guarantee I could populate the entire page of results to compensate for the grouping. -- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-Collapse-Query-tp3164433p3166789.html Sent from the Solr - User mailing list archive at Nabble.com.
Grouping / Collapse Query
I'm messing around with the field collapsing in 4.x http://wiki.apache.org/solr/FieldCollapsing . Is it currently possible to group by a field with a certain value only and leave all the others ungrouped using the group.query param? This currently doesn't seem to work the way I want it to. For example, I have documents all with a "type" field. Possible values are: picture, video, game, other. I want to only group the pictures, and leave all other documents ungrouped. If I query something like: q=dogs&group=true&group.query=type:picture I ONLY get pictures back. Seems like this behaves more like an 'fq' What I want is a result set that looks like this: 1. doc 1, type=video 2. doc 2, type=game 3. doc 3, type=picture, + 3 other pictures 4. doc 4, type=video 5. doc 5, type=video ... I've also tried: q=dogs&group=true&group.query=type:picture&group.query=-type:video -type:game But this doesn't work because the order of the groups don't put together the correct order of results that would be displayed. -- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-Collapse-Query-tp3164433p3164433.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Fuzzy Query Param
I'm using Solr trunk. If it's levenstein/edit distance, that's great, that's what I want. It just didn't seem to be officially documented anywhere so I wanted to find out for sure. Thanks for confirming. -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-Query-Param-tp3120235p3122418.html Sent from the Solr - User mailing list archive at Nabble.com.
CopyField into another CopyField?
In solr, is it possible to 'chain' copyfields so that you can copy the value of one into another? Example: Point being, every time I add a new field to the autocomplete, I want it to automatically also be added to ac_spellcheck without having to do it twice. -- View this message in context: http://lucene.472066.n3.nabble.com/CopyField-into-another-CopyField-tp3122408p3122408.html Sent from the Solr - User mailing list archive at Nabble.com.
Fuzzy Query Param
According to the docs on lucene query syntax: "Starting with Lucene 1.9 an additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched." I was messing around with this and started doing queries with values greater than 1 and it seemed to be doing something. However I haven't been able to find any documentation on this. What happens when specifying a fuzzy query with a value > 1? "tiger"~2 "animal"~3 -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-Query-Param-tp3120235p3120235.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Analyzer creates PhraseQuery
Thanks guys. Both the PositionFilterFactory and the autoGeneratePhraseQueries=false solutions solved the issue. -- View this message in context: http://lucene.472066.n3.nabble.com/Analyzer-creates-PhraseQuery-tp3116288p3118471.html Sent from the Solr - User mailing list archive at Nabble.com.
Analyzer creates PhraseQuery
I have an analyzer setup in my schema like so: What's happening is if I index a term like "toys and dolls", if I search for "to", I get no matches. The debug output in solr gives me: to to PhraseQuery(autocomplete:"t o to") autocomplete:"t o to" Which means it looks like the lucene query parser is turning it into a PhraseQuery for some reason. The explain seems to confirm that this PhraseQuery is what's causing my document to not match: 0.0 = (NON-MATCH) weight(autocomplete:"t o to" in 82), product of: 1.0 = queryWeight(autocomplete:"t o to"), product of: 6.684934 = idf(autocomplete: t=60 o=68 to=14) 0.1495901 = queryNorm 0.0 = fieldWeight(autocomplete:"t o to" in 82), product of: 0.0 = tf(phraseFreq=0.0) 6.684934 = idf(autocomplete: t=60 o=68 to=14) 0.1875 = fieldNorm(field=autocomplete, doc=82) But why? This seems like it should match to me, and indeed the Solr analysis tool highlights the matches (see image), so something isn't lining up right. http://lucene.472066.n3.nabble.com/file/n3116288/Screen_shot_2011-06-27_at_7.55.49_PM.png In case you're wondering, I'm trying to implement a semi-advanced autocomplete feature that goes beyond using what a simple EdgeNGram analyzer could do. -- View this message in context: http://lucene.472066.n3.nabble.com/Analyzer-creates-PhraseQuery-tp3116288p3116288.html Sent from the Solr - User mailing list archive at Nabble.com.
Update JSON Invalid
I'm looking at the wiki article about updating the index with json and the format doesn't seem well formed to me. http://wiki.apache.org/solr/UpdateJSON Technically, yes, it's valid json, but most libraries treat the json objects as maps, and with multiple "add" elements as the keys, you cannot properly deserialize. As an example, try putting this into jsonlint.com, and notice it trims off one of the docs: { "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} }, "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} } } Is there something I'm just not seeing? Should we consider cleaning up this format, possibly using some json arrays so that it makes more sense from a json perspective? -- View this message in context: http://lucene.472066.n3.nabble.com/Update-JSON-Invalid-tp3088963p3088963.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCore / Index Searcher Instances
Make sense. However, one of the reasons I was asking was that we've configured Solr to use RAMDirectory and it appears that it loads the index into memory twice. I suspect the first time is for warming firstSearcher and the second time is for warming newSearcher. It makes our jvm memory requirements > 2x indexSize, which for us is a lot since indexSize=8GB. I'm wondering why it either a) loads the index twice, or b) seems to not release the 2nd load of the RAMDirectory in memory -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCore-Index-Searcher-Instances-tp1599373p1631329.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCore / Index Searcher Instances
This may seem like a stupid question, but why on the info / stats pages do we see two instances on SolrIndexSearcher? The reason I ask is that we've implemented SOLR-465 to try and serve our index from a RAMDirectory, but it appears that our index is being loaded into memory twice, as our JVM heap size requirements are > 2 x our index size on disk Does Solr actually create two instances of SolrCore / SolrIndexSearcher on startup? If one is used for warming, why isn't it destroyed when it's finished? http://lucene.472066.n3.nabble.com/file/n1599373/Screen_shot_2010-09-28_at_4.40.19_PM.png -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCore-Index-Searcher-Instances-tp1599373p1599373.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 1.4 - stats page slow
Apologies if this was resolved, but we just deployed Solr 1.4.1 and the stats page takes over a minute to load for us as well and began causing OutOfMemory errors so we've had to refrain from hitting the page. From what I gather, it is the fieldCache part that's causing it. Was there ever an official fix or recommendation on how to disable the stats page from calculating the fieldCache entries? If we could just ignore it, I think we'd be good to go since I find this page very useful otherwise. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-1-4-stats-page-slow-tp498810p1081193.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
Interesting...I guess I had logically assumed that having type="index" meant it wasn't used for query time, but I see why that's not possible. Here's the thing though: We had one field defined using this fieldtype and we deployed the new schema to solr when we started seeing the issue. However, we had not yet released our code that was using the new field (obviously we have to make the change on the solr end before the code, so we asynchronously do this offset by a few days). So the field that was of that fieldtype wasn't even being queried against. The problem for us would be pretty easy to reproduce, but I don't think our sys admins would appreciate experimenting with our production solr servers. We can pretty much only reproduce on our live environment because that's the only environment that's really getting regular (100 qps) traffic, so I guess you could say that it is traffic related. Just some other notes, we have a distributed index across 3 shards. We also regularly pick up snapshots from the master server about once per hour, so whatever commits happen during snapinstalling may affect it, but the timeline of the memory growing doesn't really line up with those commits. Anyway, I know it all seems like mystery and I apologize if it seems like I'm being vague, but the issue really is that simple. Hopefully if someone else ever experiences it they can come up with a better explanation why. Until then, we decided to just deploy our custom classes "the old way" by exploding the war and placing the jars in there - not nearly as convenient, but we haven't experienced any problems doing it this way (same code and config btw, so since the only difference is using the lib directory vs. not, that's most likely the problem). Thanks for your help hossman wrote: > > > : > : > : > : > : > : > : > : > ... > : only do indexing on the master server. However, with this schema in > place > : on the slaves, as well as our custom.jar in the solrHome/lib directory, > we > : run into these issues where the memory usage grows and grows without > : explanation. > > ...even if you only o indexing on the master, having a single analyzer > defined for a field means it's used at both index and query time (even > though you say 'type="index"') so a memory leak in either of your custom > factories could cause a problem on a query box. > > This however concerns me... > > : fact, in a previous try, we had simply dropped one of our custom plugin > jars > : into the lib directory but forgot to deploy the new solrconfig or schema > : files that referenced the classes in there, and the issue still > occurred. > > ...this i can't think of a rational explanation for. Can you elaborate on > what you can do to create this problem .. ie: does the memory usage grow > even when solr doesn't get any requests? or do it happen when searches are > executed? or when commits happen? etc... > > If the problem is as easy to reproduce as you describe, can you please > generate some heap dumps against a server that isn't processing any > queries -- one from when hte server first starts up, and one from when hte > server crashes from an OOM (there's a JVM option for generating heap dumps > on OOM that i can't think of off hte top of my head) > > > > -Hoss > > > -- View this message in context: http://old.nabble.com/Plugin-Performance-Issues-tp24295010p26201123.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
Here is where our custom class is referenced in the schema: As you can see, we built our own field type to be used at index time to essentially act as a sort of KeywordTokenizer, but removing stopwords. We share a schema.xml for both master and slave servers for convenience, but we only do indexing on the master server. However, with this schema in place on the slaves, as well as our custom.jar in the solrHome/lib directory, we run into these issues where the memory usage grows and grows without explanation. We've done this before (earlier in this thread) with having a custom spelling implementation too, and we ran into the same problem. We since gave up on that fix, but this is our very next attempt at deploying custom code using solr's plugin capability. Unfortunately, we got the same results. In fact, in a previous try, we had simply dropped one of our custom plugin jars into the lib directory but forgot to deploy the new solrconfig or schema files that referenced the classes in there, and the issue still occurred. Anyway, for now we've been able to get around this by packaging the solr.war with our custom jars in the WEB-INF/lib. Although this is more proper anyway, it's not nearly as convenient as being able to drop jars into an external lib directory and let solr pick up our classes that way. I'm still curious if this is unique to our environment or if there's a bug with solr's classloading for the plugin functionality. Grant Ingersoll-6 wrote: > > I would guess that your code is being used. I'm not sure what you > mean by it "was only referenced in the schema". That implies usage to > me. Is it a new field type? What is your plugin doing? > > Have you tried setting breakpoints at method entry points in your > plugin and starting up Solr w/ a debugger attached. > > -Grant > > On Oct 28, 2009, at 4:54 PM, entdeveloper wrote: > >> >> This is an issue we experienced a while back. We once again tried >> to load a >> custom class as a plugin jar from the lib directory and began >> experiencing >> severe memory problems again. The code in our jar wasn't being used >> at >> all...the class was only referenced in the schema. I find it >> strange that >> no one else has experienced this, but we're not doing anything >> particularly >> complex, which is still leading me to believe that there is something >> strange going on with Solr's class loading for this lib directory. >> Perhaps >> it is something specific with our environment (specs below)? >> >> java version "1.6.0_05" >> Java(TM) SE Runtime Environment (build 1.6.0_05-b13) >> Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode) >> >> Tomcat 6.0.16 >> >> Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64 >> x86_64 x86_64 >> GNU/Linux >> >> Max heap set to 1GB. >> >> With the jars in the plugin directory, RAM usage increases by 1.5 - >> 2GB, >> increasing at about 200MB/hr. >> >> >> >> hossman wrote: >>> >>> >>> : I'm not entirely convinced that it's related to our code, but it >>> could >>> be. >>> : Just trying to get a sense if other plugins have had similar >>> problems, >>> just >>> : by the nature of using Solr's resource loading from the /lib >>> directory. >>> >>> Plugins aren't something that every Solr users -- but enough people >>> use >>> them that if there was a fundemental memory leak just from loading >>> plugin >>> jars i'm guessing more people would be complaining. >>> >>> I use plugins in several solr instances, and i've never noticed any >>> problems like you describe -- but i don't personally use tomcat. >>> >>> Otis is right on the money: you need to use profiling tools to >>> really look >>> at the heap and see what's taking up all that ram. >>> >>> Alternately: a quick way to rule out the special plugin class >>> loader would >>> be to embed your custom handler directly into the solr.war ("The >>> Old Way" >>> on the SolrPlugins wiki) ... if you still have problems, then the >>> cause >>> isn't the plugin classloader. >>> >>> >>> >>> >>> >>> -Hoss >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26101741.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) > using Solr/Lucene: > http://www.lucidimagination.com/search > > > -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26118673.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
This is an issue we experienced a while back. We once again tried to load a custom class as a plugin jar from the lib directory and began experiencing severe memory problems again. The code in our jar wasn't being used at all...the class was only referenced in the schema. I find it strange that no one else has experienced this, but we're not doing anything particularly complex, which is still leading me to believe that there is something strange going on with Solr's class loading for this lib directory. Perhaps it is something specific with our environment (specs below)? java version "1.6.0_05" Java(TM) SE Runtime Environment (build 1.6.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode) Tomcat 6.0.16 Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64 x86_64 x86_64 GNU/Linux Max heap set to 1GB. With the jars in the plugin directory, RAM usage increases by 1.5 - 2GB, increasing at about 200MB/hr. hossman wrote: > > > : I'm not entirely convinced that it's related to our code, but it could > be. > : Just trying to get a sense if other plugins have had similar problems, > just > : by the nature of using Solr's resource loading from the /lib directory. > > Plugins aren't something that every Solr users -- but enough people use > them that if there was a fundemental memory leak just from loading plugin > jars i'm guessing more people would be complaining. > > I use plugins in several solr instances, and i've never noticed any > problems like you describe -- but i don't personally use tomcat. > > Otis is right on the money: you need to use profiling tools to really look > at the heap and see what's taking up all that ram. > > Alternately: a quick way to rule out the special plugin class loader would > be to embed your custom handler directly into the solr.war ("The Old Way" > on the SolrPlugins wiki) ... if you still have problems, then the cause > isn't the plugin classloader. > > > > > > -Hoss > > > -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26101741.html Sent from the Solr - User mailing list archive at Nabble.com.
Lucene versions in upcoming solr release
It's my understanding that Solr 1.4, which is to be released any day now, will be based on version 2.9 of lucene. It's also my understanding that lucene 3.0 will be released very shortly as well. Is there a plan to update Solr to use lucene 3.0 shortly after that? Just trying to decide if we should wait if there are going to be 2 releases very close to one another. -- View this message in context: http://www.nabble.com/Lucene-versions-in-upcoming-solr-release-tp26002391p26002391.html Sent from the Solr - User mailing list archive at Nabble.com.