Re: DIH: alternative approach to deltaQuery
Hi, ok since it didnt seem like there was interest to document this approach on the wiki i have simply documented it on my blog: http://pooteeweet.org/blog/1827 regards, Lukas Kahwe Smith m...@pooteeweet.org
Re: DIH: alternative approach to deltaQuery
Thank you very much Shawn. Paul On Fri, Sep 17, 2010 at 12:11 PM, Shawn Heisey elyog...@elyograg.orgwrote: On 9/17/2010 3:01 AM, Paul Dhaliwal wrote: Another feature missing in DIH is ability to pass parameters into your queries. If one could pass a named or positional parameter for an entity query, it will give them lot of freedom to optimize their delta or full load queries. One can even get creative with entity and delta queries that can take ranges and pass timestamps that depend on external sources. Paul, If I understand what you are saying, this ability already exists. I am using it with Solr 1.4.1. I sent some detailed information on how to do it to the list early last month: http://www.mail-archive.com/solr-user@lucene.apache.org/msg40466.html Shawn
Help: java.lang.OutOfMemoryError: PermGen space
the second time we had the error java.lang.OutOfMemoryError: PermGen space and solr stopped responding. we use the default jetty installation with jdk1.6.0_21. after the last time i tried to set the garbage collector right these are my settings: -D64 -server -Xms892m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:-HeapDumpOnOutOfMemoryError -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled as far as i thought, -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled should also cleanup the PermGen space. what can we do? ok, at the moment solr is not stopped and is running all time. maybe we should do a regular (daily) restart, then the problem should be fixed. but how can we adjust the garbage settings, so that the PermGen space is not running out of space... markus
Re: Help: java.lang.OutOfMemoryError: PermGen space
see http://stackoverflow.com/questions/88235/how-to-deal-with-java-lang-outofmemoryerror-permgen-space-error and the links there. There seems to be no good solution :-/ The only reliable solution is restart, before you haven't enough permgenspace (use jvisualvm to monitor) And try to increase -XX:MaxPermSize to make the restart interval longer or using jrebel or sth. like that should probably help too. Regards, Peter. the second time we had the error java.lang.OutOfMemoryError: PermGen space and solr stopped responding. we use the default jetty installation with jdk1.6.0_21. after the last time i tried to set the garbage collector right these are my settings: -D64 -server -Xms892m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:-HeapDumpOnOutOfMemoryError -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled as far as i thought, -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled should also cleanup the PermGen space. what can we do? ok, at the moment solr is not stopped and is running all time. maybe we should do a regular (daily) restart, then the problem should be fixed. but how can we adjust the garbage settings, so that the PermGen space is not running out of space... markus -- http://jetwick.com twitter search prototype
Solr UIMA integration
Hi all, I am working on integrating Apache UIMA as un UpdateRequestProcessor for Apache Solr and I am now at the first working snapshot. I put the code on GoogleCode [1] and you can take a look at the tutorial [2]. I would be glad to donate it to the Apache Solr project, as I think it could be a useful module to trigger automatic content extraction while indexing documents. At the moment the UIMAUpdateRequestProcessor base implementation can automatically extract document's sentences, language, keywords, concepts and named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and AlchemyAPIAnnotator components (but it can be easily expanded). Any feedback is welcome. Have a nice day. Tommaso [1] : http://code.google.com/p/solr-uima/ [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
Restrict possible results based on relational information
Hi List, this is my first message on this list, so if there's something missing/incorrect, please let me know :) the current problem, described in short words followed by an short example, is the following one: users can send privates messages, the selection of recipients is done via auto-complete. therefore we need to restrict the possible results based on the users confirmed contacts - but i have absolutely no idea how to do that :/ Add all confirmed contacts to the index, and use it like a type of relation? pass the list of confirmed contacts together with the query? let's say we have John Doe which creates a new message. typing doe should suggest Jane Doe, Thomas Doe - but not Another Doe, which is also a user, but none of his confirmed Contacts. Maybe we get also John Doe as possible match, but that should be okay in the first place - if we could exclude the user himself also, that's of course better. every user-record has an id, additional fields for firstname and lastname. confirmed contacts are simply explained records with field from:user-id to:user-id, actually with no additional information about type of relationship or something. but nothing of this relationship-information is currently submitted to the solr-index. if you need more information to answer this not-very-concrete question (and i'm sure, i've missed some relevant info) just ask, please :) Regards Stefan
Re: Restrict possible results based on relational information
hi Stefan users can send privates messages, the selection of recipients is done via auto-complete. therefore we need to restrict the possible results based on the users confirmed contacts - but i have absolutely no idea how to do that :/ Add all confirmed contacts to the index, and use it like a type of relation? pass the list of confirmed contacts together with the query? This does not sound like a search query because: 1. you know the user 2. you know his/her list of confirmed contacts If both statements are true, the list of confirmed contacts should be accessible via JSON-URL call so that you can load it into a autocomplete dropdown. SOLR needs not be involved in this case (but you can of course store the list of confirmed contacts in a multivalued field per user if you need it for other searches or facetting). Cheers, Chantal
Solr Analyzer results before the actual query.
Hi to all the Forum from a new subscriber, I’m working on the Server Side Search solution of the Company when I’m currently employed with. I have a problem at the moment: When I will submit a search to Solr I want to see the “Analyzer results”, with all the Filter applied to it as defined into the types.xml, of the search terms (Query) submitted to the Analyzer itself. The result of the Analyzer I want to have displayed BEFORE the actual search will be performed so I can decide at this point if I can run the proper search or leave the user with no results on the search performed. The problem is more less described in that issue https://issues.apache.org/jira/browse/SOLR-261. In summary is that possible to have the Analyzer results (in code) before running the actual Sorl search? I'm quite new to Solr so maybe this issue has been already discussed in another thread but I'm unable to find it at the moment, so if anybody has a any clue on how to do that please any suggestion will be more than welcome. Thanks very much in advance for your answer. Best wishes. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Analyzer-results-before-the-actual-query-tp1528692p1528692.html Sent from the Solr - User mailing list archive at Nabble.com.
NGram and word boundaries?
I've got a question regarding NGramFilterFactory. It seems to work very well, but I've had trouble getting it to work with other filters. Specifically, if I have an index analyzer that uses a StandardTokenizerFactory to tokenize and follows it up with an NGramFilterFactory, it does a fine job of handling ngrams, but it doesn't respect word boundaries: queries will match across whitespace. Using a modified example of the monitor.xml file for the example, If I have a field containing the text Dell Widescreen UltraSharp 3007WFP, and I provide the search query en U, it will match. I'd like to have the NGramFilterFactory match only _within_ words: how can I go about doing that? I'd like to avoid having to manually pre-process the query. I can provide detailed schema and examples is they'd help.. thanks! -harry
Re: Solr for statistical data
On Thu, Sep 16, 2010 at 11:48 AM, Peter Karich peat...@yahoo.de wrote: Hi Kjetil, is this custom component (which performes groub by + calcs stats) somewhere available? I would like to do something similar. Would you mind to share if it isn't already available? The grouping stuff sounds similar to https://issues.apache.org/jira/browse/SOLR-236 where you can have mem problems too ;-) or see: https://issues.apache.org/jira/browse/SOLR-1682 Thanks for the links! These patches seem to provide somewhat similar functionality, I'll investigate if they're implemented in a similar way too. We've developed this component for a client, so while I'd like to share it I can't make any promises. Sorry. Any tips or similar experiences? you want to decrease memory usage? Yes. Specifically, I would like to keep the heap at 4 GB. Unfortunately I'm still seeing some OutOfMemoryErrors so I might have to up the heap size again. I guess what I'm really wondering is if there's a way to keep memory use down, while at the same time not sacrificing the performance of our queries. The queries have to run through all values for a field in order to calculate the sum, so it's not enough to just cache a few values. The code which fetches values from the index uses FieldCache.DEFAULT.getStringIndex for a field, and then indexes like this: FieldType fieldType = searcher.getSchema().getFieldType(fieldName); fieldType.indexedToReadable(stringIndex.lookup[stringIndex.order[documentId]]); Is there a better way to do this? Thanks. ---Kjetil
Re: Searching solr with a two word query
Here is my raw query: q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablobversion=1.3json.nl=maprows=10start=0wt=xmlhl=truehl.fl=texthl.simple.pre=span+class%3Dhlhl.simple.post=%2Fspanhl.fragsize=0hl.mergeContiguous=falsedebugQuery=on and here is what I get on the debugQuery: lst name=debug − str name=rawquerystring opening excellent AND presentation_id:294 AND type:blob /str − str name=querystring opening excellent AND presentation_id:294 AND type:blob /str − str name=parsedquery all_text:open +all_text:excel +presentation_id:294 +type:blob /str − str name=parsedquery_toString all_text:open +all_text:excel +presentation_id:#0;Ħ +type:blob /str − lst name=explain − str name=1435675blob 3.1143723 = (MATCH) sum of: 0.46052343 = (MATCH) weight(all_text:open in 4457), product of: 0.5531408 = queryWeight(all_text:open), product of: 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.10381013 = queryNorm 0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of: 1.0 = tf(termFreq(all_text:open)=1) 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 0.74662465 = (MATCH) weight(all_text:excel in 4457), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of: 1.0 = tf(termFreq(all_text:excel)=1) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 1.7987071 = (MATCH) weight(presentation_id:#0;Ħ in 4457), product of: 0.43211576 = queryWeight(presentation_id:#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:#0;Ħ in 4457), product of: 1.0 = tf(termFreq(presentation_id:#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4457) 0.108517066 = (MATCH) weight(type:blob in 4457), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 = idf(docFreq=12084, maxDocs=12359) 0.10381013 = queryNorm 1.0224196 = (MATCH) fieldWeight(type:blob in 4457), product of: 1.0 = tf(termFreq(type:blob)=1) 1.0224196 = idf(docFreq=12084, maxDocs=12359) 1.0 = fieldNorm(field=type, doc=4457) /str − str name=1436129blob 2.06395 = (MATCH) product of: 2.7519336 = (MATCH) sum of: 0.84470934 = (MATCH) weight(all_text:excel in 4911), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.199351 = (MATCH) fieldWeight(all_text:excel in 4911), product of: 1.4142135 = tf(termFreq(all_text:excel)=2) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.125 = fieldNorm(field=all_text, doc=4911) 1.7987071 = (MATCH) weight(presentation_id:#0;Ħ in 4911), product of: 0.43211576 = queryWeight(presentation_id:#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:#0;Ħ in 4911), product of: 1.0 = tf(termFreq(presentation_id:#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4911) 0.108517066 = (MATCH) weight(type:blob in 4911), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 = idf(docFreq=12084, maxDocs=12359) 0.10381013 = queryNorm 1.0224196 = (MATCH) fieldWeight(type:blob in 4911), product of: 1.0 = tf(termFreq(type:blob)=1) 1.0224196 = idf(docFreq=12084, maxDocs=12359) 1.0 = fieldNorm(field=type, doc=4911) 0.75 = coord(3/4) /str − str name=1435686blob 1.9903867 = (MATCH) product of: 2.653849 = (MATCH) sum of: 0.74662465 = (MATCH) weight(all_text:excel in 4468), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH) fieldWeight(all_text:excel in 4468), product of: 1.0 = tf(termFreq(all_text:excel)=1) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4468) 1.7987071 = (MATCH) weight(presentation_id:#0;Ħ in 4468), product of: 0.43211576 = queryWeight(presentation_id:#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:#0;Ħ in 4468), product of: 1.0 = tf(termFreq(presentation_id:#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4468) 0.108517066 = (MATCH) weight(type:blob in 4468), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 =
SolrCloud new....
Hi all, I am having 4 instances of solr in 4 systems.Each system has a single instance of solr.. I want the result from all these servers. I came to know using of solrcloud. I read about it and worked on the example and it was working as given in wiki. I am using solr 1.4 and apache tomcat. In order to implement cloud in the solr trunk wat procedure should be followed. 1)Should i copy the libraries from cloud to trunk??? 2)should i keep the cloud module in every system??? 3) I am not using any cores in the solr. It is a single solr in every system.can solrcloud support it?? 4) the example is given in jetty.Is it the same way to make it in tomcat??? Regards, satya
Re: Solr for statistical data
I don't know if this thread might help with your problems any, but it might give some pointers: http://lucene.472066.n3.nabble.com/Tuning-Solr-caches-with-high-commit-rates-NRT-td1461275.html http://lucene.472066.n3.nabble.com/Tuning-Solr-caches-with-high-commit-rates-NRT-td1461275.html --Thomas On Mon, Sep 20, 2010 at 7:58 AM, Kjetil Ødegaard kjetil.odega...@gmail.comwrote: On Thu, Sep 16, 2010 at 11:48 AM, Peter Karich peat...@yahoo.de wrote: Hi Kjetil, is this custom component (which performes groub by + calcs stats) somewhere available? I would like to do something similar. Would you mind to share if it isn't already available? The grouping stuff sounds similar to https://issues.apache.org/jira/browse/SOLR-236 where you can have mem problems too ;-) or see: https://issues.apache.org/jira/browse/SOLR-1682 Thanks for the links! These patches seem to provide somewhat similar functionality, I'll investigate if they're implemented in a similar way too. We've developed this component for a client, so while I'd like to share it I can't make any promises. Sorry. Any tips or similar experiences? you want to decrease memory usage? Yes. Specifically, I would like to keep the heap at 4 GB. Unfortunately I'm still seeing some OutOfMemoryErrors so I might have to up the heap size again. I guess what I'm really wondering is if there's a way to keep memory use down, while at the same time not sacrificing the performance of our queries. The queries have to run through all values for a field in order to calculate the sum, so it's not enough to just cache a few values. The code which fetches values from the index uses FieldCache.DEFAULT.getStringIndex for a field, and then indexes like this: FieldType fieldType = searcher.getSchema().getFieldType(fieldName); fieldType.indexedToReadable(stringIndex.lookup[stringIndex.order[documentId]]); Is there a better way to do this? Thanks. ---Kjetil
Re: Calculating distances in Solr using longitude latitude
Hi Dennis, Good suggestion, but I see that most of that is Solr 4.0 functionality, which has not been released yet. How can I still use the longitude latitude functionality (LatLonType)? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Calculating-distances-in-Solr-using-longitude-latitude-tp1524297p1529097.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching solr with a two word query
Here's an excellent description of the Lucene query operators and how they differ from strict boolean logic: http://www.gossamer-threads.com/lists/lucene/java-user/47928 http://www.gossamer-threads.com/lists/lucene/java-user/47928But the short form is that (and boy, doesn't the fact that the URL escaping spaces as '+', which is also a Lucene operator make looking at these interesting), is that the first term is essentially a SHOULD clause in a Lucene BooleanQuery and is matching your docs all by itself. HTH Erick On Mon, Sep 20, 2010 at 8:58 AM, n...@frameweld.com wrote: Here is my raw query: q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablobversion=1.3 json.nl =maprows=10start=0wt=xmlhl=truehl.fl=texthl.simple.pre=span+class%3Dhlhl.simple.post=%2Fspanhl.fragsize=0hl.mergeContiguous=falsedebugQuery=on and here is what I get on the debugQuery: lst name=debug − str name=rawquerystring opening excellent AND presentation_id:294 AND type:blob /str − str name=querystring opening excellent AND presentation_id:294 AND type:blob /str − str name=parsedquery all_text:open +all_text:excel +presentation_id:294 +type:blob /str − str name=parsedquery_toString all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob /str − lst name=explain − str name=1435675blob 3.1143723 = (MATCH) sum of: 0.46052343 = (MATCH) weight(all_text:open in 4457), product of: 0.5531408 = queryWeight(all_text:open), product of: 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.10381013 = queryNorm 0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of: 1.0 = tf(termFreq(all_text:open)=1) 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 0.74662465 = (MATCH) weight(all_text:excel in 4457), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of: 1.0 = tf(termFreq(all_text:excel)=1) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4457), product of: 0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4457), product of: 1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4457) 0.108517066 = (MATCH) weight(type:blob in 4457), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 = idf(docFreq=12084, maxDocs=12359) 0.10381013 = queryNorm 1.0224196 = (MATCH) fieldWeight(type:blob in 4457), product of: 1.0 = tf(termFreq(type:blob)=1) 1.0224196 = idf(docFreq=12084, maxDocs=12359) 1.0 = fieldNorm(field=type, doc=4457) /str − str name=1436129blob 2.06395 = (MATCH) product of: 2.7519336 = (MATCH) sum of: 0.84470934 = (MATCH) weight(all_text:excel in 4911), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.199351 = (MATCH) fieldWeight(all_text:excel in 4911), product of: 1.4142135 = tf(termFreq(all_text:excel)=2) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.125 = fieldNorm(field=all_text, doc=4911) 1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4911), product of: 0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4911), product of: 1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4911) 0.108517066 = (MATCH) weight(type:blob in 4911), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 = idf(docFreq=12084, maxDocs=12359) 0.10381013 = queryNorm 1.0224196 = (MATCH) fieldWeight(type:blob in 4911), product of: 1.0 = tf(termFreq(type:blob)=1) 1.0224196 = idf(docFreq=12084, maxDocs=12359) 1.0 = fieldNorm(field=type, doc=4911) 0.75 = coord(3/4) /str − str name=1435686blob 1.9903867 = (MATCH) product of: 2.653849 = (MATCH) sum of: 0.74662465 = (MATCH) weight(all_text:excel in 4468), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH) fieldWeight(all_text:excel in 4468), product of: 1.0 = tf(termFreq(all_text:excel)=1) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.15625 = fieldNorm(field=all_text,
Re: Solr UIMA integration
Hi Tommaso, Really cool what you've done. Looking forward to testing it, and I'm sure it's a welcome contribution to Solr. You can easily contribute your code by opening a JIRA issue and attaching a patch file. BTW Have you considered making the output field names configurable on a per instance basis? It could be done as follows: processor class=org.apache.solr.uima.processor.UIMAProcessorFactory str name=concept_fieldconcept/str str name=language_fieldconcept/str str name=keyword_fieldconcept/str ... /processor -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 20. sep. 2010, at 12.35, Tommaso Teofili wrote: Hi all, I am working on integrating Apache UIMA as un UpdateRequestProcessor for Apache Solr and I am now at the first working snapshot. I put the code on GoogleCode [1] and you can take a look at the tutorial [2]. I would be glad to donate it to the Apache Solr project, as I think it could be a useful module to trigger automatic content extraction while indexing documents. At the moment the UIMAUpdateRequestProcessor base implementation can automatically extract document's sentences, language, keywords, concepts and named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and AlchemyAPIAnnotator components (but it can be easily expanded). Any feedback is welcome. Have a nice day. Tommaso [1] : http://code.google.com/p/solr-uima/ [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
Re: Restrict possible results based on relational information
Hi, You could simply create an autocomplete Solr Core with a simple schema consisting of id, from, to: Let the fieldType of from be String, and in the fieldType of to you can use StandardTokenizer, WordDelimiterFilter and EdgeNGramFilter. add doc field name=idjohn@mycompany.com-jane.doe@mycompany.com/field field name=fromjohn@mycompany.com/field field name=toJane Doe (jane@mycompany.com)/field /doc doc field name=idjohn@mycompany.com-thomas.doe@mycompany.com/field field name=fromjohn@mycompany.com/field field name=toThomas Doe (thomas@mycompany.com)/field /doc doc field name=idpeter@mycompany.com-another.doe@mycompany.com/field field name=frompeter@mycompany.com/field field name=toAnother Doe (another@mycompany.com)/field /doc /add Now, if your autocomplete query is like this: wt=jsonfl=toqf=from:john@mycompany.comq={!q.op=AND df=to}do Your response will now be a list of valid recepients where the from field is current user. By using EdgeNGramFilter in the to field, you get the effect of an automatic wildcard search since John Doe will be indexed as (conceptually) J Jo Joh John D Do Doe -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 20. sep. 2010, at 12.36, Stefan Matheis wrote: Hi List, this is my first message on this list, so if there's something missing/incorrect, please let me know :) the current problem, described in short words followed by an short example, is the following one: users can send privates messages, the selection of recipients is done via auto-complete. therefore we need to restrict the possible results based on the users confirmed contacts - but i have absolutely no idea how to do that :/ Add all confirmed contacts to the index, and use it like a type of relation? pass the list of confirmed contacts together with the query? let's say we have John Doe which creates a new message. typing doe should suggest Jane Doe, Thomas Doe - but not Another Doe, which is also a user, but none of his confirmed Contacts. Maybe we get also John Doe as possible match, but that should be okay in the first place - if we could exclude the user himself also, that's of course better. every user-record has an id, additional fields for firstname and lastname. confirmed contacts are simply explained records with field from:user-id to:user-id, actually with no additional information about type of relationship or something. but nothing of this relationship-information is currently submitted to the solr-index. if you need more information to answer this not-very-concrete question (and i'm sure, i've missed some relevant info) just ask, please :) Regards Stefan
Re: Solr UIMA integration
Looks like a great scraping engine technology :-) Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/20/10, Tommaso Teofili tommaso.teof...@gmail.com wrote: From: Tommaso Teofili tommaso.teof...@gmail.com Subject: Solr UIMA integration To: solr-user@lucene.apache.org Date: Monday, September 20, 2010, 3:35 AM Hi all, I am working on integrating Apache UIMA as un UpdateRequestProcessor for Apache Solr and I am now at the first working snapshot. I put the code on GoogleCode [1] and you can take a look at the tutorial [2]. I would be glad to donate it to the Apache Solr project, as I think it could be a useful module to trigger automatic content extraction while indexing documents. At the moment the UIMAUpdateRequestProcessor base implementation can automatically extract document's sentences, language, keywords, concepts and named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and AlchemyAPIAnnotator components (but it can be easily expanded). Any feedback is welcome. Have a nice day. Tommaso [1] : http://code.google.com/p/solr-uima/ [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
Re: Calculating distances in Solr using longitude latitude
Hmmm, I am about to put a engineer on our search engine requirements with the assumption that latitude/longitude is available in the current release of Solr, (not knowing what that is). I have been partitioning the whole Solr thing to him,except enough info for me to understand and interface to his work. So, I don't have that answer. Can someone else answer him? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/20/10, PeterKerk vettepa...@hotmail.com wrote: From: PeterKerk vettepa...@hotmail.com Subject: Re: Calculating distances in Solr using longitude latitude To: solr-user@lucene.apache.org Date: Monday, September 20, 2010, 6:53 AM Hi Dennis, Good suggestion, but I see that most of that is Solr 4.0 functionality, which has not been released yet. How can I still use the longitude latitude functionality (LatLonType)? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Calculating-distances-in-Solr-using-longitude-latitude-tp1524297p1529097.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr starting problem
Are you trying to implement custom code or is this a stock release? Because if you're trying to just move a stock release over, it'd be much simpler to just unpack the distribution (for Linux) on the linux machine and go. It might be worth doing anyway just to compare the differences to see what's causing your problem. But it looks like you're problem is in you Jetty configuration. I'm really guessing that you can't start your Jetty servlet at all HTH Erick On Mon, Sep 20, 2010 at 11:19 AM, Yavuz Selim YILMAZ yvzslmyilm...@gmail.com wrote: I use solr in windows without any problem, I 'm trying to run solr in linux, ( copy all files from windows to linux ), but I'm given exceptions when I try to start solr (java -jar start.jar) java.lang.ClassNotFoundException: org.mortbay.xml.xmlConfiguration at java.net.URLClassLoader.findClass(URLClassLoader.java:378) at java.lang.ClassLoader.loadClass(ClassLoader.java:570) at java.lang.ClassLoader.loadClass(ClassLoader.java:502) at jorg.mortbay.start.Main.start(Main.java:534) at jorg.mortbay.start.Main.start(Main.java:441) at jorg.mortbay.start.Main.Main(Main.java:119) I controlled all jar files, problem looks releted with jetty, but I can't find any solution. Any ideas? Thnx. -- Yavuz Selim YILMAZ
Re: Searching solr with a two word query
I'm missing what you really want out of your query, your phrase either word as a single result just isn't connecting in my grey matter.. Could you give some example inputs and outputs that demonstrates what you want? Best Erick On Mon, Sep 20, 2010 at 11:41 AM, n...@frameweld.com wrote: I noticed that my defaultOperator is OR, and that does have an effect on what does come up. If I were to change that to and, it's an exact match to my query, but Im would like similar matches with either word as a single result. Is there another value I can use? Or maybe I should use another query parser? Thanks. - Noel -Original Message- From: Erick Erickson erickerick...@gmail.com Sent: Monday, September 20, 2010 10:05am To: solr-user@lucene.apache.org Subject: Re: Searching solr with a two word query Here's an excellent description of the Lucene query operators and how they differ from strict boolean logic: http://www.gossamer-threads.com/lists/lucene/java-user/47928 http://www.gossamer-threads.com/lists/lucene/java-user/47928But the short form is that (and boy, doesn't the fact that the URL escaping spaces as '+', which is also a Lucene operator make looking at these interesting), is that the first term is essentially a SHOULD clause in a Lucene BooleanQuery and is matching your docs all by itself. HTH Erick On Mon, Sep 20, 2010 at 8:58 AM, n...@frameweld.com wrote: Here is my raw query: q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablobversion=1.3 json.nl =maprows=10start=0wt=xmlhl=truehl.fl=texthl.simple.pre=span+class%3Dhlhl.simple.post=%2Fspanhl.fragsize=0hl.mergeContiguous=falsedebugQuery=on and here is what I get on the debugQuery: lst name=debug − str name=rawquerystring opening excellent AND presentation_id:294 AND type:blob /str − str name=querystring opening excellent AND presentation_id:294 AND type:blob /str − str name=parsedquery all_text:open +all_text:excel +presentation_id:294 +type:blob /str − str name=parsedquery_toString all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob /str − lst name=explain − str name=1435675blob 3.1143723 = (MATCH) sum of: 0.46052343 = (MATCH) weight(all_text:open in 4457), product of: 0.5531408 = queryWeight(all_text:open), product of: 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.10381013 = queryNorm 0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of: 1.0 = tf(termFreq(all_text:open)=1) 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 0.74662465 = (MATCH) weight(all_text:excel in 4457), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of: 1.0 = tf(termFreq(all_text:excel)=1) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4457), product of: 0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4457), product of: 1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4457) 0.108517066 = (MATCH) weight(type:blob in 4457), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 = idf(docFreq=12084, maxDocs=12359) 0.10381013 = queryNorm 1.0224196 = (MATCH) fieldWeight(type:blob in 4457), product of: 1.0 = tf(termFreq(type:blob)=1) 1.0224196 = idf(docFreq=12084, maxDocs=12359) 1.0 = fieldNorm(field=type, doc=4457) /str − str name=1436129blob 2.06395 = (MATCH) product of: 2.7519336 = (MATCH) sum of: 0.84470934 = (MATCH) weight(all_text:excel in 4911), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.199351 = (MATCH) fieldWeight(all_text:excel in 4911), product of: 1.4142135 = tf(termFreq(all_text:excel)=2) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.125 = fieldNorm(field=all_text, doc=4911) 1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4911), product of: 0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4911), product of: 1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4911) 0.108517066 = (MATCH)
logging for solr
I'm running an old version of Solr (1.2) on Apache Tomcat 5.5.25. Right now the logs all go to the catalina.out file, which has been growing rather large. I have to shut down the servers periodically to clear out that logfile because it keeps getting large and giving disk space warnings. I've tried looking around for instructions on configuring the logging for Solr, but I'm not having much luck. Can someone please point me in the right direction to set up the logging for Solr? If I can get it into rolling logfiles, I can just have a cron job take out the old ones and not have to restart to do cleanup. Please don't tell me to upgrade the software -- it is not an option at this point. I'm sure that the latest versions have it working better, but right now I am unable to upgrade Solr or Tomcat to new versions. Thanks! -- Chris
Re: Searching solr with a two word query
Say if I had a two word query that was opening excellent, I would like it to return something like: opening excellent opening opening opening excellent excellent excellent Instead of: opening excellent excellent excellent excellent If I did a search, I would like the first word alone to also show up in the results, because currently my results show both words in one result and only the second word for the rest of the results. I've done a search on each word by itself, and there are results for them. Thanks. -Original Message- From: Erick Erickson erickerick...@gmail.com Sent: Monday, September 20, 2010 2:37pm To: solr-user@lucene.apache.org Subject: Re: Searching solr with a two word query I'm missing what you really want out of your query, your phrase either word as a single result just isn't connecting in my grey matter.. Could you give some example inputs and outputs that demonstrates what you want? Best Erick On Mon, Sep 20, 2010 at 11:41 AM, n...@frameweld.com wrote: I noticed that my defaultOperator is OR, and that does have an effect on what does come up. If I were to change that to and, it's an exact match to my query, but Im would like similar matches with either word as a single result. Is there another value I can use? Or maybe I should use another query parser? Thanks. - Noel -Original Message- From: Erick Erickson erickerick...@gmail.com Sent: Monday, September 20, 2010 10:05am To: solr-user@lucene.apache.org Subject: Re: Searching solr with a two word query Here's an excellent description of the Lucene query operators and how they differ from strict boolean logic: http://www.gossamer-threads.com/lists/lucene/java-user/47928 http://www.gossamer-threads.com/lists/lucene/java-user/47928But the short form is that (and boy, doesn't the fact that the URL escaping spaces as '+', which is also a Lucene operator make looking at these interesting), is that the first term is essentially a SHOULD clause in a Lucene BooleanQuery and is matching your docs all by itself. HTH Erick On Mon, Sep 20, 2010 at 8:58 AM, n...@frameweld.com wrote: Here is my raw query: q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablobversion=1.3 json.nl =maprows=10start=0wt=xmlhl=truehl.fl=texthl.simple.pre=span+class%3Dhlhl.simple.post=%2Fspanhl.fragsize=0hl.mergeContiguous=falsedebugQuery=on and here is what I get on the debugQuery: lst name=debug − str name=rawquerystring opening excellent AND presentation_id:294 AND type:blob /str − str name=querystring opening excellent AND presentation_id:294 AND type:blob /str − str name=parsedquery all_text:open +all_text:excel +presentation_id:294 +type:blob /str − str name=parsedquery_toString all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob /str − lst name=explain − str name=1435675blob 3.1143723 = (MATCH) sum of: 0.46052343 = (MATCH) weight(all_text:open in 4457), product of: 0.5531408 = queryWeight(all_text:open), product of: 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.10381013 = queryNorm 0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of: 1.0 = tf(termFreq(all_text:open)=1) 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 0.74662465 = (MATCH) weight(all_text:excel in 4457), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of: 1.0 = tf(termFreq(all_text:excel)=1) 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4457), product of: 0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of: 4.1625586 = idf(docFreq=522, maxDocs=12359) 0.10381013 = queryNorm 4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4457), product of: 1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1) 4.1625586 = idf(docFreq=522, maxDocs=12359) 1.0 = fieldNorm(field=presentation_id, doc=4457) 0.108517066 = (MATCH) weight(type:blob in 4457), product of: 0.10613751 = queryWeight(type:blob), product of: 1.0224196 = idf(docFreq=12084, maxDocs=12359) 0.10381013 = queryNorm 1.0224196 = (MATCH) fieldWeight(type:blob in 4457), product of: 1.0 = tf(termFreq(type:blob)=1) 1.0224196 = idf(docFreq=12084, maxDocs=12359) 1.0 = fieldNorm(field=type, doc=4457) /str − str name=1436129blob 2.06395 = (MATCH) product of: 2.7519336 = (MATCH) sum of: 0.84470934 = (MATCH) weight(all_text:excel in 4911), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.199351 = (MATCH)
Re: logging for solr
It is quite easy to modify its default value. Solr is using default logging values that started to use in jvm. It can be bound as a start parameter or can be externally defined in ../tomcat/conf/logging.properties. Simply it is enough to remove all contents (backup first) in ../tomcat/conf/logging.properties and write .level = SEVERE This change will make root checkbox from unset to severe. Of course you can switch it to WARNING or INFO too. You can observe changes from http://localhost:8080/solr/admin/logging or simply ~/admin/logging pages. Details are here: http://wiki.apache.org/tomcat/Logging_Tutorial http://tomcat.apache.org/tomcat-6.0-doc/logging.html Jak On Mon, Sep 20, 2010 at 10:32 PM, Christopher Gross cogr...@gmail.com wrote: I'm running an old version of Solr (1.2) on Apache Tomcat 5.5.25. Right now the logs all go to the catalina.out file, which has been growing rather large. I have to shut down the servers periodically to clear out that logfile because it keeps getting large and giving disk space warnings. I've tried looking around for instructions on configuring the logging for Solr, but I'm not having much luck. Can someone please point me in the right direction to set up the logging for Solr? If I can get it into rolling logfiles, I can just have a cron job take out the old ones and not have to restart to do cleanup. Please don't tell me to upgrade the software -- it is not an option at this point. I'm sure that the latest versions have it working better, but right now I am unable to upgrade Solr or Tomcat to new versions. Thanks! -- Chris
Re: logging for solr
Thanks Jak! That was just what I was looking for! -- Chris On Mon, Sep 20, 2010 at 4:25 PM, Jak Akdemir jakde...@gmail.com wrote: It is quite easy to modify its default value. Solr is using default logging values that started to use in jvm. It can be bound as a start parameter or can be externally defined in ../tomcat/conf/logging.properties. Simply it is enough to remove all contents (backup first) in ../tomcat/conf/logging.properties and write .level = SEVERE This change will make root checkbox from unset to severe. Of course you can switch it to WARNING or INFO too. You can observe changes from http://localhost:8080/solr/admin/logging or simply ~/admin/logging pages. Details are here: http://wiki.apache.org/tomcat/Logging_Tutorial http://tomcat.apache.org/tomcat-6.0-doc/logging.html Jak On Mon, Sep 20, 2010 at 10:32 PM, Christopher Gross cogr...@gmail.com wrote: I'm running an old version of Solr (1.2) on Apache Tomcat 5.5.25. Right now the logs all go to the catalina.out file, which has been growing rather large. I have to shut down the servers periodically to clear out that logfile because it keeps getting large and giving disk space warnings. I've tried looking around for instructions on configuring the logging for Solr, but I'm not having much luck. Can someone please point me in the right direction to set up the logging for Solr? If I can get it into rolling logfiles, I can just have a cron job take out the old ones and not have to restart to do cleanup. Please don't tell me to upgrade the software -- it is not an option at this point. I'm sure that the latest versions have it working better, but right now I am unable to upgrade Solr or Tomcat to new versions. Thanks! -- Chris
Re: Searching solr with a two word query
It will probably be clearer if you don't use the pseudo-boolean operators, and just use + for required terms. If you look at your output from debug, you see your query becomes: all_text:open +all_text:excel +presentation_id:294 +type:blob Note that all_text:open does not have a + sign, but all_text:excel has one. So all_text:open is not required, but all_text:excel is. I think this is because AND marks both of its operands as required. (which puts the + on +all_text:excel), but the open has no explicit op, so it uses OR, which marks that term as optional. What I would suggest you do is: opening excellent +presentation_id:294 +type:blob Which is think is much clearer. I think you could also do opening excellent presentation_id:294 AND type:blob but I think it's non-obvious how the result will differ from opening excellent AND presentation_id:294 AND type:blob So I wouldn't use either of the last two. Tom p.s. Not sure what is going on with the last lines of your debug output for the query. Is that really what shows up after presentation ID? I see Euro, hash mark, zero, semi-colon, and H with stroke str name=parsedquery_toString all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob /str On Mon, Sep 20, 2010 at 12:46 PM, n...@frameweld.com wrote: Say if I had a two word query that was opening excellent, I would like it to return something like: opening excellent opening opening opening excellent excellent excellent Instead of: opening excellent excellent excellent excellent If I did a search, I would like the first word alone to also show up in the results, because currently my results show both words in one result and only the second word for the rest of the results. I've done a search on each word by itself, and there are results for them. Thanks. -Original Message- From: Erick Erickson erickerick...@gmail.com Sent: Monday, September 20, 2010 2:37pm To: solr-user@lucene.apache.org Subject: Re: Searching solr with a two word query I'm missing what you really want out of your query, your phrase either word as a single result just isn't connecting in my grey matter.. Could you give some example inputs and outputs that demonstrates what you want? Best Erick On Mon, Sep 20, 2010 at 11:41 AM, n...@frameweld.com wrote: I noticed that my defaultOperator is OR, and that does have an effect on what does come up. If I were to change that to and, it's an exact match to my query, but Im would like similar matches with either word as a single result. Is there another value I can use? Or maybe I should use another query parser? Thanks. - Noel -Original Message- From: Erick Erickson erickerick...@gmail.com Sent: Monday, September 20, 2010 10:05am To: solr-user@lucene.apache.org Subject: Re: Searching solr with a two word query Here's an excellent description of the Lucene query operators and how they differ from strict boolean logic: http://www.gossamer-threads.com/lists/lucene/java-user/47928 http://www.gossamer-threads.com/lists/lucene/java-user/47928But the short form is that (and boy, doesn't the fact that the URL escaping spaces as '+', which is also a Lucene operator make looking at these interesting), is that the first term is essentially a SHOULD clause in a Lucene BooleanQuery and is matching your docs all by itself. HTH Erick On Mon, Sep 20, 2010 at 8:58 AM, n...@frameweld.com wrote: Here is my raw query: q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablobversion=1.3 json.nl =maprows=10start=0wt=xmlhl=truehl.fl=texthl.simple.pre=span+class%3Dhlhl.simple.post=%2Fspanhl.fragsize=0hl.mergeContiguous=falsedebugQuery=on and here is what I get on the debugQuery: lst name=debug − str name=rawquerystring opening excellent AND presentation_id:294 AND type:blob /str − str name=querystring opening excellent AND presentation_id:294 AND type:blob /str − str name=parsedquery all_text:open +all_text:excel +presentation_id:294 +type:blob /str − str name=parsedquery_toString all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob /str − lst name=explain − str name=1435675blob 3.1143723 = (MATCH) sum of: 0.46052343 = (MATCH) weight(all_text:open in 4457), product of: 0.5531408 = queryWeight(all_text:open), product of: 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.10381013 = queryNorm 0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of: 1.0 = tf(termFreq(all_text:open)=1) 5.3283896 = idf(docFreq=162, maxDocs=12359) 0.15625 = fieldNorm(field=all_text, doc=4457) 0.74662465 = (MATCH) weight(all_text:excel in 4457), product of: 0.7043054 = queryWeight(all_text:excel), product of: 6.7845535 = idf(docFreq=37, maxDocs=12359) 0.10381013 = queryNorm 1.0600865 = (MATCH)
RE: Re: Calculating distances in Solr using longitude latitude
You know, if there were some sort of hexagonal/pentagonal, soccer ball coordinate system for the Earth, all you'd need is an entry's distance to each of the 6/5 facets of the cell it was in, the distance between any two facets, and the distance to the endpoint to all it's facets. A giant table of precomputed distances, or some numbering system of coordinates that automatically gave the two facets and the distance between the faces would be even better. Then just look up the distances and add them. Still waiting for the coordinate system though :-). If one could get it to 10 meters resolution, wow. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/20/10, Markus Jelsma markus.jel...@buyways.nl wrote: From: Markus Jelsma markus.jel...@buyways.nl Subject: RE: Re: Calculating distances in Solr using longitude latitude To: solr-user@lucene.apache.org Date: Monday, September 20, 2010, 1:00 PM Hi, In the early Solr 1.3 times we had an index with leisure-time objects that included geographical coordinates. Based on certain conditions we had to display a specific list of nearby objects. We simply implemented some Great Circle calculations such as the distance between points [1] and aggregated nearby objects and sent then to our index. The drawback is that for each addition to the index, you'd have to recalculate all other nearby objects, that takes a while. The good thing is, in production, the system isn't slowed down by these calculations so it's very fast. [1]: http://williams.best.vwh.net/avform.htm#Dist Cheers, -Original message- From: Dennis Gearon gear...@sbcglobal.net Sent: Mon 20-09-2010 19:42 To: solr-user@lucene.apache.org; Subject: Re: Calculating distances in Solr using longitude latitude Hmmm, I am about to put a engineer on our search engine requirements with the assumption that latitude/longitude is available in the current release of Solr, (not knowing what that is). I have been partitioning the whole Solr thing to him,except enough info for me to understand and interface to his work. So, I don't have that answer. Can someone else answer him? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/20/10, PeterKerk vettepa...@hotmail.com wrote: From: PeterKerk vettepa...@hotmail.com Subject: Re: Calculating distances in Solr using longitude latitude To: solr-user@lucene.apache.org Date: Monday, September 20, 2010, 6:53 AM Hi Dennis, Good suggestion, but I see that most of that is Solr 4.0 functionality, which has not been released yet. How can I still use the longitude latitude functionality (LatLonType)? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Calculating-distances-in-Solr-using-longitude-latitude-tp1524297p1529097.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Calculating distances in Solr using longitude latitude
There is a third-party add-on for Solr 1.4 called LocalSolr. It has a different API than the upcoming SpatialSearch stuff, and will probably not live on in future releases. The LatLonType stuff is definitely only on the trunk, not even 3.x. PeterKerk wrote: Hi Dennis, Good suggestion, but I see that most of that is Solr 4.0 functionality, which has not been released yet. How can I still use the longitude latitude functionality (LatLonType)? Thanks!
Re: Solr for statistical data
Does this do what you want? http://wiki.apache.org/solr/StatsComponent I can see that group by is a possible enhancement to this component. Kjetil Ødegaard wrote: Hi all, we're currently using Solr 1.4.0 in a project for statistical data, where we group and sum a number of double values. Probably not what most people use Solr for, but it seems to be working fine for us :-) We do have some challenges, especially with memory use, so I thought I'd check here if anybody has done something similar. Some details: - The index is currently around 30 GB and growing. The data is indexed directly from a database, each row ends up as a document. I think we have around 100 million documents now, the largest core is about 40 million. The data is split in different cores for different statistics data. - Heap size is currently 4 GB. We're currently running all the cores in a single JVM on WebSphere (WAS) 6.1. We have a couple of GB left for OS disk cache. Initially we used a 1 GB heap, so we had to split cores in different shards in order to avoid OutOfMemoryErrors because of the FieldCache (I think). - The grouping is done by a custom Solr component which takes parameters that specify which fields to group by (like in SQL) and sums up values for the group. This uses the FieldCache for speedy retrieval. We did a PoC on using Documents instead, but this seemed to go a lot slower. I've done a memory dump and the combined FieldCache looks to be about 3 GB (taken with a grain of salt since I'm not sure all the data was cached). I guess this is different from normal Solr searches since we have to process all the documents in a core in order to calculate results, we can't just return the first 10 (or whatever) documents. Any tips or similar experiences? ---Kjetil
Re: Solr Analyzer results before the actual query.
Yes. Look at the jsp page solr/admin/analysis.jsp . This does calls to Solr which do exactly what you want. They use the AnalysisComponent. Lance zackko wrote: Hi to all the Forum from a new subscriber, I’m working on the Server Side Search solution of the Company when I’m currently employed with. I have a problem at the moment: When I will submit a search to Solr I want to see the “Analyzer results”, with all the Filter applied to it as defined into the types.xml, of the search terms (Query) submitted to the Analyzer itself. The result of the Analyzer I want to have displayed BEFORE the actual search will be performed so I can decide at this point if I can run the proper search or leave the user with no results on the search performed. The problem is more less described in that issue https://issues.apache.org/jira/browse/SOLR-261. In summary is that possible to have the Analyzer results (in code) before running the actual Sorl search? I'm quite new to Solr so maybe this issue has been already discussed in another thread but I'm unable to find it at the moment, so if anybody has a any clue on how to do that please any suggestion will be more than welcome. Thanks very much in advance for your answer. Best wishes.