MISSING LICENSE
Hi Just tried to ant clean test on latest code from trunk. I get a lot of MISSING LICENSE messages - e.g. [licenses] MISSING LICENSE for the following file: [licenses] .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-3.3.3.jar [licenses] Expected locations below: [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-ASL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-BSD.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-BSD_LIKE.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-CDDL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-CPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-EPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-MIT.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-MPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-PD.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-SUN.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-COMPOUND.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-FAKE.txt $ ant -version Apache Ant(TM) version 1.8.2 compiled on October 14 2011 What might be wrong? Regards, Per Steffensen
Performance (responsetime) on request
Hi, i've got two virtual machines in the same subnet at the same hostingprovider. On one machine my webapplication is running, on the second a solr instance. In solr I use the following fieldType name=text_auto class=solr.TextField analyzer type=index !--tokenizer class=solr.KeywordTokenizerFactory/-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 /-- /analyzer analyzer type=query !--tokenizer class=solr.KeywordTokenizerFactory /-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType If I search from my webapplication in my autosuggest box, I get response times of ~500ms per request. Is it possible to tune solr, so that I get faster results? I have no special cache configuration, nor I don't know what to configure here. Thanks, Ramo
SOLR Query Intersection
Hi , I am trying to Compare three independent queries,intersection among them and draw an Venn diagram using the Google CHART . By using OR I will be able to get the union of the 3 fields and using AND I will be able to get the intersection among the three , Is it possible to get the union and intersection among the fields in a same query For ex : I have 3 values which is under Multi-valued field browsers Google , Firefox and IE . I just need to find the no.of documents having only google , Firefox etc.. and no.of documents having all the three and an intersection among them like Google IE , Google Firefox Is it possible to do with the Query Intersections or do I need to write separate queries for all the above , If not please suggest how it can be achieved Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Query-Intersection-tp3818756p3818756.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR Query Intersection
It sounds like facets http://wiki.apache.org/solr/SolrFacetingOverview . Doesn't it? On Mon, Mar 12, 2012 at 1:16 PM, balaji mcabal...@gmail.com wrote: Hi , I am trying to Compare three independent queries,intersection among them and draw an Venn diagram using the Google CHART . By using OR I will be able to get the union of the 3 fields and using AND I will be able to get the intersection among the three , Is it possible to get the union and intersection among the fields in a same query For ex : I have 3 values which is under Multi-valued field browsers Google , Firefox and IE . I just need to find the no.of documents having only google , Firefox etc.. and no.of documents having all the three and an intersection among them like Google IE , Google Firefox Is it possible to do with the Query Intersections or do I need to write separate queries for all the above , If not please suggest how it can be achieved Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Query-Intersection-tp3818756p3818756.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
List of recommendation engines with solr
Hi All, I would require list of recs engine which can be integrated with solr and also suggest best one out of this. any comments would be appriciated!! Thanks, Rohan -- View this message in context: http://lucene.472066.n3.nabble.com/List-of-recommendation-engines-with-solr-tp3818917p3818917.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to index doc file in solr?
Hi Erick, Thanks for the valuable comments on this. See i have few set of word docs file and i would like to index meta data part includeing the content of the page , so is there any way to complete this task? Need your comments on this. Thanks, Rohan -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3818938.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to ignore indexing of duplicated documents?
http://wiki.apache.org/solr/Deduplication -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-ignore-indexing-of-duplicated-documents-tp3814858p3818973.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCore error
Hi, I'm getting some exceptions while shutting the hybris server and the exception details are specifies in the file attached to this mail. Please try to resolve it as soon as possible. Thanks Regards, Nikhila Pala Systems engineer Infosys Technologies Limited CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS***
Re: Solr 4.0
Hi Robert, See http://wiki.apache.org/solr/Solr4.0 The developer community is working towards a 4.0-Alpha release expected in a few months, however no dates are fixed. Many already use a snapshot version of TRUNK. You are free to do so, at your own risk. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 12. mars 2012, at 03:15, Robert Yu wrote: What's status of Solr 4.0? is there anyone start to use it? I heard it support real time update index, I'm interested in this feature. Thanks, Robert Yu Platform Service - Backend Morningstar Shenzhen Ltd. Morningstar. Illuminating investing worldwide. +86 755 3311-0223 voice +86 137-2377-0925 mobile +86 755 - fax robert...@morningstar.com 8FL, Tower A, Donghai International Center ( or East Pacific International Center) 7888 Shennan Road, Futian district, Shenzhen, Guangdong province, China 518040 http://cn.morningstar.com http://cn.morningstar.com This e-mail contains privileged and confidential information and is intended only for the use of the person(s) named above. Any dissemination, distribution, or duplication of this communication without prior written consent from Morningstar is strictly prohibited. If you have received this message in error, please contact the sender immediately and delete the materials from any computer.
Re: SOLR Query Intersection
Hi Mikhail, Yes I am trying to get the facets counts for all these and populate the chart , but comparison between the values is what I am wondering Will facets handle all the 3 possible scenarios Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Query-Intersection-tp3818756p3819111.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Does the lucene support the substring search?
Return to the post, I would like to know about whether the lucene support the substring search or not. As you can see, one field of my document is long string filed without any spaces. It means the token doesn't work here. Suppose I want to search a string TARCSV in my documents. I want to return the sample record from my document set. I try the Wildcard search and Fuzzy search both. But neither seems work. I am very sure whether I do all things right in the index and parse stage. Do you any one has the experience in the substring search? Yes it is possible. Two different approaches are described in a recent thread. http://search-lucene.com/m/Wicj8UB0gl2 One of them uses both trailing and leading wildcard, e.g. q=*TARCSV* Other approach makes use of NGramFilterFactry at index time only. It seems that you will be dealing with extremely long tokens. It is a good idea to increase maxTokenLength (default value is 255) SOLR-2188 Tokens longer than this are silently ignored.
Re: SOLR Query Intersection
Hi, I got your point are you suggesting me to run using the *facet.query* param for the various combinations Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Query-Intersection-tp3818756p3819165.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR Query Intersection
I've done exactly this, rendering Venn diagrams using Google Charts from Solr. See my presentation here: http://www.slideshare.net/erikhatcher/rapid-prototyping-with-solr-5675936 See slides 26-29, even with full code in the slides, but the code is also available here: https://github.com/erikhatcher/solr-rapid-prototyping/tree/master/ApacheCon2010 And, yup, facet.query was leveraged for this. Erik On Mar 12, 2012, at 05:16 , balaji wrote: Hi , I am trying to Compare three independent queries,intersection among them and draw an Venn diagram using the Google CHART . By using OR I will be able to get the union of the 3 fields and using AND I will be able to get the intersection among the three , Is it possible to get the union and intersection among the fields in a same query For ex : I have 3 values which is under Multi-valued field browsers Google , Firefox and IE . I just need to find the no.of documents having only google , Firefox etc.. and no.of documents having all the three and an intersection among them like Google IE , Google Firefox Is it possible to do with the Query Intersections or do I need to write separate queries for all the above , If not please suggest how it can be achieved Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Query-Intersection-tp3818756p3818756.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: MISSING LICENSE
On 3/12/2012 1:24 AM, Per Steffensen wrote: $ ant -version Apache Ant(TM) version 1.8.2 compiled on October 14 2011 What might be wrong? If you check lucene/BUILD.txt in your source, it says to use ant 1.7.1 or later, but not 1.8.x. This is from a recent trunk checkout: Basic steps: 0) Install JDK 1.6 (or greater), Ant 1.7.1+ (not 1.6.x, not 1.8.x) 1) Download Lucene from Apache and unpack it 2) Connect to the top-level of your Lucene installation 3) Install JavaCC (optional) 4) Run ant A previous message on the mailing list about the missing license messages (from 2012-02-23) says that some work has been done to get it working with ant 1.8, but it's not done yet. Can you downgrade or install the older release in an alternate location? It looks like ant 1.8 has been out for two years, so newer operating systems are going to be shipping with it and it may become difficult to get the older ant release. I know from my own systems that CentOS/RHEL 6 is still using ant 1.7.1. Thanks, Shawn
Re: Performance (responsetime) on request
If you look at solr admin page / statistics of cache, you could check the evictions of different types of cache. If some of them are larger than zero, try minimizing them by increasing the corresponding cache params in the solrconfig.xml. On Mon, Mar 12, 2012 at 10:12 AM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, i've got two virtual machines in the same subnet at the same hostingprovider. On one machine my webapplication is running, on the second a solr instance. In solr I use the following fieldType name=text_auto class=solr.TextField analyzer type=index !--tokenizer class=solr.KeywordTokenizerFactory/-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 /-- /analyzer analyzer type=query !--tokenizer class=solr.KeywordTokenizerFactory /-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType If I search from my webapplication in my autosuggest box, I get response times of ~500ms per request. Is it possible to tune solr, so that I get faster results? I have no special cache configuration, nor I don't know what to configure here. Thanks, Ramo -- Regards, Dmitry Kan
Re: MISSING LICENSE
Shawn Heisey skrev: On 3/12/2012 1:24 AM, Per Steffensen wrote: $ ant -version Apache Ant(TM) version 1.8.2 compiled on October 14 2011 What might be wrong? If you check lucene/BUILD.txt in your source, it says to use ant 1.7.1 or later, but not 1.8.x. This is from a recent trunk checkout: Basic steps: 0) Install JDK 1.6 (or greater), Ant 1.7.1+ (not 1.6.x, not 1.8.x) 1) Download Lucene from Apache and unpack it 2) Connect to the top-level of your Lucene installation 3) Install JavaCC (optional) 4) Run ant Ok, thanks. Didnt catch that. Have another checkout of solrtrunk from about 2 weeks ago, where I didnt see the problem?!?!?!? In that I am able to run ant test etc. without license problems. A previous message on the mailing list about the missing license messages (from 2012-02-23) says that some work has been done to get it working with ant 1.8, but it's not done yet. Can you downgrade or install the older release in an alternate location? Im sure I will manage, now that I know what the problem is. Thanks again. It looks like ant 1.8 has been out for two years, so newer operating systems are going to be shipping with it and it may become difficult to get the older ant release. I know from my own systems that CentOS/RHEL 6 is still using ant 1.7.1. Thanks, Shawn
Re: MISSING LICENSE
Over-aggressive license checking code doesn't like jars in extraneous directories (like the work directory that the war is exploded into under exampleB). delete exampleB and the build should work. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 On Mon, Mar 12, 2012 at 3:24 AM, Per Steffensen st...@designware.dk wrote: Hi Just tried to ant clean test on latest code from trunk. I get a lot of MISSING LICENSE messages - e.g. [licenses] MISSING LICENSE for the following file: [licenses] .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-3.3.3.jar [licenses] Expected locations below: [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-ASL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-BSD.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-BSD_LIKE.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-CDDL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-CPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-EPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-MIT.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-MPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-PD.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-SUN.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-COMPOUND.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-FAKE.txt $ ant -version Apache Ant(TM) version 1.8.2 compiled on October 14 2011 What might be wrong? Regards, Per Steffensen
Re: List of recommendation engines with solr
On 12 March 2012 16:30, Rohan rohan_kumb...@infosys.com wrote: Hi All, I would require list of recs engine which can be integrated with solr and also suggest best one out of this. any comments would be appriciated!! What exactly do you mean by that? Why is integration with Solr a requirement, and what do you expect to gain by such an integration? Best also probably depends on the context of your requirements. There are a variety of open-source recommendation engines. If you are looking at something from Apache, and in Java, Mahout might be a good choice. Regards, Gora
Re: Faster Solr Indexing
How have you determined that it's the solr add? By timing the call on the SolrJ side or by looking at the machine where Solr is running? This is the very first thing you have to answer. You can get a rough ides with any simple profiler (say Activity Monitor no a Mac, Task Manager on a Windows box). The point is just to see whether the indexer machine is being well utilized. I'd guess it's not actually. One quick experiment would be to try using StreamingUpdateSolrServer (SUSS), which has the capability of having multiple threads fire at Solr at once. It is possible that your performance is spent waiting for I/O. Once you have that question answered, you can refine. But until you know which side of the wire the problem is on, you're flying blind. Both Yandong Peyman: These times are quite surprising. Running everything locally on my laptop, I'm indexing between 5-7K documents/second. The source is the Wikipedia dump. I'm particularly surprised by the difference Yandong is seeing based on the various analysis chains. the first thing I'd back off is the MaxPermSize. 512M is huge for this parameter. If you're getting that kind of time differential and your CPU isn't pegged, you're probably swapping in which case you need to give the processes more memory. I'd just take the MaxPermSize out completely as a start. Not sure if you've seen this page, something there might help. http://wiki.apache.org/lucene-java/ImproveIndexingSpeed But throw a profiler at the indexer as a first step, just to see where the problem is, CPU or I/O. Best Erick On Sat, Mar 10, 2012 at 4:09 PM, Peyman Faratin pey...@robustlinks.com wrote: Hi I am trying to index 12MM docs faster than is currently happening in Solr (using solrj). We have identified solr's add method as the bottleneck (and not commit - which is tuned ok through mergeFactor and maxRamBufferSize and jvm ram). Adding 1000 docs is taking approximately 25 seconds. We are making sure we add and commit in batches. And we've tried both CommonsHttpSolrServer and EmbeddedSolrServer (assuming removing http overhead would speed things up with embedding) but the differences is marginal. The docs being indexed are on average 20 fields long, mostly indexed but none stored. The major size contributors are two fields: - content, and - shingledContent (populated using copyField of content). The length of the content field is (likely) gaussian distributed (few large docs 50-80K tokens, but majority around 2k tokens). We use shingledContent to support phrase queries and content for unigram queries (following the advice of Solr Enterprise search server advice - p. 305, section The Solution: Shingling). Clearly the size of the docs is a contributor to the slow adds (confirmed by removing these 2 fields resulting in halving the indexing time). We've tried compressed=true also but that is not working. Any guidance on how to support our application logic (without having to change the schema too much) and speed the indexing speed (from current 212 days for 12MM docs) would be much appreciated. thank you Peyman
AW: Performance (responsetime) on request
Hi, this are the results form the solr admin page for cache: name: queryResultCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 376 hits : 246 hitratio : 0.65 inserts : 130 evictions : 0 size : 130 warmupTime : 0 cumulative_lookups : 2994 cumulative_hits : 1934 cumulative_hitratio : 0.64 cumulative_inserts : 1060 cumulative_evictions : 409 name: fieldCache class: org.apache.solr.search.SolrFieldCacheMBean version:1.0 description:Provides introspection of the Lucene FieldCache, this is **NOT** a cache that is managed by Solr. stats: entries_count : 0 insanity_count : 0 name: documentCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 13416 hits : 11787 hitratio : 0.87 inserts : 1629 evictions : 1089 size : 512 warmupTime : 0 cumulative_lookups : 100012 cumulative_hits : 86959 cumulative_hitratio : 0.86 cumulative_inserts : 13053 cumulative_evictions : 11914 name: fieldValueCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=1, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 name: filterCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=512, initialSize=512, minSize=460, acceptableSize=486, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 Is there something tob e optimized? Thanks, Ramo -Ursprüngliche Nachricht- Von: Dmitry Kan [mailto:dmitry@gmail.com] Gesendet: Montag, 12. März 2012 15:06 An: solr-user@lucene.apache.org Betreff: Re: Performance (responsetime) on request If you look at solr admin page / statistics of cache, you could check the evictions of different types of cache. If some of them are larger than zero, try minimizing them by increasing the corresponding cache params in the solrconfig.xml. On Mon, Mar 12, 2012 at 10:12 AM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, i've got two virtual machines in the same subnet at the same hostingprovider. On one machine my webapplication is running, on the second a solr instance. In solr I use the following fieldType name=text_auto class=solr.TextField analyzer type=index !--tokenizer class=solr.KeywordTokenizerFactory/-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 /-- /analyzer analyzer type=query !--tokenizer class=solr.KeywordTokenizerFactory /-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType If I search from my webapplication in my autosuggest box, I get response times of ~500ms per request. Is it possible to tune solr, so that I get faster results? I have no special cache configuration, nor I don't know what to configure here. Thanks, Ramo -- Regards, Dmitry Kan
Re: Performance (responsetime) on request
you can optimize the documentCache by setting maxSize to some decent value, like 2000. Also configure some meaningful warming queries in the solrconfig. When increasing the cache size, monitor the RAM usage, as that can starting increasing as well. Do you / would you need to use filter queries? Those can speed up search as well through the usage of filterCache. Dmitry On Mon, Mar 12, 2012 at 5:12 PM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, this are the results form the solr admin page for cache: name: queryResultCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 376 hits : 246 hitratio : 0.65 inserts : 130 evictions : 0 size : 130 warmupTime : 0 cumulative_lookups : 2994 cumulative_hits : 1934 cumulative_hitratio : 0.64 cumulative_inserts : 1060 cumulative_evictions : 409 name: fieldCache class: org.apache.solr.search.SolrFieldCacheMBean version:1.0 description:Provides introspection of the Lucene FieldCache, this is **NOT** a cache that is managed by Solr. stats: entries_count : 0 insanity_count : 0 name: documentCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 13416 hits : 11787 hitratio : 0.87 inserts : 1629 evictions : 1089 size : 512 warmupTime : 0 cumulative_lookups : 100012 cumulative_hits : 86959 cumulative_hitratio : 0.86 cumulative_inserts : 13053 cumulative_evictions : 11914 name: fieldValueCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=1, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 name: filterCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=512, initialSize=512, minSize=460, acceptableSize=486, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 Is there something tob e optimized? Thanks, Ramo -Ursprüngliche Nachricht- Von: Dmitry Kan [mailto:dmitry@gmail.com] Gesendet: Montag, 12. März 2012 15:06 An: solr-user@lucene.apache.org Betreff: Re: Performance (responsetime) on request If you look at solr admin page / statistics of cache, you could check the evictions of different types of cache. If some of them are larger than zero, try minimizing them by increasing the corresponding cache params in the solrconfig.xml. On Mon, Mar 12, 2012 at 10:12 AM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, i've got two virtual machines in the same subnet at the same hostingprovider. On one machine my webapplication is running, on the second a solr instance. In solr I use the following fieldType name=text_auto class=solr.TextField analyzer type=index !--tokenizer class=solr.KeywordTokenizerFactory/-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 /-- /analyzer analyzer type=query !--tokenizer class=solr.KeywordTokenizerFactory /-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType If I search from my webapplication in my autosuggest box, I get response times of ~500ms per request. Is it possible to tune solr, so that I get faster results? I have no special cache configuration, nor I don't know what to configure here. Thanks, Ramo -- Regards, Dmitry Kan
Re: Strange behavior with search on empty string and NOT
Because Lucene query syntax is not a strict Boolean logic system. There's a good explanation here: http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/ Adding debugQuery=on to your search is your friend G.. You'll see that your return (at least on 3.5 with going at /solr/select) returns this as the parsed query: str name=parsedquery-name:foobar/str Solr really doesn't have the semantics for empty strings (or NULL for that matter) so it just gets dropped out. Best Erick On Sun, Mar 11, 2012 at 11:36 PM, Lan dung@gmail.com wrote: I am curious why solr results are inconsistent for the query below for an empty string search on a TextField. q=name: returns 0 results q=name: AND NOT name:FOOBAR return all results in the solr index. Should it should not return 0 results too? Here is the debugQuery. response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=qname: AND NOT name:BLAH232282/str str name=rows0/str str name=version2.2/str /lst /lst result name=response numFound=3790790 start=0/ lst name=debug str name=rawquerystringname: AND NOT name:BLAH232282/str str name=querystringname: AND NOT name:BLAH232282/str str name=parsedquery-PhraseQuery(name:blah 232282)/str str name=parsedquery_toString-name:blah 232282/str lst name=explain/ str name=QParserLuceneQParser/str lst name=timing double name=time1.0/double lst name=prepare double name=time1.0/double lst name=org.apache.solr.handler.component.QueryComponent double name=time1.0/double /lst lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst lst name=process double name=time0.0/double lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst /lst /lst /response -- View this message in context: http://lucene.472066.n3.nabble.com/Strange-behavior-with-search-on-empty-string-and-NOT-tp3818023p3818023.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: 3 Way Solr Join . . ?
I know it goes against the grain here for a DB person, but... denormalize. Really. Solr does many things well, but whenever you start trying to make it do database-like stuff you need to back up and re-think things. Simplest thing: Try indexing one record for each customer/purchase/complaint triplet. How many records are we talking here anyway? 30-40M documents will probably perform admirably on even a small piece of hardware. Best Erick On Mon, Mar 12, 2012 at 12:55 AM, Angelyna Bola angelyna.b...@gmail.com wrote: Bill, So sorry - my example is rapidly showing its short comings. The data I am actually working with is complex and obscure so I was trying to think of an example that was easy to relate to, but still has all the relevant characteristics. Let me try a better example: Let's suppose a Company is selling products and keeps track of complaints (which do not relate to any specific purchase): Data: Table #1: CUSTOMERS (parent table) City State Zip Table #2: PURCHASES (child table with foreign key to CUSTOMERS) Date Product Type Quantity Table #3: COMPLAINTS (child table with foreign key to CUSTOMERS) Date Complaint Type Complaint Text Remediation And the company wants to be able to query how their customers buy products and complaints. The tricky part is company needs to be able to blend string queries with date range queires and integer range queries. Query: CUSTOMERS in Vermont and PURCHASES within the last 1 year with a Quantity 75 and COMPLAINTS within the last 2 years with a Complaint Type = XYZ and Complaint Text contains the words ABC and EFG Problem: The problem with multi-valued fields is I loose the ability to do range queries over numeric attributes (such as Quantity or Date) when they only relate to other specific attributes (such as Product or Service Type). With the Join feature in Solr Trunk, I have no problem joining CUSTOMERS to PURCHASES or alternatively joining CUSTOMERS to COMPLAINTS. But I do not see a way of joining across all three. Hopefully I have done a better job with this example (appreciate your patience in trying to help me - I am not always the best at explaining). Angelyna
Re: SOLR Query Intersection
Hi, Thank you guys Erik and Mikhail , You saved my day Thanks Balaji -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Query-Intersection-tp3818756p3819571.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to index doc file in solr?
Consider using SolrJ, possibly combined with Tika (which is what underlies Solr Cel). http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/ AlthoughExtractingRequestHandler has the capability of indexing metadata as well if you map the fields. See: http://wiki.apache.org/solr/ExtractingRequestHandler Best Erick On Mon, Mar 12, 2012 at 11:09 AM, Rohan rohan_kumb...@infosys.com wrote: Hi Erick, Thanks for the valuable comments on this. See i have few set of word docs file and i would like to index meta data part includeing the content of the page , so is there any way to complete this task? Need your comments on this. Thanks, Rohan -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3818938.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: I wanna subscribe this maillist
Please follow the instructions here: http://lucene.apache.org/solr/discussion.html Best Erick On Mon, Mar 12, 2012 at 2:35 AM, 刘翀 lc87...@gmail.com wrote: I wanna subscribe this maillist
AW: Performance (responsetime) on request
Hi, thanks for you advice. Do you have any documentation on that? I'm not sure, how and where to configure this stuff and what impact it has. Thans, Ramo -Ursprüngliche Nachricht- Von: Dmitry Kan [mailto:dmitry@gmail.com] Gesendet: Montag, 12. März 2012 16:21 An: solr-user@lucene.apache.org Betreff: Re: Performance (responsetime) on request you can optimize the documentCache by setting maxSize to some decent value, like 2000. Also configure some meaningful warming queries in the solrconfig. When increasing the cache size, monitor the RAM usage, as that can starting increasing as well. Do you / would you need to use filter queries? Those can speed up search as well through the usage of filterCache. Dmitry On Mon, Mar 12, 2012 at 5:12 PM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, this are the results form the solr admin page for cache: name: queryResultCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 376 hits : 246 hitratio : 0.65 inserts : 130 evictions : 0 size : 130 warmupTime : 0 cumulative_lookups : 2994 cumulative_hits : 1934 cumulative_hitratio : 0.64 cumulative_inserts : 1060 cumulative_evictions : 409 name: fieldCache class: org.apache.solr.search.SolrFieldCacheMBean version:1.0 description:Provides introspection of the Lucene FieldCache, this is **NOT** a cache that is managed by Solr. stats: entries_count : 0 insanity_count : 0 name: documentCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 13416 hits : 11787 hitratio : 0.87 inserts : 1629 evictions : 1089 size : 512 warmupTime : 0 cumulative_lookups : 100012 cumulative_hits : 86959 cumulative_hitratio : 0.86 cumulative_inserts : 13053 cumulative_evictions : 11914 name: fieldValueCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=1, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 name: filterCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=512, initialSize=512, minSize=460, acceptableSize=486, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 Is there something tob e optimized? Thanks, Ramo -Ursprüngliche Nachricht- Von: Dmitry Kan [mailto:dmitry@gmail.com] Gesendet: Montag, 12. März 2012 15:06 An: solr-user@lucene.apache.org Betreff: Re: Performance (responsetime) on request If you look at solr admin page / statistics of cache, you could check the evictions of different types of cache. If some of them are larger than zero, try minimizing them by increasing the corresponding cache params in the solrconfig.xml. On Mon, Mar 12, 2012 at 10:12 AM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, i've got two virtual machines in the same subnet at the same hostingprovider. On one machine my webapplication is running, on the second a solr instance. In solr I use the following fieldType name=text_auto class=solr.TextField analyzer type=index !--tokenizer class=solr.KeywordTokenizerFactory/-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 /-- /analyzer analyzer type=query !--tokenizer class=solr.KeywordTokenizerFactory /-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType If I search from my webapplication in my autosuggest box, I get response times of ~500ms per request. Is it possible to tune solr, so that I get faster results? I have no special cache configuration, nor I don't know what to configure here. Thanks,
Re: Zookeeper view not displaying on latest trunk
Jamie, would you mind to give the latest another try, if the Cloud-Tab is working as it should? On Thursday, February 9, 2012 at 6:57 PM, Mark Miller wrote: On Feb 9, 2012, at 12:09 PM, Jamie Johnson wrote: To get this to work I had to modify my solr.xml to add a defaultCoreName, then everything worked fine on the old interface (/solr/admin). The new interface was still unhappy and looking at the response that comes back I see the following {status: 404, error : Zookeeper is not configured for this Solr Core. Please try connecting to an alternate zookeeper address.} Does the new interface support multiple cores? It should, but someone else wrote it, so I don't know offhand - sounds like a issue we need to look at. Should the old interface require that defaultCoreName be set? No - another thing we should look at. On Thu, Feb 9, 2012 at 10:29 AM, Jamie Johnson jej2...@gmail.com (mailto:jej2...@gmail.com) wrote: I'm looking at the latest code on trunk and it seems as if the zookeeper view does not work. When trying to access the information I get the following in the log 2012-02-09 10:28:49.030:WARN::/solr/zookeeper.jsp java.lang.NullPointerException at org.apache.jsp.zookeeper_jsp$ZKPrinter.init(org.apache.jsp.zookeeper_jsp:55) at org.apache.jsp.zookeeper_jsp._jspService(org.apache.jsp.zookeeper_jsp:533) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:389) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:486) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:380) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:280) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) - Mark Miller lucidimagination.com (http://lucidimagination.com)
RE: solr 3.5 and indexing performance
Hi guys, I have hit the same problem with Hunspell. Doing a few tests for 500 000 documents, I've got: Hunspell from http://code.google.com/p/lucene-hunspell/ with 3.4 version - 125 documents per second Build Hunspell from 4.0 trunk - 11 documents per second. All the tests were made on 8 core CPU with 32 GB RAM and index on SSD disks. For Solr 3.5 I've tried to change JVM heap size, rambuffersize, mergefactor but the speed of indexing was about 10 -20 documents per second. Is it possible that there is some performance bug with Solr 4.0? According to previous post the problem exists in 3.5 version. Best regards Agnieszka Kukałowicz -Original Message- From: mizayah [mailto:miza...@gmail.com] Sent: Thursday, February 23, 2012 10:19 AM To: solr-user@lucene.apache.org Subject: Re: solr 3.5 and indexing performance Ok i found it. Its becouse of Hunspell which now is in solr. Somehow when im using it by myself in 3.4 it is a lot of faster then one from 3.5. Dont know about differences, but is there any way i use my old Google Hunspell jar? -- View this message in context: http://lucene.472066.n3.nabble.com/solr- 3-5-and-indexing-performance-tp3766653p3769139.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Zookeeper view not displaying on latest trunk
I have not pulled the latest (I am pulled a week or 2 ago) and it works on that version. On Mon, Mar 12, 2012 at 11:40 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: Jamie, would you mind to give the latest another try, if the Cloud-Tab is working as it should? On Thursday, February 9, 2012 at 6:57 PM, Mark Miller wrote: On Feb 9, 2012, at 12:09 PM, Jamie Johnson wrote: To get this to work I had to modify my solr.xml to add a defaultCoreName, then everything worked fine on the old interface (/solr/admin). The new interface was still unhappy and looking at the response that comes back I see the following {status: 404, error : Zookeeper is not configured for this Solr Core. Please try connecting to an alternate zookeeper address.} Does the new interface support multiple cores? It should, but someone else wrote it, so I don't know offhand - sounds like a issue we need to look at. Should the old interface require that defaultCoreName be set? No - another thing we should look at. On Thu, Feb 9, 2012 at 10:29 AM, Jamie Johnson jej2...@gmail.com (mailto:jej2...@gmail.com) wrote: I'm looking at the latest code on trunk and it seems as if the zookeeper view does not work. When trying to access the information I get the following in the log 2012-02-09 10:28:49.030:WARN::/solr/zookeeper.jsp java.lang.NullPointerException at org.apache.jsp.zookeeper_jsp$ZKPrinter.init(org.apache.jsp.zookeeper_jsp:55) at org.apache.jsp.zookeeper_jsp._jspService(org.apache.jsp.zookeeper_jsp:533) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:389) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:486) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:380) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:280) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) - Mark Miller lucidimagination.com (http://lucidimagination.com)
Re: Knowing which fields matched a search
Paul, I would think debugQuery would make it slower too, wouldn't it? Where is the thread you are referring to? Is there a lucene jira ticket for this? On Mar 11, 2012, at 9:38 AM, Paul Libbrecht wrote: Russel, there's been a thread on that in the lucene world... it's not really perfect yet. The suggestion to debugQuery gives only, to my experience, the explain monster which is good for developers (only). paul Le 11 mars 2012 à 08:40, William Bell a écrit : debugQuery tells you. On Fri, Mar 9, 2012 at 1:05 PM, Russell Black rbl...@fold3.com wrote: When searching across multiple fields, is there a way to identify which field(s) resulted in a match without using highlighting or stored fields? -- Bill Bell billnb...@gmail.com cell 720-256-8076
Relational data
Hi. I need to setup an index that have relational data. This index will be for houses to rent, where the user will search for date, price, holydays (by name), etc. The problem is that the same house can have different prices for different dates. If I denormalyze this data, I will show the same house multiple times in the resultset, and I don't want this. So, for example: House Holyday Price per day 1 Xmas $ 75.00 1 July 4 $ 50.00 1 Valentine's $ 15.00 2 Xmas $ 50.00 2 July 4 $ 10.00 If I query for all data, I'll get 3 documents for the same house (house 1), but I just want to show it one time to the end-user. There is some way to do this in Solr (Without processing it in my app)? Thank's * -- * *E conhecereis a verdade, e a verdade vos libertará. (João 8:32)* *andre.maldonado*@gmail.com andre.maldon...@gmail.com (11) 9112-4227 http://www.orkut.com.br/Main#Profile?uid=2397703412199036664 http://www.orkut.com.br/Main#Profile?uid=2397703412199036664 http://www.facebook.com/profile.php?id=10659376883 http://twitter.com/andremaldonado http://www.delicious.com/andre.maldonado https://profiles.google.com/105605760943701739931 http://www.linkedin.com/pub/andr%C3%A9-maldonado/23/234/4b3 http://www.youtube.com/andremaldonado
Re: Relational data
The problem is that the same house can have different prices for different dates. If I denormalyze this data, I will show the same house multiple times in the resultset, and I don't want this. So, for example: House Holyday Price per day 1 Xmas $ 75.00 1 July 4 $ 50.00 1 Valentine's $ 15.00 2 Xmas $ 50.00 2 July 4 $ 10.00 If I query for all data, I'll get 3 documents for the same house (house 1), but I just want to show it one time to the end-user. There is some way to do this in Solr (Without processing it in my app)? http://wiki.apache.org/solr/FieldCollapsing could work.
Trouble indexing word documents
Hello, I running Solr inside Tomcat and I'm trying to index a word.doc using curl and I get the following error: bash-3.2# curl http://localhost:8585/solr/update/extract?literal.id=1commit=true; -F myfile=@troubleshooting_performance.doc htmlheadtitleApache Tomcat/6.0.14 - Error report/title /headbody HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:257) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:239) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:459) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:248) ... 16 more Caused by: java.lang.ClassNotFoundException: solr.extraction.ExtractingRequestHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) ... 19 more HR size=1 noshade=noshadep*type* Status report/pp*message* ulazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:257) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:239) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler'
Re: Relational data
You could use the grouping feature, depending on your needs: http://wiki.apache.org/solr/FieldCollapsing 2012/3/12 André Maldonado andre.maldon...@gmail.com Hi. I need to setup an index that have relational data. This index will be for houses to rent, where the user will search for date, price, holydays (by name), etc. The problem is that the same house can have different prices for different dates. If I denormalyze this data, I will show the same house multiple times in the resultset, and I don't want this. So, for example: House Holyday Price per day 1 Xmas $ 75.00 1 July 4 $ 50.00 1 Valentine's $ 15.00 2 Xmas $ 50.00 2 July 4 $ 10.00 If I query for all data, I'll get 3 documents for the same house (house 1), but I just want to show it one time to the end-user. There is some way to do this in Solr (Without processing it in my app)? Thank's * -- * *E conhecereis a verdade, e a verdade vos libertará. (João 8:32)* *andre.maldonado*@gmail.com andre.maldon...@gmail.com (11) 9112-4227 http://www.orkut.com.br/Main#Profile?uid=2397703412199036664 http://www.orkut.com.br/Main#Profile?uid=2397703412199036664 http://www.facebook.com/profile.php?id=10659376883 http://twitter.com/andremaldonado http://www.delicious.com/andre.maldonado https://profiles.google.com/105605760943701739931 http://www.linkedin.com/pub/andr%C3%A9-maldonado/23/234/4b3 http://www.youtube.com/andremaldonado
Re: Trouble indexing word documents
Make sure the Solr cell jar is in the classpath. You probably have a line like this in your solrconfig.xml: lib dir=../../dist/ regex=apache-solr-cell-\d.*\.jar / Make sure that points to the right file. On Mon, Mar 12, 2012 at 2:59 PM, rdancy rda...@wiley.com wrote: Hello, I running Solr inside Tomcat and I'm trying to index a word.doc using curl and I get the following error: bash-3.2# curl http://localhost:8585/solr/update/extract?literal.id=1commit=true; -F myfile=@troubleshooting_performance.doc htmlheadtitleApache Tomcat/6.0.14 - Error report/title /headbody HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:257) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:239) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:459) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:248) ... 16 more Caused by: java.lang.ClassNotFoundException: solr.extraction.ExtractingRequestHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) ... 19 more HR size=1 noshade=noshadep*type* Status report/pp*message* ulazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:257) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:239) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at
Solr Monitoring / Stats
Hi All, I was wondering if anyone knows of a free tool to use to monitor multiple Solr hosts under one roof ? I found some non functioning cacti munin trial implementation but would really like more direct statistics of the JVM itself + all Solr cores (i.e. requests /s , etc.) ? Does anyone know of one ? Or has a set of JMX URLs that could be used to make i.e. munin or cacti use that data ? I'm currently running psi-probe on each host to have at least some overview of whats going on within the JVM. Thanks! Alex
RE: Including an attribute value from a higher level entity when using DIH to index an XML file
I found an answer to my question, but it comes with a cost. With an XML file like this (this is simplified to remove extraneous elements and attributes): data user id=[id-num] message date=[date][message text]/message ... /user ... /data I can index the user id as a field in documents that represent each of the user's messages with this data-config expression: dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name=message processor=XPathEntityProcessor stream=true forEach=/data/user/message | /data/user url=message-data.xml field column=id xpath=/data/user/@id commonField=true/ field column=date xpath=/data/user/message/@date dateTimeFormat=-MM-dd'T'hh:mm:ss/ field column=text xpath=/data/user/message / /entity /document /dataConfig I didn't realize that commonField would work for cases in which the previously encountered field is in an element that encompasses the other elements, but it does. The forEach value has to be /data/user/message | /data/user in order for the user id to be located, since it is not under /data/user/message. By specifying forEach=/data/user/message | /data/user I am saying that each /data/user or /data/user/message element is a document in the index, but I don't really want /data/user elements to be treated this way. As luck would have it, those documents are filtered out, only because date and text are required fields, and they have not been assigned values yet when a document is created for a /data/user element, so an exception is thrown. I could live with this, but it's kind of ugly. I don't see any other way of doing what I need to do with embedded XML elements though. I tried creating nested entities in the data-config file, but each one of them is required to have a url attribute, and I think that caused the input file to be read twice. The only other possibility I could see from reading the DataImportHandler documentation was to specify an XSL file and change the XML file's structure so that the user id attribute is moved down to be an attribute of the message element. I'm not sure it's worth it to do something like that for what seems like a small problem, and I wonder how much it would slow down the importing of a large XML file. Are there any other ways of handling cases like this, where an attribute of an outer element is to be included in an index document that corresponds to an element nested inside it? Thanks, Mike -Original Message- From: Mike O'Leary [mailto:tmole...@uw.edu] Sent: Friday, March 02, 2012 3:30 PM To: Solr-User (solr-user@lucene.apache.org) Subject: Including an attribute value from a higher level entity when using DIH to index an XML file I have an XML file that I would like to index, that has a structure similar to this: data user id=[id-num] message date=[date][message text]/message ... /user ... /data I would like to have the documents in the index correspond to the messages in the xml file, and have the user's [id-num] value stored as a field in each of the user's documents. I think this means that I have to define an entity for message that looks like this: dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name=message processor=XPathEntityProcessor stream=true forEach=/data/user/message/ url=message-data.xml field column=date xpath=/data/user/message/@date dateTimeFormat=-MM-dd'T'hh:mm:ss/ field column=text xpath=/data/user/message / /entity /document /dataConfig but I don't know where to put the field definition for the user id. It would look like field column=id xpath=/data/user/@id / I can't put it within the message entity, because it is defined with forEach=/data/user/message/ and the id field's xpath value is outside of the entity's scope. Putting the id field definition there causes a null pointer exception. I don't think I want to create a user entity that the message entity is nested inside of, or is there a way to do that and still have the index documents correspond to messages from the file? Are there one or more attributes or values of attribute that I haven't run across in my searching that provide a way to do what I need to do? Thanks, Mike
Re: MISSING LICENSE
Per: You've been working with SolrCloud, haven't you? Yonik's right on, removing exampleB is what I had to do with the exact same problem. Erick On Mon, Mar 12, 2012 at 2:33 PM, Yonik Seeley yo...@lucidimagination.com wrote: Over-aggressive license checking code doesn't like jars in extraneous directories (like the work directory that the war is exploded into under exampleB). delete exampleB and the build should work. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 On Mon, Mar 12, 2012 at 3:24 AM, Per Steffensen st...@designware.dk wrote: Hi Just tried to ant clean test on latest code from trunk. I get a lot of MISSING LICENSE messages - e.g. [licenses] MISSING LICENSE for the following file: [licenses] .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-3.3.3.jar [licenses] Expected locations below: [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-ASL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-BSD.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-BSD_LIKE.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-CDDL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-CPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-EPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-MIT.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-MPL.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-PD.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-SUN.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-COMPOUND.txt [licenses] = .../solr/exampleB/work/Jetty_0_0_0_0_8900_solr.war__solr__dsbrc0/webapp/WEB-INF/lib/zookeeper-LICENSE-FAKE.txt $ ant -version Apache Ant(TM) version 1.8.2 compiled on October 14 2011 What might be wrong? Regards, Per Steffensen
query to some field in solr for multiple values
How can we perform query to single string type field for multiple values? e.g. I have the schema field like field name=id type=string indexed=true stored=true required=true / I want to query on id field for multiple values like.. q=id:['1', '5', '17']... in mysql we perform the same query like.. select * from table where id in(1,5,17) how can we perform the same query in solr on id field? -- Thanks Regards Preetesh Dubey
Re: Relational data
Thank's Ahmet and Tomás. It worked like a charm. * -- * *E conhecereis a verdade, e a verdade vos libertará. (João 8:32)* *andre.maldonado*@gmail.com andre.maldon...@gmail.com (11) 9112-4227 http://www.orkut.com.br/Main#Profile?uid=2397703412199036664 http://www.orkut.com.br/Main#Profile?uid=2397703412199036664 http://www.facebook.com/profile.php?id=10659376883 http://twitter.com/andremaldonado http://www.delicious.com/andre.maldonado https://profiles.google.com/105605760943701739931 http://www.linkedin.com/pub/andr%C3%A9-maldonado/23/234/4b3 http://www.youtube.com/andremaldonado On Mon, Mar 12, 2012 at 2:54 PM, Ahmet Arslan iori...@yahoo.com wrote: The problem is that the same house can have different prices for different dates. If I denormalyze this data, I will show the same house multiple times in the resultset, and I don't want this. So, for example: House Holyday Price per day 1 Xmas $ 75.00 1 July 4 $ 50.00 1 Valentine's $ 15.00 2 Xmas $ 50.00 2 July 4 $ 10.00 If I query for all data, I'll get 3 documents for the same house (house 1), but I just want to show it one time to the end-user. There is some way to do this in Solr (Without processing it in my app)? http://wiki.apache.org/solr/FieldCollapsing could work.
Re: query to some field in solr for multiple values
I want to query on id field for multiple values like.. q=id:['1', '5', '17']... in mysql we perform the same query like.. select * from table where id in(1,5,17) how can we perform the same query in solr on id field? q=1 5 17q.op=ORdf=id
Re: Performance (responsetime) on request
This page should help you: http://wiki.apache.org/solr/SolrCaching -- Dmitry On Mon, Mar 12, 2012 at 5:37 PM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, thanks for you advice. Do you have any documentation on that? I'm not sure, how and where to configure this stuff and what impact it has. Thans, Ramo -Ursprüngliche Nachricht- Von: Dmitry Kan [mailto:dmitry@gmail.com] Gesendet: Montag, 12. März 2012 16:21 An: solr-user@lucene.apache.org Betreff: Re: Performance (responsetime) on request you can optimize the documentCache by setting maxSize to some decent value, like 2000. Also configure some meaningful warming queries in the solrconfig. When increasing the cache size, monitor the RAM usage, as that can starting increasing as well. Do you / would you need to use filter queries? Those can speed up search as well through the usage of filterCache. Dmitry On Mon, Mar 12, 2012 at 5:12 PM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, this are the results form the solr admin page for cache: name: queryResultCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 376 hits : 246 hitratio : 0.65 inserts : 130 evictions : 0 size : 130 warmupTime : 0 cumulative_lookups : 2994 cumulative_hits : 1934 cumulative_hitratio : 0.64 cumulative_inserts : 1060 cumulative_evictions : 409 name: fieldCache class: org.apache.solr.search.SolrFieldCacheMBean version:1.0 description:Provides introspection of the Lucene FieldCache, this is **NOT** a cache that is managed by Solr. stats: entries_count : 0 insanity_count : 0 name: documentCache class: org.apache.solr.search.LRUCache version:1.0 description:LRU Cache(maxSize=512, initialSize=512) stats: lookups : 13416 hits : 11787 hitratio : 0.87 inserts : 1629 evictions : 1089 size : 512 warmupTime : 0 cumulative_lookups : 100012 cumulative_hits : 86959 cumulative_hitratio : 0.86 cumulative_inserts : 13053 cumulative_evictions : 11914 name: fieldValueCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=1, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 name: filterCache class: org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=512, initialSize=512, minSize=460, acceptableSize=486, cleanupThread=false) stats: lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 warmupTime : 0 cumulative_lookups : 0 cumulative_hits : 0 cumulative_hitratio : 0.00 cumulative_inserts : 0 cumulative_evictions : 0 Is there something tob e optimized? Thanks, Ramo -Ursprüngliche Nachricht- Von: Dmitry Kan [mailto:dmitry@gmail.com] Gesendet: Montag, 12. März 2012 15:06 An: solr-user@lucene.apache.org Betreff: Re: Performance (responsetime) on request If you look at solr admin page / statistics of cache, you could check the evictions of different types of cache. If some of them are larger than zero, try minimizing them by increasing the corresponding cache params in the solrconfig.xml. On Mon, Mar 12, 2012 at 10:12 AM, Ramo Karahasan ramo.karaha...@googlemail.com wrote: Hi, i've got two virtual machines in the same subnet at the same hostingprovider. On one machine my webapplication is running, on the second a solr instance. In solr I use the following fieldType name=text_auto class=solr.TextField analyzer type=index !--tokenizer class=solr.KeywordTokenizerFactory/-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 /-- /analyzer analyzer type=query !--tokenizer class=solr.KeywordTokenizerFactory /-- tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=25 / /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter
Re: Trouble indexing word documents
I see the line - lib dir=../../dist/ regex=apache-solr-cell-\d.*\.jar / but I don't see any solr cell jars, only Tika jars. I moved all the jars over to my classpath directory. I'm using version lucidworks-solr-3.2.0_01. -- View this message in context: http://lucene.472066.n3.nabble.com/Trouble-indexing-word-documents-tp3819949p3820472.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: MISSING LICENSE
Thank you both for your kind help. Regards, Steff Erick Erickson skrev: Per: You've been working with SolrCloud, haven't you? Yonik's right on, removing exampleB is what I had to do with the exact same problem. Erick On Mon, Mar 12, 2012 at 2:33 PM, Yonik Seeley yo...@lucidimagination.com wrote: Over-aggressive license checking code doesn't like jars in extraneous directories (like the work directory that the war is exploded into under exampleB). delete exampleB and the build should work. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Re: Display of highlighted search result should start with the beginning of the sentence that contains the search string.
Hi Koji, I am Shyam's coworker. After some looking into this issue, I believe the problem of chopped word has to do with org.apache.lucene.search.vectorhighlight.SimpleFragListBuilder class' 'margin' field. It is set to 6 by default. My understanding is having margin value of greater than zero results in truncated word when the highlighted term is too close to beginning of a document. I was able to reset the 'margin' field by creating my custom version of org.apache.solr.highlight.SimpleFragListBuilder and passing zero for 'margin' when calling the Lucene's SimpleFragListBuilder constructor. My testing shows the problem has been fixed. Do you concur? Now couple of questions. Not sure what the purpose of this field is, could you give the use case for it? Also could it be exposed as a parameter in Solr so it could be set to some other value? Thanks, Koorosh -- View this message in context: http://lucene.472066.n3.nabble.com/Display-of-highlighted-search-result-should-start-with-the-beginning-of-the-sentence-that-contains-t-tp3722912p3820516.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCore error
You attachment didn't come through, the mail server often strips this stuff. Please either inline it or put it up on some publicly accessible place Best Erick On Sun, Mar 11, 2012 at 10:51 PM, Nikhila Pala nikhila_p...@infosys.comwrote: Hi, ** ** I’m getting some exceptions while shutting the hybris server and the exception details are specifies in the file attached to this mail. Please try to resolve it as soon as possible. ** ** Thanks Regards, Nikhila Pala Systems engineer Infosys Technologies Limited ** ** ** ** CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS***
Re: Trouble indexing word documents
it should be in lucidworks-solr-3.2.0_01/dist/lucidworks-solr-cell-3.2.0_01.jar, don't you have that one? On Mon, Mar 12, 2012 at 5:44 PM, rdancy rda...@wiley.com wrote: I see the line - lib dir=../../dist/ regex=apache-solr-cell-\d.*\.jar / but I don't see any solr cell jars, only Tika jars. I moved all the jars over to my classpath directory. I'm using version lucidworks-solr-3.2.0_01. -- View this message in context: http://lucene.472066.n3.nabble.com/Trouble-indexing-word-documents-tp3819949p3820472.html Sent from the Solr - User mailing list archive at Nabble.com.
Additional Query with MLT
Is there a way to provide an additional query constraint to the MLT component? My particular use case is I want to get similar documents, but limit them to the documents a user can actually see based on some authorization query. Is this currently possible?
Incomplete documents with parent child DB relationship
I'm new to SOLR and have managed to get some basic indexing and querying working. However I haven't been able to successfully implement the indexing of a parent child database relationship. My db-data-config.xml is: dataConfig dataSource driver=com.ibm.as400.access.AS400JDBCDriver url=jdbc:as400://FAB/SV95TNDTA;;naming=system; user=SV95TNGLB password=GLOBAL95TN / document entity name=client query=SELECT #1ABCD, #1C8TX, #1AFTX, #1A7NA, #1A8NA FROM REP field column=#1ABCD name=id / field column=#1C8TX name=surname / field column=#1AFTX name=forenames / field column=#1A7NA name=ird_number / field column=#1A8NA name=gst_number / entity name=idreference query=select M6ABR from ABM6CPP where M6ABCD='${client.id}' field column=M6ABR name=id_reference / /entity /entity /document /dataConfig Most 'client' records will have one or more 'idreference' records. SOLR seems to import the data successfully (see status below) but when I do a *:* search there are no 'id_reference' elements in any document (see below at bottom): response lst name=responseHeader int name=status0/int int name=QTime0/int /lst lst name=initArgs lst name=defaults str name=configdb-data-config.xml/str /lst /lst str name=commandstatus/str str name=statusidle/str str name=importResponse/ lst name=statusMessages str name=Total Requests made to DataSource13594/str str name=Total Rows Fetched13593/str str name=Total Documents Skipped0/str str name=Full Dump Started2012-03-13 13:15:07/str str name= Indexing completed. Added/Updated: 13593 documents. Deleted 0 documents. /str str name=Committed2012-03-13 13:15:36/str str name=Optimized2012-03-13 13:15:36/str str name=Total Documents Processed13593/str str name=Time taken 0:0:29.804/str /lst str name=WARNING This response format is experimental. It is likely to change in the future. /str /response result name=response numFound=13593 start=0 maxScore=1.0 doc float name=score1.0/float str name=forenamesJohn David/str str name=gst_number/str str name=id012345/str str name=ird_number/str str name=surnameSagers/str /doc doc float name=score1.0/float str name=forenamesMark James/str str name=gst_number/str str name=id000426/str str name=ird_number/str str name=surnameKirby/str /doc ... Any assistance would be greatly appreciated. -- View this message in context: http://lucene.472066.n3.nabble.com/Incomplete-documents-with-parent-child-DB-relationship-tp3820963p3820963.html Sent from the Solr - User mailing list archive at Nabble.com.
Can solr-langid(Solr3.5.0) detect multiple languages in one text?
Hi, all, I am using solr-langid(Solr3.5.0) to do language detection, and I hope multiple languages in one text can be detected. The example text is: 咖哩起源於印度。印度民間傳說咖哩是佛祖釋迦牟尼所創,由於咖哩的辛辣與香味可以幫助遮掩羊肉的腥騷,此舉即為用以幫助不吃豬肉與牛肉的印度人。在泰米爾語中,「kari」是「醬」的意思。在馬來西亞,kari也稱dal(當在mamak檔)。早期印度被蒙古人所建立的莫臥兒帝國(Mughal Empire)所統治過,其間從波斯(現今的伊朗)帶來的飲食習慣,從而影響印度人的烹調風格直到現今。 Curry (plural, Curries) is a generic term primarily employed in Western culture to denote a wide variety of dishes originating in Indian, Pakistani, Bangladeshi, Sri Lankan, Thai or other Southeast Asian cuisines. Their common feature is the incorporation of more or less complex combinations of spices and herbs, usually (but not invariably) including fresh or dried hot capsicum peppers, commonly called chili or cayenne peppers. I want the text can be separated into two parts, and the part in Chinese goes to text_zh-tw while the other one text_en. Can I do something like that? Thank you. Best Regards, Bing -- View this message in context: http://lucene.472066.n3.nabble.com/Can-solr-langid-Solr3-5-0-detect-multiple-languages-in-one-text-tp3821210p3821210.html Sent from the Solr - User mailing list archive at Nabble.com.
Highlighting a font without bold or italic modes
How do you highlight terms in languages without boldface or italic modes? Maybe raise the text size a couple of sizes just for that word? -- Lance Norskog goks...@gmail.com
RE: List of recommendation engines with solr
Hi Gora, Thanks a lot for your valuable comments, really appreciated. Yeah , You got me correctly I am exactly looking for Mahout as I am using Java as my business layer with Apache solr. Thanks, Rohan From: Gora Mohanty-3 [via Lucene] [mailto:ml-node+s472066n3819480...@n3.nabble.com] Sent: Monday, March 12, 2012 8:28 PM To: Rohan Ashok Kumbhar Subject: Re: List of recommendation engines with solr On 12 March 2012 16:30, Rohan [hidden email]/user/SendEmail.jtp?type=nodenode=3819480i=0 wrote: Hi All, I would require list of recs engine which can be integrated with solr and also suggest best one out of this. any comments would be appriciated!! What exactly do you mean by that? Why is integration with Solr a requirement, and what do you expect to gain by such an integration? Best also probably depends on the context of your requirements. There are a variety of open-source recommendation engines. If you are looking at something from Apache, and in Java, Mahout might be a good choice. Regards, Gora If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/List-of-recommendation-engines-with-solr-tp3818917p3819480.html To unsubscribe from List of recommendation engines with solr, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3818917code=Um9oYW5fS3VtYmhhckBpbmZvc3lzLmNvbXwzODE4OTE3fC0xMjUwNDUyNDI1. NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** -- View this message in context: http://lucene.472066.n3.nabble.com/List-of-recommendation-engines-with-solr-tp3818917p3821268.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: How to index doc file in solr?
Thanks Erick ,really appreciated. From: Erick Erickson [via Lucene] [mailto:ml-node+s472066n3819585...@n3.nabble.com] Sent: Monday, March 12, 2012 9:05 PM To: Rohan Ashok Kumbhar Subject: Re: How to index doc file in solr? Consider using SolrJ, possibly combined with Tika (which is what underlies Solr Cel). http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/ AlthoughExtractingRequestHandler has the capability of indexing metadata as well if you map the fields. See: http://wiki.apache.org/solr/ExtractingRequestHandler Best Erick On Mon, Mar 12, 2012 at 11:09 AM, Rohan [hidden email]/user/SendEmail.jtp?type=nodenode=3819585i=0 wrote: Hi Erick, Thanks for the valuable comments on this. See i have few set of word docs file and i would like to index meta data part includeing the content of the page , so is there any way to complete this task? Need your comments on this. Thanks, Rohan -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3818938.html Sent from the Solr - User mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3819585.html To unsubscribe from How to index doc file in solr?, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3806543code=Um9oYW5fS3VtYmhhckBpbmZvc3lzLmNvbXwzODA2NTQzfC0xMjUwNDUyNDI1. NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3821271.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using multiple DirectSolrSpellcheckers for a query
Hi James/Robert, Thanks for the responses. Robert: What is it about the current APIs that makes this hard? How much/what kind of refactoring would open this up? James: I didn't quite understand the usage you suggested. I thought that the spellcheck.q param shouldn't include field names, etc and that the purpose of specifying this param is to avoid the extra parsing out of the field names, etc. from the q param to get the query terms for spell checking. This is based on this bit in the SpellCheckComponent wiki - The spellcheck.q parameter is intended to be the original query, minus any extra markup like field names, boosts, etc. Did I misunderstand something? I agree that it's impossible to know if the query run should be corrected to sun or running in the example I gave but I guess I'm asking more from the angle of how to avoid correcting terms that will be matched because they exist in other more processed fields that are being searched. Since the recommendation is to build spellcheck fields from minimally processed source fields, seems like this would be a common problem? And another kind of unrelated question - all the examples of spellcheck dictionaries I've seen in sample solrconfig.xmls have minPrefix set to 1. Is this for performance reasons? And with this setting, we wouldn't get run as a correction for eon right? Thanks, Nalini On Wed, Mar 7, 2012 at 11:04 AM, Robert Muir rcm...@gmail.com wrote: On Wed, Jan 25, 2012 at 12:55 PM, Nalini Kartha nalinikar...@gmail.com wrote: Is there any reason why Solr doesn't support using multiple spellcheckers for a query? Is it because of performance overhead? Thats not the case really, see https://issues.apache.org/jira/browse/SOLR-2926 I think the issue is that the spellchecker APIs need to be extended to allow this to happen easier, there is no real hard performance/technical/algorithmic issue, its just a matter of refactoring spellchecker APIs to allow this! -- lucidimagination.com