Solr3.6 DeleteByQuery with negated field not working
Hi, I am trying to delete a some documents in my index by query. When I select them with this negated query I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-field:*somename//query/delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Solr3.6 DeleteByQuery not working with negated query
Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-field:*somename//query/delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Re: Solr3.6 DeleteByQuery not working with negated query
Hi Markus, Why do you think it's not deleting amyrhing,? Thanks, Patrick Op 22 okt. 2012 08:36 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/**core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/**delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Re: Solr3.6 DeleteByQuery not working with negated query
Hi, Patrick, Because I have the same amount of documents in my index than before I perform the query. And when I use the negated query just to select the documents I can see they still there (and of course all other documents too :) ) Regards, Markus On 22.10.2012 14:38, Patrick Plaatje wrote: Hi Markus, Why do you think it's not deleting amyrhing,? Thanks, Patrick Op 22 okt. 2012 08:36 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/**core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/**delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Re: Solr3.6 DeleteByQuery not working with negated query
Did you make sure to commit after the delete? Patrick Op 22 okt. 2012 08:43 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, Patrick, Because I have the same amount of documents in my index than before I perform the query. And when I use the negated query just to select the documents I can see they still there (and of course all other documents too :) ) Regards, Markus On 22.10.2012 14:38, Patrick Plaatje wrote: Hi Markus, Why do you think it's not deleting amyrhing,? Thanks, Patrick Op 22 okt. 2012 08:36 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Re: Easy question ? docs with empty geodata field
Amit, Your guess was perfect and result is what expected: fq=-location_0_coordinate:[* TO *] to get docs with no geo data Thx, Jul -- View this message in context: http://lucene.472066.n3.nabble.com/Easy-question-docs-with-empty-geodata-field-tp4014751p4015067.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr3.6 DeleteByQuery not working with negated query
Yes Im sure. I commited a second time too to be sure. And I tried to delete just one entry with the same command but without a negated query and this worked. I think the problem is that its a negated query. Markus On 22.10.2012 14:46, Patrick Plaatje wrote: Did you make sure to commit after the delete? Patrick Op 22 okt. 2012 08:43 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, Patrick, Because I have the same amount of documents in my index than before I perform the query. And when I use the negated query just to select the documents I can see they still there (and of course all other documents too :) ) Regards, Markus On 22.10.2012 14:38, Patrick Plaatje wrote: Hi Markus, Why do you think it's not deleting amyrhing,? Thanks, Patrick Op 22 okt. 2012 08:36 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Re: Solr3.6 DeleteByQuery not working with negated query
3.6 has some quirks around parsing pure negative queries sometimes. Try *:* -whatever. BTW, a syntax I like for doing delete-by-query just in a raw URL is http://localhost:8983/solr/collection1/update?commit=truestream.body=deletequery*:* -store_0_coordinate:[* TO *]/query/delete The curl you used is, of course, fine. I just find the above easier. Best Erick On Mon, Oct 22, 2012 at 4:22 AM, Markus.Mirsberger markus.mirsber...@gmx.de wrote: Yes Im sure. I commited a second time too to be sure. And I tried to delete just one entry with the same command but without a negated query and this worked. I think the problem is that its a negated query. Markus On 22.10.2012 14:46, Patrick Plaatje wrote: Did you make sure to commit after the delete? Patrick Op 22 okt. 2012 08:43 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, Patrick, Because I have the same amount of documents in my index than before I perform the query. And when I use the negated query just to select the documents I can see they still there (and of course all other documents too :) ) Regards, Markus On 22.10.2012 14:38, Patrick Plaatje wrote: Hi Markus, Why do you think it's not deleting amyrhing,? Thanks, Patrick Op 22 okt. 2012 08:36 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
Re: Query related to Solr XML
LucidWorks is a commercial product supported by LucidWorks (the company). As Hatcher already said, you really should ask the question on the LucidWorks forum bq: It's best to ask LucidWorks related questions at http://support.lucidworks.com rather than in this e-mail list. As for your issue more information is needed in order to assist. Did you start the Solr XML crawler? Does your data source show that there are documents in the index? If you simply press search (with an empty query) do you see documents? (best, again, to respond to these questions at the LucidWorks support site) * Best Erick On Mon, Oct 22, 2012 at 12:56 AM, leenajawale leenajawal...@gmail.com wrote: I start the XML Crawler. Data source shows that there are documents. But still I am unable to search. Do I need to do any changes in the config.xml file of solr.?? -- View this message in context: http://lucene.472066.n3.nabble.com/Query-related-to-Solr-XML-tp4014711p4015033.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr3.6 DeleteByQuery not working with negated query
Hi Erick, thanks alot. That trick fixed it :) Regards, Markus On 22.10.2012 15:43, Erick Erickson wrote: 3.6 has some quirks around parsing pure negative queries sometimes. Try *:* -whatever. BTW, a syntax I like for doing delete-by-query just in a raw URL is http://localhost:8983/solr/collection1/update?commit=truestream.body=deletequery*:* -store_0_coordinate:[* TO *]/query/delete The curl you used is, of course, fine. I just find the above easier. Best Erick On Mon, Oct 22, 2012 at 4:22 AM, Markus.Mirsberger markus.mirsber...@gmx.de wrote: Yes Im sure. I commited a second time too to be sure. And I tried to delete just one entry with the same command but without a negated query and this worked. I think the problem is that its a negated query. Markus On 22.10.2012 14:46, Patrick Plaatje wrote: Did you make sure to commit after the delete? Patrick Op 22 okt. 2012 08:43 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, Patrick, Because I have the same amount of documents in my index than before I perform the query. And when I use the negated query just to select the documents I can see they still there (and of course all other documents too :) ) Regards, Markus On 22.10.2012 14:38, Patrick Plaatje wrote: Hi Markus, Why do you think it's not deleting amyrhing,? Thanks, Patrick Op 22 okt. 2012 08:36 schreef Markus.Mirsberger markus.mirsber...@gmx.de het volgende: Hi, I am trying to delete a some documents in my index by query. When I just select them with this negated query, I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus
uniqueKey not enforced
Hi, I noticed a duplicate entry in my index and I am wondering how that can be, because I have a uniqueKey defined. I have the following defined in my schema.xml: ?xml version=1.0 ? schema name=main core version=1.1 types fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ other field types omitted here ... /fieldType /types fields !-- general -- !-- id computed as a combination of media id and path -- field name=id type=string indexed=true stored=true multiValued=false / other fields omitted here ... /fields !-- field to use to determine and enforce document uniqueness. -- uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldname/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=OR/ /schema And now I have two records which both have the value 4b34b883-a9d9-428a-92c3-ba1a69d96a70:/Düsendrögl in its id field. Is it the Non-ASCII chars that cause the uniqueness enforcement to fail? I am using Solr 3.6.1. Any ideas what's going on? Thanks, Robert
Re: Solr3.6 DeleteByQuery with negated field not working
Hi, This is how we do it in our Solr 3.4 setup: curl http://solrip:port/solr/update?commit=true --data-binary 'deletequeryhere_goes_the_query/query/delete' -H 'Content-type:text/xml' i.e. no extra update, /update tags surrounding the delete tags. HTH, Dmitry On Mon, Oct 22, 2012 at 10:29 AM, Markus.Mirsberger markus.mirsber...@gmx.de wrote: Hi, I am trying to delete a some documents in my index by query. When I select them with this negated query I get all the documents I want to delete but when I use this query in the DeleteByQuery it is not working Im trying to delete all elements which value ends with 'somename/' When I use this for selection it works and I get exactly the right documents (about 10.000. so too many to delete one by one:) ) curl http://solrip:8080/solr/**core/update/?commit=true -H Content-Type: text/xml --data-binary 'updatedeletequery-** field:*somename//query/**delete/update'; And here the response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime11091/int/lst /response I tried to perform it in the browser too by using /update?stream.body ... but the result is the same. And no Error in the Solr-Log. I hope someone can help me ... I dont want do this manually :) Regards, Markus -- Regards, Dmitry Kan
RE: [External] Spatial Index (polygon)
Billy, There's a great wiki page at: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 which gives an example on indexing polygons -Original Message- From: Billy Newman [mailto:newman...@gmail.com] Sent: Sunday, October 21, 2012 3:27 PM To: solr-user@lucene.apache.org Subject: [External] Spatial Index (polygon) Is it possible to index polygons in Solr/Lucene 4? I know you can do a polygon search, but I am not sure if you can index polygons. Any know? Thanks, Billy
Re: uniqueKey not enforced
Which release of Solr? Is this a single node Solr or distributed or cloud? Is is possible that you added documents with the overwrite=false attribute? That would suppress the uniqueness test. Is it possible that you added those documents before adding the uniqueKey element to your schema, or added uniqueKey but did not restart Solr before adding those documents? One minor difference from the Solr example schema is that your id field does not have required=true. I don't think that should matter (Solr will force the uniqueKey field to be required in documents), but I am curious how you managed to get an id field different from the Solr example. -- Jack Krupansky -Original Message- From: Robert Krüger Sent: Monday, October 22, 2012 5:56 AM To: solr-user@lucene.apache.org Subject: uniqueKey not enforced Hi, I noticed a duplicate entry in my index and I am wondering how that can be, because I have a uniqueKey defined. I have the following defined in my schema.xml: ?xml version=1.0 ? schema name=main core version=1.1 types fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ other field types omitted here ... /fieldType /types fields !-- general -- !-- id computed as a combination of media id and path -- field name=id type=string indexed=true stored=true multiValued=false / other fields omitted here ... /fields !-- field to use to determine and enforce document uniqueness. -- uniqueKeyid/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldname/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=OR/ /schema And now I have two records which both have the value 4b34b883-a9d9-428a-92c3-ba1a69d96a70:/Düsendrögl in its id field. Is it the Non-ASCII chars that cause the uniqueness enforcement to fail? I am using Solr 3.6.1. Any ideas what's going on? Thanks, Robert
Re: Best and quickest Solr Search Front end
My experience for the easiest query is solr/itas (aka velocity solr). paul Le 22 oct. 2012 à 11:15, Muwonge Ronald a écrit : Hi all, have done some crawls for certain urls with nutch and indexed them to solr.I kindly request for assistance in getting the best search interface but have no choice.Could you please assist me on this with examples and guide lines looked at solr-php-client but failed. Thnx Ronny
Re: need help with exact match search
hello jack, that was it! thx mark -- View this message in context: http://lucene.472066.n3.nabble.com/need-help-with-exact-match-search-tp4014832p4015103.html Sent from the Solr - User mailing list archive at Nabble.com.
Phonetic filter factory for indian languages
I was trying to use phonetic filter factory , I have tried all the encoders that are available with solr.PhoneticFilterFactory but none of them is supporting indian languages . Is there any other Filter/Method available so that i can get phonetic representation for indian languages e.g Hindi,tamil,Bengali etc If not then how we can modify existing filters to support these languages. -- View this message in context: http://lucene.472066.n3.nabble.com/Phonetic-filter-factory-for-indian-languages-tp4015104.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Does SolrCloud support distributed IDFs?
Hi Mark, Mark Miller wrote: Still waiting on that issue. I think Andrzej should just update it to trunk and commit - it's option and defaults to off. Go vote :) Sounds like the problem is already solved and the remaining work consists of code integration? Can somebody estimate how much work that would be? -Sascha
Re: Best and quickest Solr Search Front end
Thanks let me try it On Mon, Oct 22, 2012 at 3:13 PM, Paul Libbrecht p...@hoplahup.net wrote: My experience for the easiest query is solr/itas (aka velocity solr). paul Le 22 oct. 2012 à 11:15, Muwonge Ronald a écrit : Hi all, have done some crawls for certain urls with nutch and indexed them to solr.I kindly request for assistance in getting the best search interface but have no choice.Could you please assist me on this with examples and guide lines looked at solr-php-client but failed. Thnx Ronny
Re: uniqueKey not enforced
On Mon, Oct 22, 2012 at 2:08 PM, Jack Krupansky j...@basetechnology.com wrote: Which release of Solr? 3.6.1 Is this a single node Solr or distributed or cloud? single node, actually embedded in an application. Is is possible that you added documents with the overwrite=false attribute? That would suppress the uniqueness test. no, I just used SolrServer.add(CollectionSolrInputDocument docs) Is it possible that you added those documents before adding the uniqueKey element to your schema, or added uniqueKey but did not restart Solr before adding those documents? no, the element has been there for months, the index has been created from scratch just before the test One minor difference from the Solr example schema is that your id field does not have required=true. I don't think that should matter (Solr will force the uniqueKey field to be required in documents), but I am curious how you managed to get an id field different from the Solr example. so am I ;-). I will add the required attribute, though. It cannot hurt.
Solr Implementation Plan and FTE for Install/Maintenance
All - I'm a bit new to Solr and looking for documentation or guides on implementing Solr as an enterprise search solution over some other products we are currently using. Ideally, I'd like to find out information about * General Solr server hardware requirements and approx. starting size for a 3 million document index * Approximate time to setup and configure Solr for a 3 million document index * Number of FTE's typically that folks see to setup and configure Solr * Approximate number of FTE's necessary to maintain Solr on an ongoing basis Any general FTE information, implementation timeline information, or cost comparison data you may have, I'd find extremely interesting. I've looked for this type of data blogs and on the Lucene site but haven't been able to find much information in these areas. Thanks! Seth
Re: Occasional Solr performance issues
When Solr is slow, I'm seeing these in the logs: [collection1] Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. [collection1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2 Googling, I found this in the FAQ: Typically the way to avoid this error is to either reduce the frequency of commits, or reduce the amount of warming a searcher does while it's on deck (by reducing the work in newSearcher listeners, and/or reducing the autowarmCount on your caches) http://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F I happen to know that the script will try to commit once every 60 seconds. How does one reduce the work in newSearcher listeners? What effect will this have? What effect will reducing the autowarmCount on caches have? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Best and quickest Solr Search Front end
Further on that in recent versions of Solr, it's /browse, not the sillier /itas handler name. As far as the best search front end, it's such an opinionated answer here. It all really depends on what technologies you'd like to deploy. The library world has created two nice front-ends that are more or less general purpose enough to use for other (non-library) schemas, with a bit of configuration. There's Blacklight (Ruby on Rails) and VuFind (PHP). As the initial creator of Blacklight, I'll toss in my vote for that one as the best :) But again, it depends on many factors what's the Right choice for your environment. You can learn more about Blacklight at http://projectblacklight.org/, and see many examples of it deployed in production here: https://github.com/projectblacklight/blacklight/wiki/Examples Erik On Oct 22, 2012, at 08:13 , Paul Libbrecht wrote: My experience for the easiest query is solr/itas (aka velocity solr). paul Le 22 oct. 2012 à 11:15, Muwonge Ronald a écrit : Hi all, have done some crawls for certain urls with nutch and indexed them to solr.I kindly request for assistance in getting the best search interface but have no choice.Could you please assist me on this with examples and guide lines looked at solr-php-client but failed. Thnx Ronny
Re: Occasional Solr performance issues
Hello! You can check if the long warming is causing the overlapping searchers. Check Solr admin panel and look at cache statistics, there should be warmupTime property. Lowering the autowarmCount should lower the time needed to warm up, howere you can also look at your warming queries (if you have such) and see how long they take. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch When Solr is slow, I'm seeing these in the logs: [collection1] Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. [collection1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2 Googling, I found this in the FAQ: Typically the way to avoid this error is to either reduce the frequency of commits, or reduce the amount of warming a searcher does while it's on deck (by reducing the work in newSearcher listeners, and/or reducing the autowarmCount on your caches) http://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F I happen to know that the script will try to commit once every 60 seconds. How does one reduce the work in newSearcher listeners? What effect will this have? What effect will reducing the autowarmCount on caches have? Thanks.
Re: Occasional Solr performance issues
Are you using Solr 3X? The occasional long commit should no longer show up in Solr 4. - Mark On Mon, Oct 22, 2012 at 10:44 AM, Dotan Cohen dotanco...@gmail.com wrote: I've got a script writing ~50 documents to Solr at a time, then commiting. Each of these documents is no longer than 1 KiB of text, some much less. Usually the write-and-commit will take 1-2 seconds or less, but sometimes it can go over 60 seconds. During a recent time of over-60-second write-and-commits, I saw that the server did not look overloaded: $ uptime 14:36:46 up 19:20, 1 user, load average: 1.08, 1.16, 1.16 $ free -m total used free sharedbuffers cached Mem: 14980 2091 12889 0233 1243 -/+ buffers/cache:613 14366 Swap:0 0 0 Other than Solr, nothing is running on this machine other than stock Ubuntu Server services (no Apache, no MySQL). The machine is running on an Extra Large Amazon EC2 instance, with a virtual 4-core 2.4 GHz Xeon processor and ~16 GiB of RAM. The solr home is on a mounted EBS volume. What might make some queries take so long, while others perform fine? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com -- - Mark
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 5:02 PM, Rafał Kuć r@solr.pl wrote: Hello! You can check if the long warming is causing the overlapping searchers. Check Solr admin panel and look at cache statistics, there should be warmupTime property. Thank you, I have gone over the Solr admin panel twice and I cannot find the cache statistics. Where are they? Lowering the autowarmCount should lower the time needed to warm up, howere you can also look at your warming queries (if you have such) and see how long they take. Thank you, I will look at that! -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 5:27 PM, Mark Miller markrmil...@gmail.com wrote: Are you using Solr 3X? The occasional long commit should no longer show up in Solr 4. Thank you Mark. In fact, this is the production release of Solr 4. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: uniqueKey not enforced
And, are you using UUID's or providing specific key values? -- Jack Krupansky -Original Message- From: Robert Krüger Sent: Monday, October 22, 2012 9:22 AM To: solr-user@lucene.apache.org Subject: Re: uniqueKey not enforced On Mon, Oct 22, 2012 at 2:08 PM, Jack Krupansky j...@basetechnology.com wrote: Which release of Solr? 3.6.1 Is this a single node Solr or distributed or cloud? single node, actually embedded in an application. Is is possible that you added documents with the overwrite=false attribute? That would suppress the uniqueness test. no, I just used SolrServer.add(CollectionSolrInputDocument docs) Is it possible that you added those documents before adding the uniqueKey element to your schema, or added uniqueKey but did not restart Solr before adding those documents? no, the element has been there for months, the index has been created from scratch just before the test One minor difference from the Solr example schema is that your id field does not have required=true. I don't think that should matter (Solr will force the uniqueKey field to be required in documents), but I am curious how you managed to get an id field different from the Solr example. so am I ;-). I will add the required attribute, though. It cannot hurt.
Re: uniqueKey not enforced
On Mon, Oct 22, 2012 at 6:01 PM, Jack Krupansky j...@basetechnology.com wrote: And, are you using UUID's or providing specific key values? specific key values
Re: Occasional Solr performance issues
On 10/22/2012 9:58 AM, Dotan Cohen wrote: Thank you, I have gone over the Solr admin panel twice and I cannot find the cache statistics. Where are they? If you are running Solr4, you can see individual cache autowarming times here, assuming your core is named collection1: http://server:port/solr/#/collection1/plugins/cache?entry=queryResultCache http://server:port/solr/#/collection1/plugins/cache?entry=filterCache The warmup time for the entire searcher can be found here: http://server:port/solr/#/collection1/plugins/core?entry=searcher If you are on an older Solr release, everything is in various sections of the stats page. Do a page search for warmup multiple times to see them all: http://server:port/solr/corename/admin/stats.jsp Thanks, Shawn
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 7:29 PM, Shawn Heisey s...@elyograg.org wrote: On 10/22/2012 9:58 AM, Dotan Cohen wrote: Thank you, I have gone over the Solr admin panel twice and I cannot find the cache statistics. Where are they? If you are running Solr4, you can see individual cache autowarming times here, assuming your core is named collection1: http://server:port/solr/#/collection1/plugins/cache?entry=queryResultCache http://server:port/solr/#/collection1/plugins/cache?entry=filterCache The warmup time for the entire searcher can be found here: http://server:port/solr/#/collection1/plugins/core?entry=searcher Thank you Shawn! I can see how I missed that data. I'm reviewing it now. Solr has a low barrier to entry, but quite a learning curve. I'm loving it! I see that the server is using less than 2 GiB of memory, whereas it is a dedicated Solr server with 16 GiB of memory. I understand that I can increase the query and document caches to increase performance, but I worry that this will increase the warm-up time to unacceptable levels. What is a good strategy for increasing the caches yet preserving performance after an optimize operation? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Occasional Solr performance issues
Perhaps you can grab a snapshot of the stack traces when the 60 second delay is occurring? You can get the stack traces right in the admin ui, or you can use another tool (jconsole, visualvm, jstack cmd line, etc) - Mark On Mon, Oct 22, 2012 at 1:47 PM, Dotan Cohen dotanco...@gmail.com wrote: On Mon, Oct 22, 2012 at 7:29 PM, Shawn Heisey s...@elyograg.org wrote: On 10/22/2012 9:58 AM, Dotan Cohen wrote: Thank you, I have gone over the Solr admin panel twice and I cannot find the cache statistics. Where are they? If you are running Solr4, you can see individual cache autowarming times here, assuming your core is named collection1: http://server:port/solr/#/collection1/plugins/cache?entry=queryResultCache http://server:port/solr/#/collection1/plugins/cache?entry=filterCache The warmup time for the entire searcher can be found here: http://server:port/solr/#/collection1/plugins/core?entry=searcher Thank you Shawn! I can see how I missed that data. I'm reviewing it now. Solr has a low barrier to entry, but quite a learning curve. I'm loving it! I see that the server is using less than 2 GiB of memory, whereas it is a dedicated Solr server with 16 GiB of memory. I understand that I can increase the query and document caches to increase performance, but I worry that this will increase the warm-up time to unacceptable levels. What is a good strategy for increasing the caches yet preserving performance after an optimize operation? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com -- - Mark
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller markrmil...@gmail.com wrote: Perhaps you can grab a snapshot of the stack traces when the 60 second delay is occurring? You can get the stack traces right in the admin ui, or you can use another tool (jconsole, visualvm, jstack cmd line, etc) Thanks. I've refactored so that the index is optimized once per hour, instead after each dump of commits. But when I will need to increase the optmize frequency in the future I will go through the stack traces. Thanks! In any case, the server has an extra 14 GiB of memory available, how might I make the best use of that for Solr assuming both heavy reads and writes? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Occasional Solr performance issues
First, stop optimizing. You do not need to manually force merges. The system does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and might be the cause of your problem. Second, the OS will use the extra memory for file buffers, which really helps performance, so you might not need to do anything. This will work better after you stop forcing merges. A forced merge replaces every file, so the OS needs to reload everything into file buffers. wunder On Oct 22, 2012, at 12:55 PM, Dotan Cohen wrote: On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller markrmil...@gmail.com wrote: Perhaps you can grab a snapshot of the stack traces when the 60 second delay is occurring? You can get the stack traces right in the admin ui, or you can use another tool (jconsole, visualvm, jstack cmd line, etc) Thanks. I've refactored so that the index is optimized once per hour, instead after each dump of commits. But when I will need to increase the optmize frequency in the future I will go through the stack traces. Thanks! In any case, the server has an extra 14 GiB of memory available, how might I make the best use of that for Solr assuming both heavy reads and writes? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Occasional Solr performance issues
Has the Solr team considered renaming the optimize function to avoid leading people down the path of this antipattern? Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Oct 22, 2012 at 4:01 PM, Walter Underwood wun...@wunderwood.org wrote: First, stop optimizing. You do not need to manually force merges. The system does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and might be the cause of your problem. Second, the OS will use the extra memory for file buffers, which really helps performance, so you might not need to do anything. This will work better after you stop forcing merges. A forced merge replaces every file, so the OS needs to reload everything into file buffers. wunder On Oct 22, 2012, at 12:55 PM, Dotan Cohen wrote: On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller markrmil...@gmail.com wrote: Perhaps you can grab a snapshot of the stack traces when the 60 second delay is occurring? You can get the stack traces right in the admin ui, or you can use another tool (jconsole, visualvm, jstack cmd line, etc) Thanks. I've refactored so that the index is optimized once per hour, instead after each dump of commits. But when I will need to increase the optmize frequency in the future I will go through the stack traces. Thanks! In any case, the server has an extra 14 GiB of memory available, how might I make the best use of that for Solr assuming both heavy reads and writes? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Occasional Solr performance issues
Lucene already did that: https://issues.apache.org/jira/browse/LUCENE-3454 Here is the Solr issue: https://issues.apache.org/jira/browse/SOLR-3141 People over-use this regardless of the name. In Ultraseek Server, it was called force merge and we had to tell people to stop doing that nearly every month. wunder On Oct 22, 2012, at 1:39 PM, Michael Della Bitta wrote: Has the Solr team considered renaming the optimize function to avoid leading people down the path of this antipattern? Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Oct 22, 2012 at 4:01 PM, Walter Underwood wun...@wunderwood.org wrote: First, stop optimizing. You do not need to manually force merges. The system does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and might be the cause of your problem. Second, the OS will use the extra memory for file buffers, which really helps performance, so you might not need to do anything. This will work better after you stop forcing merges. A forced merge replaces every file, so the OS needs to reload everything into file buffers. wunder On Oct 22, 2012, at 12:55 PM, Dotan Cohen wrote: On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller markrmil...@gmail.com wrote: Perhaps you can grab a snapshot of the stack traces when the 60 second delay is occurring? You can get the stack traces right in the admin ui, or you can use another tool (jconsole, visualvm, jstack cmd line, etc) Thanks. I've refactored so that the index is optimized once per hour, instead after each dump of commits. But when I will need to increase the optmize frequency in the future I will go through the stack traces. Thanks! In any case, the server has an extra 14 GiB of memory available, how might I make the best use of that for Solr assuming both heavy reads and writes? Thanks. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com -- Walter Underwood wun...@wunderwood.org
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 4:39 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: Has the Solr team considered renaming the optimize function to avoid leading people down the path of this antipattern? If it were never the right thing to do, it could simply be removed. The problem is that it's sometimes the right thing to do - but it depends heavily on the use cases and trade-offs. The best thing is to simply document what it does and the cost of doing it. -Yonik http://lucidworks.com
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 10:01 PM, Walter Underwood wun...@wunderwood.org wrote: First, stop optimizing. You do not need to manually force merges. The system does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and might be the cause of your problem. Thanks. Looking at the index statistics, I see that within minutes after running optimize that the stats say the index needs to be reoptimized. Though, the index still reads and writes fine even in that state. Second, the OS will use the extra memory for file buffers, which really helps performance, so you might not need to do anything. This will work better after you stop forcing merges. A forced merge replaces every file, so the OS needs to reload everything into file buffers. I don't see that the memory is being used: $ free -g total used free sharedbuffers cached Mem:14 2 12 0 0 1 -/+ buffers/cache: 0 14 Swap:0 0 0 -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: [/solr] memory leak prevent tomcat shutdown
any input on this? thanks Jie -- View this message in context: http://lucene.472066.n3.nabble.com/solr-memory-leak-prevent-tomcat-shutdown-tp4014788p4015265.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Occasional Solr performance issues
On Mon, Oct 22, 2012 at 10:44 PM, Walter Underwood wun...@wunderwood.org wrote: Lucene already did that: https://issues.apache.org/jira/browse/LUCENE-3454 Here is the Solr issue: https://issues.apache.org/jira/browse/SOLR-3141 People over-use this regardless of the name. In Ultraseek Server, it was called force merge and we had to tell people to stop doing that nearly every month. Thank you for those links. I commented on the Solr bug. There are some very insightful comments in there. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
solr 4.1 compression
can someone provide example configuration how to use new compression in solr 4.1? http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene
Solr Cloud Questions
I have a few questions regarding Solr Cloud. I've been following it for quite some time but I believe it wasn't ever production ready. I see that with the release of 4.0 it's considered stable… is that the case? Can anyone out there share your experiences with Solr Cloud in a production environment?
Re: Occasional Solr performance issues
On 10/22/2012 3:11 PM, Dotan Cohen wrote: On Mon, Oct 22, 2012 at 10:01 PM, Walter Underwood wun...@wunderwood.org wrote: First, stop optimizing. You do not need to manually force merges. The system does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and might be the cause of your problem. Thanks. Looking at the index statistics, I see that within minutes after running optimize that the stats say the index needs to be reoptimized. Though, the index still reads and writes fine even in that state. As soon as you make any change at all to an index, it's no longer optimized. Delete one document, add one document, anything. Most of the time you will not see a performance increase from optimizing an index that consists of one large segment and a bunch of very tiny segments or deleted documents. Second, the OS will use the extra memory for file buffers, which really helps performance, so you might not need to do anything. This will work better after you stop forcing merges. A forced merge replaces every file, so the OS needs to reload everything into file buffers. I don't see that the memory is being used: $ free -g total used free sharedbuffers cached Mem:14 2 12 0 0 1 -/+ buffers/cache: 0 14 Swap:0 0 0 How big is your index, and did you run this right after a reboot? If you did, then the cache will be fairly empty, and Solr has only read enough from the index files to open the searcher.The number is probably too small to show up on a gigabyte scale. As you issue queries, the cached amount will get bigger. If your index is small enough to fit in the 14GB of free RAM that you have, you can manually populate the disk cache by going to your index directory and doing 'cat * /dev/null' from the commandline or a script. The first time you do it, it may go slowly, but if you immediately do it again, it will complete VERY fast -- the data will all be in RAM. The 'free -m' command in your first email shows cache usage of 1243MB, which suggests that maybe your index is considerably smaller than your available RAM. Having loads of free RAM is a good thing for just about any workload, but especially for Solr.Try running the free command without the -g so you can see those numbers in kilobytes. I have seen a tendency towards creating huge caches in Solr because people have lots of memory. It's important to realize that the OS is far better at the overall job of caching the index files than Solr itself is. Solr caches are meant to cache result sets from queries and filters, not large sections of the actual index contents. Make the caches big enough that you see some benefit, but not big enough to suck up all your RAM. If you are having warm time problems, make the autowarm counts low. I have run into problems with warming on my filter cache, because we have filters that are extremely hairy and slow to run. I had to reduce my autowarm count on the filter cache to FOUR, with a cache size of 512. When it is 8 or higher, it can take over a minute to autowarm. Thanks, Shawn
Re: Occasional Solr performance issues
On Tue, Oct 23, 2012 at 3:52 AM, Shawn Heisey s...@elyograg.org wrote: As soon as you make any change at all to an index, it's no longer optimized. Delete one document, add one document, anything. Most of the time you will not see a performance increase from optimizing an index that consists of one large segment and a bunch of very tiny segments or deleted documents. I've since realized that by experimentation. I've probably saved quite a few minutes of reading time by investing hours of experiment time! How big is your index, and did you run this right after a reboot? If you did, then the cache will be fairly empty, and Solr has only read enough from the index files to open the searcher.The number is probably too small to show up on a gigabyte scale. As you issue queries, the cached amount will get bigger. If your index is small enough to fit in the 14GB of free RAM that you have, you can manually populate the disk cache by going to your index directory and doing 'cat * /dev/null' from the commandline or a script. The first time you do it, it may go slowly, but if you immediately do it again, it will complete VERY fast -- the data will all be in RAM. The cat trick to get the files in RAM is great. I would not have thought that would work for binary files. The index is small, much less than the available RAM, for the time being. Therefore, there was nothing to fill it with I now understand. Both 'free' outputs were after the system had been running for some time. The 'free -m' command in your first email shows cache usage of 1243MB, which suggests that maybe your index is considerably smaller than your available RAM. Having loads of free RAM is a good thing for just about any workload, but especially for Solr.Try running the free command without the -g so you can see those numbers in kilobytes. I have seen a tendency towards creating huge caches in Solr because people have lots of memory. It's important to realize that the OS is far better at the overall job of caching the index files than Solr itself is. Solr caches are meant to cache result sets from queries and filters, not large sections of the actual index contents. Make the caches big enough that you see some benefit, but not big enough to suck up all your RAM. I see, thanks. If you are having warm time problems, make the autowarm counts low. I have run into problems with warming on my filter cache, because we have filters that are extremely hairy and slow to run. I had to reduce my autowarm count on the filter cache to FOUR, with a cache size of 512. When it is 8 or higher, it can take over a minute to autowarm. I will have to experiment with the warning. Thank you for the tips. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com
Re: Data Writing Performance of Solr 4.0
Thanks for the replies. I think I'll take a look at NRT. (2012/10/21 4:42), Nagendra Nagarajayya wrote: You may want to look at realtime NRT for this kind of performance: https://issues.apache.org/jira/browse/SOLR-3816 You can download realtime NRT integrated with Apache Solr from here: http://solr-ra.tgels.org Regards, - Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 10/18/2012 11:50 PM, higashihara_hdk wrote: Hello everyone. I have two questions. I am considering using Solr 4.0 to perform full searches on the data output in real-time by a Storm cluster (http://storm-project.net/). 1. In particular, I'm concerned whether Solr would be able to keep up with the 2000-message-per-second throughput of the Storm cluster. What kind of throughput would I be able to expect from Solr 4.0, for example on a Xeon 2.5GHz 4-core with HDD? 2. Also, how efficiently would Solr scale with clustering? Any pertinent information would be greatly appreciated. Hideki Higashihara
searching a database element
Hi, I have indexed from a database. I have specified a field type laptop. In the database, laptop has the value equal to Dell. I can search laptop: Dell from the database with the following command. http://localhost:8983/solr/db/select/?q=laptop:Dellstart=0rows=4fl=laptop Can i search for just the query string dell without specifying that it is of field type laptop? Below is my data-config and schema.xml file: data-config.xml: dataConfig dataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/camerasys user=root password= 123 / document entity name=computer query=SELECT * FROM computer field column=laptop name=laptop/ /entity /document /dataConfig schema.xml: field name=laptop type=string indexed=true stored=true required=true/ uniqueKeylaptop/uniqueKey Romita Saha
Re: searching a database element
Hi, I added defaultSearchFieldlaptop/defaultSearchField to the schema.xml file. However the query http://.../solr/db/select?q=Dellstart=0rows=4fl=laptop is not able to search for dell. Following is the response. response lst name=responseHeader int name=status0/int int name=QTime2/int lst name=paramsstr name=indenton/str str name=start0/strstr name=qdell/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0//response Thanks and regards, Romita Saha From: adityab aditya_ba...@yahoo.com To: solr-user@lucene.apache.org, Date: 10/23/2012 12:01 PM Subject:Re: searching a database element if i understand correctly you are looking for the below attribute in schema.xml to be defined. defaultSearchFieldlaptop/defaultSearchField you query can now be http://.../solr/db/select?q=Dellstart=0rows=4fl=laptop -- View this message in context: http://lucene.472066.n3.nabble.com/searching-a-database-element-tp4015293p4015296.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: searching a database element
Are you applying any analyzer/tokenizer for the fieldType 'string' (i guess no) your query in the response shows '*dell*' where as you are store data is '*Dell*'. If you wan to search ignoring the case then you might need to use LowerCaseFilterFactory as analyzer to the field. and then perform the search. -- View this message in context: http://lucene.472066.n3.nabble.com/searching-a-database-element-tp4015293p4015298.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: searching a database element
Hi, Sorry for the typo in the previous mail. I am searching for dell actually. The query is http://.../solr/db/select?q=dellstart=0rows=4fl=laptop I am not applying any analyzer/tokenizer for the fieldType 'string'. I also want to share my solrconfig file with you. requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler Thanks and regards, Romita Saha From: adityab aditya_ba...@yahoo.com To: solr-user@lucene.apache.org, Date: 10/23/2012 12:19 PM Subject:Re: searching a database element Are you applying any analyzer/tokenizer for the fieldType 'string' (i guess no) your query in the response shows '*dell*' where as you are store data is '*Dell*'. If you wan to search ignoring the case then you might need to use LowerCaseFilterFactory as analyzer to the field. and then perform the search. -- View this message in context: http://lucene.472066.n3.nabble.com/searching-a-database-element-tp4015293p4015298.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: searching a database element
Hi, It worked. I was specifying more than one filed under defaultSearchField. Once I specified just the required field, it is able to do the search. Thanks a lot for your guidance. Romita From: Romita Saha romita.s...@sg.panasonic.com To: solr-user@lucene.apache.org, Date: 10/23/2012 12:31 PM Subject:Re: searching a database element Hi, Sorry for the typo in the previous mail. I am searching for dell actually. The query is http://.../solr/db/select?q=dellstart=0rows=4fl=laptop I am not applying any analyzer/tokenizer for the fieldType 'string'. I also want to share my solrconfig file with you. requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler Thanks and regards, Romita Saha From: adityab aditya_ba...@yahoo.com To: solr-user@lucene.apache.org, Date: 10/23/2012 12:19 PM Subject:Re: searching a database element Are you applying any analyzer/tokenizer for the fieldType 'string' (i guess no) your query in the response shows '*dell*' where as you are store data is '*Dell*'. If you wan to search ignoring the case then you might need to use LowerCaseFilterFactory as analyzer to the field. and then perform the search. -- View this message in context: http://lucene.472066.n3.nabble.com/searching-a-database-element-tp4015293p4015298.html Sent from the Solr - User mailing list archive at Nabble.com.