Re: Partial results with not enough hits
Thanks for the response. I have increased the timeout and it did not increase execution time or system load. It is really that I misused the timeout. Just to give you a bit of perspective, we added timeout to guarantee some level of QoS from the search engine. Our UI allows user to construct very complex queries and (what is worse) not all the time user really understands what she needs. That may become a problem if we have lots of users doing that. In this case I do not want to run such a complex query for seconds and want to return some result with a warning to the user that she is doing something wrong. But clearly, I set a timeout too low for that and started to harm even normal queries. Anyway, thanks everyone for the replies. The issue is fixed and I now understand how timeout works much better (which was the reason to post to this list). Thanks! -- Aleksey On 12-11-22 06:37 AM, Otis Gospodnetic wrote: Hi, Maybe your goal should be to make your queries faster instead of fighting with timeouts which are known not to work well. What is your hardware like? How about your queries? What do you see in debugQuery=true output? Otis -- SOLR Performance Monitoring - http://sematext.com/spm On Nov 21, 2012 6:04 PM, "Aleksey Vorona" wrote: In all of my queries I have timeAllowed parameter. My application is ready for partial results. However, whenever Solr returns partial result it is a very bad result. For example, I have a test query and here its execution log with the strict time allowed: WARNING: Query: ; Elapsed time: 120Exceeded allowed search time: 100 ms. INFO: [] webapp=/solr path=/select params={&timeAllowed=**100} hits=189 status=0 QTime=119 Here it is without such a strict limitation: INFO: [] webapp=/solr path=/select params={&timeAllowed=**1} hits=582 status=0 QTime=124 The total execution time is different by mere 5 ms, but the partial result has only about 1/3 of the full result. Is it the expected behaviour? Does that mean I can never rely on the partial results? I added timeAllowed to protect from too expensive wide queries, but I still want to return something relevant to the user. This query returned 30% of the full result, but I have other queries in the log where partial result is just empty. Am I doing something wrong? P.S. I am using Solr 3.6.1, index size is 3Gb and easily fits in memory. Load Average on the Solr box is very low. -- Aleksey
Re: Partial results with not enough hits
Thank you! That seems to be the case, I tried to execute queries without sorting and only one document in the response and I got execution time in the same range as before. -- Aleksey On 12-11-21 04:07 PM, Jack Krupansky wrote: It could be that the time to get set up to return even the first result is high and then each additional document is a minimal increment in time. Do a query with &rows=1 (or even 0) and see what the minimum query time is for your query, index, and environment. -- Jack Krupansky -Original Message- From: Aleksey Vorona Sent: Wednesday, November 21, 2012 6:04 PM To: solr-user@lucene.apache.org Subject: Partial results with not enough hits In all of my queries I have timeAllowed parameter. My application is ready for partial results. However, whenever Solr returns partial result it is a very bad result. For example, I have a test query and here its execution log with the strict time allowed: WARNING: Query: ; Elapsed time: 120Exceeded allowed search time: 100 ms. INFO: [] webapp=/solr path=/select params={&timeAllowed=100} hits=189 status=0 QTime=119 Here it is without such a strict limitation: INFO: [] webapp=/solr path=/select params={&timeAllowed=1} hits=582 status=0 QTime=124 The total execution time is different by mere 5 ms, but the partial result has only about 1/3 of the full result. Is it the expected behaviour? Does that mean I can never rely on the partial results? I added timeAllowed to protect from too expensive wide queries, but I still want to return something relevant to the user. This query returned 30% of the full result, but I have other queries in the log where partial result is just empty. Am I doing something wrong? P.S. I am using Solr 3.6.1, index size is 3Gb and easily fits in memory. Load Average on the Solr box is very low. -- Aleksey
Partial results with not enough hits
In all of my queries I have timeAllowed parameter. My application is ready for partial results. However, whenever Solr returns partial result it is a very bad result. For example, I have a test query and here its execution log with the strict time allowed: WARNING: Query: ; Elapsed time: 120Exceeded allowed search time: 100 ms. INFO: [] webapp=/solr path=/select params={&timeAllowed=100} hits=189 status=0 QTime=119 Here it is without such a strict limitation: INFO: [] webapp=/solr path=/select params={&timeAllowed=1} hits=582 status=0 QTime=124 The total execution time is different by mere 5 ms, but the partial result has only about 1/3 of the full result. Is it the expected behaviour? Does that mean I can never rely on the partial results? I added timeAllowed to protect from too expensive wide queries, but I still want to return something relevant to the user. This query returned 30% of the full result, but I have other queries in the log where partial result is just empty. Am I doing something wrong? P.S. I am using Solr 3.6.1, index size is 3Gb and easily fits in memory. Load Average on the Solr box is very low. -- Aleksey
All-wildcard query performance
Hi, Our application sometimes generates queries with one of the constraints: field:[* TO *] I expected this query performance to be the same as if we omitted the "field" constraint completely. However, I see the performance of the two queries to differ drastically (3ms without all-wildcard constraint, 200ms with it). Could someone explain the source of the difference, please? I am fixing the application not to generate such queries, obviously, but still would like to understand the logic here. We use Solr 3.6.1. Thanks. -- Aleksey
Re: Solr Replication and Autocommit
Thank both of you for the responses! -- Aleksey On 12-09-27 03:51 AM, Erick Erickson wrote: I'll echo Otis, nothing comes to mind... Unless you were indexing stuff to the _slaves_, which you should never do, now or in the past Erick On Thu, Sep 27, 2012 at 12:00 AM, Aleksey Vorona wrote: Hi, I remember having some issues with replication and autocommit previously. But now we are using Solr 3.6.1. Are there any known issues or any other reasons to avoid autocommit while using replication? I guess not, just want confirmation from someone confident and competent. -- Aleksey
Solr Replication and Autocommit
Hi, I remember having some issues with replication and autocommit previously. But now we are using Solr 3.6.1. Are there any known issues or any other reasons to avoid autocommit while using replication? I guess not, just want confirmation from someone confident and competent. -- Aleksey
Re: Search by field with the space in it
Thank you for that insight. I, myself, would've liked to remove the spaces, but it is not possible in that particular project. I see that I need to learn more about Lucene. Hopefully that will help me avoid some of those headaches to come. -- Aleksey On 12-09-19 11:42 AM, Erick Erickson wrote: I would _really_ recommend that you re-do your schema and take spaces out of your field names. That may require that you change your indexing program to not send spaces in dynamic field names This is the kind of thing that causes endless headaches as time goes forward. You don't _have_ to, but I predict you'll regret if if you don't . Best Erick On Wed, Sep 19, 2012 at 2:11 PM, Aleksey Vorona wrote: On 12-09-19 11:04 AM, Ahmet Arslan wrote: I have a field with space in its name (that is a dynamic field). How can I execute search on it? I tried "q=aattr_box%20%type_sc:super" and it did not work The field name is "aattr_box type" How about q=aattr_box\ type_sc:super That works! Thank you! Sidenote: of course I urlencode space. -- Aleksey
Re: Search by field with the space in it
On 12-09-19 11:04 AM, Ahmet Arslan wrote: I have a field with space in its name (that is a dynamic field). How can I execute search on it? I tried "q=aattr_box%20%type_sc:super" and it did not work The field name is "aattr_box type" How about q=aattr_box\ type_sc:super That works! Thank you! Sidenote: of course I urlencode space. -- Aleksey
Search by field with the space in it
Hi, I have a field with space in its name (that is a dynamic field). How can I execute search on it? I tried "q=aattr_box%20%type_sc:super" and it did not work The field name is "aattr_box type" -- Aleksey
Re: Solr not allowing persistent HTTP connections
Thank you. I did the test with curl the same way you did it and it works. I still can not get ab ("apache benchmark") to reuse connections to solr. I'll investigate this further. $ ab -c 1 -n 100 -k 'http://localhost:8983/solr/select?q=*:*' | grep Alive Keep-Alive requests:0 -- Aleksey On 12-09-06 11:07 AM, Chris Hostetter wrote: : Some extra information. If I use curl and force it to use HTTP 1.0, it is more : visible that Solr doesn't allow persistent connections: a) solr has nothing to do with it, it's entirely something under the control of jetty & the client. b) i think you are introducing confusion by trying to force an HTTP/1.0 connection -- Jetty supports Keep-Alive for HTTP/1.1, but maybe not for HTTP/1.0 ? If you use curl to request multiple URLs and just let curl & jetty do their normal behavior (w/o trying to bypass anything or manually add headers) you can see that keep-alive is in fact working... $ curl -v --keepalive 'http://localhost:8983/solr/select?q=*:*' 'http://localhost:8983/solr/select?q=foo' * About to connect() to localhost port 8983 (#0) * Trying 127.0.0.1... connected GET /solr/select?q=*:* HTTP/1.1 User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Host: localhost:8983 Accept: */* < HTTP/1.1 200 OK < Content-Type: application/xml; charset=UTF-8 < Transfer-Encoding: chunked < 01*:* * Connection #0 to host localhost left intact * Re-using existing connection! (#0) with host localhost * Connected to localhost (127.0.0.1) port 8983 (#0) GET /solr/select?q=foo HTTP/1.1 User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Host: localhost:8983 Accept: */* < HTTP/1.1 200 OK < Content-Type: application/xml; charset=UTF-8 < Transfer-Encoding: chunked < 00foo * Connection #0 to host localhost left intact * Closing connection #0 -Hoss
Re: Solr not allowing persistent HTTP connections
Some extra information. If I use curl and force it to use HTTP 1.0, it is more visible that Solr doesn't allow persistent connections: $ curl -v -0 'http://localhost:8983/solr/select?q=*:*' -H'Connection: Keep-Alive'* About to connect() to localhost port 8983 (#0) * Trying ::1... connected > GET /solr/select?q=*:* HTTP/1.0 > User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 > Host: localhost:8983 > Accept: */* > Connection: Keep-Alive > < HTTP/1.1 200 OK < Content-Type: application/xml; charset=UTF-8 * no chunk, no close, no size. Assume close to signal end < ...removed the rest of the response body... -- Aleksey On 12-09-05 03:54 PM, Aleksey Vorona wrote: Hi, Running example Solr from the 3.6.1 distribution I can not make it to keep persistent HTTP connections: $ ab -c 1 -n 100 -k 'http://localhost:8983/solr/select?q=*:*' | grep Keep-Alive Keep-Alive requests:0 What should I change to fix that? P.S. We have the same issue in production with Jetty 7, but I thought it would be better to ask about Solr example, since it is easier for anyone to reproduce the issue. -- Aleksey
Solr not allowing persistent HTTP connections
Hi, Running example Solr from the 3.6.1 distribution I can not make it to keep persistent HTTP connections: $ ab -c 1 -n 100 -k 'http://localhost:8983/solr/select?q=*:*' | grep Keep-Alive Keep-Alive requests:0 What should I change to fix that? P.S. We have the same issue in production with Jetty 7, but I thought it would be better to ask about Solr example, since it is easier for anyone to reproduce the issue. -- Aleksey
Re: Solr and query abortion
We are working on optimizing query performance. My concern was to ensure some stable QoS. Given our API and UI layout, user may generate an expensive query. Given the nature of the service, user may want to "hack" it. Currently, our Search API is a good point to try to inflict DoS on our server. And even though search outage will not cause any real security concern, it would be not nice. That is why I wanted to put a hard limit on the query complexity. Thank you for a hint on how to do it. As a side note, search performance with Solr is great. It is only during a good load test I am able to see those long running queries. When there is no load, even the most expensive query I have takes less than 100ms to be processed. As you said, 2.5M docs is not a very big index. Thanks again for the reply. I am not sure if we are going to implement custom component for Solr or put query complexity estimation code in our application. But in any case your response was greatly appreciated, because I was thinking that I am missing something. -- Aleksey On 12-08-30 05:51 AM, Erick Erickson wrote: The first thing I'd do is run your query with &debguQuery=on and look at the "timings" section. That'll tell you what component is taking all the time and should help you figure out where the problem is But worst-case you could implement a custom component to stop processing after some set number of responses.. 2.5M docs isn't a very big index. So I'd look at the rest of the tuning knobs before jumping to a solution. Also be aware that the first time, for instance, a sort gets performed there's a lengthy hit for warming the caches so you should disregard the first few queries, or do appropriate autowarming. Best Erick On Wed, Aug 29, 2012 at 1:26 PM, Aleksey Vorona wrote: Hi, we are running Solr 3.6.1 and see an issue in our load tests. Some of the queries our load test script produces result in huge number of hits. It may go as high as 90% of all documents we have (2.5M). Those are all range queries. I see in the log that those queries take much more time to execute. Since such a query does not make any sense from the end user perspective, I would like to limit its performance impact. Is it possible to abort the query after certain number of document hits or certain time elapsed and return a error? I would render that error as "Please refine your search" message to the end user in my application. I know that many sites on the web do that, and I guess most of them do that with Solr. I tried setting timeAllowed limit, but, for some reason, I did not see those query times to go down. I suspect that most of the time is spent not in Search phase (which is the only one respecting timeAllowed, as far as I know), but in the sorting phase. And still, I want to abort any longer running query. Otherwise they accumulate over time, pushing server's load average sky high and killing performance even for regular queries. -- Aleksey
Re: Null Pointer Exception on DIH with MySQL
Thank you for the reply. We rebuilt solr from sources, reinstalled it and the problem went away. As it was never reproducible on any other server, I blame some mysterious java byte code corruption on that server. The assumption I would never be able to verify, because we did not make a copy of the previous binaries. -- Aleksey On 12-08-29 06:17 PM, Erick Erickson wrote: Not much information to go on here, have you tried the DIH debugging console? See: http://wiki.apache.org/solr/DataImportHandler#interactive Best Erick On Mon, Aug 27, 2012 at 7:22 PM, Aleksey Vorona wrote: We have Solr 3.6.1 running on Jetty (7.x) and using DIH to get data from the MySQL database. On one of the environment the import always fails with an exception: http://pastebin.com/tG28cHPe It is a null pointer exception on connection being null. I've tested that I can connect from the Solr server to Mysql server via command line mysql client. Does anybody knows anything about this exception and how to fix it? I am not able to reproduce it on any other environment. -- Aleksey
Re: Load Testing in Solr
On 12-08-29 11:44 AM, dhaivat dave wrote: Hello everyone . Can any one know any component or tool that can be used for testing the solr performance. People were recommending https://code.google.com/p/solrmeter/ earlier. -- Aleksey
Solr and query abortion
Hi, we are running Solr 3.6.1 and see an issue in our load tests. Some of the queries our load test script produces result in huge number of hits. It may go as high as 90% of all documents we have (2.5M). Those are all range queries. I see in the log that those queries take much more time to execute. Since such a query does not make any sense from the end user perspective, I would like to limit its performance impact. Is it possible to abort the query after certain number of document hits or certain time elapsed and return a error? I would render that error as "Please refine your search" message to the end user in my application. I know that many sites on the web do that, and I guess most of them do that with Solr. I tried setting timeAllowed limit, but, for some reason, I did not see those query times to go down. I suspect that most of the time is spent not in Search phase (which is the only one respecting timeAllowed, as far as I know), but in the sorting phase. And still, I want to abort any longer running query. Otherwise they accumulate over time, pushing server's load average sky high and killing performance even for regular queries. -- Aleksey
Null Pointer Exception on DIH with MySQL
We have Solr 3.6.1 running on Jetty (7.x) and using DIH to get data from the MySQL database. On one of the environment the import always fails with an exception: http://pastebin.com/tG28cHPe It is a null pointer exception on connection being null. I've tested that I can connect from the Solr server to Mysql server via command line mysql client. Does anybody knows anything about this exception and how to fix it? I am not able to reproduce it on any other environment. -- Aleksey