Re: Validate Query Syntax of Solr Request Before Sending
Hi, FYI, I found out. I'm using the SolrQueryParser (tadaa...) It needs the solrconfig.xml and the solr.xml files in other to validate the query. Then I'm able to validate any query before sending it to the Solrserver, thereby preventing unnecessary requests. /Christian -- View this message in context: http://lucene.472066.n3.nabble.com/Validate-Query-Syntax-of-Solr-Request-Before-Sending-tp2515797p2528183.html Sent from the Solr - User mailing list archive at Nabble.com.
Validate Query Syntax of Solr Request Before Sending
Hi, I wonder if it is possible to let the user build up a Solr Query and have it validated by some java API before sending it to Solr. Is there a parser that could help with that? I would like to help the user building a valid query as she types by showing messages like "The query is not valid" or purhaps even more advanced: "The parentheses are not balanced". Maybe one day it would also be possible to analyse the semantics of the query like: "This query has a build-in inconsistency because the two dates you have specified requires documents to be before AND after these date". But this is far future... Regards, Christian Sonne Jensen -- View this message in context: http://lucene.472066.n3.nabble.com/Validate-Query-Syntax-of-Solr-Request-Before-Sending-tp2515797p2515797.html Sent from the Solr - User mailing list archive at Nabble.com.
How effective are faceted queries ?
Hi, I was wondering if there exists any performance characteristica for facets. As I understand facets, they are a subqueries, that will perform certain counts on the resultset. This mean that a facet will be evaluated on every shard along with the main query. But how will the facet query evaluate? If the resultset is sorted, will the facet query take advantage of that when evaluating? Example: a search is done for all document within a given range of dates by the field createdDate. The resultset is sorted by that field. Would a facet query then be able to use this sorting, when it counts how many documents were created per week, or per day for that matter? Kind regards, Christian Sonne Jensen -- View this message in context: http://lucene.472066.n3.nabble.com/How-effective-are-faceted-queries-tp2412689p2412689.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: match count per shard and across shards
Hi, FYI: I figured out a solution my self. I wanted a smart way to get the shard count for a query (how many documents were found in each shard). The "smart" consisted in having all these counts in just one query using faceting. I was asking if Solr could help with this, e.g. had some smart info for shards, I could facet out of the box. But apparently it does not. But in my situation I can use my knowledge of how the shards are organised. They are organised chronologically, and I happen to know the date boundaries. My solution is simply to facet those boundaries. In this way I can query once and include all known shards and have their count for the search. This may have a performance penalty, but it is at least for now a simple way. Christian Sonne Jensen -- View this message in context: http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2385061.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: match count per shard and across shards
I'm not sure I understand this. What is the difference between multible indexes and multible shards? -- View this message in context: http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2382499.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: match count per shard and across shards
Interesting idea. I must investigate if this is a possibility - eg. how often will a document be reindexed from one shard to another - this is actually a possibility as a consequence of the way we configure our shards :-/ Thanks for the input! I was still hoping for a way to get that info from Solr. The idea is the same: facet the Solr-shard position of each document... -- View this message in context: http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2382495.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: match count per shard and across shards
Indeed the distribution across shards should be transparent. In fact, as a client I should not need to know anything about any shard. But as the current state of Solr (1.4) dictate an interface where you - as a client - must provide a list of shards, then the responsibility has been shiftet over to the client. Since we get so much data that we must add a new shard per month, we have to be shard-aware on the client side. My understanding of Solr is that the final reponse of a query is only finished when every shard in the querys shard list has been consulted. This mean that the slowest ship defines the speed, so to speak. Or worse - if any shard in the list fails, then the response fails! What I hope to achieve is a way of cutting shards off the list for a query. If I more or less know how many hits a given query have in a shard, then I could control paging myself, and only include shards I know will have the documents in the shardlist for the query. Otherwise I'm afraid of performance when we get to have dusins of shards. So to summerise: We are developing a system where a given search will be performed again and again over time on an ever-increasing document base. The first time a search is done, it will be distributed across every shard in order to get a total from beginning of time till the current timestamp of the querys debute. This total is cached and hereafter maintained by querying the most recent shards from the last date until now. Mostly the documents come in a chronological order, but occasionally they arrive out of order. The shards are organised by date intervals, and this mean that every shard from time to time will be the target of more documents. This will induce a slight discrepency between the cached total and the actual total. But this is a discrepency that we can live with. But I would also like to know how many hits there are in each individual shard. If I know this, then I can tailormake a precise shardlist for the query: Because I know the offset and pagesize of the query, and I know how many documents are in each shard, then I can calculate which shards to include. This is a lot of client side administration - I know, but I quess - I hope - it will performe quite well... Is this idea crazy or what? -- View this message in context: http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2382411.html Sent from the Solr - User mailing list archive at Nabble.com.
match count per shard and across shards
Hi, Is it possible to construct a Solr query that will return the total number of hits there across all shards, and at the same time getting the number of hits per shard? I was thinking along the lines of a faceted search, but I'm not deep enough into Solr capabilities and query parameters to figure it out. Regards, Christian Sonne Jensen -- View this message in context: http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2369627.html Sent from the Solr - User mailing list archive at Nabble.com.