Re: Validate Query Syntax of Solr Request Before Sending

2011-02-18 Thread csj

Hi,

FYI, I found out. I'm using the SolrQueryParser (tadaa...)

It needs the solrconfig.xml and the solr.xml files in other to validate the
query.

Then I'm able to validate any query before sending it to the Solrserver,
thereby preventing unnecessary requests.

/Christian
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Validate-Query-Syntax-of-Solr-Request-Before-Sending-tp2515797p2528183.html
Sent from the Solr - User mailing list archive at Nabble.com.


Validate Query Syntax of Solr Request Before Sending

2011-02-16 Thread csj

Hi,

I wonder if it is possible to let the user build up a Solr Query and have it
validated by some java API before sending it to Solr. 

Is there a parser that could help with that? I would like to help the user
building a valid query as she types by showing messages like "The query is
not valid" or purhaps even more advanced: "The parentheses are not
balanced".

Maybe one day it would also be possible to analyse the semantics of the
query like: "This query has a build-in inconsistency because the two dates
you have specified requires documents to be before AND after these date".
But this is far future...

Regards,

Christian Sonne Jensen
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Validate-Query-Syntax-of-Solr-Request-Before-Sending-tp2515797p2515797.html
Sent from the Solr - User mailing list archive at Nabble.com.


How effective are faceted queries ?

2011-02-03 Thread csj

Hi,

I was wondering if there exists any performance characteristica for facets.
As I understand facets, they are a subqueries, that will perform certain
counts on the resultset. This mean that a facet will be evaluated on every
shard along with the main query. 

But how will the facet query evaluate? If the resultset is sorted, will the
facet query take advantage of that when evaluating? 

Example: a search is done for all document within a given range of dates by
the field createdDate. The resultset is sorted by that field. Would a facet
query then be able to use this sorting, when it counts how many documents
were created per week, or per day for that matter?

Kind regards,

Christian Sonne Jensen
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-effective-are-faceted-queries-tp2412689p2412689.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: match count per shard and across shards

2011-01-30 Thread csj

Hi,

FYI:
I figured out a solution my self. I wanted a smart way to get the shard
count for a query (how many documents were found in each shard). The "smart"
consisted in having all these counts in just one query using faceting. I was
asking if Solr could help with this, e.g. had some smart info for shards, I
could facet out of the box. But apparently it does not.

But in my situation I can use my knowledge of how the shards are organised.
They are organised chronologically, and I happen to know the date
boundaries. 

My solution is simply to facet those boundaries. In this way I can query
once and include all known shards and have their count for the search. This
may have a performance penalty, but it is at least for now a simple way.

Christian Sonne Jensen
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2385061.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: match count per shard and across shards

2011-01-29 Thread csj

I'm not sure I understand this. What is the difference between multible
indexes and multible shards?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2382499.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: match count per shard and across shards

2011-01-29 Thread csj

Interesting idea. I must investigate if this is a possibility - eg. how often
will a document be reindexed from one shard to another - this is actually a
possibility as a consequence of the way we configure our shards :-/

Thanks for the input! I was still hoping for a way to get that info from
Solr. The idea is the same: facet the Solr-shard position of each
document... 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2382495.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: match count per shard and across shards

2011-01-29 Thread csj

Indeed the distribution across shards should be transparent. In fact, as a
client I should not need to know anything about any shard. But as the
current state of Solr (1.4) dictate an interface where you - as a client -
must provide a list of shards, then the responsibility has been shiftet over
to the client.

Since we get so much data that we must add a new shard per month, we have to
be shard-aware on the client side. My understanding of Solr is that the
final reponse of a query is only finished when every shard in the querys
shard list has been consulted. This mean that the slowest ship defines the
speed, so to speak. Or worse - if any shard in the list fails, then the
response fails!

What I hope to achieve is a way of cutting shards off the list for a query.
If I more or less know how many hits a given query have in a shard, then I
could control paging myself, and only include shards I know will have the
documents in the shardlist for the query. Otherwise I'm afraid of
performance when we get to have dusins of shards.

So to summerise: We are developing a system where a given search will be
performed again and again over time on an ever-increasing document base. The
first time a search is done, it will be distributed across every shard in
order to get a total from beginning of time till the current timestamp of
the querys debute. This total is cached and hereafter maintained by querying
the most recent shards from the last date until now.
Mostly the documents come in a chronological order, but occasionally they
arrive out of order. The shards are organised by date intervals, and this
mean that every shard from time to time will be the target of more
documents. This will induce a slight discrepency between the cached total
and the actual total. But this is a discrepency that we can live with.
But I would also like to know how many hits there are in each individual
shard. If I know this, then I can tailormake a precise shardlist for the
query: Because I know the offset and pagesize of the query, and I know how
many documents are in each shard, then I can calculate which shards to
include. This is a lot of client side administration - I know, but I quess -
I hope - it will performe quite well...

Is this idea crazy or what?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2382411.html
Sent from the Solr - User mailing list archive at Nabble.com.


match count per shard and across shards

2011-01-29 Thread csj

Hi,

Is it possible to construct a Solr query that will return the total number
of hits there across all shards, and at the same time getting the number of
hits per shard?

I was thinking along the lines of a faceted search, but I'm not deep enough
into Solr capabilities and query parameters to figure it out.

Regards,

Christian Sonne Jensen

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/match-count-per-shard-and-across-shards-tp2369627p2369627.html
Sent from the Solr - User mailing list archive at Nabble.com.