date:20120822

Group count in SOLR 3.3

2012-08-22 Thread Roman Slavík


Hi guys,

we are using SOLR 3.3 with Solrj inside our java project. In actual 
version we had to add some grouping support, so we add parameters into 
SolrQuery object like this:


query.setParam(GroupParams.GROUP, true);
query.setParam(GroupParams.GROUP_MAIN, true);
query.setParam(GroupParams.GROUP_FIELD, OUR_GROUP_FIELD);

and we get QueryResponse with results we need. Awesome!

But now I have one remaining problem, I don't know how get number of 
groups from QueryResponse. I found I must add group.ngroups=true param 
into query. So I did it:


   query.setParam(GroupParams.GROUP_TOTAL_COUNT, true);

But QueryResponse seems same. There's no method like "getGroupCount()" 
or group count param in header.


Am I doing something wrong? Or is it SOLR 3.3 problem? If we upgrade to 
newer version, will it works?


Thanks for any advice!

Roman

Re: Weighted Search Results / Multi-Value Value's Not Aggregating Weight

2012-08-22 Thread David Radunz


Hey,

Please disregard this, I worked out what the actual problem was. I 
am going to post another query with something else I discovered.


Thanks :)

David

On 22/08/2012 7:24 PM, David Radunz wrote:

Hey,

I have been having some problems getting good search results when 
using weighting against many fields with multi-values. After quite a 
bit of testing it seems to me that the problem is (at least as far as 
my query is concerned) is that the only one weighting is taken into 
account per field. For example, in a multi-value field if we have 
"Comedy" and "Romance" and have separate weightings for those - the 
one with the highest weighting is used (and not a combined weighting). 
Which means that searched for romantic comedy returns "Alvin and the 
Chipmunks" (Family, Children Comedy).


Query:

facet=on&fl=id,name,matching_genres,score,url_path,url_key,price,special_price,small_image,thumbnail,sku,stock_qty,release_date&sort=score+desc,retail_rating+desc,release_date+desc&start=&q=**+-sku:"1019660"+-movie_id:"1805"+-movie_id:"1806"+(series_names_attr_opt_id:"454282"^9000+OR+cat_id:"22"^9+OR+cat_id:"248"^9+OR+cat_id:"249"^9+OR+matching_genres:"Comedy"^9+OR+matching_genres:"Romance"^7+OR+matching_genres:"Drama"^5)&fq=store_id:"1"+AND+avail_status_attr_opt_id:"available"+AND+(format_attr_opt_id:"372619")&rows=4 



Now if I change matching_genres:"Romance"^7 to 
matching_genres:"Romance"^70 (adding a 0) suddenly the first 
result is "Sex and the City: The Movie / Sex and the City 2" (which 
ironically is "Drama", "Comedy", "Romance - The very combination we 
are looking for).


So is there a way to structure my query so that all of the 
multi-value values are treated individually? Aggregating the 
weighting/score?


Thanks in advance!

David

Re: Solr - Index Concurrency - Is it possible to have multiple threads write to same index?

2012-08-22 Thread ksu wildcats

Thanks for the reply Mikhail.

For our needs the speed is more important than flexibility and we have huge
text files (ex: blogs / articles ~2 MB size) that needs to be read from our
filesystem and then store into the index.

We have our app creating separate core per client (dynamically) and there is
one instance of EmbeddedSolrServer for each core thats used for adding
documents to the index.
Each document has about 10 fields and one of the field has ~2MB data stored
(stored = true, analyzed=true).
Also we have logic built into our webapp to dynamically create the solr
config files
(solrConfig & schema per core - filters/analyzers/handler values can be
different for each core)
for each core before creating an instance of EmbeddedSolrServer for that
core.
Another reason to go with EmbeddedSolrServer is to reduce overhead of
transporting large data (~2 MB) over http/xml.

We use this setup for building our master index which then gets replicated
to slave servers
using replication scripts provided by solr.
We also have solr admin ui integrated into our webapp (using admin jsp &
handlers from solradmin ui)

We have been using this MultiCore setup for more than a year now and so far
we havent run into any issues with EmbeddedSolrServer integrated into our
webapp.
However I am now trying to figure out the impact if we allow multiple
threads sending request to EmbeddedSolrServer (same core) for adding docs to
index simultaneously.

Our understanding was that EmbeddedSolrServer would give us better
performance over http solr for our needs.
Its quite possible that we might be wrong and http solr would have given us
similar/better performance.

Also based on documentation from SolrWiki I am assuming that
EmbeddedSolrServer API is same as the one used by Http Solr.

Said that, can you please tell if there is any specific downside to using
EmbeddedSolrServer that could cause issues for us down the line.

I am also interested in your below comment about indexing 1 million docs in
few mins. Ideally we would like to get to that speed
I am assuming this depends on the size of the doc and type of
analyzer/tokenizer/filters being used. Correct?
Can you please share (or point me to documentation) on how to get this speed
for 1 mil docs.
>> - one million is a fairly small amount, in average it should be indexed
>> in few mins. I doubt that you really need to distribute indexing

Thanks
-K

--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Index-Concurrency-Is-it-possible-to-have-multiple-threads-write-to-same-index-tp4002544p4002776.html
Sent from the Solr - User mailing list archive at Nabble.com.

65 matches

Mail list logo