Hi folks,

I am new to Solr, and using it for web application. I have been
experimenting with it and have a couple of doubts which I was unable to
resolve by Google. Our portal allows users to upload content and the fields
we use are - title, description, transcript, tags. Now each of the content
has certain - hits, downloads, favorites and auto calculated values -
rating. We have a master/slave configuration (1 master, 2 slaves).

Solr version: 1.4.0
Java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)
32GiB RAM and 8 Core
Index Size: ~100 GiB


One of my use case is to find out related documents given a document ID. I
have been using More Like Handler to generate related documents, using
DisMax query. Now, I have to filter out certain content from the results
solr gives me. So, if for a document id X, solr returns me a list of 20
related documents, I want to apply a filter that these 20 documents should
not contain "black listed words". This is fairly straight forward in a
direct query using NOT operator. How is it possible to implement a similar
behavior in MoreLikeThisHandler?

Every week, we perform a full index of all the documents and
a nightly incremental indexing. This is done by a script which reads data
from MySQL and updates it to Solr. Sometimes it happens that the script
fails after updating 60% of the documents. Commit has not been performed at
this stage. The next cron executes, it adds some more documents and commits
them. So, will this commit involve the current update as well as the last
uncommitted updates as well? Are those uncommitted changes (which are stored
in a temp file) deleted after some time? Is there a way to clean uncommitted
changes?

Off lately, Solr has started to perform slow. When Solr is started it goes
quick and responds to requests in ~100ms. Gradually (very gradually) it goes
on to a limit where avg response time of last 10 queries goes beyond 5000ms,
and that is when requests start to pile up. As I am composing this mail,
optimize command is being executed which I hope should help, but to what
extent, I will need to see.

Finally, what happens if the schema of master and slave are different (there
exists a field in master which does not exist in slave). I thought that
replication would show me some kind of error, but it went on successfully.

Thanks,

Pranav

Reply via email to