Re: need help on solr search

2012-11-05 Thread jchen2000
Used mm parameter and it works!

Right now preparing perf test. Please share if anybody has method to
optimize dismax queries

Thanks!

Jeremy


Otis Gospodnetic-5 wrote
 Hi,
 
 Have a look at your solrconfig.xml and look for your default operator.
 Also
 look at the docs for the mm parameter on the Wiki. Let us know if that
 does
 it for you.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191p4018397.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help on solr search

2012-11-01 Thread jchen2000
Seems like phrase query is close, but not exactly what we needed. Here is an
example assuming just one field:
the doc: a1 a2 a3 b1 b2 c1 c2 c3 c4 d1 d2
the query: a1 a2* a3 a4* b1 b2* c2 d1* d2

both doc and query terms are ordered. We know that a term should never go
match with b or c terms. Obviously, if we treat all query terms with OR, we
could have the job done, but in a much slower way (also the returned list
would be too long). So the question is, do we have a way to speed this query
up? or customization code is needed (and how)?

Thanks,
Jeremy



--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191p4017630.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help on solr search

2012-11-01 Thread Otis Gospodnetic
Hi,

I don't follow. :(  See inline.

On Thu, Nov 1, 2012 at 2:16 PM, jchen2000 jchen...@yahoo.com wrote:

 Seems like phrase query is close, but not exactly what we needed. Here is
 an
 example assuming just one field:
 the doc: a1 a2 a3 b1 b2 c1 c2 c3 c4 d1 d2
 the query: a1 a2* a3 a4* b1 b2* c2 d1* d2


Are those *s wildcards? Are they relevant here?  If not, remove them to
keep your example as simple as possible.

both doc and query terms are ordered. We know that a term should never go
 match with b or c terms. Obviously, if we treat all query terms with OR, we


What does should never go match with b or c terms mean?  I see b1, b2*,
c2 in there. There are no b or c terms anywhere in example doc or querym
just b1, b2*, c2.


 could have the job done, but in a much slower way (also the returned list
 would be too long). So the question is, do we have a way to speed this
 query
 up? or customization code is needed (and how)?


Is that query supposed to match that doc?
Ah, I think I see what you are trying to do/say.

You want ordered term matching (like in a phrase), but you cannot use AND
because you do not want all query terms to be required.  Correct?

If so, would a1 a2* a3 a4* b1 b2* c2 d1* d2~someBigSlop work?

I'm not sure if that would be faster than the OR query, but see if that
even does what you are after.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


Re: need help on solr search

2012-11-01 Thread jchen2000
Otis Gospodnetic-5 wrote
 You want ordered term matching (like in a phrase), but you cannot use
 AND
 because you do not want all query terms to be required.  Correct?

That's exactly right! actually none of the query term is required, but we
need to base similarity score on how many terms are matched. In addtion,
since we have unique prefixes like a, b, c, we guarantee a1 would
never match anything in group b or c, etc.


Otis Gospodnetic-5 wrote
 If so, would a1 a2* a3 a4* b1 b2* c2 d1* d2~someBigSlop work?

This does not work, because a2* (just any term different from a2, not
wildcard), a4* etc do not appear in the doc. quoted proximity match seems
still to require all query terms to appear.

Thanks,
Jeremy





--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191p4017686.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help on solr search

2012-11-01 Thread Otis Gospodnetic
Hi,

Have a look at your solrconfig.xml and look for your default operator. Also
look at the docs for the mm parameter on the Wiki. Let us know if that does
it for you.

Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 1, 2012 5:11 PM, jchen2000 jchen...@yahoo.com wrote:

 Otis Gospodnetic-5 wrote
  You want ordered term matching (like in a phrase), but you cannot use
  AND
  because you do not want all query terms to be required.  Correct?

 That's exactly right! actually none of the query term is required, but we
 need to base similarity score on how many terms are matched. In addtion,
 since we have unique prefixes like a, b, c, we guarantee a1 would
 never match anything in group b or c, etc.


 Otis Gospodnetic-5 wrote
  If so, would a1 a2* a3 a4* b1 b2* c2 d1* d2~someBigSlop work?

 This does not work, because a2* (just any term different from a2, not
 wildcard), a4* etc do not appear in the doc. quoted proximity match seems
 still to require all query terms to appear.

 Thanks,
 Jeremy





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191p4017686.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: need help on solr search

2012-10-31 Thread Erick Erickson
You need to provide significantly more information than you have.
What are your perf requirements? How big is your data set? What
kinds of searches are you talking about here? How are you
measuring response?

This really feels like an XY problem.

Best
Erick

On Wed, Oct 31, 2012 at 1:33 AM, jchen2000 jchen...@yahoo.com wrote:
 Hi Solr experts,

 Our documents as well as queries consist of 10 properties in a particular
 order. Because of stringent requirements on search latency, we grouped them
 into only 2 fields with 5 properties each (we may use just 1 field, field
 number over 3 seems too slow), and each property value is split into
 fixed-length terms (like n-gram, hopefully to save search time) and prefixed
 with property name. What we want is to find out how similar the query is to
 the documents by comparing terms. We can't use the default OR operator since
 it's slow, we wanted to take advantage of the prefix and the defined order.

 My questions are:
 1) Can we do this simply through solr configuration, and how if possible?
 2) If we need to customize solr request handler or anything else, where to
 start?

 Thanks a lot!

 Jeremy



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help on solr search

2012-10-31 Thread jchen2000
Sure.  here are some more details:
1) we are having 30M ~ 60M documents per node (right now we have 4 nodes,
but that will increase in the future).  Documents are relatively small
(around 3K), but 99% searches must be returned within 200ms and this is
measured by test drivers sitting right in front of solr servers. 

2) throughput requirement right now is about 300 qps. The machines we use
are quite powerful with 16 cores, lots of memory and with ssd drives. We
haven't really achieved this throughput, but search latency is more of an
issue

3) one property value may overlap with value in another different property,
but we don't want to match those so we prefixed terms with property name

Thanks,
Fang  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191p4017341.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help on solr search

2012-10-31 Thread Otis Gospodnetic
Hi,

Not sure if I follow your requirements correctly, but it sounds like
you may be looking for phrase queries (as opposed to term/keyword
queries).

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Wed, Oct 31, 2012 at 1:33 AM, jchen2000 jchen...@yahoo.com wrote:
 Hi Solr experts,

 Our documents as well as queries consist of 10 properties in a particular
 order. Because of stringent requirements on search latency, we grouped them
 into only 2 fields with 5 properties each (we may use just 1 field, field
 number over 3 seems too slow), and each property value is split into
 fixed-length terms (like n-gram, hopefully to save search time) and prefixed
 with property name. What we want is to find out how similar the query is to
 the documents by comparing terms. We can't use the default OR operator since
 it's slow, we wanted to take advantage of the prefix and the defined order.

 My questions are:
 1) Can we do this simply through solr configuration, and how if possible?
 2) If we need to customize solr request handler or anything else, where to
 start?

 Thanks a lot!

 Jeremy



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191.html
 Sent from the Solr - User mailing list archive at Nabble.com.


need help on solr search

2012-10-30 Thread jchen2000
Hi Solr experts,

Our documents as well as queries consist of 10 properties in a particular
order. Because of stringent requirements on search latency, we grouped them
into only 2 fields with 5 properties each (we may use just 1 field, field
number over 3 seems too slow), and each property value is split into
fixed-length terms (like n-gram, hopefully to save search time) and prefixed
with property name. What we want is to find out how similar the query is to
the documents by comparing terms. We can't use the default OR operator since
it's slow, we wanted to take advantage of the prefix and the defined order. 

My questions are:
1) Can we do this simply through solr configuration, and how if possible?
2) If we need to customize solr request handler or anything else, where to
start?

Thanks a lot!

Jeremy



--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-on-solr-search-tp4017191.html
Sent from the Solr - User mailing list archive at Nabble.com.