eDismax parser and the mm parameter

2014-03-30 Thread S.L
Hi All, I am planning to use the eDismax query parser in SOLR to give boost to documents that have a phrase in their fields present. Now there is a mm parameter in the edismax parser query , since the query typed by the user could be of any length (i.e. =1) I would like to set the mm value to 1 .

Re: eDismax parser and the mm parameter

2014-03-30 Thread Jack Krupansky
1. Yes, the default for mm is 1. 2. It depends on what you are really trying to do - you haven't told us. Generally, mm=1 is equivalent to q.op=OR, and mm=100% is equivalent to q.op=AND. Generally, use q.op unless you really know what you are doing. Generally, the intent of mm is to set the

Re: eDismax parser and the mm parameter

2014-03-30 Thread Ahmet Arslan
Hi, Using mm=1 with (e)dismax is not a good idea. Your user will be unhappy.  Because there in no coord factor with this parser. coord is about : Typically, a document that contains more of the query's terms will receive a higher score than another document with fewer query terms. I suggest you

Re: eDismax parser and the mm parameter

2014-03-30 Thread simpleliving...@gmail.com
Thanks Ahmet. So if its single term query like 'Ginseng' what does a mm=3 do to the query .I am guessing it would be reduced to 1 automatically in this case. Sent from my HTC - Reply message - From: Ahmet Arslan iori...@yahoo.com To: solr-user@lucene.apache.org

Re: eDismax parser and the mm parameter

2014-03-30 Thread S.L
Thanks Jack! I understand the intent of mm parameter, my question is that since the query terms being provided are not of fixed length I do not know what the mm should like for example Ginseng,Siberian Ginseng are my search terms. The first one can have an mm upto 1 and the second one can have an

Re: eDismax parser and the mm parameter

2014-03-30 Thread Jack Krupansky
It still depends on your objective - which you haven't told us yet. Show us some use cases and detail what your expectations are for each use case. The edismax phrase boosting is probably a lot more useful than messing around with mm. Take a look at pf, pf2, and pf3. See:

Re: Context-aware suggesters in Solr

2014-03-30 Thread Alan Woodward
Thanks Areek. So looking at the code in trunk, exposing it to Solr looks to be pretty straightforward - just extending DocumentDictionaryFactory to take a 'contextField' parameter as well, and passing that on to the DocumentDictionary constructor. I'll give it a go! Thanks again. Alan

SolrCloud OR distributed Solr

2014-03-30 Thread Priti Solanki
Hello Member, Is there any difference between distributed solr solrCloud ? Consider I have three countries' product. I have indexed one country data and it's index size is 160 gb+ Now we have other two countries and now I am confused ! My client ask me what is the difference if we procure

Re: SolrCloud OR distributed Solr

2014-03-30 Thread Gora Mohanty
On 30 March 2014 23:12, Priti Solanki pritiatw...@gmail.com wrote: Hello Member, Is there any difference between distributed solr solrCloud ? You might be confusing the older Solr distributed search with the new SolrCloud: * Older distributed search:

Re: SolrCloud OR distributed Solr

2014-03-30 Thread Erick Erickson
Distributed solr is simply the ability for Solr to take the incoming query and send it to multiple shards, then aggregate the response. Here a shard is a physical partition of a single logical index. The assumption is that you can't fit the entire index on a single machine and still get the

Re: zookeeper reconnect failure

2014-03-30 Thread Mark Miller
We don’t currently retry, but I don’t think it would hurt much if we did - at least briefly. If you want to file a JIRA issue, that would be the best way to get it in a future release. --  Mark Miller about.me/markrmiller On March 28, 2014 at 5:40:47 PM, Michael Della Bitta

Re: SOLR Cloud 4.6 - PERFORMANCE WARNING: Overlapping onDeckSearchers=2

2014-03-30 Thread Rishi Easwaran
RAM shouldn't be a problem. I have a box with 144GB RAM, running 12 instances with 4GB Java heap each. There are 9 instances wrting to 1TB of SSD disk space. Other 3 are writing to SATA drives, and have autosoftcommit disabled. -Original Message- From: Shawn Heisey

Re: SOLR Cloud 4.6 - PERFORMANCE WARNING: Overlapping onDeckSearchers=2

2014-03-30 Thread Shawn Heisey
On 3/30/2014 2:59 PM, Rishi Easwaran wrote: RAM shouldn't be a problem. I have a box with 144GB RAM, running 12 instances with 4GB Java heap each. There are 9 instances wrting to 1TB of SSD disk space. Other 3 are writing to SATA drives, and have autosoftcommit disabled. This brought up

Re: eDismax parser and the mm parameter

2014-03-30 Thread S.L
Jacks Thanks Again, I am searching Chinese medicine documents , as the example I gave earlier a user can search for Ginseng or Siberian Ginseng or Red Siberian Ginseng , I certainly want to use pf parameter (which is not driven by mm parameter) , however for giving higher score to documents

Re: eDismax parser and the mm parameter

2014-03-30 Thread Jack Krupansky
If you use pf, pf2, and pf3 and boost appropriately, the effects of mm will be dwarfed. The general goal is to assure that the top documents really are the best, not to necessarily limit the total document count. Focusing on the latter could be a real waste of time. It's still not clear why

Re: eDismax parser and the mm parameter

2014-03-30 Thread S.L
Jack, I mis-stated the problem , I am not using the OR operator as default now(now that I think about it it does not make sense to use the default operator OR along with the mm parameter) , the reason I want to use pf and mm in conjunction is because of my understanding of the edismax parser and

Re: eDismax parser and the mm parameter

2014-03-30 Thread Jack Krupansky
The mm parameter is really only relevant when the default operator is OR or explicit OR operators are used. Again: Please provide your use case examples and your expectations for each use case. It really doesn't make a lot of sense to prematurely focus on a solution when you haven't clearly

Re: eDismax parser and the mm parameter

2014-03-30 Thread S.L
Thanks Jack , my use cases are as follows. 1. Search for Ginseng everything related to ginseng should show up. 2. Search For White Siberian Ginseng results with the whole phrase show up first followed by 2 words from the phrase followed by a single word in the phrase 3. Fuzzy

how to index 20 MB plain-text xml

2014-03-30 Thread Floyd Wu
I have many plain text xml that I transfer to form of solr xml format. But every time I send them to solr, I hit OOM exception. How to configure solr to eat these big xml? Please guide me a way. Thanks floyd

Re: how to index 20 MB plain-text xml

2014-03-30 Thread Alexandre Rafalovitch
Without digging too deep into why exactly this is happening, here are the general options: 0. Are you actually committing? Check the messages in the logs and see if the records show up when you expect them too. 1. Are you actually trying to feed 20Mb file to Solr? Maybe it's HTTP buffer that's