Re: MoreLikeThisHandler with mltipli input documents

2015-09-30 Thread Szűcs Roland
Hi Alessandro, Exactly. The response time varies but let's have a concrete other example. This is my call: http://localhost:8983/solr/bandwpl/mlt?q=id:10812=id This is my result: { "responseHeader":{ "status":0, "QTime":6232}, "response":{"numFound":4564,"start":0,"docs":[ {

Re: MoreLikeThisHandler with mltipli input documents

2015-09-30 Thread Szűcs Roland
Hi Alessandro, You are right. I forget to mention one important factor. For 3000 hungarian e-books the approach you mentioned is absolutely fine as the response time is some 0.7 sec. But when I use the same mlt for 5600 polish e-books the response time is 7 sec which is definetely not acceptable

Re: MoreLikeThisHandler with mltipli input documents

2015-09-30 Thread Alessandro Benedetti
I am still missing why you quote the number of the documents... If you have 5600 polish books, but you use the MLT only when you land in the page of a specific book ... I think i still miss the point ! MLT on 1 polish book, takes 7 secs ? 2015-09-30 9:10 GMT+01:00 Szűcs Roland

Re: MoreLikeThisHandler with mltipli input documents

2015-09-30 Thread Szűcs Roland
Hello Upayavira, We use the ajax call and it can work when it takes only some seconds (even the 7 sec can be acceptable in this case) as the customers first focus on the product page and if they are not satisfied with the e-book they will need the offer. I am just started to scare what will

Re: MoreLikeThisHandler with mltipli input documents

2015-09-30 Thread Upayavira
Could you do the MLT as a separate (AJAX) request? They appear a little afterwards, whilst the user is already reading the page? Or, you could do offline clustering, in which case, overnight, you compare every document with every other, using a (likely non-solr) clustering algorithm, and store

Re: MoreLikeThisHandler with mltipli input documents

2015-09-30 Thread Alessandro Benedetti
This query time is still suspicious ... Have you tried to play with MLT params ? Min term frequency ? Min Doc Freq ? You can reduce the terms to query, Parameter Description mlt.qf Query fields and their boosts using the same format as that used by the DisMaxRequestHandler. These fields must

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Upayavira
Let's take a step back. So, you have 3000 or so docs, and you want to know which documents are similar to these. Why do you want to know this? What feature do you need to build that will use that information? Knowing this may help us to arrive at the right technology for you. For example, you

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Szűcs Roland
Hi Alessandro, My original goal was to get offline suggestsion on content based similarity for every e-book we have . We wanted to run a bulk more like this calculation in the evening when the usage of our site is low and we submit a new e-book. Real time more like this can take a while as we

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Upayavira
If MoreLikeThis is slow for large documents that are indexed, have you enabled term vectors on the similarity fields? Basically, what more like this does is this: * decide on what terms in the source doc are "interesting", and pick the 25 most interesting ones * build and execute a boolean query

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Alessandro Benedetti
Hi Roland, what is your exact requirement ? Do you want to basically build a "description" for a set of documents and then find documents in the index, similar to this description ? By default , based on my experience ( and on the code) this is the entry point for the Lucene More Like This : >

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Szűcs Roland
Hello Upayavira, Thanks dealing with my issue. I have applied already the termVectors=true to all fileds involved in the more like this calculation. I have just 3 000 documents each of them is represented by a relativly big term vector with more than 20 000 unique terms. If I run the more like

MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Roland Szűcs
Hi all, Is it possible to feed multiple solr id for a MoreLikeThisHandler? false details title,content 4 title^12 content^1 2 10 true json true when I call this: http://localhost:8983/solr/bandwhu/mlt?q=id:8=id it works fine. Is there any way to have a kind of "bulk" call of more like

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Alessandro Benedetti
Hi Roland, you said "The main goal is that when a customer is on the pruduct page ". But if you are in a product page, I guess you have the product Id. If you have the product id , you can simply execute the MLT request with the single Doc Id in input. Why do you need to calculate beforehand?

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Szűcs Roland
Hello Upayavira, The main goal is that when a customer is on the pruduct page on an e-book and he does not like it somehow I want to immediately offer her/him alternative e-books in the same topic. If I expect from the customer to click on a button like "similar e-books" I lose half of them as