Re: MLT Using a Query created in a different index
Thanks for that Jack, so it's fair to say that if both the sources and target corpus are large and diverse, then the impact of using a different index to create the query would be negligible. P. On 04/04/2013 06:49 PM, Jack Krupansky wrote: The heart of MLT is examining the top result of a query (or maybe more than one) and identifying the top terms from the top document(s) and then simply using those top terms for a subsequent query. The term ranking would of course depend on term frequency, and other relevancy considerations - for the corpus of the original query. A rich query corpus will give great results, a weak corpus will give weak results - no matter how rich or weak the final target corpus is. OTOH, if the target corpus really is representative on the source corpus, then results should be either good or terrible - the selected/query document may not have any representation in the target corpus. -- Jack Krupansky -Original Message- From: Peter Lavin Sent: Thursday, April 04, 2013 1:06 PM To: java-user@lucene.apache.org Subject: MLT Using a Query created in a different index Dear Users, I am doing some research where Lucene is integrated into agent technology. Part of this work involves using an MLT query in an index which was not created from a document in that index (i.e. the query is created, serialised and sent to the remote agent). Can anyone point me towards any information on what the potential impact of doing this would be? I'm assuming if both indexes have similar sets of documents, the impact would be negligible, but what, for example would be the impact of creating an MLT query from an index with only one or two documents for use in an index with several (say 100+) documents, with thanks, Peter -- with best regards, Peter Lavin, PhD Candidate, CAG - Computer Architecture Grid Research Group, Lloyd Institute, 005, Trinity College Dublin, Ireland. +353 1 8961536 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: MLT Using a Query created in a different index
In a statistical sense, for the majority of documents, yes, but you could probably find quite a few outlier examples where the results from A to B or from B to A as significantly or even completely different or even non-existent. -- Jack Krupansky -Original Message- From: Peter Lavin Sent: Friday, April 05, 2013 3:49 AM To: java-user@lucene.apache.org Subject: Re: MLT Using a Query created in a different index Thanks for that Jack, so it's fair to say that if both the sources and target corpus are large and diverse, then the impact of using a different index to create the query would be negligible. P. On 04/04/2013 06:49 PM, Jack Krupansky wrote: The heart of MLT is examining the top result of a query (or maybe more than one) and identifying the top terms from the top document(s) and then simply using those top terms for a subsequent query. The term ranking would of course depend on term frequency, and other relevancy considerations - for the corpus of the original query. A rich query corpus will give great results, a weak corpus will give weak results - no matter how rich or weak the final target corpus is. OTOH, if the target corpus really is representative on the source corpus, then results should be either good or terrible - the selected/query document may not have any representation in the target corpus. -- Jack Krupansky -Original Message- From: Peter Lavin Sent: Thursday, April 04, 2013 1:06 PM To: java-user@lucene.apache.org Subject: MLT Using a Query created in a different index Dear Users, I am doing some research where Lucene is integrated into agent technology. Part of this work involves using an MLT query in an index which was not created from a document in that index (i.e. the query is created, serialised and sent to the remote agent). Can anyone point me towards any information on what the potential impact of doing this would be? I'm assuming if both indexes have similar sets of documents, the impact would be negligible, but what, for example would be the impact of creating an MLT query from an index with only one or two documents for use in an index with several (say 100+) documents, with thanks, Peter -- with best regards, Peter Lavin, PhD Candidate, CAG - Computer Architecture Grid Research Group, Lloyd Institute, 005, Trinity College Dublin, Ireland. +353 1 8961536 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
MLT Using a Query created in a different index
Dear Users, I am doing some research where Lucene is integrated into agent technology. Part of this work involves using an MLT query in an index which was not created from a document in that index (i.e. the query is created, serialised and sent to the remote agent). Can anyone point me towards any information on what the potential impact of doing this would be? I'm assuming if both indexes have similar sets of documents, the impact would be negligible, but what, for example would be the impact of creating an MLT query from an index with only one or two documents for use in an index with several (say 100+) documents, with thanks, Peter -- with best regards, Peter Lavin, PhD Candidate, CAG - Computer Architecture Grid Research Group, Lloyd Institute, 005, Trinity College Dublin, Ireland. +353 1 8961536 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: MLT Using a Query created in a different index
The heart of MLT is examining the top result of a query (or maybe more than one) and identifying the top terms from the top document(s) and then simply using those top terms for a subsequent query. The term ranking would of course depend on term frequency, and other relevancy considerations - for the corpus of the original query. A rich query corpus will give great results, a weak corpus will give weak results - no matter how rich or weak the final target corpus is. OTOH, if the target corpus really is representative on the source corpus, then results should be either good or terrible - the selected/query document may not have any representation in the target corpus. -- Jack Krupansky -Original Message- From: Peter Lavin Sent: Thursday, April 04, 2013 1:06 PM To: java-user@lucene.apache.org Subject: MLT Using a Query created in a different index Dear Users, I am doing some research where Lucene is integrated into agent technology. Part of this work involves using an MLT query in an index which was not created from a document in that index (i.e. the query is created, serialised and sent to the remote agent). Can anyone point me towards any information on what the potential impact of doing this would be? I'm assuming if both indexes have similar sets of documents, the impact would be negligible, but what, for example would be the impact of creating an MLT query from an index with only one or two documents for use in an index with several (say 100+) documents, with thanks, Peter -- with best regards, Peter Lavin, PhD Candidate, CAG - Computer Architecture Grid Research Group, Lloyd Institute, 005, Trinity College Dublin, Ireland. +353 1 8961536 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org