On Tue, 2010-10-26 at 15:48 +0200, Ron Mayer wrote:
And a third potential reason - it's arguably a feature instead of a bug
for some applications. Depending on how I organize my shards, give me
the most relevant document from each shard for this search seems like
it could be useful.
You can
Andrzej Bialecki wrote:
On 2010-10-25 11:22, Toke Eskildsen wrote:
On Thu, 2010-07-22 at 04:21 +0200, Li Li wrote:
But itshows a problem of distrubted search without common idf.
A doc will get different score in different shard.
Bingo.
I really don't understand why this fundamental problem
On Thu, 2010-07-22 at 04:21 +0200, Li Li wrote:
But itshows a problem of distrubted search without common idf.
A doc will get different score in different shard.
Bingo.
I really don't understand why this fundamental problem with sharding
isn't mentioned more often. Every time the advice use
On 2010-10-25 11:22, Toke Eskildsen wrote:
On Thu, 2010-07-22 at 04:21 +0200, Li Li wrote:
But itshows a problem of distrubted search without common idf.
A doc will get different score in different shard.
Bingo.
I really don't understand why this fundamental problem with sharding
isn't
On Mon, 2010-10-25 at 11:50 +0200, Andrzej Bialecki wrote:
* there is an exact solution to this problem, namely to make two
distributed calls instead of one (first call to collect per-shard IDFs
for given query terms, second call to submit a query rewritten with the
global IDF-s). This
On 2010-10-25 13:37, Toke Eskildsen wrote:
On Mon, 2010-10-25 at 11:50 +0200, Andrzej Bialecki wrote:
* there is an exact solution to this problem, namely to make two
distributed calls instead of one (first call to collect per-shard IDFs
for given query terms, second call to submit a query
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p995407.html
Sent from the Solr - User mailing list archive at Nabble.com.
where is the link of this patch?
2010/7/24 Yonik Seeley yo...@lucidimagination.com:
On Fri, Jul 23, 2010 at 2:23 PM, MitchK mitc...@web.de wrote:
why do we do not send the output of TermsComponent of every node in the
cluster to a Hadoop instance?
Since TermsComponent does the map-part of the
the solr version I used is 1.4
2010/7/26 Li Li fancye...@gmail.com:
where is the link of this patch?
2010/7/24 Yonik Seeley yo...@lucidimagination.com:
On Fri, Jul 23, 2010 at 2:23 PM, MitchK mitc...@web.de wrote:
why do we do not send the output of TermsComponent of every node in the
distributed IDF
(like at the mentioned JIRA-issue) to normalize your results's scoring.
But the mentioned problem at this mailing-list-posting has nothing to do
with that...
Regards
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search
this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p990506.html
Sent from the Solr - User mailing list archive at Nabble.com.
On Fri, Jul 23, 2010 at 2:23 PM, MitchK mitc...@web.de wrote:
why do we do not send the output of TermsComponent of every node in the
cluster to a Hadoop instance?
Since TermsComponent does the map-part of the map-reduce concept, Hadoop
only needs to reduce the stuff. Maybe we even do not need
other suggestions?
-Yonik
http://www.lucidimagination.com
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p990551.html
Sent from the Solr - User mailing list archive at Nabble.com.
That only works if the docs are exactly the same - they may not be.
Ahm, what? Why? If the uniqueID is the same, the docs *should* be the same,
don't they?
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p990563.html
Sent from
On Fri, Jul 23, 2010 at 2:40 PM, MitchK mitc...@web.de wrote:
That only works if the docs are exactly the same - they may not be.
Ahm, what? Why? If the uniqueID is the same, the docs *should* be the same,
don't they?
Documents aren't supposed to be duplicated across shards... so the
presence
As the comments suggest, it's not a bug, but just the best we can do
for now since our priority queues don't support removal of arbitrary
elements. I guess we could rebuild the current priority queue if we
detect a duplicate, but that will have an obvious performance impact.
Any other
: As the comments suggest, it's not a bug, but just the best we can do
: for now since our priority queues don't support removal of arbitrary
FYI: I updated the DistributedSearch wiki to be more clear about this --
it previously didn't make it explicitly clear that docIds were suppose to
be
in QueryComponent.mergeIds. It will remove document which has
duplicated uniqueKey with others. In current implementation, it use
the first encountered.
String prevShard = uniqueDoc.put(id, srsp.getShard());
if (prevShard != null) {
// duplicate detected
Li Li,
this is the intended behaviour, not a bug.
Otherwise you could get back the same record in a response for several
times, which may not be intended by the user.
Kind regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search
not be intended by the user.
Kind regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p983675.html
Sent from the Solr - User mailing list archive at Nabble.com.
regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p983771.html
Sent from the Solr - User mailing list archive at Nabble.com.
you can't prevent this without custom coding or
making a document's occurence unique.
Kind regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p983771.html
Sent from the Solr - User mailing list archive at Nabble.com.
://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p983880.html
Sent from the Solr - User mailing list archive at Nabble.com.
How about sorting over the score? Would that be possible?
On Jul 21, 2010, at 12:13 AM, Li Li wrote:
in QueryComponent.mergeIds. It will remove document which has
duplicated uniqueKey with others. In current implementation, it use
the first encountered.
String prevShard =
sees
the doc_X firstly at shard_A and ignores it at shard_B. That means, that the
doc maybe would occur at page 10 in pagination, although it *should* occur
at page 1 or 2.
Kind regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search
firstly at shard_A and ignores it at shard_B. That means, that the
doc maybe would occur at page 10 in pagination, although it *should* occur
at page 1 or 2.
Kind regards,
- Mitch
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search
26 matches
Mail list logo