Having multiple cores point to the same index is, except for
special circumstances where one of the cores is guaranteed to
be read only, a Bad Thing.

So it sounds like you've found your issue...

Best
Erick

On Mon, May 6, 2013 at 4:44 AM, Iker Mtnz. Apellaniz
<mitxin...@gmail.com> wrote:
> Thanks Erick,
>   I think we found the problem. When defining the cores for both shards we
> define both of them in the same instanceDir, like this:
> <core schema="schema.xml" shard="shard2" instanceDir="1_collection/"
> name="1_collection" config="solrconfig.xml" collection="1_collection"/>
> <core schema="schema.xml" shard="shard4" instanceDir="1_collection/"
> name="1_collection" config="solrconfig.xml" collection="1_collection"/>
>
>   Each shard should have its own folder, so the final configuration should
> be like this:
> <core schema="schema.xml" shard="shard2" instanceDir="1_collection/shard2/"
> name="1_collection" config="solrconfig.xml" collection="1_collection"/>
> <core schema="schema.xml" shard="shard4" instanceDir="1_collection/shard4/"
> name="1_collection" config="solrconfig.xml" collection="1_collection"/>
>
> Can anyone confirm this?
>
> Thanks,
>   Iker
>
>
> 2013/5/4 Erick Erickson <erickerick...@gmail.com>
>
>> Sounds like you've explicitly routed the same document to two
>> different shards. Document replacement only happens locally to a
>> shard, so the fact that you have documents with the same ID on two
>> different shards is why you're getting duplicate documents.
>>
>> Best
>> Erick
>>
>> On Fri, May 3, 2013 at 3:44 PM, Iker Mtnz. Apellaniz
>> <mitxin...@gmail.com> wrote:
>> > We are currently using version 4.2.
>> > We have made tests with a single document and it gives us a 2 document
>> > count. But if we force to shard into te first machine, the one with a
>> > unique shard, the count gives us 1 document.
>> > I've tried using distrib=false parameter, it gives us no duplicate
>> > documents, but the same document appears to be in two different shards.
>> >
>> > Finally, about the separate directories, We have only one directory for
>> the
>> > data in each physical machine and collection, and I don't see any
>> subfolder
>> > for the different shards.
>> >
>> > Is it possible that we have something wrong with the dataDir
>> configuration
>> > to use multiple shards in one machine?
>> >
>> > <dataDir>${solr.data.dir:}</dataDir>
>> > <directoryFactory name="DirectoryFactory"
>> > class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
>> >
>> >
>> >
>> > 2013/5/3 Erick Erickson <erickerick...@gmail.com>
>> >
>> >> What version of Solr? The custom routing stuff is quite new so
>> >> I'm guessing 4x?
>> >>
>> >> But this shouldn't be happening. The actual index data for the
>> >> shards should be in separate directories, they just happen to
>> >> be on the same physical machine.
>> >>
>> >> Try querying each one with &distrib=false to see the counts
>> >> from single shards, that may shed some light on this. It vaguely
>> >> sounds like you have indexed the same document to both shards
>> >> somehow...
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Fri, May 3, 2013 at 5:28 AM, Iker Mtnz. Apellaniz
>> >> <mitxin...@gmail.com> wrote:
>> >> > Hi,
>> >> >   We have currently a solrCloud implementation running 5 shards in 3
>> >> > physical machines, so the first machine will have the shard number 1,
>> the
>> >> > second machine shards 2 & 4, and the third shards 3 & 5. We noticed
>> that
>> >> > while queryng numFoundDocs decreased when we increased the start
>> param.
>> >> >   After some investigation we found that the documents in shards 2 to
>> 5
>> >> > were being counted twice. Querying to shard 2 will give you back the
>> >> > results for shard 2 & 4, and the same thing for shards 3 & 5. Our
>> guess
>> >> is
>> >> > that the physical index for both shard 2&4 is shared, so the shards
>> don't
>> >> > know which part of it is for each one.
>> >> >   The uniqueKey is correctly defined, and we have tried using shard
>> >> prefix
>> >> > (shard1!docID).
>> >> >
>> >> >   Is there any way to solve this problem when a unique physical
>> machine
>> >> > shares shards?
>> >> >   Is it a "real" problem os it just affects facet & numResults?
>> >> >
>> >> > Thanks
>> >> >    Iker
>> >> >
>> >> > --
>> >> > /** @author imartinez*/
>> >> > Person me = *new* Developer();
>> >> > me.setName(*"Iker Mtz de Apellaniz Anzuola"*);
>> >> > me.setTwit("@mitxino77 <https://twitter.com/mitxino77>");
>> >> > me.setLocations({"St Cugat, Barcelona", "Kanpezu, Euskadi", "*,
>> >> World"]});
>> >> > me.setSkills({*SoftwareDeveloper, Curious, AmateurCook*});
>> >> > me.setWebs({*urbasaabentura.com, ikertxef.com*});
>> >> > *return* me;
>> >>
>> >
>> >
>> >
>> > --
>> > /** @author imartinez*/
>> > Person me = *new* Developer();
>> > me.setName(*"Iker Mtz de Apellaniz Anzuola"*);
>> > me.setTwit("@mitxino77 <https://twitter.com/mitxino77>");
>> > me.setLocations({"St Cugat, Barcelona", "Kanpezu, Euskadi", "*,
>> World"]});
>> > me.setSkills({*SoftwareDeveloper, Curious, AmateurCook*});
>> > *return* me;
>>
>
>
>
> --
> /** @author imartinez*/
> Person me = *new* Developer();
> me.setName(*"Iker Mtz de Apellaniz Anzuola"*);
> me.setTwit("@mitxino77 <https://twitter.com/mitxino77>");
> me.setLocations({"St Cugat, Barcelona", "Kanpezu, Euskadi", "*, World"]});
> me.setSkills({*SoftwareDeveloper, Curious, AmateurCook*});
> *return* me;

Reply via email to