Using LukeRequestHandler, I see:

<int name="numDocs">7725</int>
<int name="maxDoc">28099</int>
<int name="numTerms">758826</int>
<long name="version">1266355690710</long>
<bool name="optimized">false</bool>
<bool name="current">true</bool>
<bool name="hasDeletions">true</bool>
<str name="directory">
org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/opt/solr/myindex/data/index
</str>

I will copy the index to my local machine so I can open it with luke. Should
I look for something specific?

Thanks!


ANKITBHATNAGAR wrote:
> 
> Try inspecting your index with luke
> 
> 
> Ankit
> 
> 
> -----Original Message-----
> From: Pascal Dimassimo [mailto:thesuper...@hotmail.com] 
> Sent: Friday, February 19, 2010 2:22 PM
> To: solr-user@lucene.apache.org
> Subject: Documents disappearing
> 
> 
> Hi,
> 
> I have encounter a situation that I can't explain. We are indexing
> documents
> that are often duplicates so we activated deduplication like this:
> 
> <processor
> class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
>       <bool name="enabled">true</bool>
>       <bool name="overwriteDupes">true</bool>
>       <str name="signatureField">signature</str>
>       <str name="fields">title,text</str>
>       <str
> name="signatureClass">org.apache.solr.update.processor.Lookup3Signature</str>
> </processor>
> 
> What I can't explain is that when I look at the documents count in the
> log,
> I see documents disappearing.
> 
> 11:24:23 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=0 status=0 QTime=0
> 14:04:24 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=4065 status=0 QTime=10
> 14:17:07 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=6499 status=0 QTime=42
> 14:25:42 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7629 status=0 QTime=1
> 14:47:12 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=10140 status=0 QTime=12
> 15:17:22 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=10861 status=0 QTime=13
> 15:47:31 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=9852 status=0 QTime=19
> 16:17:42 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=8112 status=0 QTime=13
> 16:38:17 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=10
> 16:39:10 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=1
> 16:47:40 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=46
> 16:51:24 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=74
> 17:02:13 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=102
> 17:17:41 INFO  - [myindex] webapp=null path=null
> params={event=newSearcher&q=*:*&wt=dismax} hits=7725 status=0 QTime=8
> 
> 11:24 was the time at which Solr was started that day. Around 13:30, we
> started the indexation.
> 
> At some point during the indexation, I notice that a batch a documents
> were
> resend (i.e, documents with the same id field were sent again to the
> index).
> And according to the log, NO delete was sent to Solr.
> 
> I understand that if I send duplicates (either documents with the same id
> or
> with the same signature), the count of documents should stay the same. But
> how can we explain that it is lowering? What are the possible causes of
> this
> behavior?
> 
> Thanks! 
> -- 
> View this message in context:
> http://old.nabble.com/Documents-disappearing-tp27659047p27659047.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Documents-disappearing-tp27659047p27660077.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to