On 23/11/2016 16:31, Gupta, Rajiv wrote:
Thanks for your reply Nick.
I wanted to delete the old documents that is why I was trying to get the doc_id
and use that to delete it. However, that does not help it deleted other
documents and keep changing the document. I wanted to use delete by term but in
my doc I don't have any primary key.
I add document like this:
$indexer->add_doc({
title => $mytitle,
content => substr($mybodytext,0,1024),
url => $onlyfilename,
urlpath => $filpath,
position => $fileseektostart,
linenum => $filelinenumtostart,
jobtype => $self->{_logfile_hash}{$filetoindex}[5] ,
});
You can use any field as primary key if the field's value is guaranteed to be
unique for all your documents. But it seems that you index the contents of
files line by line, so "urlpath" isn't unique. Your primary key is probably
the tuple (urlpath, linenum).
If you update all the lines of a file at once, this isn't a problem. You can
simply delete all documents relating to the file with
$indexer->delete_by_term(
field => 'urlpath',
term => $filepath,
);
If you only want to update certain lines, you'll have to construct an ANDQuery
for each line and use delete_by_query. For example:
$indexer->delete_by_query(Lucy::Search::ANDQuery->new(
children => [
Lucy::Search::TermQuery->new(
field => 'urlpath',
term => $filepath,
),
Lucy::Search::TermQuery->new(
field => 'linenum',
term => $linenum,
),
],
));
Or maybe use a RangeQuery to delete a contiguous range of lines.
Nick