On 9/30/13 11:22 AM, Kieron Taylor wrote:
%%% Indexing %%%
$lucy_indexer = Lucy::Index::Indexer->new(
schema => $schema,
index => $path,
create => 1,
);
#
while ($record = shift) {
%flattened_record = %{$record};
$flattened_record{accessions} = join ' ',@accessions;
# Array of values turned into whitespaced list.
$lucy_indexer->add_doc(
\%flattened_record
);
}
# Commit is called ~100k records, before spinning up another indexer
$lucy_indexer->commit;
I assume you are not passing the 'create => 1' param for each $lucy_indexer.
%%% Querying %%%
$query = 'accessions:UPI01';
$searcher = Lucy::Search::IndexSearcher->new(
index => $path,
);
$parser = Search::Query->parser(
dialect => 'Lucy',
fields => $lucy_indexer->get_schema()->all_fields,
);
$search = $parser->parse($query)->as_lucy_query;
I would probably insert a debugging statement here to verify that the
parser is doing what you think it is:
$parsed_query = $parser->parse($query);
printf("parsed_query:%s\n", $parsed_query);
$lucy_query = $parsed_query->as_lucy_query;
printf("lucy_query:%s\n", $lucy_query->dump);
$result = $searcher->hits(
query => $search,
num_wanted => 1000,
);
while (my $hit = $result->next) {
say $hit->{accessions};
}
I've not shown result paging code, and some blob data use that doesn't
affect this issue, since blobs are for later and I'm not getting any
hits on some strings, that I can grep from the .dat in a seg.
Instead of grep'ing the segment files, you might try seeing what Lucy
reports via the API:
https://metacpan.org/source/KARMAN/SWISH-Prog-Lucy-0.17/bin/lucyx-dump-terms
--
Peter Karman . http://peknet.com/ . [email protected]