Added: lucy/site/trunk/content/docs/perl/Lucy/Index/Indexer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/Indexer.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/Indexer.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/Indexer.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,253 @@ +Title: Lucy::Index::Indexer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Indexer - Build inverted indexes.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $indexer = Lucy::Index::Indexer->new( + schema => $schema, + index => '/path/to/index', + create => 1, +); +while ( my ( $title, $content ) = each %source_docs ) { + $indexer->add_doc({ + title => $title, + content => $content, + }); +} +$indexer->commit;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>The Indexer class is Apache Lucy’s primary tool for managing the content of inverted indexes, +which may later be searched using <a href="../../Lucy/Search/IndexSearcher.html" class="podlinkpod" +>IndexSearcher</a>.</p> + +<p>In general, +only one Indexer at a time may write to an index safely. +If a write lock cannot be secured, +new() will throw an exception.</p> + +<p>If an index is located on a shared volume, +each writer application must identify itself by supplying an <a href="../../Lucy/Index/IndexManager.html" class="podlinkpod" +>IndexManager</a> with a unique <code>host</code> id to Indexer’s constructor or index corruption will occur. +See <a href="../../Lucy/Docs/FileLocking.html" class="podlinkpod" +>FileLocking</a> for a detailed discussion.</p> + +<p>Note: at present, +<a href="#delete_by_term" class="podlinkpod" +>delete_by_term()</a> and <a href="#delete_by_query" class="podlinkpod" +>delete_by_query()</a> only affect documents which had been previously committed to the index – and not any documents added this indexing session but not yet committed. +This may change in a future update.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $indexer = Lucy::Index::Indexer->new( + schema => $schema, # required at index creation + index => '/path/to/index', # required + create => 1, # default: 0 + truncate => 1, # default: 0 + manager => $manager # default: created internally +);</pre> + +<ul> +<li><b>schema</b> - A Schema. +Required when index is being created; if not supplied, +will be extracted from the index folder.</li> + +<li><b>index</b> - Either a filepath to an index or a Folder.</li> + +<li><b>create</b> - If true and the index directory does not exist, +attempt to create it.</li> + +<li><b>truncate</b> - If true, +proceed with the intention of discarding all previous indexing data. +The old data will remain intact and visible until commit() succeeds.</li> + +<li><b>manager</b> - An IndexManager.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="add_doc" +>add_doc</a></h3> + +<pre>$indexer->add_doc($doc); +$indexer->add_doc( { field_name => $field_value } ); +$indexer->add_doc( + doc => { field_name => $field_value }, + boost => 2.5, # default: 1.0 +);</pre> + +<p>Add a document to the index. +Accepts either a single argument or labeled params.</p> + +<ul> +<li><b>doc</b> - Either a Lucy::Document::Doc object, +or a hashref (which will be attached to a Lucy::Document::Doc object internally).</li> + +<li><b>boost</b> - A floating point weight which affects how this document scores.</li> +</ul> + +<h3><a class='u' +name="add_index" +>add_index</a></h3> + +<pre>$indexer->add_index($index);</pre> + +<p>Absorb an existing index into this one. +The two indexes must have matching Schemas.</p> + +<ul> +<li><b>index</b> - Either an index path name or a Folder.</li> +</ul> + +<h3><a class='u' +name="delete_by_term" +>delete_by_term</a></h3> + +<pre>$indexer->delete_by_term( + field => $field # required + term => $term # required +);</pre> + +<p>Mark documents which contain the supplied term as deleted, +so that they will be excluded from search results and eventually removed altogether. +The change is not apparent to search apps until after <a href="#commit" class="podlinkpod" +>commit()</a> succeeds.</p> + +<ul> +<li><b>field</b> - The name of an indexed field. +(If it is not spec’d as <code>indexed</code>, +an error will occur.)</li> + +<li><b>term</b> - The term which identifies docs to be marked as deleted. +If <code>field</code> is associated with an Analyzer, +<code>term</code> will be processed automatically (so don’t pre-process it yourself).</li> +</ul> + +<h3><a class='u' +name="delete_by_query" +>delete_by_query</a></h3> + +<pre>$indexer->delete_by_query($query);</pre> + +<p>Mark documents which match the supplied Query as deleted.</p> + +<ul> +<li><b>query</b> - A <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Query</a>.</li> +</ul> + +<h3><a class='u' +name="delete_by_doc_id" +>delete_by_doc_id</a></h3> + +<pre>$indexer->delete_by_doc_id($doc_id);</pre> + +<p>Mark the document identified by the supplied document ID as deleted.</p> + +<ul> +<li><b>doc_id</b> - A <a href="../../Lucy/Docs/DocIDs.html" class="podlinkpod" +>document id</a>.</li> +</ul> + +<h3><a class='u' +name="optimize" +>optimize</a></h3> + +<pre>$indexer->optimize();</pre> + +<p>Optimize the index for search-time performance. +This may take a while, +as it can involve rewriting large amounts of data.</p> + +<p>Every Indexer session which changes index content and ends in a <a href="#commit" class="podlinkpod" +>commit()</a> creates a new segment. +Once written, +segments are never modified. +However, +they are periodically recycled by feeding their content into the segment currently being written.</p> + +<p>The <a href="#optimize" class="podlinkpod" +>optimize()</a> method causes all existing index content to be fed back into the Indexer. +When <a href="#commit" class="podlinkpod" +>commit()</a> completes after an <a href="#optimize" class="podlinkpod" +>optimize()</a>, +the index will consist of one segment. +So <a href="#optimize" class="podlinkpod" +>optimize()</a> must be called before <a href="#commit" class="podlinkpod" +>commit()</a>. +Also, +optimizing a fresh index created from scratch has no effect.</p> + +<p>Historically, +there was a significant search-time performance benefit to collapsing down to a single segment versus even two segments. +Now the effect of collapsing is much less significant, +and calling <a href="#optimize" class="podlinkpod" +>optimize()</a> is rarely justified.</p> + +<h3><a class='u' +name="commit" +>commit</a></h3> + +<pre>$indexer->commit();</pre> + +<p>Commit any changes made to the index. +Until this is called, +none of the changes made during an indexing session are permanent.</p> + +<p>Calling <a href="#commit" class="podlinkpod" +>commit()</a> invalidates the Indexer, +so if you want to make more changes you’ll need a new one.</p> + +<h3><a class='u' +name="prepare_commit" +>prepare_commit</a></h3> + +<pre>$indexer->prepare_commit();</pre> + +<p>Perform the expensive setup for <a href="#commit" class="podlinkpod" +>commit()</a> in advance, +so that <a href="#commit" class="podlinkpod" +>commit()</a> completes quickly. +(If <a href="#prepare_commit" class="podlinkpod" +>prepare_commit()</a> is not called explicitly by the user, +<a href="#commit" class="podlinkpod" +>commit()</a> will call it internally.)</p> + +<h3><a class='u' +name="get_schema" +>get_schema</a></h3> + +<pre>my $schema = $indexer->get_schema();</pre> + +<p>Accessor for schema.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Indexer isa Clownfish::Obj.</p> + +</div>
Added: lucy/site/trunk/content/docs/perl/Lucy/Index/Lexicon.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/Lexicon.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/Lexicon.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/Lexicon.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Index::Lexicon â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Lexicon - Iterator for a field’s terms.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $lex_reader = $seg_reader->obtain('Lucy::Index::LexiconReader'); +my $lexicon = $lex_reader->lexicon( field => 'content' ); +while ( $lexicon->next ) { + print $lexicon->get_term . "\n"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A Lexicon is an iterator which provides access to all the unique terms for a given field in sorted order.</p> + +<p>If an index consists of two documents with a ‘content’ field holding “three blind mice” and “three musketeers” respectively, +then iterating through the ‘content’ field’s lexicon would produce this list:</p> + +<pre>blind +mice +musketeers +three</pre> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="seek" +>seek</a></h3> + +<pre>$lexicon->seek($target); +$lexicon->seek(); # default: undef</pre> + +<p>Seek the Lexicon to the first iterator state which is greater than or equal to <code>target</code>. +If <code>target</code> is undef, +reset the iterator.</p> + +<h3><a class='u' +name="next" +>next</a></h3> + +<pre>my $bool = $lexicon->next();</pre> + +<p>Proceed to the next term.</p> + +<p>Returns: true until the iterator is exhausted, +then false.</p> + +<h3><a class='u' +name="reset" +>reset</a></h3> + +<pre>$lexicon->reset();</pre> + +<p>Reset the iterator. +<a href="#next" class="podlinkpod" +>next()</a> must be called to proceed to the first element.</p> + +<h3><a class='u' +name="get_term" +>get_term</a></h3> + +<pre>my $obj = $lexicon->get_term();</pre> + +<p>Return the current term, +or undef if the iterator is not in a valid state.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Lexicon isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/LexiconReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/LexiconReader.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/LexiconReader.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/LexiconReader.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Index::LexiconReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::LexiconReader - Read Lexicon data.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $lex_reader = $seg_reader->obtain("Lucy::Index::LexiconReader"); +my $lexicon = $lex_reader->lexicon( field => 'title' );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>LexiconReader reads term dictionary information.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="lexicon" +>lexicon</a></h3> + +<pre>my $lexicon = $lexicon_reader->lexicon( + field => $field # required + term => $term # default: undef +);</pre> + +<p>Return a new Lexicon for the given <code>field</code>. +Will return undef if either the field is not indexed, +or if no documents contain a value for the field.</p> + +<ul> +<li><b>field</b> - Field name.</li> + +<li><b>term</b> - Pre-locate the Lexicon to this term.</li> +</ul> + +<h3><a class='u' +name="doc_freq" +>doc_freq</a></h3> + +<pre>my $int = $lexicon_reader->doc_freq( + field => $field # required + term => $term # required +);</pre> + +<p>Return the number of documents where the specified term is present.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="aggregator" +>aggregator</a></h3> + +<pre>my $result = $lexicon_reader->aggregator( + readers => $readers # required + offsets => $offsets # required +);</pre> + +<p>Return a LexiconReader which merges the output of other LexiconReaders.</p> + +<ul> +<li><b>readers</b> - An array of LexiconReaders.</li> + +<li><b>offsets</b> - Doc id start offsets for each reader.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::LexiconReader isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/PolyReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/PolyReader.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/PolyReader.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/PolyReader.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,95 @@ +Title: Lucy::Index::PolyReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::PolyReader - Multi-segment implementation of IndexReader.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $polyreader = Lucy::Index::IndexReader->open( + index => '/path/to/index', +); +my $doc_reader = $polyreader->obtain("Lucy::Index::DocReader"); +for my $doc_id ( 1 .. $polyreader->doc_max ) { + my $doc = $doc_reader->fetch_doc($doc_id); + print " $doc_id: $doc->{title}\n"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>PolyReader conflates index data from multiple segments. +For instance, +if an index contains three segments with 10 documents each, +PolyReader’s <a href="../../Lucy/Index/IndexReader.html#doc_max" class="podlinkpod" +>doc_max()</a> method will return 30.</p> + +<p>Some of PolyReader’s <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>DataReader</a> components may be less efficient or complete than the single-segment implementations accessed via <a href="../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="doc_max" +>doc_max</a></h3> + +<pre>my $int = $poly_reader->doc_max();</pre> + +<p>Return the maximum number of documents available to the reader, +which is also the highest possible internal document id. +Documents which have been marked as deleted but not yet purged from the index are included in this count.</p> + +<h3><a class='u' +name="doc_count" +>doc_count</a></h3> + +<pre>my $int = $poly_reader->doc_count();</pre> + +<p>Return the number of documents available to the reader, +subtracting any that are marked as deleted.</p> + +<h3><a class='u' +name="del_count" +>del_count</a></h3> + +<pre>my $int = $poly_reader->del_count();</pre> + +<p>Return the number of documents which have been marked as deleted but not yet purged from the index.</p> + +<h3><a class='u' +name="offsets" +>offsets</a></h3> + +<pre>my $i32_array = $poly_reader->offsets();</pre> + +<p>Return an array with one entry for each segment, +corresponding to segment doc_id start offset.</p> + +<h3><a class='u' +name="seg_readers" +>seg_readers</a></h3> + +<pre>my $arrayref = $poly_reader->seg_readers();</pre> + +<p>Return an array of all the SegReaders represented within the IndexReader.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::PolyReader isa <a href="../../Lucy/Index/IndexReader.html" class="podlinkpod" +>Lucy::Index::IndexReader</a> isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/PostingList.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/PostingList.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/PostingList.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/PostingList.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,70 @@ +Title: Lucy::Index::PostingList â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::PostingList - Term-Document pairings.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $posting_list_reader + = $seg_reader->obtain("Lucy::Index::PostingListReader"); +my $posting_list = $posting_list_reader->posting_list( + field => 'content', + term => 'foo', +); +while ( my $doc_id = $posting_list->next ) { + say "Matching doc id: $doc_id"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>PostingList is an iterator which supplies a list of document ids that match a given term.</p> + +<p>See <a href="../../Lucy/Docs/IRTheory.html" class="podlinkpod" +>IRTheory</a> for definitions of “posting” and “posting list”.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="get_doc_freq" +>get_doc_freq</a></h3> + +<pre>my $int = $posting_list->get_doc_freq();</pre> + +<p>Return the number of documents that the PostingList contains. +(This number will include any documents which have been marked as deleted but not yet purged.)</p> + +<h3><a class='u' +name="seek" +>seek</a></h3> + +<pre>$posting_list->seek($target); +$posting_list->seek(); # default: undef</pre> + +<p>Prepare the PostingList object to iterate over matches for documents that match <code>target</code>.</p> + +<ul> +<li><b>target</b> - The term to match. +If undef, +the iterator will be empty.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::PostingList isa <a href="../../Lucy/Search/Matcher.html" class="podlinkpod" +>Lucy::Search::Matcher</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/PostingListReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/PostingListReader.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/PostingListReader.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/PostingListReader.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,76 @@ +Title: Lucy::Index::PostingListReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::PostingListReader - Read postings data.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $posting_list_reader + = $seg_reader->obtain("Lucy::Index::PostingListReader"); +my $posting_list = $posting_list_reader->posting_list( + field => 'title', + term => 'foo', +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>PostingListReaders produce <a href="../../Lucy/Index/PostingList.html" class="podlinkpod" +>PostingList</a> objects which convey document matching information.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="posting_list" +>posting_list</a></h3> + +<pre>my $posting_list = $posting_list_reader->posting_list( + field => $field # default: undef + term => $term # default: undef +);</pre> + +<p>Returns a PostingList, +or undef if either <code>field</code> is undef or <code>field</code> is not present in any documents.</p> + +<ul> +<li><b>field</b> - A field name.</li> + +<li><b>term</b> - If supplied, +the PostingList will be pre-located to this term using <a href="../../Lucy/Index/PostingList.html#seek" class="podlinkpod" +>seek()</a>.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="aggregator" +>aggregator</a></h3> + +<pre>my $result = $posting_list_reader->aggregator( + readers => $readers # required + offsets => $offsets # required +);</pre> + +<p>Returns undef since PostingLists may only be iterated at the segment level.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::PostingListReader isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/SegReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/SegReader.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/SegReader.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/SegReader.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,115 @@ +Title: Lucy::Index::SegReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::SegReader - Single-segment IndexReader.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $polyreader = Lucy::Index::IndexReader->open( + index => '/path/to/index', +); +my $seg_readers = $polyreader->seg_readers; +for my $seg_reader (@$seg_readers) { + my $seg_name = $seg_reader->get_seg_name; + my $num_docs = $seg_reader->doc_max; + print "Segment $seg_name ($num_docs documents):\n"; + my $doc_reader = $seg_reader->obtain("Lucy::Index::DocReader"); + for my $doc_id ( 1 .. $num_docs ) { + my $doc = $doc_reader->fetch_doc($doc_id); + print " $doc_id: $doc->{title}\n"; + } +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>SegReader interprets the data within a single segment of an index.</p> + +<p>Generally speaking, +only advanced users writing subclasses which manipulate data at the segment level need to deal with the SegReader API directly.</p> + +<p>Nearly all of SegReader’s functionality is implemented by pluggable components spawned by <a href="../../Lucy/Plan/Architecture.html" class="podlinkpod" +>Architecture</a>’s factory methods.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_seg_name" +>get_seg_name</a></h3> + +<pre>my $string = $seg_reader->get_seg_name();</pre> + +<p>Return the name of the segment.</p> + +<h3><a class='u' +name="get_seg_num" +>get_seg_num</a></h3> + +<pre>my $int = $seg_reader->get_seg_num();</pre> + +<p>Return the number of the segment.</p> + +<h3><a class='u' +name="del_count" +>del_count</a></h3> + +<pre>my $int = $seg_reader->del_count();</pre> + +<p>Return the number of documents which have been marked as deleted but not yet purged from the index.</p> + +<h3><a class='u' +name="doc_max" +>doc_max</a></h3> + +<pre>my $int = $seg_reader->doc_max();</pre> + +<p>Return the maximum number of documents available to the reader, +which is also the highest possible internal document id. +Documents which have been marked as deleted but not yet purged from the index are included in this count.</p> + +<h3><a class='u' +name="doc_count" +>doc_count</a></h3> + +<pre>my $int = $seg_reader->doc_count();</pre> + +<p>Return the number of documents available to the reader, +subtracting any that are marked as deleted.</p> + +<h3><a class='u' +name="_offsets" +>_offsets</a></h3> + +<pre>my $i32_array = $seg_reader->_offsets();</pre> + +<p>Return an array with one entry for each segment, +corresponding to segment doc_id start offset.</p> + +<h3><a class='u' +name="seg_readers" +>seg_readers</a></h3> + +<pre>my $arrayref = $seg_reader->seg_readers();</pre> + +<p>Return an array of all the SegReaders represented within the IndexReader.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::SegReader isa <a href="../../Lucy/Index/IndexReader.html" class="podlinkpod" +>Lucy::Index::IndexReader</a> isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/SegWriter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/SegWriter.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/SegWriter.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/SegWriter.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,126 @@ +Title: Lucy::Index::SegWriter â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::SegWriter - Write one segment of an index.</p> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>SegWriter is a conduit through which information fed to Indexer passes. +It manages <a href="../../Lucy/Index/Segment.html" class="podlinkpod" +>Segment</a> and Inverter, +invokes the <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Analyzer</a> chain, +and feeds low level <a href="../../Lucy/Index/DataWriter.html" class="podlinkpod" +>DataWriters</a> such as PostingListWriter and DocWriter.</p> + +<p>The sub-components of a SegWriter are determined by <a href="../../Lucy/Plan/Architecture.html" class="podlinkpod" +>Architecture</a>. +DataWriter components which are added to the stack of writers via <a href="#add_writer" class="podlinkpod" +>add_writer()</a> have Add_Inverted_Doc() invoked for each document supplied to SegWriter’s <a href="#add_doc" class="podlinkpod" +>add_doc()</a>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="register" +>register</a></h3> + +<pre>$seg_writer->register( + api => $api # required + component => $component # required +);</pre> + +<p>Register a DataWriter component with the SegWriter. +(Note that registration simply makes the writer available via <a href="#fetch" class="podlinkpod" +>fetch()</a>, +so you may also want to call <a href="#add_writer" class="podlinkpod" +>add_writer()</a>).</p> + +<ul> +<li><b>api</b> - The name of the DataWriter api which <code>writer</code> implements.</li> + +<li><b>component</b> - A DataWriter.</li> +</ul> + +<h3><a class='u' +name="fetch" +>fetch</a></h3> + +<pre>my $obj = $seg_writer->fetch($api);</pre> + +<p>Retrieve a registered component.</p> + +<ul> +<li><b>api</b> - The name of the DataWriter api which the component implements.</li> +</ul> + +<h3><a class='u' +name="add_writer" +>add_writer</a></h3> + +<pre>$seg_writer->add_writer($writer);</pre> + +<p>Add a DataWriter to the SegWriter’s stack of writers.</p> + +<h3><a class='u' +name="add_doc" +>add_doc</a></h3> + +<pre>$seg_writer->add_doc( + doc => $doc # required + boost => $boost # default: 1.0 +);</pre> + +<p>Add a document to the segment. +Inverts <code>doc</code>, +increments the Segment’s internal document id, +then calls Add_Inverted_Doc(), +feeding all sub-writers.</p> + +<h3><a class='u' +name="add_segment" +>add_segment</a></h3> + +<pre>$seg_writer->add_segment( + reader => $reader # required + doc_map => $doc_map # default: undef +);</pre> + +<p>Add content from an existing segment into the one currently being written.</p> + +<ul> +<li><b>reader</b> - The SegReader containing content to add.</li> + +<li><b>doc_map</b> - An array of integers mapping old document ids to new. +Deleted documents are mapped to 0, +indicating that they should be skipped.</li> +</ul> + +<h3><a class='u' +name="finish" +>finish</a></h3> + +<pre>$seg_writer->finish();</pre> + +<p>Complete the segment: close all streams, +store metadata, +etc.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::SegWriter isa <a href="../../Lucy/Index/DataWriter.html" class="podlinkpod" +>Lucy::Index::DataWriter</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/Segment.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/Segment.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/Segment.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/Segment.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,182 @@ +Title: Lucy::Index::Segment â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Segment - Warehouse for information about one segment of an inverted index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># Index-time. +package MyDataWriter; +use base qw( Lucy::Index::DataWriter ); + +sub finish { + my $self = shift; + my $segment = $self->get_segment; + my $metadata = $self->SUPER::metadata(); + $metadata->{foo} = $self->get_foo; + $segment->store_metadata( + key => 'my_component', + metadata => $metadata + ); +} + +# Search-time. +package MyDataReader; +use base qw( Lucy::Index::DataReader ); + +sub new { + my $self = shift->SUPER::new(@_); + my $segment = $self->get_segment; + my $metadata = $segment->fetch_metadata('my_component'); + if ($metadata) { + $self->set_foo( $metadata->{foo} ); + ... + } + return $self; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Apache Lucy’s indexes are made up of individual “segments”, +each of which is is an independent inverted index. +On the file system, +each segment is a directory within the main index directory whose name starts with “seg_”: “seg_2”, +“seg_5a”, +etc.</p> + +<p>Each Segment object keeps track of information about an index segment: its fields, +document count, +and so on. +The Segment object itself writes one file, +<code>segmeta.json</code>; besides storing info needed by Segment itself, +the “segmeta” file serves as a central repository for metadata generated by other index components – relieving them of the burden of storing metadata themselves.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="add_field" +>add_field</a></h3> + +<pre>my $int = $segment->add_field($field);</pre> + +<p>Register a new field and assign it a field number. +If the field was already known, +nothing happens.</p> + +<ul> +<li><b>field</b> - Field name.</li> +</ul> + +<p>Returns: the field’s field number, +which is a positive integer.</p> + +<h3><a class='u' +name="store_metadata" +>store_metadata</a></h3> + +<pre>$segment->store_metadata( + key => $key # required + metadata => $metadata # required +);</pre> + +<p>Store arbitrary information in the segment’s metadata hash, +to be serialized later. +Throws an error if <code>key</code> is used twice.</p> + +<ul> +<li><b>key</b> - String identifying an index component.</li> + +<li><b>metadata</b> - JSON-izable data structure.</li> +</ul> + +<h3><a class='u' +name="fetch_metadata" +>fetch_metadata</a></h3> + +<pre>my $obj = $segment->fetch_metadata($key);</pre> + +<p>Fetch a value from the Segment’s metadata hash.</p> + +<h3><a class='u' +name="field_num" +>field_num</a></h3> + +<pre>my $int = $segment->field_num($field);</pre> + +<p>Given a field name, +return its field number for this segment (which may differ from its number in other segments). +Return 0 (an invalid field number) if the field name can’t be found.</p> + +<ul> +<li><b>field</b> - Field name.</li> +</ul> + +<h3><a class='u' +name="field_name" +>field_name</a></h3> + +<pre>my $string = $segment->field_name($field_num);</pre> + +<p>Given a field number, +return the name of its field, +or undef if the field name can’t be found.</p> + +<h3><a class='u' +name="get_name" +>get_name</a></h3> + +<pre>my $string = $segment->get_name();</pre> + +<p>Getter for the object’s seg name.</p> + +<h3><a class='u' +name="get_number" +>get_number</a></h3> + +<pre>my $int = $segment->get_number();</pre> + +<p>Getter for the segment number.</p> + +<h3><a class='u' +name="set_count" +>set_count</a></h3> + +<pre>$segment->set_count($count);</pre> + +<p>Setter for the object’s document count.</p> + +<h3><a class='u' +name="get_count" +>get_count</a></h3> + +<pre>my $int = $segment->get_count();</pre> + +<p>Getter for the object’s document count.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $segment->compare_to($other);</pre> + +<p>Compare by segment number.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Segment isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/Similarity.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/Similarity.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/Similarity.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/Similarity.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Index::Similarity â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Similarity - Judge how well a document matches a query.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>package MySimilarity; + +sub length_norm { return 1.0 } # disable length normalization + +package MyFullTextType; +use base qw( Lucy::Plan::FullTextType ); + +sub make_similarity { MySimilarity->new }</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>After determining whether a document matches a given query, +a score must be calculated which indicates how <i>well</i> the document matches the query. +The Similarity class is used to judge how “similar” the query and the document are to each other; the closer the resemblance, +they higher the document scores.</p> + +<p>The default implementation uses Lucene’s modified cosine similarity measure. +Subclasses might tweak the existing algorithms, +or might be used in conjunction with custom Query subclasses to implement arbitrary scoring schemes.</p> + +<p>Most of the methods operate on single fields, +but some are used to combine scores from multiple fields.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $sim = Lucy::Index::Similarity->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="length_norm" +>length_norm</a></h3> + +<pre>my $float = $similarity->length_norm($num_tokens);</pre> + +<p>Dampen the scores of long documents.</p> + +<p>After a field is broken up into terms at index-time, +each term must be assigned a weight. +One of the factors in calculating this weight is the number of tokens that the original field was broken into.</p> + +<p>Typically, +we assume that the more tokens in a field, +the less important any one of them is – so that, +e.g. +5 mentions of “Kafka” in a short article are given more heft than 5 mentions of “Kafka” in an entire book. +The default implementation of length_norm expresses this using an inverted square root.</p> + +<p>However, +the inverted square root has a tendency to reward very short fields highly, +which isn’t always appropriate for fields you expect to have a lot of tokens on average.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Similarity isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Index/Snapshot.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Index/Snapshot.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Index/Snapshot.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Index/Snapshot.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,156 @@ +Title: Lucy::Index::Snapshot â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Snapshot - Point-in-time index file list.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $snapshot = Lucy::Index::Snapshot->new; +$snapshot->read_file( folder => $folder ); # load most recent snapshot +my $files = $snapshot->list; +print "$_\n" for @$files;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A Snapshot is list of index files and folders. +Because index files, +once written, +are never modified, +a Snapshot defines a point-in-time view of the data in an index.</p> + +<p><a href="../../Lucy/Index/IndexReader.html" class="podlinkpod" +>IndexReader</a> objects interpret the data associated with a single Snapshot.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $snapshot = Lucy::Index::Snapshot->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="list" +>list</a></h3> + +<pre>my $arrayref = $snapshot->list();</pre> + +<p>Return an array of all entries.</p> + +<h3><a class='u' +name="num_entries" +>num_entries</a></h3> + +<pre>my $int = $snapshot->num_entries();</pre> + +<p>Return the number of entries (including directories).</p> + +<h3><a class='u' +name="add_entry" +>add_entry</a></h3> + +<pre>$snapshot->add_entry($entry);</pre> + +<p>Add a filepath to the snapshot.</p> + +<h3><a class='u' +name="delete_entry" +>delete_entry</a></h3> + +<pre>my $bool = $snapshot->delete_entry($entry);</pre> + +<p>Delete a filepath from the snapshot.</p> + +<p>Returns: true if the entry existed and was successfully deleted, +false otherwise.</p> + +<h3><a class='u' +name="read_file" +>read_file</a></h3> + +<pre>my $result = $snapshot->read_file( + folder => $folder # required + path => $path # default: undef +);</pre> + +<p>Decode a snapshot file and initialize the object to reflect its contents.</p> + +<ul> +<li><b>folder</b> - A Folder.</li> + +<li><b>path</b> - The location of the snapshot file. +If not supplied, +the most recent snapshot file in the base directory will be chosen.</li> +</ul> + +<p>Returns: the Snapshot object itself</p> + +<h3><a class='u' +name="write_file" +>write_file</a></h3> + +<pre>$snapshot->write_file( + folder => $folder # required + path => $path # default: undef +);</pre> + +<p>Write a snapshot file. +The caller must lock the index while this operation takes place, +and the operation will fail if the snapshot file already exists.</p> + +<ul> +<li><b>folder</b> - A Folder.</li> + +<li><b>path</b> - The path of the file to write. +If undef, +a file name will be chosen which supersedes the latest snapshot file in the index folder.</li> +</ul> + +<h3><a class='u' +name="set_path" +>set_path</a></h3> + +<pre>$snapshot->set_path($path);</pre> + +<p>Set the path to the file that the Snapshot object serves as a proxy for.</p> + +<h3><a class='u' +name="get_path" +>get_path</a></h3> + +<pre>my $string = $snapshot->get_path();</pre> + +<p>Get the path to the snapshot file. +Initially undef; updated by <a href="#read_file" class="podlinkpod" +>read_file()</a>, +<a href="#write_file" class="podlinkpod" +>write_file()</a>, +and <a href="#set_path" class="podlinkpod" +>set_path()</a>.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Snapshot isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Object/BitVector.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Object/BitVector.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Object/BitVector.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Object/BitVector.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,232 @@ +Title: Lucy::Object::BitVector â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Object::BitVector - An array of bits.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $bit_vec = Lucy::Object::BitVector->new( capacity => 8 ); +my $other = Lucy::Object::BitVector->new( capacity => 8 ); +$bit_vec->set($_) for ( 0, 2, 4, 6 ); +$other->set($_) for ( 1, 3, 5, 7 ); +$bit_vec->or($other); +print "$_\n" for @{ $bit_vec->to_array }; # prints 0 through 7.</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>BitVector is a growable array of bits. +All bits are initially zero.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $bit_vec = Lucy::Object::BitVector->new( + capacity => $doc_max + 1, # default 0, +);</pre> + +<p>Create a new BitVector.</p> + +<ul> +<li><b>capacity</b> - The number of bits that the initial array should be able to hold.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get" +>get</a></h3> + +<pre>my $bool = $bit_vector->get($tick);</pre> + +<p>Return true if the bit at <code>tick</code> has been set, +false if it hasn’t (regardless of whether it lies within the bounds of the object’s capacity).</p> + +<ul> +<li><b>tick</b> - The requested bit.</li> +</ul> + +<h3><a class='u' +name="set" +>set</a></h3> + +<pre>$bit_vector->set($tick);</pre> + +<p>Set the bit at <code>tick</code> to 1.</p> + +<ul> +<li><b>tick</b> - The bit to be set.</li> +</ul> + +<h3><a class='u' +name="next_hit" +>next_hit</a></h3> + +<pre>my $int = $bit_vector->next_hit($tick);</pre> + +<p>Returns the next set bit equal to or greater than <code>tick</code>, +or -1 if no such bit exists.</p> + +<h3><a class='u' +name="clear" +>clear</a></h3> + +<pre>$bit_vector->clear($tick);</pre> + +<p>Clear the indicated bit. +(i.e. +set it to 0).</p> + +<ul> +<li><b>tick</b> - The bit to be cleared.</li> +</ul> + +<h3><a class='u' +name="clear_all" +>clear_all</a></h3> + +<pre>$bit_vector->clear_all();</pre> + +<p>Clear all bits.</p> + +<h3><a class='u' +name="grow" +>grow</a></h3> + +<pre>$bit_vector->grow($capacity);</pre> + +<p>If the BitVector does not already have enough room to hold the indicated number of bits, +allocate more memory so that it can.</p> + +<ul> +<li><b>capacity</b> - Least number of bits the BitVector should accomodate.</li> +</ul> + +<h3><a class='u' +name="and" +>and</a></h3> + +<pre>$bit_vector->and($other);</pre> + +<p>Modify the BitVector so that only bits which remain set are those which 1) were already set in this BitVector, +and 2) were also set in the other BitVector.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="or" +>or</a></h3> + +<pre>$bit_vector->or($other);</pre> + +<p>Modify the BitVector, +setting all bits which are set in the other BitVector if they were not already set.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="xor" +>xor</a></h3> + +<pre>$bit_vector->xor($other);</pre> + +<p>Modify the BitVector, +performing an XOR operation against the other.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="and_not" +>and_not</a></h3> + +<pre>$bit_vector->and_not($other);</pre> + +<p>Modify the BitVector, +clearing all bits which are set in the other.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="flip" +>flip</a></h3> + +<pre>$bit_vector->flip($tick);</pre> + +<p>Invert the value of a bit.</p> + +<ul> +<li><b>tick</b> - The bit to invert.</li> +</ul> + +<h3><a class='u' +name="flip_block" +>flip_block</a></h3> + +<pre>$bit_vector->flip_block( + offset => $offset # required + length => $length # required +);</pre> + +<p>Invert each bit within a contiguous block.</p> + +<ul> +<li><b>offset</b> - Lower bound.</li> + +<li><b>length</b> - The number of bits to flip.</li> +</ul> + +<h3><a class='u' +name="count" +>count</a></h3> + +<pre>my $int = $bit_vector->count();</pre> + +<p>Return a count of the number of set bits.</p> + +<h3><a class='u' +name="to_array" +>to_array</a></h3> + +<pre>my $i32_array = $bit_vector->to_array();</pre> + +<p>Return an array where each element represents a set bit.</p> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $bit_vector->clone();</pre> + +<p>Return a clone of the object.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Object::BitVector isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Object/Obj.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Object/Obj.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Object/Obj.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Object/Obj.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,18 @@ +Title: Lucy::Object::Obj â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Object::Obj - Moved.</p> + +<h2><a class='u' +name="MOVED" +>MOVED</a></h2> + +<p>Lucy::Object::Obj has been moved to Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Plan/Architecture.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Plan/Architecture.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Plan/Architecture.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Plan/Architecture.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,123 @@ +Title: Lucy::Plan::Architecture â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::Architecture - Configure major components of an index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>package MyArchitecture; +use base qw( Lucy::Plan::Architecture ); + +use LucyX::Index::ZlibDocWriter; +use LucyX::Index::ZlibDocReader; + +sub register_doc_writer { + my ( $self, $seg_writer ) = @_; + my $doc_writer = LucyX::Index::ZlibDocWriter->new( + snapshot => $seg_writer->get_snapshot, + segment => $seg_writer->get_segment, + polyreader => $seg_writer->get_polyreader, + ); + $seg_writer->register( + api => "Lucy::Index::DocReader", + component => $doc_writer, + ); + $seg_writer->add_writer($doc_writer); +} + +sub register_doc_reader { + my ( $self, $seg_reader ) = @_; + my $doc_reader = LucyX::Index::ZlibDocReader->new( + schema => $seg_reader->get_schema, + folder => $seg_reader->get_folder, + segments => $seg_reader->get_segments, + seg_tick => $seg_reader->get_seg_tick, + snapshot => $seg_reader->get_snapshot, + ); + $seg_reader->register( + api => 'Lucy::Index::DocReader', + component => $doc_reader, + ); +} + +package MySchema; +use base qw( Lucy::Plan::Schema ); + +sub architecture { + shift; + return MyArchitecture->new(@_); +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>By default, +a Lucy index consists of several main parts: lexicon, +postings, +stored documents, +deletions, +and highlight data. +The readers and writers for that data are spawned by Architecture. +Each component operates at the segment level; Architecture’s factory methods are used to build up <a href="../../Lucy/Index/SegWriter.html" class="podlinkpod" +>SegWriter</a> and <a href="../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a>.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $arch = Lucy::Plan::Architecture->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="register_doc_writer" +>register_doc_writer</a></h3> + +<pre>$architecture->register_doc_writer($writer);</pre> + +<p>Spawn a DataWriter and <a href="../../Lucy/Index/SegWriter.html#register" class="podlinkpod" +>register()</a> it with the supplied SegWriter, +adding it to the SegWriter’s writer stack.</p> + +<ul> +<li><b>writer</b> - A SegWriter.</li> +</ul> + +<h3><a class='u' +name="register_doc_reader" +>register_doc_reader</a></h3> + +<pre>$architecture->register_doc_reader($reader);</pre> + +<p>Spawn a DocReader and register it with the supplied SegReader.</p> + +<ul> +<li><b>reader</b> - A SegReader.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::Architecture isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Plan/BlobType.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Plan/BlobType.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Plan/BlobType.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Plan/BlobType.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,54 @@ +Title: Lucy::Plan::BlobType â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::BlobType - Default behaviors for binary fields.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $string_type = Lucy::Plan::StringType->new; +my $blob_type = Lucy::Plan::BlobType->new( stored => 1 ); +my $schema = Lucy::Plan::Schema->new; +$schema->spec_field( name => 'id', type => $string_type ); +$schema->spec_field( name => 'jpeg', type => $blob_type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>BlobType is an implementation of FieldType tuned for use with fields containing binary data, +which cannot be indexed or searched – only stored.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $blob_type = Lucy::Plan::BlobType->new( + stored => 1, # default: false +);</pre> + +<p>Create a new BlobType.</p> + +<ul> +<li><b>stored</b> - boolean indicating whether the field should be stored.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::BlobType isa <a href="../../Lucy/Plan/FieldType.html" class="podlinkpod" +>Lucy::Plan::FieldType</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Plan/FieldType.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Plan/FieldType.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Plan/FieldType.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Plan/FieldType.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,138 @@ +Title: Lucy::Plan::FieldType â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::FieldType - Define a field’s behavior.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my @sortable; +for my $field ( @{ $schema->all_fields } ) { + my $type = $schema->fetch_type($field); + next unless $type->sortable; + push @sortable, $field; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>FieldType is an abstract class defining a set of traits and behaviors which may be associated with one or more field names.</p> + +<p>Properties which are common to all field types include <code>boost</code>, +<code>indexed</code>, +<code>stored</code>, +<code>sortable</code>, +<code>binary</code>, +and <code>similarity</code>.</p> + +<p>The <code>boost</code> property is a floating point scoring multiplier which defaults to 1.0. +Values greater than 1.0 cause the field to contribute more to a document’s score, +lower values, +less.</p> + +<p>The <code>indexed</code> property indicates whether the field should be indexed (so that it can be searched).</p> + +<p>The <code>stored</code> property indicates whether to store the raw field value, +so that it can be retrieved when a document turns up in a search.</p> + +<p>The <code>sortable</code> property indicates whether search results should be sortable based on the contents of the field.</p> + +<p>The <code>binary</code> property indicates whether the field contains binary or text data. +Unlike most other properties, +<code>binary</code> is not settable.</p> + +<p>The <code>similarity</code> property is a <a href="../../Lucy/Index/Similarity.html" class="podlinkpod" +>Similarity</a> object which defines matching and scoring behavior for the field. +It is required if the field is <code>indexed</code>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="set_boost" +>set_boost</a></h3> + +<pre>$field_type->set_boost($boost);</pre> + +<p>Setter for <code>boost</code>.</p> + +<h3><a class='u' +name="get_boost" +>get_boost</a></h3> + +<pre>my $float = $field_type->get_boost();</pre> + +<p>Accessor for <code>boost</code>.</p> + +<h3><a class='u' +name="set_indexed" +>set_indexed</a></h3> + +<pre>$field_type->set_indexed($indexed);</pre> + +<p>Setter for <code>indexed</code>.</p> + +<h3><a class='u' +name="indexed" +>indexed</a></h3> + +<pre>my $bool = $field_type->indexed();</pre> + +<p>Accessor for <code>indexed</code>.</p> + +<h3><a class='u' +name="set_stored" +>set_stored</a></h3> + +<pre>$field_type->set_stored($stored);</pre> + +<p>Setter for <code>stored</code>.</p> + +<h3><a class='u' +name="stored" +>stored</a></h3> + +<pre>my $bool = $field_type->stored();</pre> + +<p>Accessor for <code>stored</code>.</p> + +<h3><a class='u' +name="set_sortable" +>set_sortable</a></h3> + +<pre>$field_type->set_sortable($sortable);</pre> + +<p>Setter for <code>sortable</code>.</p> + +<h3><a class='u' +name="sortable" +>sortable</a></h3> + +<pre>my $bool = $field_type->sortable();</pre> + +<p>Accessor for <code>sortable</code>.</p> + +<h3><a class='u' +name="binary" +>binary</a></h3> + +<pre>my $bool = $field_type->binary();</pre> + +<p>Indicate whether the field contains binary data.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::FieldType isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Plan/FullTextType.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Plan/FullTextType.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Plan/FullTextType.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Plan/FullTextType.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,109 @@ +Title: Lucy::Plan::FullTextType â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::FullTextType - Full-text search field type.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $easyanalyzer = Lucy::Analysis::EasyAnalyzer->new( + language => 'en', +); +my $type = Lucy::Plan::FullTextType->new( + analyzer => $easyanalyzer, +); +my $schema = Lucy::Plan::Schema->new; +$schema->spec_field( name => 'title', type => $type ); +$schema->spec_field( name => 'content', type => $type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Lucy::Plan::FullTextType is an implementation of <a href="../../Lucy/Plan/FieldType.html" class="podlinkpod" +>FieldType</a> tuned for “full text search”.</p> + +<p>Full text fields are associated with an <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Analyzer</a>, +which is used to tokenize and normalize the text so that it can be searched for individual words.</p> + +<p>For an exact-match, +single value field type using character data, +see <a href="../../Lucy/Plan/StringType.html" class="podlinkpod" +>StringType</a>.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $type = Lucy::Plan::FullTextType->new( + analyzer => $analyzer, # required + boost => 2.0, # default: 1.0 + indexed => 1, # default: true + stored => 1, # default: true + sortable => 1, # default: false + highlightable => 1, # default: false +);</pre> + +<ul> +<li><b>analyzer</b> - An Analyzer.</li> + +<li><b>boost</b> - floating point per-field boost.</li> + +<li><b>indexed</b> - boolean indicating whether the field should be indexed.</li> + +<li><b>stored</b> - boolean indicating whether the field should be stored.</li> + +<li><b>sortable</b> - boolean indicating whether the field should be sortable.</li> + +<li><b>highlightable</b> - boolean indicating whether the field should be highlightable.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="set_highlightable" +>set_highlightable</a></h3> + +<pre>$full_text_type->set_highlightable($highlightable);</pre> + +<p>Indicate whether to store data required by <a href="../../Lucy/Highlight/Highlighter.html" class="podlinkpod" +>Highlighter</a> for excerpt selection and search term highlighting.</p> + +<h3><a class='u' +name="highlightable" +>highlightable</a></h3> + +<pre>my $bool = $full_text_type->highlightable();</pre> + +<p>Accessor for “highlightable” property.</p> + +<h3><a class='u' +name="get_analyzer" +>get_analyzer</a></h3> + +<pre>my $analyzer = $full_text_type->get_analyzer();</pre> + +<p>Accessor for the type’s analyzer.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::FullTextType isa Lucy::Plan::TextType isa <a href="../../Lucy/Plan/FieldType.html" class="podlinkpod" +>Lucy::Plan::FieldType</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Plan/Schema.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Plan/Schema.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Plan/Schema.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Plan/Schema.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,145 @@ +Title: Lucy::Plan::Schema â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::Schema - User-created specification for an inverted index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>use Lucy::Plan::Schema; +use Lucy::Plan::FullTextType; +use Lucy::Analysis::EasyAnalyzer; + +my $schema = Lucy::Plan::Schema->new; +my $easyanalyzer = Lucy::Analysis::EasyAnalyzer->new( + language => 'en', +); +my $type = Lucy::Plan::FullTextType->new( + analyzer => $easyanalyzer, +); +$schema->spec_field( name => 'title', type => $type ); +$schema->spec_field( name => 'content', type => $type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A Schema is a specification which indicates how other entities should interpret the raw data in an inverted index and interact with it.</p> + +<p>Once an actual index has been created using a particular Schema, +existing field definitions may not be changed. +However, +it is possible to add new fields during subsequent indexing sessions.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $schema = Lucy::Plan::Schema->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="architecture" +>architecture</a></h3> + +<pre>my $architecture = $schema->architecture();</pre> + +<p>Factory method which creates an Architecture object for this index.</p> + +<h3><a class='u' +name="spec_field" +>spec_field</a></h3> + +<pre>$schema->spec_field( + name => $name # required + type => $type # required +);</pre> + +<p>Define the behavior of a field by associating it with a FieldType.</p> + +<p>If this method has already been called for the supplied <code>field</code>, +it will merely test to verify that the supplied FieldType <a href="../../Clownfish/Obj.html#equals" class="podlinkpod" +>equals()</a> the existing one.</p> + +<ul> +<li><b>name</b> - The name of the field.</li> + +<li><b>type</b> - A FieldType.</li> +</ul> + +<h3><a class='u' +name="fetch_type" +>fetch_type</a></h3> + +<pre>my $field_type = $schema->fetch_type($field);</pre> + +<p>Return the FieldType for the specified field. +If the field can’t be found, +return undef.</p> + +<h3><a class='u' +name="fetch_sim" +>fetch_sim</a></h3> + +<pre>my $similarity = $schema->fetch_sim($field); +my $similarity = $schema->fetch_sim(); # default: undef</pre> + +<p>Return the Similarity for the specified field, +or undef if either the field can’t be found or it isn’t associated with a Similarity.</p> + +<h3><a class='u' +name="num_fields" +>num_fields</a></h3> + +<pre>my $int = $schema->num_fields();</pre> + +<p>Return the number of fields currently defined.</p> + +<h3><a class='u' +name="all_fields" +>all_fields</a></h3> + +<pre>my $arrayref = $schema->all_fields();</pre> + +<p>Return all the Schema’s field names as an array.</p> + +<h3><a class='u' +name="get_architecture" +>get_architecture</a></h3> + +<pre>my $architecture = $schema->get_architecture();</pre> + +<p>Return the Schema instance’s internal Architecture object.</p> + +<h3><a class='u' +name="get_similarity" +>get_similarity</a></h3> + +<pre>my $similarity = $schema->get_similarity();</pre> + +<p>Return the Schema instance’s internal Similarity object.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::Schema isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Plan/StringType.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Plan/StringType.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Plan/StringType.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Plan/StringType.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,58 @@ +Title: Lucy::Plan::StringType â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::StringType - Non-tokenized text type.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $type = Lucy::Plan::StringType->new; +my $schema = Lucy::Plan::Schema->new; +$schema->spec_field( name => 'category', type => $type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Lucy::Plan::StringType is used for “exact-match” strings.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $type = Lucy::Plan::StringType->new( + boost => 0.1, # default: 1.0 + indexed => 1, # default: true + stored => 1, # default: true + sortable => 1, # default: false +);</pre> + +<ul> +<li><b>boost</b> - floating point per-field boost.</li> + +<li><b>indexed</b> - boolean indicating whether the field should be indexed.</li> + +<li><b>stored</b> - boolean indicating whether the field should be stored.</li> + +<li><b>sortable</b> - boolean indicating whether the field should be sortable.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::StringType isa Lucy::Plan::TextType isa <a href="../../Lucy/Plan/FieldType.html" class="podlinkpod" +>Lucy::Plan::FieldType</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Search/ANDQuery.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Search/ANDQuery.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Search/ANDQuery.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Search/ANDQuery.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,84 @@ +Title: Lucy::Search::ANDQuery â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Search::ANDQuery - Intersect multiple result sets.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $foo_and_bar_query = Lucy::Search::ANDQuery->new( + children => [ $foo_query, $bar_query ], +); +my $hits = $searcher->hits( query => $foo_and_bar_query ); +...</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>ANDQuery is a composite <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Query</a> which matches only when all of its children match, +so its result set is the intersection of their result sets. +Documents which match receive a summed score.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $foo_and_bar_query = Lucy::Search::ANDQuery->new( + children => [ $foo_query, $bar_query ], +);</pre> + +<p>Create a new ANDQuery.</p> + +<ul> +<li><b>children</b> - An array of child Queries.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="make_compiler" +>make_compiler</a></h3> + +<pre>my $compiler = $and_query->make_compiler( + searcher => $searcher # required + boost => $boost # required + subordinate => $subordinate # default: false +);</pre> + +<p>Abstract factory method returning a Compiler derived from this Query.</p> + +<ul> +<li><b>searcher</b> - A Searcher.</li> + +<li><b>boost</b> - A scoring multiplier.</li> + +<li><b>subordinate</b> - Indicates whether the Query is a subquery (as opposed to a top-level query). +If false, +the implementation must invoke <a href="../../Lucy/Search/Compiler.html#normalize" class="podlinkpod" +>normalize()</a> on the newly minted Compiler object before returning it.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Search::ANDQuery isa <a href="../../Lucy/Search/PolyQuery.html" class="podlinkpod" +>Lucy::Search::PolyQuery</a> isa <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Lucy::Search::Query</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Search/Collector.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Search/Collector.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Search/Collector.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Search/Collector.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,79 @@ +Title: Lucy::Search::Collector â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Search::Collector - Process hits.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># Abstract base class.</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A Collector decides what to do with the hits that a <a href="../../Lucy/Search/Matcher.html" class="podlinkpod" +>Matcher</a> iterates through, +based on how the abstract <a href="#collect" class="podlinkpod" +>collect()</a> method is implemented.</p> + +<p>Collectors operate on individual segments, +but must operate within the context of a larger collection. +Each time the collector moves to a new segment, +Set_Reader(), +Set_Base() and Set_Matcher() will be called, +and the collector must take the updated information into account.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>package MyCollector; +use base qw( Lucy::Search::Collector ); +our %foo; +sub new { + my $self = shift->SUPER::new; + my %args = @_; + $foo{$$self} = $args{foo}; + return $self; +}</pre> + +<p>Abstract constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="collect" +>collect</a></h3> + +<pre>$collector->collect($doc_id);</pre> + +<p>Do something with a doc id. +(For instance, +keep track of the docs with the ten highest scores.)</p> + +<ul> +<li><b>doc_id</b> - A segment document id.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Search::Collector isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Search/Collector/BitCollector.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Search/Collector/BitCollector.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Search/Collector/BitCollector.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Search/Collector/BitCollector.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,72 @@ +Title: Lucy::Search::Collector::BitCollector â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Search::Collector::BitCollector - Collector which records doc nums in a BitVector.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $bit_vec = Lucy::Object::BitVector->new( + capacity => $searcher->doc_max + 1, +); +my $bit_collector = Lucy::Search::Collector::BitCollector->new( + bit_vector => $bit_vec, +); +$searcher->collect( + collector => $bit_collector, + query => $query, +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>BitCollector is a Collector which saves matching document ids in a <a href="../../../Lucy/Object/BitVector.html" class="podlinkpod" +>BitVector</a>. +It is useful for recording the entire set of documents which matches a query.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $bit_collector = Lucy::Search::Collector::BitCollector->new( + bit_vector => $bit_vec, # required +);</pre> + +<p>Create a new BitCollector.</p> + +<ul> +<li><b>bit_vector</b> - A Lucy::Object::BitVector.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="collect" +>collect</a></h3> + +<pre>$bit_collector->collect($doc_id);</pre> + +<p>Set bit in the object’s BitVector for the supplied doc id.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Search::Collector::BitCollector isa <a href="../../../Lucy/Search/Collector.html" class="podlinkpod" +>Lucy::Search::Collector</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Search/Compiler.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Search/Compiler.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Search/Compiler.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Search/Compiler.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,205 @@ +Title: Lucy::Search::Compiler â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Search::Compiler - Query-to-Matcher compiler.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># (Compiler is an abstract base class.) +package MyCompiler; +use base qw( Lucy::Search::Compiler ); + +sub make_matcher { + my $self = shift; + return MyMatcher->new( @_, compiler => $self ); +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>The purpose of the Compiler class is to take a specification in the form of a <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Query</a> object and compile a <a href="../../Lucy/Search/Matcher.html" class="podlinkpod" +>Matcher</a> object that can do real work.</p> + +<p>The simplest Compiler subclasses – such as those associated with constant-scoring Query types – might simply implement a <a href="#make_matcher" class="podlinkpod" +>make_matcher()</a> method which passes along information verbatim from the Query to the Matcher’s constructor.</p> + +<p>However it is common for the Compiler to perform some calculations which affect it’s “weight” – a floating point multiplier that the Matcher will factor into each document’s score. +If that is the case, +then the Compiler subclass may wish to override <a href="#get_weight" class="podlinkpod" +>get_weight()</a>, +<a href="#sum_of_squared_weights" class="podlinkpod" +>sum_of_squared_weights()</a>, +and <a href="#apply_norm_factor" class="podlinkpod" +>apply_norm_factor()</a>.</p> + +<p>Compiling a Matcher is a two stage process.</p> + +<p>The first stage takes place during the Compiler’s construction, +which is where the Query object meets a <a href="../../Lucy/Search/Searcher.html" class="podlinkpod" +>Searcher</a> object for the first time. +Searchers operate on a specific document collection and they can tell you certain statistical information about the collection – such as how many total documents are in the collection, +or how many documents in the collection a particular term is present in. +Lucy’s core Compiler classes plug this information into the classic TF/IDF weighting algorithm to adjust the Compiler’s weight; custom subclasses might do something similar.</p> + +<p>The second stage of compilation is <a href="#make_matcher" class="podlinkpod" +>make_matcher()</a>, +method, +which is where the Compiler meets a <a href="../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a> object. +SegReaders are associated with a single segment within a single index on a single machine, +and are thus lower-level than Searchers, +which may represent a document collection spread out over a search cluster (comprising several indexes and many segments). +The Compiler object can use new information supplied by the SegReader – such as whether a term is missing from the local index even though it is present within the larger collection represented by the Searcher – when figuring out what to feed to the Matchers’s constructor, +or whether <a href="#make_matcher" class="podlinkpod" +>make_matcher()</a> should return a Matcher at all.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $compiler = MyCompiler->SUPER::new( + parent => $my_query, + searcher => $searcher, + similarity => $sim, # default: undef + boost => undef, # default: see below +);</pre> + +<p>Abstract constructor.</p> + +<ul> +<li><b>parent</b> - The parent Query.</li> + +<li><b>searcher</b> - A Lucy::Search::Searcher, +such as an IndexSearcher.</li> + +<li><b>similarity</b> - A Similarity.</li> + +<li><b>boost</b> - An arbitrary scoring multiplier. +Defaults to the boost of the parent Query.</li> +</ul> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="make_matcher" +>make_matcher</a></h3> + +<pre>my $matcher = $compiler->make_matcher( + reader => $reader # required + need_score => $need_score # required +);</pre> + +<p>Factory method returning a Matcher.</p> + +<ul> +<li><b>reader</b> - A SegReader.</li> + +<li><b>need_score</b> - Indicate whether the Matcher must implement <a href="../../Lucy/Search/Matcher.html#score" class="podlinkpod" +>score()</a>.</li> +</ul> + +<p>Returns: a Matcher, +or undef if the Matcher would have matched no documents.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_weight" +>get_weight</a></h3> + +<pre>my $float = $compiler->get_weight();</pre> + +<p>Return the Compiler’s numerical weight, +a scoring multiplier. +By default, +returns the object’s boost.</p> + +<h3><a class='u' +name="get_similarity" +>get_similarity</a></h3> + +<pre>my $similarity = $compiler->get_similarity();</pre> + +<p>Accessor for the Compiler’s Similarity object.</p> + +<h3><a class='u' +name="get_parent" +>get_parent</a></h3> + +<pre>my $query = $compiler->get_parent();</pre> + +<p>Accessor for the Compiler’s parent Query object.</p> + +<h3><a class='u' +name="sum_of_squared_weights" +>sum_of_squared_weights</a></h3> + +<pre>my $float = $compiler->sum_of_squared_weights();</pre> + +<p>Compute and return a raw weighting factor. +(This quantity is used by <a href="#normalize" class="podlinkpod" +>normalize()</a>). +By default, +simply returns 1.0.</p> + +<h3><a class='u' +name="apply_norm_factor" +>apply_norm_factor</a></h3> + +<pre>$compiler->apply_norm_factor($factor);</pre> + +<p>Apply a floating point normalization multiplier. +For a TermCompiler, +this involves multiplying its own weight by the supplied factor; combining classes such as ORCompiler would apply the factor recursively to their children.</p> + +<p>The default implementation is a no-op; subclasses may wish to multiply their internal weight by the supplied factor.</p> + +<ul> +<li><b>factor</b> - The multiplier.</li> +</ul> + +<h3><a class='u' +name="normalize" +>normalize</a></h3> + +<pre>$compiler->normalize();</pre> + +<p>Take a newly minted Compiler object and apply query-specific normalization factors. +Should be invoked by Query subclasses during <a href="../../Lucy/Search/Query.html#make_compiler" class="podlinkpod" +>make_compiler()</a> for top-level nodes.</p> + +<p>For a TermQuery, +the scoring formula is approximately:</p> + +<pre>(tf_d * idf_t / norm_d) * (tf_q * idf_t / norm_q)</pre> + +<p><a href="#normalize" class="podlinkpod" +>normalize()</a> is theoretically concerned with applying the second half of that formula to a the Compiler’s weight. +What actually happens depends on how the Compiler and Similarity methods called internally are implemented.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Search::Compiler isa <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Lucy::Search::Query</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Search/Hits.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Search/Hits.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Search/Hits.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Search/Hits.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,60 @@ +Title: Lucy::Search::Hits â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Search::Hits - Access search results.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $hits = $searcher->hits( + query => $query, + offset => 0, + num_wanted => 10, +); +while ( my $hit = $hits->next ) { + print "<p>$hit->{title} <em>" . $hit->get_score . "</em></p>\n"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Hits objects are iterators used to access the results of a search.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="next" +>next</a></h3> + +<pre>my $hit_doc = $hits->next();</pre> + +<p>Return the next hit, +or undef when the iterator is exhausted.</p> + +<h3><a class='u' +name="total_hits" +>total_hits</a></h3> + +<pre>my $int = $hits->total_hits();</pre> + +<p>Return the total number of documents which matched the Query used to produce the Hits object. +Note that this is the total number of matches, +not just the number of matches represented by the Hits iterator.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Search::Hits isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/perl/Lucy/Search/IndexSearcher.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/perl/Lucy/Search/IndexSearcher.mdtext?rev=1737642&view=auto ============================================================================== --- lucy/site/trunk/content/docs/perl/Lucy/Search/IndexSearcher.mdtext (added) +++ lucy/site/trunk/content/docs/perl/Lucy/Search/IndexSearcher.mdtext Mon Apr 4 09:22:30 2016 @@ -0,0 +1,137 @@ +Title: Lucy::Search::IndexSearcher â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Search::IndexSearcher - Execute searches against a single index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $searcher = Lucy::Search::IndexSearcher->new( + index => '/path/to/index' +); +my $hits = $searcher->hits( + query => 'foo bar', + offset => 0, + num_wanted => 100, +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Use the IndexSearcher class to perform search queries against an index. +(For searching multiple indexes at once, +see <a href="../../Lucy/Search/PolySearcher.html" class="podlinkpod" +>PolySearcher</a>).</p> + +<p>IndexSearchers operate against a single point-in-time view or <a href="../../Lucy/Index/Snapshot.html" class="podlinkpod" +>Snapshot</a> of the index. +If an index is modified, +a new IndexSearcher must be opened to access the changes.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $searcher = Lucy::Search::IndexSearcher->new( + index => '/path/to/index' +);</pre> + +<p>Create a new IndexSearcher.</p> + +<ul> +<li><b>index</b> - Either a string filepath, +a Folder, +or an IndexReader.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="doc_max" +>doc_max</a></h3> + +<pre>my $int = $index_searcher->doc_max();</pre> + +<p>Return the maximum number of docs in the collection represented by the Searcher, +which is also the highest possible internal doc id. +Documents which have been marked as deleted but not yet purged are included in this count.</p> + +<h3><a class='u' +name="doc_freq" +>doc_freq</a></h3> + +<pre>my $int = $index_searcher->doc_freq( + field => $field # required + term => $term # required +);</pre> + +<p>Return the number of documents which contain the term in the given field.</p> + +<ul> +<li><b>field</b> - Field name.</li> + +<li><b>term</b> - The term to look up.</li> +</ul> + +<h3><a class='u' +name="collect" +>collect</a></h3> + +<pre>$index_searcher->collect( + query => $query # required + collector => $collector # required +);</pre> + +<p>Iterate over hits, +feeding them into a <a href="../../Lucy/Search/Collector.html" class="podlinkpod" +>Collector</a>.</p> + +<ul> +<li><b>query</b> - A Query.</li> + +<li><b>collector</b> - A Collector.</li> +</ul> + +<h3><a class='u' +name="fetch_doc" +>fetch_doc</a></h3> + +<pre>my $hit_doc = $index_searcher->fetch_doc($doc_id);</pre> + +<p>Retrieve a document. +Throws an error if the doc id is out of range.</p> + +<ul> +<li><b>doc_id</b> - A document id.</li> +</ul> + +<h3><a class='u' +name="get_reader" +>get_reader</a></h3> + +<pre>my $index_reader = $index_searcher->get_reader();</pre> + +<p>Accessor for the object’s <code>reader</code> member.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Search::IndexSearcher isa <a href="../../Lucy/Search/Searcher.html" class="podlinkpod" +>Lucy::Search::Searcher</a> isa Clownfish::Obj.</p> + +</div>