Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,208 @@ +Title: Lucy::Docs::Tutorial::BeyondSimpleTutorial + +<div class="c-api"> +<h2>A more flexible app structure.</h2> +<h3>Goal</h3> +<p>In this tutorial chapter, weâll refactor the apps we built in +<a href="../../../Lucy/Docs/Tutorial/SimpleTutorial.html">SimpleTutorial</a> so that they look exactly the same from +the end userâs point of view, but offer the developer greater possibilites for +expansion.</p> +<p>To achieve this, weâll ditch Lucy::Simple and replace it with the +classes that it uses internally:</p> +<ul> +<li><a href="../../../Lucy/Plan/Schema.html">Schema</a> - Plan out your index.</li> +<li><a href="../../../Lucy/Plan/FullTextType.html">FullTextType</a> - Field type for full text search.</li> +<li><a href="../../../Lucy/Analysis/EasyAnalyzer.html">EasyAnalyzer</a> - A one-size-fits-all parser/tokenizer.</li> +<li><a href="../../../Lucy/Index/Indexer.html">Indexer</a> - Manipulate index content.</li> +<li><a href="../../../Lucy/Search/IndexSearcher.html">IndexSearcher</a> - Search an index.</li> +<li><a href="../../../Lucy/Search/Hits.html">Hits</a> - Iterate over hits returned by a Searcher.</li> +</ul> +<h3>Adaptations to indexer.pl</h3> +<p>After we load our modulesâ¦</p> +<pre><code class="language-c">#include <dirent.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#define CFISH_USE_SHORT_NAMES +#define LUCY_USE_SHORT_NAMES +#include "Clownfish/String.h" +#include "Lucy/Analysis/EasyAnalyzer.h" +#include "Lucy/Document/Doc.h" +#include "Lucy/Index/Indexer.h" +#include "Lucy/Plan/FullTextType.h" +#include "Lucy/Plan/StringType.h" +#include "Lucy/Plan/Schema.h" + +const char path_to_index[] = "/path/to/index"; +const char uscon_source[] = "/usr/local/apache2/htdocs/us_constitution"; +</code></pre> +<p>⦠the first item weâre going need is a <a href="../../../Lucy/Plan/Schema.html">Schema</a>.</p> +<p>The primary job of a Schema is to specify what fields are available and how +theyâre defined. Weâll start off with three fields: title, content and url.</p> +<pre><code class="language-c">static Schema* +S_create_schema() { + // Create a new schema. + Schema *schema = Schema_new(); + + // Create an analyzer. + String *language = Str_newf("en"); + EasyAnalyzer *analyzer = EasyAnalyzer_new(language); + + // Specify fields. + + FullTextType *type = FullTextType_new((Analyzer*)analyzer); + + { + String *field_str = Str_newf("title"); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(field_str); + } + + { + String *field_str = Str_newf("content"); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(field_str); + } + + { + String *field_str = Str_newf("url"); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(field_str); + } + + DECREF(type); + DECREF(analyzer); + DECREF(language); + return schema; +} +</code></pre> +<p>All of the fields are specâd out using the <a href="../../../Lucy/Plan/FullTextType.html">FullTextType</a> FieldType, +indicating that they will be searchable as âfull textâ â which means that +they can be searched for individual words. The âanalyzerâ, which is unique to +FullTextType fields, is what breaks up the text into searchable tokens.</p> +<p>Next, weâll swap our Lucy::Simple object out for an <a href="../../../Lucy/Index/Indexer.html">Indexer</a>. +The substitution will be straightforward because Simple has merely been +serving as a thin wrapper around an inner Indexer, and weâll just be peeling +away the wrapper.</p> +<p>First, replace the constructor:</p> +<pre><code class="language-c">int +main() { + // Initialize the library. + lucy_bootstrap_parcel(); + + Schema *schema = S_create_schema(); + String *folder = Str_newf("%s", path_to_index); + + Indexer *indexer = Indexer_new(schema, (Obj*)folder, NULL, + Indexer_CREATE | Indexer_TRUNCATE); + +</code></pre> +<p>Next, have the <code>indexer</code> object <a href="../../../Lucy/Index/Indexer.html#func_Add_Doc">Add_Doc()</a> where we +were having the <code>lucy</code> object adding the document before:</p> +<pre><code class="language-c"> DIR *dir = opendir(uscon_source); + if (dir == NULL) { + perror(uscon_source); + return 1; + } + + for (struct dirent *entry = readdir(dir); + entry; + entry = readdir(dir)) { + + if (S_ends_with(entry->d_name, ".txt")) { + Doc *doc = S_parse_file(entry->d_name); + Indexer_Add_Doc(indexer, doc, 1.0); + DECREF(doc); + } + } + + closedir(dir); +</code></pre> +<p>Thereâs only one extra step required: at the end of the app, you must call +commit() explicitly to close the indexing session and commit your changes. +(Lucy::Simple hides this detail, calling commit() implicitly when it needs to).</p> +<pre><code class="language-c"> Indexer_Commit(indexer); + + DECREF(indexer); + DECREF(folder); + DECREF(schema); + return 0; +} +</code></pre> +<h3>Adaptations to search.cgi</h3> +<p>In our search app as in our indexing app, Lucy::Simple has served as a +thin wrapper â this time around <a href="../../../Lucy/Search/IndexSearcher.html">IndexSearcher</a> and +<a href="../../../Lucy/Search/Hits.html">Hits</a>. Swapping out Simple for these two classes is +also straightforward:</p> +<pre><code class="language-c">#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#define CFISH_USE_SHORT_NAMES +#define LUCY_USE_SHORT_NAMES +#include "Clownfish/String.h" +#include "Lucy/Document/HitDoc.h" +#include "Lucy/Search/Hits.h" +#include "Lucy/Search/IndexSearcher.h" + +const char path_to_index[] = "/path/to/index"; + +int +main(int argc, char *argv[]) { + // Initialize the library. + lucy_bootstrap_parcel(); + + if (argc < 2) { + printf("Usage: %s <querystring>\n", argv[0]); + return 0; + } + + const char *query_c = argv[1]; + + printf("Searching for: %s\n\n", query_c); + + String *folder = Str_newf("%s", path_to_index); + IndexSearcher *searcher = IxSearcher_new((Obj*)folder); + + String *query_str = Str_newf("%s", query_c); + Hits *hits = IxSearcher_Hits(searcher, (Obj*)query_str, 0, 10, NULL); + + String *title_str = Str_newf("title"); + String *url_str = Str_newf("url"); + HitDoc *hit; + int i = 1; + + // Loop over search results. + while (NULL != (hit = Hits_Next(hits))) { + String *title = (String*)HitDoc_Extract(hit, title_str); + char *title_c = Str_To_Utf8(title); + + String *url = (String*)HitDoc_Extract(hit, url_str); + char *url_c = Str_To_Utf8(url); + + printf("Result %d: %s (%s)\n", i, title_c, url_c); + + free(url_c); + free(title_c); + DECREF(url); + DECREF(title); + DECREF(hit); + i++; + } + + DECREF(url_str); + DECREF(title_str); + DECREF(hits); + DECREF(query_str); + DECREF(searcher); + DECREF(folder); + return 0; +} +</code></pre> +<h3>Hooray!</h3> +<p>Congratulations! Your apps do the same thing as before⦠but now theyâll be +easier to customize.</p> +<p>In our next chapter, <a href="../../../Lucy/Docs/Tutorial/FieldTypeTutorial.html">FieldTypeTutorial</a>, weâll explore +how to assign different behaviors to different fields.</p> +</div>
Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,63 @@ +Title: Lucy::Docs::Tutorial::FieldTypeTutorial + +<div class="c-api"> +<h2>Specify per-field properties and behaviors.</h2> +<p>The Schema we used in the last chapter specifies three fields:</p> +<pre><code class="language-c"> FullTextType *type = FullTextType_new((Analyzer*)analyzer); + + { + String *field_str = Str_newf("title"); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(field_str); + } + + { + String *field_str = Str_newf("content"); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(field_str); + } + + { + String *field_str = Str_newf("url"); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(field_str); + } + +</code></pre> +<p>Since they are all defined as âfull textâ fields, they are all searchable â +including the <code>url</code> field, a dubious choice. Some URLs contain meaningful +information, but these donât, really:</p> +<pre><code>http://example.com/us_constitution/amend1.txt +</code></pre> +<p>We may as well not bother indexing the URL content. To achieve that we need +to assign the <code>url</code> field to a different FieldType.</p> +<h3>StringType</h3> +<p>Instead of FullTextType, weâll use a +<a href="../../../Lucy/Plan/StringType.html">StringType</a>, which doesnât use an +Analyzer to break up text into individual fields. Furthermore, weâll mark +this StringType as unindexed, so that its content wonât be searchable at all.</p> +<pre><code class="language-c"> { + String *field_str = Str_newf("url"); + StringType *type = StringType_new(); + StringType_Set_Indexed(type, false); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(type); + DECREF(field_str); + } +</code></pre> +<p>To observe the change in behavior, try searching for <code>us_constitution</code> both +before and after changing the Schema and re-indexing.</p> +<h3>Toggling âstoredâ</h3> +<p>For a taste of other FieldType possibilities, try turning off <code>stored</code> for +one or more fields.</p> +<pre><code class="language-c"> FullTextType *content_type = FullTextType_new((Analyzer*)analyzer); + FullTextType_Set_Stored(content_type, false); +</code></pre> +<p>Turning off <code>stored</code> for either <code>title</code> or <code>url</code> mangles our results page, +but since weâre not displaying <code>content</code>, turning it off for <code>content</code> has +no effect â except on index size.</p> +<h3>Analyzers up next</h3> +<p>Analyzers play a crucial role in the behavior of FullTextType fields. In our +next tutorial chapter, <a href="../../../Lucy/Docs/Tutorial/AnalysisTutorial.html">AnalysisTutorial</a>, weâll see how +changing up the Analyzer changes search results.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,72 @@ +Title: Lucy::Docs::Tutorial::HighlighterTutorial + +<div class="c-api"> +<h2>Augment search results with highlighted excerpts.</h2> +<p>Adding relevant excerpts with highlighted search terms to your search results +display makes it much easier for end users to scan the page and assess which +hits look promising, dramatically improving their search experience.</p> +<h3>Adaptations to indexer.pl</h3> +<p><a href="../../../Lucy/Highlight/Highlighter.html">Highlighter</a> uses information generated at index +time. To save resources, highlighting is disabled by default and must be +turned on for individual fields.</p> +<pre><code class="language-c"> { + String *field_str = Str_newf("content"); + FullTextType *type = FullTextType_new((Analyzer*)analyzer); + FullTextType_Set_Highlightable(type, true); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(type); + DECREF(field_str); + } +</code></pre> +<h3>Adaptations to search.cgi</h3> +<p>To add highlighting and excerpting to the search.cgi sample app, create a +<code>$highlighter</code> object outside the hits iterating loopâ¦</p> +<pre><code class="language-c"> String *content_str = Str_newf("content"); + Highlighter *highlighter + = Highlighter_new((Searcher*)searcher, (Obj*)query, + content_str, 200); +</code></pre> +<p>⦠then modify the loop and the per-hit display to generate and include the +excerpt.</p> +<pre><code class="language-c"> String *title_str = Str_newf("title"); + String *url_str = Str_newf("url"); + HitDoc *hit; + i = 1; + + // Loop over search results. + while (NULL != (hit = Hits_Next(hits))) { + String *title = (String*)HitDoc_Extract(hit, title_str); + char *title_c = Str_To_Utf8(title); + + String *url = (String*)HitDoc_Extract(hit, url_str); + char *url_c = Str_To_Utf8(url); + + String *excerpt = Highlighter_Create_Excerpt(highlighter, hit); + char *excerpt_c = Str_To_Utf8(excerpt); + + printf("Result %d: %s (%s)\n%s\n\n", i, title_c, url_c, excerpt_c); + + free(excerpt_c); + free(url_c); + free(title_c); + DECREF(excerpt); + DECREF(url); + DECREF(title); + DECREF(hit); + i++; + } + + DECREF(url_str); + DECREF(title_str); + DECREF(hits); + DECREF(query_str); + DECREF(highlighter); + DECREF(content_str); + DECREF(searcher); + DECREF(folder); +</code></pre> +<h3>Next chapter: Query objects</h3> +<p>Our next tutorial chapter, <a href="../../../Lucy/Docs/Tutorial/QueryObjectsTutorial.html">QueryObjectsTutorial</a>, +illustrates how to build an âadvanced searchâ interface using +<a href="../../../Lucy/Search/Query.html">Query</a> objects instead of query strings.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,181 @@ +Title: Lucy::Docs::Tutorial::QueryObjectsTutorial + +<div class="c-api"> +<h2>Use Query objects instead of query strings.</h2> +<p>Until now, our search app has had only a single search box. In this tutorial +chapter, weâll move towards an âadvanced searchâ interface, by adding a +âcategoryâ drop-down menu. Three new classes will be required:</p> +<ul> +<li> +<p><a href="../../../Lucy/Search/QueryParser.html">QueryParser</a> - Turn a query string into a +<a href="../../../Lucy/Search/Query.html">Query</a> object.</p> +</li> +<li> +<p><a href="../../../Lucy/Search/TermQuery.html">TermQuery</a> - Query for a specific term within +a specific field.</p> +</li> +<li> +<p><a href="../../../Lucy/Search/ANDQuery.html">ANDQuery</a> - âANDâ together multiple Query +objects to produce an intersected result set.</p> +</li> +</ul> +<h3>Adaptations to indexer.pl</h3> +<p>Our new âcategoryâ field will be a StringType field rather than a FullTextType +field, because we will only be looking for exact matches. It needs to be +indexed, but since we wonât display its value, it doesnât need to be stored.</p> +<pre><code class="language-c"> { + String *field_str = Str_newf("category"); + StringType *type = StringType_new(); + StringType_Set_Stored(type, false); + Schema_Spec_Field(schema, field_str, (FieldType*)type); + DECREF(type); + DECREF(field_str); + } +</code></pre> +<p>There will be three possible values: âarticleâ, âamendmentâ, and âpreambleâ, +which weâll hack out of the source fileâs name during our <code>parse_file</code> +subroutine:</p> +<pre><code class="language-c"> const char *category = NULL; + if (S_starts_with(filename, "art")) { + category = "article"; + } + else if (S_starts_with(filename, "amend")) { + category = "amendment"; + } + else if (S_starts_with(filename, "preamble")) { + category = "preamble"; + } + else { + fprintf(stderr, "Can't derive category for %s", filename); + exit(1); + } + + ... + + { + // Store 'category' field + String *field = Str_newf("category"); + String *value = Str_new_from_utf8(category, strlen(category)); + Doc_Store(doc, field, (Obj*)value); + DECREF(field); + DECREF(value); + } +</code></pre> +<h3>Adaptations to search.cgi</h3> +<p>The âcategoryâ constraint will be added to our search interface using an HTML +âselectâ element (this routine will need to be integrated into the HTML +generation section of search.cgi):</p> +<pre><code class="language-c">static void +S_usage_and_exit(const char *arg0) { + printf("Usage: %s [-c <category>] <querystring>\n", arg0); + exit(1); +} +</code></pre> +<p>Weâll start off by loading our new modules and extracting our new CGI +parameter.</p> +<pre><code class="language-c"> const char *category = NULL; + int i = 1; + + while (i < argc - 1) { + if (strcmp(argv[i], "-c") == 0) { + if (i + 1 >= argc) { + S_usage_and_exit(argv[0]); + } + i += 1; + category = argv[i]; + } + else { + S_usage_and_exit(argv[0]); + } + + i += 1; + } + + if (i + 1 != argc) { + S_usage_and_exit(argv[0]); + } + + const char *query_c = argv[i]; +</code></pre> +<p>QueryParserâs constructor requires a âschemaâ argument. We can get that from +our IndexSearcher:</p> +<pre><code class="language-c"> IndexSearcher *searcher = IxSearcher_new((Obj*)folder); + Schema *schema = IxSearcher_Get_Schema(searcher); + QueryParser *qparser = QParser_new(schema, NULL, NULL, NULL); +</code></pre> +<p>Previously, we have been handing raw query strings to IndexSearcher. Behind +the scenes, IndexSearcher has been using a QueryParser to turn those query +strings into Query objects. Now, we will bring QueryParser into the +foreground and parse the strings explicitly.</p> +<pre><code class="language-c"> Query *query = QParser_Parse(qparser, query_str); +</code></pre> +<p>If the user has specified a category, weâll use an ANDQuery to join our parsed +query together with a TermQuery representing the category.</p> +<pre><code class="language-c"> if (category) { + String *category_name = String_newf("category"); + String *category_str = String_newf("%s", category); + TermQuery *category_query + = TermQuery_new(category_name, category_str); + + Vector *children = Vec_new(2); + Vec_Push(children, (Obj*)query); + Vec_Push(children, category_query); + query = (Query*)ANDQuery_new(children); + + DECREF(children); + DECREF(category_str); + DECREF(category_name); + } +} +</code></pre> +<p>Now when we execute the queryâ¦</p> +<pre><code class="language-c"> Hits *hits = IxSearcher_Hits(searcher, (Obj*)query, 0, 10, NULL); +</code></pre> +<p>⦠weâll get a result set which is the intersection of the parsed query and +the category query.</p> +<h3>Using TermQuery with full text fields</h3> +<p>When querying full text fields, the easiest way is to create query objects +using QueryParser. But sometimes you want to create TermQuery for a single +term in a FullTextType field directly. In this case, we have to run the +search term through the fieldâs analyzer to make sure it gets normalized in +the same way as the fieldâs content.</p> +<pre><code class="language-c">Query* +make_term_query(Schema *schema, String *field, String *term) { + FieldType *type = Schema_Fetch_Type(schema, field); + String *token = NULL; + + if (FieldType_is_a(type, FULLTEXTTYPE)) { + // Run the term through the full text analysis chain. + Analyzer *analyzer = FullTextType_Get_Analyzer((FullTextType*)type); + Vector *tokens = Analyzer_Split(analyzer, term); + + if (Vec_Get_Size(tokens) != 1) { + // If the term expands to more than one token, or no + // tokens at all, it will never match a single token in + // the full text field. + DECREF(tokens); + return (Query*)NoMatchQuery_new(); + } + + token = (String*)Vec_Delete(tokens, 0); + DECREF(tokens); + } + else { + // Exact match for other types. + token = (String*)INCREF(term); + } + + TermQuery *term_query = TermQuery_new(field, (Obj*)token); + + DECREF(token); + return (Query*)term_query; +} +</code></pre> +<h3>Congratulations!</h3> +<p>Youâve made it to the end of the tutorial.</p> +<h3>See Also</h3> +<p>For additional thematic documentation, see the Apache Lucy +<a href="../../../Lucy/Docs/Cookbook.html">Cookbook</a>.</p> +<p>ANDQuery has a companion class, <a href="../../../Lucy/Search/ORQuery.html">ORQuery</a>, and a +close relative, <a href="../../../Lucy/Search/RequiredOptionalQuery.html">RequiredOptionalQuery</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/SimpleTutorial.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/SimpleTutorial.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/SimpleTutorial.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Docs/Tutorial/SimpleTutorial.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,223 @@ +Title: Lucy::Docs::Tutorial::SimpleTutorial + +<div class="c-api"> +<h2>Bare-bones search app.</h2> +<h3>Setup</h3> +<p>Copy the text presentation of the US Constitution from the <code>sample</code> directory +of the Apache Lucy distribution to the base level of your web serverâs +<code>htdocs</code> directory.</p> +<pre><code>$ cp -R sample/us_constitution /usr/local/apache2/htdocs/ +</code></pre> +<h3>Indexing: indexer.pl</h3> +<p>Our first task will be to create an application called <code>indexer.pl</code> which +builds a searchable âinverted indexâ from a collection of documents.</p> +<p>After we specify some configuration variables and load all necessary +modulesâ¦</p> +<pre><code class="language-c">#include <dirent.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#define CFISH_USE_SHORT_NAMES +#define LUCY_USE_SHORT_NAMES +#include "Clownfish/String.h" +#include "Lucy/Simple.h" +#include "Lucy/Document/Doc.h" + +const char path_to_index[] = "lucy_index"; +const char uscon_source[] = "../../common/sample/us_constitution"; +</code></pre> +<p>⦠weâll start by creating a <a href="../../../Lucy/Simple.html">Lucy::Simple</a> object, telling it +where weâd like the index to be located and the language of the source +material.</p> +<pre><code class="language-c">int +main() { + // Initialize the library. + lucy_bootstrap_parcel(); + + String *folder = Str_newf("%s", path_to_index); + String *language = Str_newf("en"); + Simple *lucy = Simple_new((Obj*)folder, language); +</code></pre> +<p>Next, weâll add a subroutine which parses our sample documents.</p> +<pre><code class="language-c">Doc* +S_parse_file(const char *filename) { + size_t bytes = strlen(uscon_source) + 1 + strlen(filename) + 1; + char *path = (char*)malloc(bytes); + path[0] = '\0'; + strcat(path, uscon_source); + strcat(path, "/"); + strcat(path, filename); + + FILE *stream = fopen(path, "r"); + if (stream == NULL) { + perror(path); + exit(1); + } + + char *title = NULL; + char *bodytext = NULL; + if (fscanf(stream, "%m[^\r\n] %m[\x01-\x7F]", &title, &bodytext) != 2) { + fprintf(stderr, "Can't extract title/bodytext from '%s'", path); + exit(1); + } + + Doc *doc = Doc_new(NULL, 0); + + { + // Store 'title' field + String *field = Str_newf("title"); + String *value = Str_new_from_utf8(title, strlen(title)); + Doc_Store(doc, field, (Obj*)value); + DECREF(field); + DECREF(value); + } + + { + // Store 'content' field + String *field = Str_newf("content"); + String *value = Str_new_from_utf8(bodytext, strlen(bodytext)); + Doc_Store(doc, field, (Obj*)value); + DECREF(field); + DECREF(value); + } + + { + // Store 'url' field + String *field = Str_newf("url"); + String *value = Str_new_from_utf8(filename, strlen(filename)); + Doc_Store(doc, field, (Obj*)value); + DECREF(field); + DECREF(value); + } + + fclose(stream); + free(bodytext); + free(title); + free(path); + return doc; +} +</code></pre> +<p>Add some elementary directory reading codeâ¦</p> +<pre><code class="language-c"> DIR *dir = opendir(uscon_source); + if (dir == NULL) { + perror(uscon_source); + return 1; + } +</code></pre> +<p>⦠and now weâre ready for the meat of indexer.pl â which occupies exactly +one line of code.</p> +<pre><code class="language-c"> for (struct dirent *entry = readdir(dir); + entry; + entry = readdir(dir)) { + + if (S_ends_with(entry->d_name, ".txt")) { + Doc *doc = S_parse_file(entry->d_name); + Simple_Add_Doc(lucy, doc); // ta-da! + DECREF(doc); + } + } + + closedir(dir); + + DECREF(lucy); + DECREF(language); + DECREF(folder); + return 0; +} +</code></pre> +<h3>Search: search.cgi</h3> +<p>As with our indexing app, the bulk of the code in our search script wonât be +Lucy-specific.</p> +<p>The beginning is dedicated to CGI processing and configuration.</p> +<pre><code class="language-c">#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#define CFISH_USE_SHORT_NAMES +#define LUCY_USE_SHORT_NAMES +#include "Clownfish/String.h" +#include "Lucy/Document/HitDoc.h" +#include "Lucy/Simple.h" + +const char path_to_index[] = "lucy_index"; + +static void +S_usage_and_exit(const char *arg0) { + printf("Usage: %s <querystring>\n", arg0); + exit(1); +} + +int +main(int argc, char *argv[]) { + // Initialize the library. + lucy_bootstrap_parcel(); + + if (argc != 2) { + S_usage_and_exit(argv[0]); + } + + const char *query_c = argv[1]; + + printf("Searching for: %s\n\n", query_c); +</code></pre> +<p>Once thatâs out of the way, we create our Lucy::Simple object and feed +it a query string.</p> +<pre><code class="language-c"> String *folder = Str_newf("%s", path_to_index); + String *language = Str_newf("en"); + Simple *lucy = Simple_new((Obj*)folder, language); + + String *query_str = Str_newf("%s", query_c); + Simple_Search(lucy, query_str, 0, 10); +</code></pre> +<p>The value returned by <a href="../../../Lucy/Simple.html#func_Search">Search()</a> is the total number of documents +in the collection which matched the query. Weâll show this hit count to the +user, and also use it in conjunction with the parameters <code>offset</code> and +<code>num_wanted</code> to break up results into âpagesâ of manageable size.</p> +<p>Calling <a href="../../../Lucy/Simple.html#func_Search">Search()</a> on our Simple object turns it into an iterator. +Invoking <a href="../../../Lucy/Simple.html#func_Next">Next()</a> now returns hits one at a time as <a href="../../../Lucy/Document/HitDoc.html">HitDoc</a> +objects, starting with the most relevant.</p> +<pre><code class="language-c"> String *title_str = Str_newf("title"); + String *url_str = Str_newf("url"); + HitDoc *hit; + int i = 1; + + // Loop over search results. + while (NULL != (hit = Simple_Next(lucy))) { + String *title = (String*)HitDoc_Extract(hit, title_str); + char *title_c = Str_To_Utf8(title); + + String *url = (String*)HitDoc_Extract(hit, url_str); + char *url_c = Str_To_Utf8(url); + + printf("Result %d: %s (%s)\n", i, title_c, url_c); + + free(url_c); + free(title_c); + DECREF(url); + DECREF(title); + DECREF(hit); + i++; + } + + DECREF(url_str); + DECREF(title_str); + DECREF(query_str); + DECREF(lucy); + DECREF(language); + DECREF(folder); + return 0; +} +</code></pre> +<p>The rest of the script is just text wrangling.</p> +<pre><code>Code example for C is missing</code></pre> +<h3>OK⦠now what?</h3> +<p>Lucy::Simple is perfectly adequate for some tasks, but itâs not very flexible. +Many people find that it doesnât do at least one or two things they canât live +without.</p> +<p>In our next tutorial chapter, +<a href="../../../Lucy/Docs/Tutorial/BeyondSimpleTutorial.html">BeyondSimpleTutorial</a>, weâll rewrite our +indexing and search scripts using the classes that Lucy::Simple hides +from view, opening up the possibilities for expansion; then, weâll spend the +rest of the tutorial chapters exploring these possibilities.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/Doc.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/Doc.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/Doc.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/Doc.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,170 @@ +Title: Lucy::Document::Doc â C API Documentation + +<div class="c-api"> +<h2>Lucy::Document::Doc</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>DOC</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>Doc</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>Doc</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Document/Doc.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Document::Doc â A document.</p> +<h3>Description</h3> +<p>A Doc object is akin to a row in a database, in that it is made up of one +or more fields, each of which has a value.</p> +<h3>Functions</h3> +<dl> +<dt id="func_new">new</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>Doc* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Doc_new</strong>( + void *<strong>fields</strong>, + int32_t <strong>doc_id</strong> +); +</code></pre> +<p>Create a new Document.</p> +<dl> +<dt>fields</dt> +<dd><p>Field-value pairs.</p> +</dd> +<dt>doc_id</dt> +<dd><p>Internal Lucy document id. Default of 0 (an +invalid doc id).</p> +</dd> +</dl> +</dd> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>Doc* +<span class="prefix">lucy_</span><strong>Doc_init</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong>, + void *<strong>fields</strong>, + int32_t <strong>doc_id</strong> +); +</code></pre> +<p>Initialize a Document.</p> +<dl> +<dt>fields</dt> +<dd><p>Field-value pairs.</p> +</dd> +<dt>doc_id</dt> +<dd><p>Internal Lucy document id. Default of 0 (an +invalid doc id).</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Set_Doc_ID">Set_Doc_ID</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>Doc_Set_Doc_ID</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong>, + int32_t <strong>doc_id</strong> +); +</code></pre> +<p>Set internal Lucy document id.</p> +</dd> +<dt id="func_Get_Doc_ID">Get_Doc_ID</dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>Doc_Get_Doc_ID</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong> +); +</code></pre> +<p>Retrieve internal Lucy document id.</p> +</dd> +<dt id="func_Store">Store</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>Doc_Store</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>value</strong> +); +</code></pre> +<p>Store a field value in the Doc.</p> +<dl> +<dt>field</dt> +<dd><p>The field name</p> +</dd> +<dt>value</dt> +<dd><p>The value</p> +</dd> +</dl> +</dd> +<dt id="func_Get_Fields">Get_Fields</dt> +<dd> +<pre><code>void* +<span class="prefix">lucy_</span><strong>Doc_Get_Fields</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong> +); +</code></pre> +<p>Return the Docâs backing fields hash.</p> +</dd> +<dt id="func_Get_Size">Get_Size</dt> +<dd> +<pre><code>uint32_t +<span class="prefix">lucy_</span><strong>Doc_Get_Size</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong> +); +</code></pre> +<p>Return the number of fields in the Doc.</p> +</dd> +<dt id="func_Extract">Extract</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Doc_Extract</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong> +); +</code></pre> +<p>Retrieve the fieldâs value, or NULL if the field is not present.</p> +</dd> +<dt id="func_Field_Names">Field_Names</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Doc_Field_Names</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong> +); +</code></pre> +<p>Return a list of names of all fields present.</p> +</dd> +<dt id="func_Equals">Equals</dt> +<dd> +<pre><code>bool +<span class="prefix">lucy_</span><strong>Doc_Equals</strong>( + <span class="prefix">lucy_</span>Doc *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>other</strong> +); +</code></pre> +<p>Indicate whether two objects are the same. By default, compares the +memory address.</p> +<dl> +<dt>other</dt> +<dd><p>Another Obj.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Document::Doc is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/HitDoc.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/HitDoc.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/HitDoc.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Document/HitDoc.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,150 @@ +Title: Lucy::Document::HitDoc â C API Documentation + +<div class="c-api"> +<h2>Lucy::Document::HitDoc</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>HITDOC</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>HitDoc</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>HitDoc</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Document/HitDoc.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Document::HitDoc â A document read from an index.</p> +<h3>Description</h3> +<p>HitDoc is the search-time relative of the index-time class Doc; it is +augmented by a numeric score attribute that Doc doesnât have.</p> +<h3>Methods</h3> +<dl> +<dt id="func_Set_Score">Set_Score</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>HitDoc_Set_Score</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong>, + float <strong>score</strong> +); +</code></pre> +<p>Set score attribute.</p> +</dd> +<dt id="func_Get_Score">Get_Score</dt> +<dd> +<pre><code>float +<span class="prefix">lucy_</span><strong>HitDoc_Get_Score</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong> +); +</code></pre> +<p>Get score attribute.</p> +</dd> +<dt id="func_Equals">Equals</dt> +<dd> +<pre><code>bool +<span class="prefix">lucy_</span><strong>HitDoc_Equals</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>other</strong> +); +</code></pre> +<p>Indicate whether two objects are the same. By default, compares the +memory address.</p> +<dl> +<dt>other</dt> +<dd><p>Another Obj.</p> +</dd> +</dl> +</dd> +</dl> +<h4>Methods inherited from Lucy::Document::Doc</h4> +<dl> +<dt id="func_Set_Doc_ID">Set_Doc_ID</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>HitDoc_Set_Doc_ID</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong>, + int32_t <strong>doc_id</strong> +); +</code></pre> +<p>Set internal Lucy document id.</p> +</dd> +<dt id="func_Get_Doc_ID">Get_Doc_ID</dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>HitDoc_Get_Doc_ID</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong> +); +</code></pre> +<p>Retrieve internal Lucy document id.</p> +</dd> +<dt id="func_Store">Store</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>HitDoc_Store</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>value</strong> +); +</code></pre> +<p>Store a field value in the Doc.</p> +<dl> +<dt>field</dt> +<dd><p>The field name</p> +</dd> +<dt>value</dt> +<dd><p>The value</p> +</dd> +</dl> +</dd> +<dt id="func_Get_Fields">Get_Fields</dt> +<dd> +<pre><code>void* +<span class="prefix">lucy_</span><strong>HitDoc_Get_Fields</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong> +); +</code></pre> +<p>Return the Docâs backing fields hash.</p> +</dd> +<dt id="func_Get_Size">Get_Size</dt> +<dd> +<pre><code>uint32_t +<span class="prefix">lucy_</span><strong>HitDoc_Get_Size</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong> +); +</code></pre> +<p>Return the number of fields in the Doc.</p> +</dd> +<dt id="func_Extract">Extract</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>HitDoc_Extract</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong> +); +</code></pre> +<p>Retrieve the fieldâs value, or NULL if the field is not present.</p> +</dd> +<dt id="func_Field_Names">Field_Names</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>HitDoc_Field_Names</strong>( + <span class="prefix">lucy_</span>HitDoc *<strong>self</strong> +); +</code></pre> +<p>Return a list of names of all fields present.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Document::HitDoc is a <a href="../../Lucy/Document/Doc.html">Lucy::Document::Doc</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Highlight/Highlighter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Highlight/Highlighter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Highlight/Highlighter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Highlight/Highlighter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,223 @@ +Title: Lucy::Highlight::Highlighter â C API Documentation + +<div class="c-api"> +<h2>Lucy::Highlight::Highlighter</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>HIGHLIGHTER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>Highlighter</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>Highlighter</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Highlight/Highlighter.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Highlight::Highlighter â Create and highlight excerpts.</p> +<h3>Description</h3> +<p>The Highlighter can be used to select relevant snippets from a document, +and to surround search terms with highlighting tags. It handles both stems +and phrases correctly and efficiently, using special-purpose data generated +at index-time.</p> +<h3>Functions</h3> +<dl> +<dt id="func_new">new</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>Highlighter* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Highlighter_new</strong>( + <span class="prefix">lucy_</span><a href="../../Lucy/Search/Searcher.html">Searcher</a> *<strong>searcher</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>query</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong>, + uint32_t <strong>excerpt_length</strong> +); +</code></pre> +<p>Create a new Highlighter.</p> +<dl> +<dt>searcher</dt> +<dd><p>An object which inherits from +<a href="../../Lucy/Search/Searcher.html">Searcher</a>, such as an +<a href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a>.</p> +</dd> +<dt>query</dt> +<dd><p>Query object or a query string.</p> +</dd> +<dt>field</dt> +<dd><p>The name of the field from which to draw the excerpt. The +field must marked as be <code>highlightable</code> (see +<a href="../../Lucy/Plan/FieldType.html">FieldType</a>).</p> +</dd> +<dt>excerpt_length</dt> +<dd><p>Maximum length of the excerpt, in characters.</p> +</dd> +</dl> +</dd> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>Highlighter* +<span class="prefix">lucy_</span><strong>Highlighter_init</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Search/Searcher.html">Searcher</a> *<strong>searcher</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>query</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong>, + uint32_t <strong>excerpt_length</strong> +); +</code></pre> +<p>Initialize a Highlighter.</p> +<dl> +<dt>searcher</dt> +<dd><p>An object which inherits from +<a href="../../Lucy/Search/Searcher.html">Searcher</a>, such as an +<a href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a>.</p> +</dd> +<dt>query</dt> +<dd><p>Query object or a query string.</p> +</dd> +<dt>field</dt> +<dd><p>The name of the field from which to draw the excerpt. The +field must marked as be <code>highlightable</code> (see +<a href="../../Lucy/Plan/FieldType.html">FieldType</a>).</p> +</dd> +<dt>excerpt_length</dt> +<dd><p>Maximum length of the excerpt, in characters.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Create_Excerpt">Create_Excerpt</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Highlighter_Create_Excerpt</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Document/HitDoc.html">HitDoc</a> *<strong>hit_doc</strong> +); +</code></pre> +<p>Take a HitDoc object and return a highlighted excerpt as a string if +the HitDoc has a value for the specified <code>field</code>.</p> +</dd> +<dt id="func_Encode">Encode</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Highlighter_Encode</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong> +); +</code></pre> +<p>Encode text with HTML entities. This method is called internally by +<a href="../../Lucy/Highlight/Highlighter.html#func_Create_Excerpt">Create_Excerpt()</a> for each text fragment when assembling an excerpt. A +subclass can override this if the text should be encoded differently or +not at all.</p> +</dd> +<dt id="func_Highlight">Highlight</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Highlighter_Highlight</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong> +); +</code></pre> +<p>Highlight a small section of text. By default, prepends pre-tag and +appends post-tag. This method is called internally by <a href="../../Lucy/Highlight/Highlighter.html#func_Create_Excerpt">Create_Excerpt()</a> +when assembling an excerpt.</p> +</dd> +<dt id="func_Set_Pre_Tag">Set_Pre_Tag</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>Highlighter_Set_Pre_Tag</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>pre_tag</strong> +); +</code></pre> +<p>Setter. The default value is â<strong>â.</p> +</dd> +<dt id="func_Set_Post_Tag">Set_Post_Tag</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>Highlighter_Set_Post_Tag</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>post_tag</strong> +); +</code></pre> +<p>Setter. The default value is â</strong>â.</p> +</dd> +<dt id="func_Get_Pre_Tag">Get_Pre_Tag</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a>* +<span class="prefix">lucy_</span><strong>Highlighter_Get_Pre_Tag</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor.</p> +</dd> +<dt id="func_Get_Post_Tag">Get_Post_Tag</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a>* +<span class="prefix">lucy_</span><strong>Highlighter_Get_Post_Tag</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor.</p> +</dd> +<dt id="func_Get_Field">Get_Field</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a>* +<span class="prefix">lucy_</span><strong>Highlighter_Get_Field</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor.</p> +</dd> +<dt id="func_Get_Excerpt_Length">Get_Excerpt_Length</dt> +<dd> +<pre><code>uint32_t +<span class="prefix">lucy_</span><strong>Highlighter_Get_Excerpt_Length</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor.</p> +</dd> +<dt id="func_Get_Searcher">Get_Searcher</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Search/Searcher.html">Searcher</a>* +<span class="prefix">lucy_</span><strong>Highlighter_Get_Searcher</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor.</p> +</dd> +<dt id="func_Get_Query">Get_Query</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Search/Query.html">Query</a>* +<span class="prefix">lucy_</span><strong>Highlighter_Get_Query</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor.</p> +</dd> +<dt id="func_Get_Compiler">Get_Compiler</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Search/Compiler.html">Compiler</a>* +<span class="prefix">lucy_</span><strong>Highlighter_Get_Compiler</strong>( + <span class="prefix">lucy_</span>Highlighter *<strong>self</strong> +); +</code></pre> +<p>Accessor for the Lucy::Search::Compiler object derived from +<code>query</code> and <code>searcher</code>.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Highlight::Highlighter is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/BackgroundMerger.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/BackgroundMerger.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/BackgroundMerger.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/BackgroundMerger.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,117 @@ +Title: Lucy::Index::BackgroundMerger â C API Documentation + +<div class="c-api"> +<h2>Lucy::Index::BackgroundMerger</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>BACKGROUNDMERGER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>BackgroundMerger</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>BGMerger</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Index/BackgroundMerger.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Index::BackgroundMerger â Consolidate index segments in the background.</p> +<h3>Description</h3> +<p>Adding documents to an index is usually fast, but every once in a while the +index must be compacted and an update takes substantially longer to +complete. See <a href="../../Lucy/Docs/Cookbook/FastUpdates.html">FastUpdates</a> for how to use this class to control +worst-case index update performance.</p> +<p>As with <a href="../../Lucy/Index/Indexer.html">Indexer</a>, see <a href="../../Lucy/Docs/FileLocking.html">FileLocking</a> if your index is on a +shared volume.</p> +<h3>Functions</h3> +<dl> +<dt id="func_new">new</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>BackgroundMerger* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>BGMerger_new</strong>( + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>index</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/IndexManager.html">IndexManager</a> *<strong>manager</strong> +); +</code></pre> +<p>Open a new BackgroundMerger.</p> +<dl> +<dt>index</dt> +<dd><p>Either a string filepath or a Folder.</p> +</dd> +<dt>manager</dt> +<dd><p>An IndexManager. If not supplied, an IndexManager with +a 10-second write lock timeout will be created.</p> +</dd> +</dl> +</dd> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>BackgroundMerger* +<span class="prefix">lucy_</span><strong>BGMerger_init</strong>( + <span class="prefix">lucy_</span>BackgroundMerger *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>index</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/IndexManager.html">IndexManager</a> *<strong>manager</strong> +); +</code></pre> +<p>Initialize a BackgroundMerger.</p> +<dl> +<dt>index</dt> +<dd><p>Either a string filepath or a Folder.</p> +</dd> +<dt>manager</dt> +<dd><p>An IndexManager. If not supplied, an IndexManager with +a 10-second write lock timeout will be created.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Optimize">Optimize</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>BGMerger_Optimize</strong>( + <span class="prefix">lucy_</span>BackgroundMerger *<strong>self</strong> +); +</code></pre> +<p>Optimize the index for search-time performance. This may take a +while, as it can involve rewriting large amounts of data.</p> +</dd> +<dt id="func_Commit">Commit</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>BGMerger_Commit</strong>( + <span class="prefix">lucy_</span>BackgroundMerger *<strong>self</strong> +); +</code></pre> +<p>Commit any changes made to the index. Until this is called, none of +the changes made during an indexing session are permanent.</p> +<p>Calls <a href="../../Lucy/Index/BackgroundMerger.html#func_Prepare_Commit">Prepare_Commit()</a> implicitly if it has not already been called.</p> +</dd> +<dt id="func_Prepare_Commit">Prepare_Commit</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>BGMerger_Prepare_Commit</strong>( + <span class="prefix">lucy_</span>BackgroundMerger *<strong>self</strong> +); +</code></pre> +<p>Perform the expensive setup for <a href="../../Lucy/Index/BackgroundMerger.html#func_Commit">Commit()</a> in advance, so that <a href="../../Lucy/Index/BackgroundMerger.html#func_Commit">Commit()</a> +completes quickly.</p> +<p>Towards the end of <a href="../../Lucy/Index/BackgroundMerger.html#func_Prepare_Commit">Prepare_Commit()</a>, the BackgroundMerger attempts to +re-acquire the write lock, which is then held until <a href="../../Lucy/Index/BackgroundMerger.html#func_Commit">Commit()</a> finishes +and releases it.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Index::BackgroundMerger is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,149 @@ +Title: Lucy::Index::DataReader â C API Documentation + +<div class="c-api"> +<h2>Lucy::Index::DataReader</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>DATAREADER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>DataReader</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>DataReader</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Index/DataReader.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Index::DataReader â Abstract base class for reading index data.</p> +<h3>Description</h3> +<p>DataReader is the companion class to +<a href="../../Lucy/Index/DataWriter.html">DataWriter</a>. Every index component will +implement one of each.</p> +<h3>Functions</h3> +<dl> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>DataReader* +<span class="prefix">lucy_</span><strong>DataReader_init</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Plan/Schema.html">Schema</a> *<strong>schema</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Store/Folder.html">Folder</a> *<strong>folder</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/Snapshot.html">Snapshot</a> *<strong>snapshot</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a> *<strong>segments</strong>, + int32_t <strong>seg_tick</strong> +); +</code></pre> +<p>Abstract initializer.</p> +<dl> +<dt>schema</dt> +<dd><p>A Schema.</p> +</dd> +<dt>folder</dt> +<dd><p>A Folder.</p> +</dd> +<dt>snapshot</dt> +<dd><p>A Snapshot.</p> +</dd> +<dt>segments</dt> +<dd><p>An array of Segments.</p> +</dd> +<dt>seg_tick</dt> +<dd><p>The array index of the Segment object within the +<code>segments</code> array that this particular DataReader is assigned +to, if any. A value of -1 indicates that no Segment should be +assigned.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Aggregator">Aggregator <span class="comment">(abstract)</span></dt> +<dd> +<pre><code><span class="prefix">lucy_</span>DataReader* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>DataReader_Aggregator</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a> *<strong>readers</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Object/I32Array.html">I32Array</a> *<strong>offsets</strong> +); +</code></pre> +<p>Create a reader which aggregates the output of several lower level +readers. Return NULL if such a reader is not valid.</p> +<dl> +<dt>readers</dt> +<dd><p>An array of DataReaders.</p> +</dd> +<dt>offsets</dt> +<dd><p>Doc id start offsets for each reader.</p> +</dd> +</dl> +</dd> +<dt id="func_Get_Schema">Get_Schema</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Plan/Schema.html">Schema</a>* +<span class="prefix">lucy_</span><strong>DataReader_Get_Schema</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âschemaâ member var.</p> +</dd> +<dt id="func_Get_Folder">Get_Folder</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Store/Folder.html">Folder</a>* +<span class="prefix">lucy_</span><strong>DataReader_Get_Folder</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âfolderâ member var.</p> +</dd> +<dt id="func_Get_Snapshot">Get_Snapshot</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Snapshot.html">Snapshot</a>* +<span class="prefix">lucy_</span><strong>DataReader_Get_Snapshot</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsnapshotâ member var.</p> +</dd> +<dt id="func_Get_Segments">Get_Segments</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* +<span class="prefix">lucy_</span><strong>DataReader_Get_Segments</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsegmentsâ member var.</p> +</dd> +<dt id="func_Get_Segment">Get_Segment</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Segment.html">Segment</a>* +<span class="prefix">lucy_</span><strong>DataReader_Get_Segment</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsegmentâ member var.</p> +</dd> +<dt id="func_Get_Seg_Tick">Get_Seg_Tick</dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>DataReader_Get_Seg_Tick</strong>( + <span class="prefix">lucy_</span>DataReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âseg_tickâ member var.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Index::DataReader is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataWriter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataWriter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataWriter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DataWriter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,213 @@ +Title: Lucy::Index::DataWriter â C API Documentation + +<div class="c-api"> +<h2>Lucy::Index::DataWriter</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>DATAWRITER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>DataWriter</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>DataWriter</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Index/DataWriter.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Index::DataWriter â Write data to an index.</p> +<h3>Description</h3> +<p>DataWriter is an abstract base class for writing index data, generally in +segment-sized chunks. Each component of an index â e.g. stored fields, +lexicon, postings, deletions â is represented by a +DataWriter/<a href="../../Lucy/Index/DataReader.html">DataReader</a> pair.</p> +<p>Components may be specified per index by subclassing +<a href="../../Lucy/Plan/Architecture.html">Architecture</a>.</p> +<h3>Functions</h3> +<dl> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>DataWriter* +<span class="prefix">lucy_</span><strong>DataWriter_init</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Plan/Schema.html">Schema</a> *<strong>schema</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/Snapshot.html">Snapshot</a> *<strong>snapshot</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/Segment.html">Segment</a> *<strong>segment</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/PolyReader.html">PolyReader</a> *<strong>polyreader</strong> +); +</code></pre> +<p>Abstract initializer.</p> +<dl> +<dt>snapshot</dt> +<dd><p>The Snapshot that will be committed at the end of the +indexing session.</p> +</dd> +<dt>segment</dt> +<dd><p>The Segment in progress.</p> +</dd> +<dt>polyreader</dt> +<dd><p>A PolyReader representing all existing data in the +index. (If the index is brand new, the PolyReader will have no +sub-readers).</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Add_Segment">Add_Segment <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DataWriter_Add_Segment</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/SegReader.html">SegReader</a> *<strong>reader</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Object/I32Array.html">I32Array</a> *<strong>doc_map</strong> +); +</code></pre> +<p>Add content from an existing segment into the one currently being +written.</p> +<dl> +<dt>reader</dt> +<dd><p>The SegReader containing content to add.</p> +</dd> +<dt>doc_map</dt> +<dd><p>An array of integers mapping old document ids to +new. Deleted documents are mapped to 0, indicating that they should be +skipped.</p> +</dd> +</dl> +</dd> +<dt id="func_Delete_Segment">Delete_Segment</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DataWriter_Delete_Segment</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/SegReader.html">SegReader</a> *<strong>reader</strong> +); +</code></pre> +<p>Remove a segmentâs data. The default implementation is a no-op, as +all files within the segment directory will be automatically deleted. +Subclasses which manage their own files outside of the segment system +should override this method and use it as a trigger for cleaning up +obsolete data.</p> +<dl> +<dt>reader</dt> +<dd><p>The SegReader containing content to merge, which must +represent a segment which is part of the the current snapshot.</p> +</dd> +</dl> +</dd> +<dt id="func_Merge_Segment">Merge_Segment</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DataWriter_Merge_Segment</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/SegReader.html">SegReader</a> *<strong>reader</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Object/I32Array.html">I32Array</a> *<strong>doc_map</strong> +); +</code></pre> +<p>Move content from an existing segment into the one currently being +written.</p> +<p>The default implementation calls <a href="../../Lucy/Index/DataWriter.html#func_Add_Segment">Add_Segment()</a> then <a href="../../Lucy/Index/DataWriter.html#func_Delete_Segment">Delete_Segment()</a>.</p> +<dl> +<dt>reader</dt> +<dd><p>The SegReader containing content to merge, which must +represent a segment which is part of the the current snapshot.</p> +</dd> +<dt>doc_map</dt> +<dd><p>An array of integers mapping old document ids to +new. Deleted documents are mapped to 0, indicating that they should be +skipped.</p> +</dd> +</dl> +</dd> +<dt id="func_Finish">Finish <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DataWriter_Finish</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Complete the segment: close all streams, store metadata, etc.</p> +</dd> +<dt id="func_Metadata">Metadata</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Hash.html">Hash</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>DataWriter_Metadata</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Arbitrary metadata to be serialized and stored by the Segment. The +default implementation supplies a hash with a single key-value pair for +âformatâ.</p> +</dd> +<dt id="func_Format">Format <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>DataWriter_Format</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Every writer must specify a file format revision number, which should +increment each time the format changes. Responsibility for revision +checking is left to the companion DataReader.</p> +</dd> +<dt id="func_Get_Snapshot">Get_Snapshot</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Snapshot.html">Snapshot</a>* +<span class="prefix">lucy_</span><strong>DataWriter_Get_Snapshot</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsnapshotâ member var.</p> +</dd> +<dt id="func_Get_Segment">Get_Segment</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Segment.html">Segment</a>* +<span class="prefix">lucy_</span><strong>DataWriter_Get_Segment</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsegmentâ member var.</p> +</dd> +<dt id="func_Get_PolyReader">Get_PolyReader</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/PolyReader.html">PolyReader</a>* +<span class="prefix">lucy_</span><strong>DataWriter_Get_PolyReader</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âpolyreaderâ member var.</p> +</dd> +<dt id="func_Get_Schema">Get_Schema</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Plan/Schema.html">Schema</a>* +<span class="prefix">lucy_</span><strong>DataWriter_Get_Schema</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âschemaâ member var.</p> +</dd> +<dt id="func_Get_Folder">Get_Folder</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Store/Folder.html">Folder</a>* +<span class="prefix">lucy_</span><strong>DataWriter_Get_Folder</strong>( + <span class="prefix">lucy_</span>DataWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âfolderâ member var.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Index::DataWriter is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DeletionsWriter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DeletionsWriter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DeletionsWriter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DeletionsWriter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,250 @@ +Title: Lucy::Index::DeletionsWriter â C API Documentation + +<div class="c-api"> +<h2>Lucy::Index::DeletionsWriter</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>DELETIONSWRITER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>DeletionsWriter</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>DelWriter</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Index/DeletionsWriter.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Index::DeletionsWriter â Abstract base class for marking documents as deleted.</p> +<h3>Description</h3> +<p>Subclasses of DeletionsWriter provide a low-level mechanism for declaring a +document deleted from an index.</p> +<p>Because files in an index are never modified, and because it is not +practical to delete entire segments, a DeletionsWriter does not actually +remove documents from the index. Instead, it communicates to a search-time +companion DeletionsReader which documents are deleted in such a way that it +can create a Matcher iterator.</p> +<p>Documents are truly deleted only when the segments which contain them are +merged into new ones.</p> +<h3>Methods</h3> +<dl> +<dt id="func_Delete_By_Term">Delete_By_Term <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DelWriter_Delete_By_Term</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>field</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>term</strong> +); +</code></pre> +<p>Delete all documents in the index that index the supplied term.</p> +<dl> +<dt>field</dt> +<dd><p>The name of an indexed field. (If it is not specâd as +<code>indexed</code>, an error will occur.)</p> +</dd> +<dt>term</dt> +<dd><p>The term which identifies docs to be marked as deleted. If +<code>field</code> is associated with an Analyzer, <code>term</code> +will be processed automatically (so donât pre-process it yourself).</p> +</dd> +</dl> +</dd> +<dt id="func_Delete_By_Query">Delete_By_Query <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DelWriter_Delete_By_Query</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Search/Query.html">Query</a> *<strong>query</strong> +); +</code></pre> +<p>Delete all documents in the index that match <code>query</code>.</p> +<dl> +<dt>query</dt> +<dd><p>A <a href="../../Lucy/Search/Query.html">Query</a>.</p> +</dd> +</dl> +</dd> +<dt id="func_Updated">Updated <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>bool +<span class="prefix">lucy_</span><strong>DelWriter_Updated</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Returns true if there are updates that need to be written.</p> +</dd> +<dt id="func_Seg_Del_Count">Seg_Del_Count <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>DelWriter_Seg_Del_Count</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>seg_name</strong> +); +</code></pre> +<p>Return the number of deletions for a given segment.</p> +<dl> +<dt>seg_name</dt> +<dd><p>The name of the segment.</p> +</dd> +</dl> +</dd> +</dl> +<h4>Methods inherited from Lucy::Index::DataWriter</h4> +<dl> +<dt id="func_Add_Segment">Add_Segment <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DelWriter_Add_Segment</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/SegReader.html">SegReader</a> *<strong>reader</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Object/I32Array.html">I32Array</a> *<strong>doc_map</strong> +); +</code></pre> +<p>Add content from an existing segment into the one currently being +written.</p> +<dl> +<dt>reader</dt> +<dd><p>The SegReader containing content to add.</p> +</dd> +<dt>doc_map</dt> +<dd><p>An array of integers mapping old document ids to +new. Deleted documents are mapped to 0, indicating that they should be +skipped.</p> +</dd> +</dl> +</dd> +<dt id="func_Delete_Segment">Delete_Segment</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DelWriter_Delete_Segment</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/SegReader.html">SegReader</a> *<strong>reader</strong> +); +</code></pre> +<p>Remove a segmentâs data. The default implementation is a no-op, as +all files within the segment directory will be automatically deleted. +Subclasses which manage their own files outside of the segment system +should override this method and use it as a trigger for cleaning up +obsolete data.</p> +<dl> +<dt>reader</dt> +<dd><p>The SegReader containing content to merge, which must +represent a segment which is part of the the current snapshot.</p> +</dd> +</dl> +</dd> +<dt id="func_Merge_Segment">Merge_Segment</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DelWriter_Merge_Segment</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Index/SegReader.html">SegReader</a> *<strong>reader</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Object/I32Array.html">I32Array</a> *<strong>doc_map</strong> +); +</code></pre> +<p>Move content from an existing segment into the one currently being +written.</p> +<p>The default implementation calls <a href="../../Lucy/Index/DeletionsWriter.html#func_Add_Segment">Add_Segment()</a> then <a href="../../Lucy/Index/DeletionsWriter.html#func_Delete_Segment">Delete_Segment()</a>.</p> +<dl> +<dt>reader</dt> +<dd><p>The SegReader containing content to merge, which must +represent a segment which is part of the the current snapshot.</p> +</dd> +<dt>doc_map</dt> +<dd><p>An array of integers mapping old document ids to +new. Deleted documents are mapped to 0, indicating that they should be +skipped.</p> +</dd> +</dl> +</dd> +<dt id="func_Finish">Finish <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>DelWriter_Finish</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Complete the segment: close all streams, store metadata, etc.</p> +</dd> +<dt id="func_Metadata">Metadata</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Hash.html">Hash</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>DelWriter_Metadata</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Arbitrary metadata to be serialized and stored by the Segment. The +default implementation supplies a hash with a single key-value pair for +âformatâ.</p> +</dd> +<dt id="func_Format">Format <span class="comment">(abstract)</span></dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>DelWriter_Format</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Every writer must specify a file format revision number, which should +increment each time the format changes. Responsibility for revision +checking is left to the companion DataReader.</p> +</dd> +<dt id="func_Get_Snapshot">Get_Snapshot</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Snapshot.html">Snapshot</a>* +<span class="prefix">lucy_</span><strong>DelWriter_Get_Snapshot</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsnapshotâ member var.</p> +</dd> +<dt id="func_Get_Segment">Get_Segment</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Segment.html">Segment</a>* +<span class="prefix">lucy_</span><strong>DelWriter_Get_Segment</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsegmentâ member var.</p> +</dd> +<dt id="func_Get_PolyReader">Get_PolyReader</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/PolyReader.html">PolyReader</a>* +<span class="prefix">lucy_</span><strong>DelWriter_Get_PolyReader</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âpolyreaderâ member var.</p> +</dd> +<dt id="func_Get_Schema">Get_Schema</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Plan/Schema.html">Schema</a>* +<span class="prefix">lucy_</span><strong>DelWriter_Get_Schema</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âschemaâ member var.</p> +</dd> +<dt id="func_Get_Folder">Get_Folder</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Store/Folder.html">Folder</a>* +<span class="prefix">lucy_</span><strong>DelWriter_Get_Folder</strong>( + <span class="prefix">lucy_</span>DeletionsWriter *<strong>self</strong> +); +</code></pre> +<p>Accessor for âfolderâ member var.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Index::DeletionsWriter is a <a href="../../Lucy/Index/DataWriter.html">Lucy::Index::DataWriter</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> Added: lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DocReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DocReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DocReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/c/Lucy/Index/DocReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,126 @@ +Title: Lucy::Index::DocReader â C API Documentation + +<div class="c-api"> +<h2>Lucy::Index::DocReader</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>DOCREADER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>DocReader</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>DocReader</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Index/DocReader.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Index::DocReader â Retrieve stored documents.</p> +<h3>Description</h3> +<p>DocReader defines the interface by which documents (with all stored fields) +are retrieved from the index. The default implementation returns +<a href="../../Lucy/Document/HitDoc.html">HitDoc</a> objects.</p> +<h3>Methods</h3> +<dl> +<dt id="func_Fetch_Doc">Fetch_Doc <span class="comment">(abstract)</span></dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Document/HitDoc.html">HitDoc</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>DocReader_Fetch_Doc</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong>, + int32_t <strong>doc_id</strong> +); +</code></pre> +<p>Retrieve the document identified by <code>doc_id</code>.</p> +<p><strong>Returns:</strong> a HitDoc.</p> +</dd> +<dt id="func_Aggregator">Aggregator</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>DocReader* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>DocReader_Aggregator</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a> *<strong>readers</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Object/I32Array.html">I32Array</a> *<strong>offsets</strong> +); +</code></pre> +<p>Returns a DocReader which divvies up requests to its sub-readers +according to the offset range.</p> +<dl> +<dt>readers</dt> +<dd><p>An array of DocReaders.</p> +</dd> +<dt>offsets</dt> +<dd><p>Doc id start offsets for each reader.</p> +</dd> +</dl> +</dd> +</dl> +<h4>Methods inherited from Lucy::Index::DataReader</h4> +<dl> +<dt id="func_Get_Schema">Get_Schema</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Plan/Schema.html">Schema</a>* +<span class="prefix">lucy_</span><strong>DocReader_Get_Schema</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âschemaâ member var.</p> +</dd> +<dt id="func_Get_Folder">Get_Folder</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Store/Folder.html">Folder</a>* +<span class="prefix">lucy_</span><strong>DocReader_Get_Folder</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âfolderâ member var.</p> +</dd> +<dt id="func_Get_Snapshot">Get_Snapshot</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Snapshot.html">Snapshot</a>* +<span class="prefix">lucy_</span><strong>DocReader_Get_Snapshot</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsnapshotâ member var.</p> +</dd> +<dt id="func_Get_Segments">Get_Segments</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* +<span class="prefix">lucy_</span><strong>DocReader_Get_Segments</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsegmentsâ member var.</p> +</dd> +<dt id="func_Get_Segment">Get_Segment</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Index/Segment.html">Segment</a>* +<span class="prefix">lucy_</span><strong>DocReader_Get_Segment</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âsegmentâ member var.</p> +</dd> +<dt id="func_Get_Seg_Tick">Get_Seg_Tick</dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>DocReader_Get_Seg_Tick</strong>( + <span class="prefix">lucy_</span>DocReader *<strong>self</strong> +); +</code></pre> +<p>Accessor for âseg_tickâ member var.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Index::DocReader is a <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div>