On May 5, 2006, at 9:13 AM, Terry Jones wrote:

This page

  http://www.zope.org/Documentation/How-To/ZCatalogTutorial

[...]

I am happy to update the how-to page if someone tells me how to access the
source or sends it to me.

Attached.  :^)  Thanks!

Regards,
Rob

--
Rob Page               V: 540 361 1710
Zope Corporation       F: 703 995 0412




{\rtf1\mac\ansicpg10000\cocoartf824\cocoasubrtf330
{\fonttbl\f0\fswiss\fcharset77 Helvetica;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww17640\viewh14640\viewkind0
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural

\f0\fs24 \cf0 <h3>ZCatalog Tutorial</h3>\
<p>  This document provides a tutorial for <code>ZCatalog</code>, the new search\
  engine machinery in Zope.  The audience for the document is content\
  managers.</p>\
\
<h4>Contents</h4>\
<ul><li><p>What is it?  What's it for?  Why's it so cool?</p>\
\
\
<li><p>Installing ZCatalog</p>\
\
\
<li><p>ZCatalog Objects</p>\
\
\
<li><p>Example using ZCatalog</p>\
\
\
<li><p>Creating Search Forms And Result Reports</p>\
\
\
<li><p>Using ZCatalog In A Zope Site</p>\
\
\
<li><p>ZCatalog vs. Catalog</p>\
\
</ul>\
\
<h4>What is it?  What's it for?  Why's it so cool?</h4>\
<p>    The <code>ZCatalog</code> provides powerful indexing and searching on a Zope\
    database using a Zope management interface.  A <code>ZCatalog</code> is a\
    Zope object that can be added to a Folder, managed through the\
    web, and extended in many ways.</p>\
\
<p>    The <code>ZCatalog</code> is a very significant project, providing a number\
    of compelling features:</p>\
<ul><li><p><strong>Searches are fast</strong>.  The data structures used by the index\
      provide extremely quick searches without consuming much memory.</p>\
\
\
<li><p><strong>Searches are robust</strong>.  The <code>ZCatalog</code> supports boolean\
      search terms, proximity searches, synonyms and stopwords.</p>\
\
\
<li><p><strong>Indexing is wildly flexible</strong>.  A <code>ZCatalog</code> can catalog\
      custom properties and track unique values.  Since <code>ZCatalog</code>\
      catalogs objects instead of file handles, you can index any\
      content that can have a Python object wrapped around it.  This\
      also lets objects participate in how they are cataloged,\
      e.g. de-HTML-ifying contents or extracting PDF properties.</p>\
\
\
<li><p><strong>Usable outside of Zope</strong>.  The software is broken into a\
      Python <code>Catalog</code> which wrapped by a <code>ZCatalog</code>.  The Python\
      <code>Catalog</code> can be used in any Python program; all it requires is\
      the Z object database and the indexing machinery from Zope.</p>\
\
\
<li><p><strong>Transactional</strong>.  An indexing operation is part of a Zope\
      transaction.  If something goes wrong after content is indexed,\
      the index is restored to its previous condition.  This also means\
      that Undo will restore an index to its previous condition.\
      Finally, a <code>ZCatalog</code> can be altered privately in a Version,\
      meaning no one else can see the changes to the index.</p>\
\
\
<li><p><strong>Cache-friendly</strong>.  The index is internally broken into\
      different "buckets", with each bucket being a separate Zope\
      database object.  Thus, only the part of the index that is needed\
      is loaded into memory.  Alternatively, an un-needed part of the\
      index can be removed from memory.</p>\
\
\
<li><p><strong>Results are lazy</strong>.  A search that returns a tremendous\
      number of matches won't return a large result set.  Only the\
      part of the results, such as the second batch of twenty, are\
      returned.</p>\
\
</ul>\
\
<p>    The <code>ZCatalog</code> is a free, Open Source part of the Zope software\
    repository and thus is covered under the same license as Zope.  It\
    is being developed in conjunction with the Zope Portal Toolkit\
    effort.  However, the <code>ZCatalog</code> product is managed as its own\
    module in CVS.</p>\
\
\
<h4>Installing ZCatalog</h4>\
<p>    <code>ZCatalog</code> can be downloaded from the Zope download area and is\
    also a module in the public CVS for Zope.  Untar it while in the\
    root directory of your Zope installation:</p>\
<PRE>\
      $ cd Zope-2.0.0a3-src/\
      $ tar xzf ../ZCatalog-x.x.tgz\
\
</PRE>\
\
<p>    Windows users can use WinZip or a similar utility to accomplish\
    the same thing.</p>\
\
<p>    Also, Zope 2.0.0a3 does not have the latest version of UnIndex and\
    UnTextIndex which fix a couple of bugs in the alpha 3 versions.\
    The latest CVS of the SearchIndex packages <em>must</em> be used.</p>\
\
<p>    Remember, you have to restart your Zope server before you will see\
    <code>ZCatalog</code>.</p>\
\
\
<h4>ZCatalog Objects</h4>\
<p>    A <code>ZCatalog</code> performs two activities: indexing information and\
    performing searches.</p>\
\
<p>    Most the work is done in the first step, which is getting objects\
    into the index.  This is done in two ways.  First, if your objects\
    are ZCatalog-aware they automatically update the index when they\
    are added, edited or directly deleted.  A ZCatalog-aware object is\
    one that is an instance of a <code>Z Class</code> that informs the <code>ZCatalog</code>\
    of changes.  <em>Directly deleted</em> means the object was deleted from\
    a Folder, not the deletion of a containing Folder.</p>\
\
<p>    The second way that site contents get updated is by "finding"\
    information "into" the <code>ZCatalog</code>.  An operation based on Zope's\
    Find view traverses Folders looking for objects matching the\
    criterion.  The objects are then registered with the Catalog.\
    Objects in the index but no longer in the site are removed from\
    the Catalog.</p>\
\
<p>    Either way, automatically updating or walking the Folders,\
    <code>ZCatalog</code> indexes the objects it finds.  The <code>ZCatalog</code> is set up\
    to look for properties, each of which are added to the index.</p>\
\
<p>    There are two kinds of indexes, called FieldIndex and TextIndex.\
    FieldIndex indexes treat data atomically.  The entire contents of a\
    FieldIndex-indexed property is treated as a unit.  With a\
    TextIndex index, it is broken into words which are indexed\
    individually.  A TextIndex is also known as <em>full-text index</em>.</p>\
\
<p>    Note that the <code>ZCatalog</code> doesn't track ZCatalog-unaware objects\
    after it has indexed them.  This means that the <code>ZCatalog</code> must\
    reindex its objects occasionally when the objects have been\
    changed.  Out of date indexes can be prevented by inheriting from\
    a ZCatalog-aware class which can tell the <code>ZCatalog</code> to reindex it\
    whenever a change is made.  Just such a class will be included\
    with the Portal toolkit.</p>\
\
<p>    ZCatalogs are "searchable objects", meaning they cooperate with Z\
    Search Interfaces documented in Z SQL Methods.  Creating a search\
    form for a <code>ZCatalog</code> is a simple matter of adding a Z Search\
    Interface from the management screen and filling in a form.\
    ZCatalogs can also be queried directly from DTML, as shown in the\
    example below.</p>\
\
\
<h4>Example using Z Classes</h4>\
<p>    The first example shows how to give your Zope site a long-desired\
    feature: full text-searches of your content.  The example assumes\
    you already have a number of DTML Methods/Documents to catalog.</p>\
<ul><li><p>Install <code>ZCatalog</code> as instructed above</p>\
\
\
<li><p>In the root folder of your Zope server, add a <code>ZCatalog</code>.</p>\
\
\
<li><p>Type in the id <code>catalog</code> and hit <code>Add</code>.</p>\
\
</ul>\
\
<p>    You now have a brand new <code>ZCatalog</code> named <code>catalog</code> in your root\
    folder.</p>\
<ul><li><p>Click on it.</p>\
\
</ul>\
\
<p>    Now you are looking at the <code>ZCatalog</code> 'Contents' view.  It says\
    the catalog is empty.  We'll catalog some objects in a moment, but\
    first we have to tell it what portions of objects we are\
    specifically interested in.</p>\
<ul><li><p>Click on <code>Indexes</code>.</p>\
\
</ul>\
\
<p>    This management view is where the attributes to be indexed are\
    defined.</p>\
<ul><li><p>In the <code>Add index</code> field, type <code>raw</code>.</p>\
\
\
<li><p>Click <code>Add</code>.</p>\
\
</ul>\
\
<p>    Now that the indexes are defined, a set of objects can be selected\
    for cataloging.</p>\
<ul><li><p>Click on <code>Find items to ZCatalog</code>.</p>\
\
</ul>\
\
<p>    For this example, we are only interested in DTML Documents and\
    Methods.</p>\
<ul><li><p>Deselect <code>All type</code>.</p>\
\
\
<li><p>Select <code>DTML Method</code> and <code>DTML Document</code>.</p>\
\
\
<li><p>Click <code>Find</code>.</p>\
\
</ul>\
\
<p>    ZCatalog will report how many items it found, and then present an\
    interface for excluding specific objects.</p>\
<ul><li><p>Click <code>Catalog Items</code>.</p>\
\
</ul>\
\
<p>    Great, now that the catalog is stocked, we can create a user\
    interface to it.</p>\
<ul><li><p>Return to the root folder's management view.</p>\
\
\
<li><p>Add a <code>Z Search Interface</code>.</p>\
\
</ul>\
\
<p>    <code>ZCatalog</code> participates in the Zope Search architecture.  You\
    simply have to fill in this form, and a basic user interface will\
    be created.</p>\
<ul><li><p>Select <code>catalog</code> in the list beside <code>Select one or more searchable\
          objects</code>.</p>\
\
\
<li><p>Beside <code>Report Id</code>, type <code>report</code>.</p>\
\
\
<li><p>Beside <code>Search Input Id</code>, type <code>search</code>.</p>\
\
</ul>\
\
<p>    <code>report</code> and <code>search</code> are the Ids of two DTML Methods which will\
    be created in your root folder.</p>\
<ul><li><p>Click <code>Add</code>.</p>\
\
</ul>\
\
<p>    Congratulations, if all has gone well, you can now find references\
    to any word in your DTML pages.  Try it by viewing <code>search</code>.  Type\
    a common word in the <code>Raw</code> field, and you should be presented with\
    a list of hits.  However, none of the results returned can be\
    clicked on.  To fix this, go to the management view of <code>report</code>.\
    <code>report</code> is called by <code>search</code> to display the results from\
    <code>catalog</code>.  <code>report</code> is just a simple <code><!--#in catalog--></code> loop\
    with a few refinements.  <code>catalog</code> knows which results to return\
    by looking at the REQUEST variable, which contains the input from\
    the <code>search</code> form.</p>\
<ul><li><p>In the source of <code>report</code>, find the following line::</p>\
<PRE>\
            &lt;tr&gt;&lt;!--#var title--&gt;&lt;/tr&gt;\
\
</PRE>\
\
\
<li><p>Replace it with this::</p>\
<PRE>\
            &lt;tr&gt;\
             &lt;a href=&quot;&lt;!--#var &quot;catalog.getpath(data_record_id_)&quot;--&gt;&quot;&gt;\
              &lt;!--#var title--&gt;\
             &lt;/a&gt;\
            &lt;/tr&gt;\
\
</PRE>\
\
</ul>\
\
<p>    This is a little confusing at first.  Keep in mind that ZCatalog\
    does not return a list of your database objects.  What it returns\
    are actually fairly unintelligent instances of a Record subclass.\
    These record objects contain copies of data from attributes of\
    catalog objects.  The <code>ZCatalog</code> 'MetaData Table' view defines\
    which attributes are copied.</p>\
\
<p>    (By default, these record objects are just SLIGHTLY more\
    intelligent than a raw tuple.  <code>Catalog</code> can be told to use a\
    custom, intelligent class for results.  Please see the <code>Catalog</code>\
    __init__ method in <code>lib/python/Products/ZCatalog/Catalog.py</code> for\
    more information.)</p>\
\
<p>    Fortunately, ZCatalog provides a utility function for going from\
    result objects to the object's path.  It is called, aptly enough,\
    <code>getpath</code>.  <code>getpath</code> expects to be passed the unique integer\
    identifier of the cataloged object.  Results store that id as\
    <code>data_record_id_</code>.</p>\
\
<p>    Commit this change, and perform another search.  Now the title can\
    be clicked on to take you to the full page.</p>\
\
\
<h4>Example cataloging custom objects</h4>\
<p>    As if full-text searches of your entire site weren't good enough,\
    ZCatalog can also catalog Z Classes, Products, and in fact any\
    Python object you can put in a ZODB.  Here is an example using a Z\
    Class, but the principles apply to any kind of object.</p>\
\
<p>    First, we're going to need something to catalog.  Follow the <code>Z\
    Classes</code> tutorial to create the CD <code>Z Class</code>.  Back?  Good.</p>\
<ul><li><p>Create a folder, <code>CDs</code>, and create a number of instances of\
        the CD Z Class in it.</p>\
\
</ul>\
\
<p>    <code>cd1</code> through <code>cd5</code> should be plenty.  Remember to fill each of\
    them in from their Properties view.</p>\
\
<p>Now we want to create a searchable catalog of CDs.</p>\
<ul><li><p>Go to the <code>CDs</code> folder and create a <code>ZCatalog</code> with an ID <code>cd_cat</code>.</p>\
\
\
<li><p>Click on the objects Indexes view.</p>\
\
</ul>\
\
<p>    This screen shows that, by default, <code>ZCatalog</code> is interested in an\
    object's <code>id</code>,'title', <code>meta_type</code>, and\
    <code>bobobase_modification_time</code>.  You will almost always want to\
    index additional information.  In this case, we would also like to\
    index the artist and description of CDs.</p>\
<ul><li><p>Type <code>artist</code> into the <code>Add Index</code> field.</p>\
\
</ul>\
\
<p>    For the sake of example, we're going to use a FieldIndex index for\
    artist.  This will give us the option of putting an HTML SELECT\
    box for artists on the search form.</p>\
<ul><li><p>Select FieldIndex from the Index type drop down, and click\
      <code>Add</code>.</p>\
\
\
<li><p>Also add an index for <code>description</code>, but leave TextIndex\
      selected.</p>\
\
</ul>\
\
<p>    This will allow us to search for individual words within the\
    description.</p>\
<ul><li><p>Click on <code>MetaData Table</code>.</p>\
\
</ul>\
\
<p>    This is where we tell the <code>ZCatalog</code> what attributes of cataloged\
    objects to cache.  These cached values are available from search\
    results without having to look up the actual indexed object.  The\
    tradeoff for the speed is extra memory, as information from the\
    content is duplicated in the <code>ZCatalog</code>.</p>\
\
<p>    You will probably want to keep the schema light-weight, so we're\
    not going to add <code>description</code> to it.  Type <code>artist</code> in the <code>Add\
    column</code> field and click <code>Add</code>.</p>\
<ul><li><p>Click on the <code>Find Items to Catalog</code> view.</p>\
\
</ul>\
\
<p>    This is the interface you use to tell the <code>ZCatalog</code> which items\
    to index.  Right now, beside <code>Find objects of type:</code>, <code>All types</code>\
    is selected.</p>\
<ul><li><p>Deselect <code>All types</code>.</p>\
\
</ul>\
<p>      O Scroll down and select CD.</p>\
\
\
<p>    You could use the rest of the form to be more specific, but since\
    we want to catalog all the CDs,</p>\
<ul><li><p>Click <code>Find</code>.</p>\
\
</ul>\
\
<p>    <code>ZCatalog</code> will report <code>Found 5 items.</code>  It is now giving you an\
    opportunity to exclude some of the matched items from the index.\
    Again, we want all of them, so,</p>\
<ul><li><p>Click <code>Update Catalog</code>.</p>\
\
</ul>\
\
<p>    You should at this point see a list of the indexed objects.  Also\
    of note is the <code>Update Catalog</code> button.  You have to use it\
    whenever you want your <code>ZCatalog</code> to notice changes you've made to\
    the objects it's indexed.</p>\
\
\
<h4>Creating Search Forms And Result Reports</h4>\
<p>This catalog isn't much good without some way of querying it.</p>\
<ul><li><p>Go back to your <code>CDs</code> folder's management screen and add a Z\
        Search Interface.</p>\
\
</ul>\
\
<p>    The search add form will automatically detect your cd_cat\
    <code>ZCatalog</code> and offer it as a searchable document.  Make sure it is\
    selected.</p>\
<ul><li><p>Fill in <code>cd_report</code> for <code>Report ID</code> and <code>cd_search</code> for\
        <code>Search Input ID</code>.</p>\
\
</ul>\
\
<p>    Those are the ids of two DTML methods that will be generated in\
    the <code>CDs</code> folder.</p>\
<ul><li><p>Click <code>Add</code>.</p>\
\
\
<li><p>View the <code>cd_search</code> Catalog (at, for example,\
        http://localhost:9673/CDs/cd_search).</p>\
\
</ul>\
\
<p>    You will see a basic search interface, with fields for searching\
    on <code>title</code>, modification date, <code>id</code>, <code>artist</code>, <code>meta type</code> and\
    <code>description</code>.  If you fill in one more more of the fields and\
    click <code>Submit Query</code>, cd_report will be displayed.  It is passed\
    the search criteria and uses it to get a list from cd_cat to\
    iterate over.  It is merely displaying the information from the\
    ZCatalog's MetaData table, but of course it can be enriched.</p>\
\
<p>    Try a few more searches.  You'll find that you can type any single\
    word from the title or description and get a match, but for artist\
    you must type the exact string.  That's because artist was indexed\
    as a FieldIndex, which gives us an opportunity to present a more\
    convenient interface.</p>\
\
<p>    Go back to the <code>cd_search</code> management interface, and change\
    it's source to look like this::</p>\
\
\
<p>  <xmp>\
  <!--#var standard_html_header-->\
  <form action="cd_report" method="get">\
  <h2><!--#var document_title--></h2>\
  Enter query parameters:<br><table>\
  <tr><th>Title</th>\
      <td><input name="title"\
                 width=30 value=""></td></tr>\
  <tr><th>Artist</th>\
      <td>\
       <select name="artist">\
        <option value="">All</option>\
        <!--#in expr="cd_cat.uniqueValuesFor('artist')"-->\
         <option value="<!--#var sequence-item-->">\
          <!--#var sequence-item-->\
         </option>\
        <!--#/in-->\
       </select>\
      </td>\
  </tr>\
  <tr><th>Description</th>\
      <td><input name="description"\
                 width=30 value=""></td></tr>\
  <tr><td colspan=2 align=center>\
  <input type="SUBMIT" name="SUBMIT" value="Submit Query">\
  </td></tr>\
  </table>\
  </form>\
  <!--#var standard_html_footer-->\
  </xmp></p>\
<p>    This is a search form somewhat more appropriate for the CD <code>Z\
    Class</code>.  Unrelated fields have been removed, and the <code>artist</code>\
    field has been changed to a drop-down menu.  Let's augment the\
    output of <code>cd_report</code> to make the title a link to the actual CD\
    object.</p>\
\
<p>    Taking a look at <code>cd_report</code>, note that the search results are\
    obtained with a simple <code><!--#in cd_cat ...--></code> tag.  The search\
    criteria is automatically obtained by the <code>ZCatalog</code> from the form\
    input.  The line we're interested in is this one:</p>\
<PRE>\
            &lt;td&gt;&lt;!--#var title--&gt;&lt;/td&gt;\
\
</PRE>\
\
<p>    Change it to read:</p>\
<PRE>\
            &lt;td&gt;\
             &lt;a href=&quot;&lt;!--#var &quot;cd_cat.getpath(data_record_id_)&quot;--&gt;&quot;&gt;\
              &lt;!--#var title--&gt;\
             &lt;/a&gt;\
            &lt;/td&gt;\
\
</PRE>\
\
<p>    Now, assuming you have added the index_html document template to\
    your CD <code>Z Class</code>, clicking on a search result will take you to\
    the CD's detailed display.</p>\
\
\
<h4>Using <code>ZCatalog</code> In A Zope Site</h4>\
<p>    The <code>ZCatalog</code> provides high-speed access to what is on your site.\
    Thus, the <code>ZCatalog</code> can be used to re-engineer the way your site\
    is laid out.</p>\
\
<p>    For instance, a Slashdot-style presentation is simple.  Just\
    insert some DTML that asks the <code>ZCatalog</code> for recent items.\
    Alternatively, a Site Map is nothing more than presenting the\
    contents of the catalog.  A page with tree-based browsing of\
    software packages by category is also easy.  Perhaps you'd like to\
    provide a link that lists all the packages the current user has\
    authored.</p>\
\
<p>    Thus, the <code>ZCatalog</code> isn't just about searching.  It can be used\
    as the DTML-scriptable engine for browsing a site as well.</p>\
\
<p>    Since the <code>ZCatalog</code> is a normal Zope folderish object, you can\
    also create DTML Methods inside it to present the catalog\
    contents.  For instance, perhaps you'd like to dump the contents\
    of the site as an RDF stream, or do content syndication with RSS.\
    These are just DTML Methods that change the <code>Content-Type:</code> and\
    send back XML.  All without actually waking up any of the content\
    objects in the site.</p>\
\
\
<h4>ZCatalog vs. Catalog</h4>\
<p>    The real star of this package is the <code>Catalog</code> module.  All the\
    heavy lifting is done by <code>Catalog</code>.  <code>ZCatalog</code> is basically a\
    Zope-aware wrapper around Catalog, which can be used on it's own\
    outside the Zope framework.  The only requirement is that you are\
    using ZODB as your object store.</p>}
_______________________________________________
Zope-web maillist  -  Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web

Reply via email to