Re: [Ferret-talk] Need clarification of documentation

David Balmain Mon, 05 Mar 2007 19:11:29 -0800

On 3/5/07, Chad Thatcher <[EMAIL PROTECTED]> wrote:
>
> Hi, I have question about the delete() method docs.
>
> I am re-indexing data on the fly so I would like to delete any existing
> indexed data for a particular resource before re-indexing it using
> index.delete(id).
>
> The delete() method api doc says:
>
> "Delete the document referenced by the document number id if id is an
> integer or all of the documents which have the term id if id is a term..
>
> id:  The number of the document to delete"
>
> I am a little confused by what this means.


Is this any clearer?

    # Deletes a document/documents from the index. The method for determining
    # the document to delete depends on the type of the argument passed.
    #
    # If +arg+ is an Integer then delete the document based on the internal
    # document number.
    #
    # If +arg+ is a String then search for the documents with +arg+ in the
    # +id+ field. The +id+ field is either :id or whatever you set the :id_field
    # parameter to when you create the Index object.

> At the time of deletion all
> I have is my own ID of the resource which was previously indexed in
> ferret with my own field :id.  If I supply my own ID will the correct
> indexed data be deleted?  Or does this ID refer to ferrets own internal
> ID for the resource?

In this case, since your id is probably an integer you will need to
convert it to a string or Ferret will delete the documents by internal
document number rather than your own ID for the resource.

> One other question while I am on the subject - will deleting a resource
> that does not exist raise an error.  I ask this because I would like to
> index new data structures that haven't been indexed before and would
> like to avoid checking in the index first whether or not it exists
> before attempting to delete.

Yes, if you delete by internal document number. No, if you are
deleting by term, ie passing your own document id which is stored in
the *id* field. So in your case you should be fine. I should also
mention that you can set the :key parameter to :id;

    index = Ferret::Index::Index.new(:key => :id)

This way, whenever you add a document with an id that already exists
in the index it will replace the existing document.

For example;

    require 'rubygems'
    require 'ferret'

    index = Ferret::I.new(:key => :id)

    [
      {:id => '1', :text => 'one'},
      {:id => '2', :text => 'Two'},
      {:id => '3', :text => 'Three'},
      {:id => '1', :text => 'One'}
    ].each {|doc| index << doc}

    puts index.size                       # => 3
    puts index['1'].load.inspect          # => {:text=>"One", :id=>"1"}
    puts index.search('id:1').to_s(:text)
        # => TopDocs: total_hits = 1, max_score = 1.287682 [
        #            3 "One": 1.287682
        #    ]

Hope that helps,
Dave

-- 
Dave Balmain
http://www.davebalmain.com/
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Re: [Ferret-talk] Need clarification of documentation

Reply via email to