I implemented something similar recently except I wanted to find any word that
started with a substring. What I ended up doing was emitting the first 3
characters of each word in the field as the key. I still had to do quite a bit
of processing on the client side once I got back the list of potential matches
from CouchDB.
The following map function outputs the first three characters of each word in
the Name field of all Person objects:
function(doc) {
function emitParts(parts,doc)
{
for (var i = 0; i < parts.length; ++i)
{
emit(parts[i].substr(0,3).toLowerCase(),doc);
}
}
if (doc.type == "Person")
{
var parts;
if (doc.Name && typeof(doc.Name) == "string")
{
parts = doc.Name.split(new RegExp("[ ]"));
emitParts(parts,doc);
}
}
}
-----Original Message-----
From: Paul Davis [mailto:[email protected]]
Sent: Thursday, January 22, 2009 9:09 AM
To: [email protected]
Subject: Re: Newbie question: substring matching
You'll never be able to have a wildcard on the front side of your
pattern with couchdb directly, and you'll only be able to have a wild
card on one end of the statement.
Something you could try:
emit(doc.field_to_search, value);
emit(string_reverse_function(doc.field_to_search), vaule);
Then you could do something like:
http://127.0.0.1:5984/db_name/_view/ddoc/using_like?startkey="foo"&endkey="foo\u9999"
http://127.0.0.1:5984/db_name/_view/ddoc/using_like?startkey="oof"&endkey="oof\u9999"
And then intersect the two sets client side. Other than that, I'd look
at integrating full text search.
HTH,
Paul Davis
On Thu, Jan 22, 2009 at 11:56 AM, Brian Candler <[email protected]> wrote:
> Suppose I have a view which indexes a single field. Using startkey and
> endkey, it's easy to find matches which start with a particular pattern.
>
> But I'm wondering how best to do substring matches (in SQL: LIKE '%foo%')
>
> I could:
>
> 1. Read the entire view, and filter it client-side (problem: large
> data transfer)
>
> 2. Create another view which enumerates all possible suffixes (problem:
> large index, O(N^2))
>
> somedata
> omedata
> medata
> edata
> data
> ata
> ta
> a
>
> 3. Create a temporary view for the exact search being done (problem: forces
> a read through all documents in the database)
>
> Is there some other option I have overlooked, such as filtering the view
> server-side somehow?
>
> Thanks,
>
> Brian.
>