On Jun 28, 2008, at 08:03, David King wrote:

I'm trying to gain a fundamental understanding of views and indexed data. If this is documented in a FAQ, please direct me there instead :)

In trying to map my understanding from SQL,

Here we have to tackle the first issue: Do not try to map what you know
from SQL to CouchDB. Try to independently understand, how CouchDB
works and then try to apply your problems to it. A translation will not
work and possibly leave you thinking CouchDB is crap because it is not
an RDBMS which is surely not the case. On the other hand, it might
be perfectly possible that CouchDB is not the right tool for your job,
but it is certainly cool that you are checking it out :)


it appears that the answer to quickly querying data is by pre- calculating query result-sets and storing them in tables, called views. A view is table populated by a function that runs against every object that is written or modified in the database.

1. How would you implement a query against a value that changes after the view is populated, like the current time? That is, if I wanted things younger than a week, a permanent view like this:

function(doc) {
        if(doc.date > now() - timeinterval('1 week')) {
                emit(null,doc);
        }
}
(date-syntax liberally made up) the results of that query, if populated when the data is changed, would quickly be invalid, because now() has changed. Is this accurate? How would you performantly run a query like this?

Your map functions must return the same result for the same input, so
things like now() can not be used. And you usually don't. The most
interesting feature of the result set (or table as you call it) of the map function is that the 'first column', the 'key' can be used for fast lookups.
So what you would do here instead, is:

function(doc) {
  emit(doc.date, null);
}

and query with /db/_view/date/name?startkey=timestamp_from_interval('1 week')&endkey=now()

Looking up this can be done in constant time.

2. Same question for a permanent view containing the youngest 10 items (this one might be easier)?

Same thing. I note that you explicitly mention permanent views. Do not
use temporary views in production, only during development.


3. The wiki doesn't mention parameterised views. So if I have a document with an 'author' field, and I want a view such that I can see everything that a given author wrote, do I need a view per author? Given thousands of authors, what is the performance cost for running a document through a few thousand author-functions?

Same as above:

function(doc) {
  emit(doc.author, null);
}

GET /db/_view/authors/name?key=authorname

One view, extremely fast lookups.


4. I know that the distribution bits are still being fleshed out, but is it the intention that eventually views can be stored or calculated on a separate server from the data (since they are implemented as tables)?

Not sure what you mean with 'since they are implemented as tables', but
maybe that is just the SQL-lingua that is confusing me. We don't have
tables (things might look like them, though). But yes, eventually, you will be able to distribute view creation. We haven't gotten around to to that yet.

Feel free to send in more questions as they come :-)

Cheers
Jan
--

Reply via email to