[ https://issues.apache.org/jira/browse/COUCHDB-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849833#action_12849833 ]
Luke Burton commented on COUCHDB-707: ------------------------------------- Not far, because the objective is to perform the filter on data available in the view, and only do the expensive fetch of the entire document when the filter criteria is met. Say I have a Couch database full of images with metadata. My goal is to fetch a bunch of images that contain a particular tag, from a particular author, of less than a particular focal length. To do this, I could build a filter view that emits [tags, author, focalLength]. I pass in my match criteria as HTTP parameters. The filter view would enumerate each row and see if req.tag is in row.tags, whether req.author = row.author, and whether req.focalLength > row.focalLength. I could then emit only the complete documents that match. To do this with a list view, I would need to supply include_docs=true, to get access to the entire image document so I could actually return it upon a match. This means Couch is retrieving in memory a potentially multi-gigabyte view document, then handing it off to the list view javascript for transformation. Expensive! What if only five images actually match? :) As I mentioned above, you can do all this on the client side - fetch the view, process it, get a list of IDs, then fetch them - but it requires multiple calls over the wire. And it's putting what I consider to be "database oriented" stuff into the front end, rather than in the database itself ... > Proposal for "Filter Views" > --------------------------- > > Key: COUCHDB-707 > URL: https://issues.apache.org/jira/browse/COUCHDB-707 > Project: CouchDB > Issue Type: New Feature > Components: JavaScript View Server > Affects Versions: 0.11 > Reporter: Luke Burton > > A common operation I find myself performing repeatedly is: > * request a view (maybe with some basic filter like "keys" or a range of keys) > * in my client, filter this view based on some complex criteria, leaving me > with a small set of document IDs (complex as in array intersections, compound > boolean operations, & other stuff not possible in the HTTP view API) > * go back to Couch and fetch the complete documents for these IDs. > List Views almost get me to the point of doing this purely in Couch. I can > enumerate over a view and do some complex things with it. But I can't output > entire documents, unless I use the include_docs=true flag which murders the > performance of the list view.Apparently because the entire view is fetched > with including docs, THEN passed on to the list view JS. Typically my complex > filter criteria is contained in the view itself, so there is no need to fetch > the entire document until I know I have a match. > In summary, a Filter View would execute some arbitrary JavaScript on each > view row, with access to HTTP request parameters, and return "true" for rows > that match. The output would be a list of IDs for whom the function returned > true. include_docs=true would include the matching documents. > Performance would certainly not be as good as fetching a raw view, but it > would indisputably be better than fetching the entire view over HTTP to a > client, deserializing the JSON, doing some stuff, then making another HTTP > request, and deserializing more JSON. > I looked at the various entry points for list views in the Couch source. > Unfortunately it will take me some time to come up to speed with the source > (if I ever have the time ...), and I hope that what I'm asking for could be a > simple extension to the List Views for someone very familiar with this area. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.