[Couchdb Wiki] Update of "View_Snippets" by Sebastian Cohnen

Apache Wiki Sun, 02 May 2010 11:08:18 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for 
change notification.


The "View_Snippets" page has been changed by SebastianCohnen.
The comment on this change is: added first level heading; added TOC; removed 
unnecessary anchors.
http://wiki.apache.org/couchdb/View_Snippets?action=diff&rev1=35&rev2=36

--------------------------------------------------

+ = View Snippets =
+ <<TableOfContents()>>
+ 
  This page collects code snippets to be used in your [[Views]]. They are 
mainly meant to help get your head around the map/reduce approach to accessing 
database content. Keep in mind that the the Futon web client silently adds 
group=true to your views.
  
-   * [[#common_mistakes|Common mistakes]]
-   * [[#get_doc_id|Get docs with a particular user id ]]
-   * [[#get_doc_with_attachment|Get all documents which have an attachment ]]
-   * [[#count_doc_with_attachment|Count documents with and without an 
attachment]]
-   * [[#list_unique_values|Generating a list of unique values]]
-   * [[#top_n_tags|Retrieve the top N tags]]
-   * [[#aggregate_sum|Joining an aggregate sum along with related data ]]
-   * [[#standard_deviation|Computing the standard deviation]]
-   * [[#summary_stats|Computing simple summary statistics 
(min,max,mean,standard deviation) ]]
-   * [[#interactive_couchdb|Interactive CouchDB Tutorial]]
-   * [[#documents_without_a_field|Retrieving documents without a certain 
field]]
-   * [[#geospatial_indexes|Using views to search for sort documents 
geographically]]
  
- <<Anchor(common_mistakes)>>
  == Common mistakes ==
  
  When creating a reduce function, a re-reduce should behave in the same way as 
the regular reduce. The reason is that CouchDB doesn't necessarily call 
re-reduce on your map results.
  
  Think about it this way: If you have a bunch of values V1 V2 V3 for key K, 
then you can get the combined result either by calling 
reduce([K,K,K],[V1,V2,V3],0) or by re-reducing the individual results: 
reduce(null,[R1,R2,R3],1). This depends on what your view results look like 
internally.
  
- <<Anchor(get_doc_id)>>
+ 
  == Get docs with a particular user id ==
  
  {{{
@@ -35, +25 @@

  
  Then query with key=USER_ID to get all the rows that match that user.
  
- <<Anchor(get_doc_with_attachment)>>
+ 
  == Get all documents which have an attachment ==
  
  This lists only the documents which have an attachment.
@@ -50, +40 @@

  
  In SQL this would be something like {{{SELECT id FROM table WHERE attachment 
IS NOT NULL}}}.
  
- <<Anchor(count_doc_with_attachment)>>
+ 
  == Count documents with and without an attachment ==
  
  Call this with ''group=true'' or you only get the combined number of 
documents with and without attachments.
@@ -80, +70 @@

  
  In SQL this would be something along the lines of {{{SELECT num_attachments 
FROM table GROUP BY num_attachments}}} (but this would give extra output for 
rows containing more than one attachment).
  
- <<Anchor(list_unique_values)>>
+ 
  == Generating a list of unique values ==
  
  Here we use the fact that the key for a view result can be an array. Suppose 
you have a map that generates (key, value) pairs with many duplicates and you 
want to remove the duplicates. To do so, use ([key, value], null) as the map 
output.
@@ -124, +114 @@

  If you then want to know the total count for each parent, you can use the 
''group_level'' view parameter:
  
''startkey=[''''''"thisparent"]&endkey=["thisparent",{}]&inclusive_end=false&group_level=1''
  
- <<Anchor(top_n_tags)>>
+ 
  == Retrieve the top N tags. ==
  
  This snippet assumes your docs have a top level tags element that is an array 
of strings, theoretically it'd work with an array of anything, but it hasn't 
been tested as such.
@@ -223, +213 @@

  
  When querying this reduce you should not use the `group` or `group_level` 
query string parameters. The returned reduce value will be an object with the 
top `MAX` tag: count pairs.
  
- <<Anchor(aggregate_sum)>>
+ 
  == Joining an aggregate sum along with related data ==
  
  Here is a modified example from the [[View_collation|View collation]] page.  
Note that `group_level` needs to be set to `1` for it to return a meaningful 
`customer_details`.
@@ -261, +251 @@

  }}}
  
  
- <<Anchor(standard_deviation)>>
  == Computing the standard deviation ==
  This example is from the couchdb test-suite. It is '''much''' easier and less 
complex then following example ([[#summary_stats|Computing simple summary 
statistics (min,max,mean,standard deviation)]]) although it does not calculate 
min,max and mean (but this should be an easy exercise).
  
@@ -311, +300 @@

  }}}
  
  
- <<Anchor(summary_stats)>>
+ 
  == Computing simple summary statistics (min,max,mean,standard deviation)  ==
  
  This implementation of standard deviation is more complex than the above 
algorithm, called the "textbook one-pass algorithm" by Chan, Golub, and 
Le``Veque.  While it is mathematically equivalent to the standard two-pass 
computation of standard deviation, it can be numerically unstable under certain 
conditions.  Specifically, if the square of the sums and  the sum of the 
squares terms are large, then they will be computed with some rounding error.  
If the variance of the data set is small, then subtracting those two large 
numbers (which have been rounded off slightly) might wipe out the computation 
of the variance.  See http://www.jstor.org/stable/2683386, 
http://people.xiph.org/~tterribe/notes/homs.html, and the wikipedia description 
of Knuth's algorithm 
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance.
@@ -706, +695 @@

  
  For example: you can now query your view and retrieve all documents that do 
not contain the field `role` (view/NAME/?key="role").
  
- <<Anchor(geospatial_indexes)>>
+ 
  == Using views to search for sort documents geographically ==
  
  If you use latitude/longitude information in your documents, it's not very 
easy to sort on proximity from a given point using the normal approach (of 
using a key of [<latitude>, <longitude>]). This happens because they're on 
different axes, which doesn't map well onto CouchDB's treatment of the index 
sorting -- which is a linear sort. However, using a 
[[http://en.wikipedia.org/wiki/Geohash|geohash]] may solve this, by letting you 
convert the coordinates of a location into a string that sorts well (e.g., 
locations that are close share a common prefix).

[Couchdb Wiki] Update of "View_Snippets" by Sebastian Cohnen

Reply via email to