On Sun, Aug 30, 2009 at 5:21 PM, Chris Anderson<[email protected]> wrote:
> On Sun, Aug 30, 2009 at 5:10 PM, Tom Sante<[email protected]> wrote:
>> On Sun, Aug 30, 19:11, Dale Ragan wrote:
>>> >
>>> >>Basically I have a document, with an id, rev, type, and Content
>>> >>keys. The Content key
>>> >>holds the serialized object that is to be stored for it's value.
>>> >>Are there any pitfalls
>>> >>with this design? I have attached a sample below:
>>> >I should say I'm in no way an expert, I'm starting to wrap my head
>>> >around document modelling myself. I've been reading up on couchdb
>>> >a couple of days now and find it really interesting.
>>> >
>>> >Anyway, on to your document. First, why duplicate the manager id?
>>> >Isn't there a risk of them getting out of sync?
>>> There is no chance that the Id's will get out of sync. I handle
>>> generating the Id's when the object is persisted for the first time.
>>> >
>>> >I think you will run into many conflicts if subordinates are
>>> >updated independently. Each subordinate has an id, is there
>>> >another document with more information about subordinates? In that
>>> >case, why not have all information in there and connect them with
>>> >a managerId attribute instead?
>>> This is just an example object that I modeled up for the post.
>>> Subordinates in this case are updated another way. They are just
>>> referenced by the Manager object. Basically, a one-to-many
>>> relationship. If you wanted to update one, you would use a document
>>> that wrapped the Worker object. Is it better to normalize the data
>>> even in CouchDB?
>>>
>>> I am new to CouchDB also. I am trying to abstract any need for a
>>> domain model needing to know about CouchDB's terms, like Rev. I am
>>> writing an API in a statically typed language and I am experimenting
>>> with the best way to store the object that is given to my API. This
>>> design helps and is one of the few I have come up with.
Putting serialized data inside a 'Content' attribute is a good way to
go. I have seen the same pattern recommended elsewhere. It lets you
serialize arbitrary data without having collisions with metadata;
specifically the '_id', '_rev', and 'type' attributes. And map
functions can pull any indexable data out of nested attributes, so I
don't think this approach has any particular performance implications.
>>> >>{
>>> >> "|_id|":|"000144df-6f11-49f1-a502-e0dab3592326"|,
>>> >> "|_rev|":|"1-308931e16105b566e1fb48106c85116e"|,
>>> >> "|type|":|"Manager"|,
>>> >> "|Content|": {
>>> >> "|Subordinates|": [
>>> >> {
>>> >> "|Address|": {
>>> >> "|Street|":|"123 Somewhere St."|,
>>> >> "|City|":|"Kalamazoo"|,
>>> >> "|State|":|"MI"|,
>>> >> "|Zip|":|"12345"|
>>> >> },
>>> >> "|Hours|":|40|,
>>> >> "|Id|":|"6bcdea2f-2439-4785-ab59-2ee612435705"|,
>>> >> "|Name|":|"Bob"|,
>>> >> "|Login|":|"bbob"|
>>> >> },
>>> >> {
>>> >> "|Address|": {
>>> >> "|Street|":|"123 Somewhere St."|,
>>> >> "|City|":|"Kalamazoo"|,
>>> >> "|State|":|"MI"|,
>>> >> "|Zip|":|"12345"|
>>> >> },
>>> >> "|Hours|":|40|,
>>> >> "|Id|":|"b0d156c9-ea3f-4c4f-b49d-ab19bff64dd8"|,
>>> >> "|Name|":|"Alice"|,
>>> >> "|Login|":|"aalice"|
>>> >> },
>>> >> {
>>> >> "|Address|": {
>>> >> "|Street|":|"123 Somewhere St."|,
>>> >> "|City|":|"Kalamazoo"|,
>>> >> "|State|":|"MI"|,
>>> >> "|Zip|":|"12345"|
>>> >> },
>>> >> "|Hours|":|20|,
>>> >> "|Id|":|"12b6dbbc-44e8-43c2-8142-11fc6c1d23df"|,
>>> >> "|Name|":|"Eve"|,
>>> >> "|Login|":|"eeve"|
>>> >> }
>>> >> ],
>>> >> "|Id|":|"000144df-6f11-49f1-a502-e0dab3592326"|,
>>> >> "|Name|":|"6"|,
>>> >> "|Login|":|"6-login"|
>>> >> }
>>> >>}
>>> >>
>>> >>Basically the content is a Manager type object with an Id, Name,
>>> >>Login, and Subordinates.
>>> >>Subordinates are Worker's with an Id, Name, Login, Hours, and an
>>> >>Address. The _id and the Id of
>>> >>the Manager object are the same. Basically the Document object
>>> >>is just a wrapper around what is
>>> >>given to be persisted.
>>> >>
>>> >>Thanks,
>>> >>
>>> >>Dale
>>
>> Like Martin said why all this duplication?
>> Give each worker it's own document and only add the id's of the
>> workers as subordinates. So you can change worker details without
>> having to change the manager document.
>
> if you put the manager_id on the worker, then you can pull out a
> manager and all it's workers in a single query if you like, using just
> a map view.
>
> here's the canonical write up of the technique:
>
> http://www.cmlenz.net/archives/2007/10/couchdb-joins
>
>>
>> It might even be better to only store the managers own info in the
>> manager doc and save any worker-manager relations in the respective
>> worker document by referencing the manager id in the worker doc + how
>> many hours he worked for that manager.
>> This makes it easier if a worker changes to work for another manager you
>> just reference the manager id in worker doc still keeping the history
>> of previous other managers that worker had in the past.