Chris Anderson wrote: > The alternative approach is to forgo the MD5 hash calculation, and > POST the parsed data into CouchDB, creating a new record with an > arbitrary id. I imagine that I would end up with a lot of identical > data in this case, and it would become the job of the > Map/Combine/Reduce process to filter duplicates while creating the > lookup indexes.
PUT is almost always better than POST; if you experience network failures, you can retry the PUT without any ill effects. If you use POST, and your experience a network partition before you get the response, how will you determine whether or not the POST succeeded? See "Post Once Exactly" (http://www.mnot.net/blog/2005/03/21/poe), HTTPLR (http://dehora.net/doc/httplr/draft-httplr-01.html), and their related discussions. - Brian
