Re: Enrolment open: Introduction to CouchDB Development
Hi Joan, I am interested. Best regards, CGS On 12/14/2011 11:13 PM, Joan Touzet wrote: Enrolment for the "Introduction to CouchDB Development" course is now open at http://moodle.wohmart.com/ ! As previously described, this course will get you up to speed in Erlang (40%), the fundamentals of how CouchDB is implemented (40%), and will culminate in a small group project (20%). There will be guest presentations from these illustrious contributors: * Randall Leeds * Bob Dionne * Adam Kocoloski * Dale Harvey * Paul Davis * Benoit Chesneau * Robert Newson * Jan Lehnardt * ...with more to come! The course runs from 2012.1.9 - 2012.3.20 or so, with a commitment of about 4-8 hours per week recommended. Also, be aware that this is an online studio course, meaning it's run entirely online and in the open. It's 100% free (as in beer and freedom). Prerequisites: Strong knowledge of at least one programming language, preferably non-scripted [*] Know how to use CouchDB. This is not a CouchDB course. To enrol: 1. Reply to me privately by email, or via IRC (freenode's #couchdb, wohali), or via Twitter (@wohali). 2. I will give you the enrolment key. 3. Visit http://moodle.wohmart.com/ , select the course and register a new user using the provided enrolment key. These people get priority enrolment until 12-16 since they expressed early interest: Timothy Chen Roman Geber Matt Adams Bryan Green Sean Copenhaver Pete Vander Giessen Clifford Hung Dave Cottlehuber See you there, Joan [*] If your only programming language is JavaScript, prepare to devote more time to the course for the first 4 weeks. Moving to Erlang from JavaScript will take more effort than if you have a background in at least one non-scripting language.
Re: Enrolment open: Introduction to CouchDB Development
Please reply off list, folksI appreciate the very positive response but I don't want to "pollute" the dev list with this. Cheers, Joan On Wed, Dec 14, 2011 at 08:16:08PM -0600, Nathan Stott wrote: > I'd like to get involved. I use CouchDB all the time and even own the npm > 'couchdb' repository but I have never done any actual coding on the CouchDB > core itself. > > On Wed, Dec 14, 2011 at 4:13 PM, Joan Touzet wrote: > > > Enrolment for the "Introduction to CouchDB Development" course is now > > open at http://moodle.wohmart.com/ ! > > > > As previously described, this course will get you up to speed in Erlang > > (40%), the fundamentals of how CouchDB is implemented (40%), and will > > culminate in a small group project (20%). > > > > There will be guest presentations from these illustrious contributors: > > * Randall Leeds > > * Bob Dionne > > * Adam Kocoloski > > * Dale Harvey > > * Paul Davis > > * Benoit Chesneau > > * Robert Newson > > * Jan Lehnardt > > * ...with more to come! > > > > The course runs from 2012.1.9 - 2012.3.20 or so, with a commitment of > > about 4-8 hours per week recommended. Also, be aware that this is an > > online studio course, meaning it's run entirely online and in the open. > > It's 100% free (as in beer and freedom). > > > > Prerequisites: > > Strong knowledge of at least one programming language, preferably > >non-scripted [*] > > Know how to use CouchDB. This is not a CouchDB course. > > > > To enrol: > > 1. Reply to me privately by email, or via IRC (freenode's > > #couchdb, wohali), or via Twitter (@wohali). > > 2. I will give you the enrolment key. > > 3. Visit http://moodle.wohmart.com/ , select the course and register > > a new user using the provided enrolment key. > > > > These people get priority enrolment until 12-16 since they expressed > > early interest: > > Timothy Chen > > Roman Geber > > Matt Adams > > Bryan Green > > Sean Copenhaver > > Pete Vander Giessen > > Clifford Hung > > Dave Cottlehuber > > > > See you there, > > Joan > > > > [*] If your only programming language is JavaScript, prepare to devote > > more time to the course for the first 4 weeks. Moving to Erlang from > > JavaScript will take more effort than if you have a background in at > > least one non-scripting language. > >
Re: Enrolment open: Introduction to CouchDB Development
I would definitely like to participate in this. On Wed, Dec 14, 2011 at 8:16 PM, Nathan Stott wrote: > I'd like to get involved. I use CouchDB all the time and even own the npm > 'couchdb' repository but I have never done any actual coding on the CouchDB > core itself. > > On Wed, Dec 14, 2011 at 4:13 PM, Joan Touzet wrote: > > > Enrolment for the "Introduction to CouchDB Development" course is now > > open at http://moodle.wohmart.com/ ! > > > > As previously described, this course will get you up to speed in Erlang > > (40%), the fundamentals of how CouchDB is implemented (40%), and will > > culminate in a small group project (20%). > > > > There will be guest presentations from these illustrious contributors: > > * Randall Leeds > > * Bob Dionne > > * Adam Kocoloski > > * Dale Harvey > > * Paul Davis > > * Benoit Chesneau > > * Robert Newson > > * Jan Lehnardt > > * ...with more to come! > > > > The course runs from 2012.1.9 - 2012.3.20 or so, with a commitment of > > about 4-8 hours per week recommended. Also, be aware that this is an > > online studio course, meaning it's run entirely online and in the open. > > It's 100% free (as in beer and freedom). > > > > Prerequisites: > > Strong knowledge of at least one programming language, preferably > >non-scripted [*] > > Know how to use CouchDB. This is not a CouchDB course. > > > > To enrol: > > 1. Reply to me privately by email, or via IRC (freenode's > > #couchdb, wohali), or via Twitter (@wohali). > > 2. I will give you the enrolment key. > > 3. Visit http://moodle.wohmart.com/ , select the course and register > > a new user using the provided enrolment key. > > > > These people get priority enrolment until 12-16 since they expressed > > early interest: > > Timothy Chen > > Roman Geber > > Matt Adams > > Bryan Green > > Sean Copenhaver > > Pete Vander Giessen > > Clifford Hung > > Dave Cottlehuber > > > > See you there, > > Joan > > > > [*] If your only programming language is JavaScript, prepare to devote > > more time to the course for the first 4 weeks. Moving to Erlang from > > JavaScript will take more effort than if you have a background in at > > least one non-scripting language. > > > -- Robert French Departments of Mathematics and Computer Science Austin Peay State University roberto.fran...@gmail.com (615) 829-6647
[jira] [Updated] (COUCHDB-1363) Race condition edge case when pulling local changes
[ https://issues.apache.org/jira/browse/COUCHDB-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Leeds updated COUCHDB-1363: --- Attachment: 0001-Fix-a-race-condition-starting-replications.patch > Race condition edge case when pulling local changes > --- > > Key: COUCHDB-1363 > URL: https://issues.apache.org/jira/browse/COUCHDB-1363 > Project: CouchDB > Issue Type: Bug > Components: Database Core >Affects Versions: 1.0.3, 1.1.1 >Reporter: Randall Leeds >Priority: Minor > Fix For: 1.2, 1.3 > > Attachments: 0001-Fix-a-race-condition-starting-replications.patch > > > It's necessary to re-open the #db after subscribing to notifications so that > updates are not lost. In practice, this is rarely problematic because the > next change will cause everything to catch up, but if a quick burst of > changes happens while replication is starting the replication can go stale. > Detected by intermittent replicator_db js test failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (COUCHDB-1363) Race condition edge case when pulling local changes
[ https://issues.apache.org/jira/browse/COUCHDB-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Leeds reassigned COUCHDB-1363: -- Assignee: Filipe Manana > Race condition edge case when pulling local changes > --- > > Key: COUCHDB-1363 > URL: https://issues.apache.org/jira/browse/COUCHDB-1363 > Project: CouchDB > Issue Type: Bug > Components: Database Core >Affects Versions: 1.0.3, 1.1.1 >Reporter: Randall Leeds >Assignee: Filipe Manana >Priority: Minor > Fix For: 1.2, 1.3 > > Attachments: 0001-Fix-a-race-condition-starting-replications.patch > > > It's necessary to re-open the #db after subscribing to notifications so that > updates are not lost. In practice, this is rarely problematic because the > next change will cause everything to catch up, but if a quick burst of > changes happens while replication is starting the replication can go stale. > Detected by intermittent replicator_db js test failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (COUCHDB-1363) Race condition edge case when pulling local changes
Race condition edge case when pulling local changes --- Key: COUCHDB-1363 URL: https://issues.apache.org/jira/browse/COUCHDB-1363 Project: CouchDB Issue Type: Bug Components: Database Core Affects Versions: 1.1.1, 1.0.3 Reporter: Randall Leeds Priority: Minor Fix For: 1.2, 1.3 It's necessary to re-open the #db after subscribing to notifications so that updates are not lost. In practice, this is rarely problematic because the next change will cause everything to catch up, but if a quick burst of changes happens while replication is starting the replication can go stale. Detected by intermittent replicator_db js test failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Enrolment open: Introduction to CouchDB Development
I'd like to get involved. I use CouchDB all the time and even own the npm 'couchdb' repository but I have never done any actual coding on the CouchDB core itself. On Wed, Dec 14, 2011 at 4:13 PM, Joan Touzet wrote: > Enrolment for the "Introduction to CouchDB Development" course is now > open at http://moodle.wohmart.com/ ! > > As previously described, this course will get you up to speed in Erlang > (40%), the fundamentals of how CouchDB is implemented (40%), and will > culminate in a small group project (20%). > > There will be guest presentations from these illustrious contributors: > * Randall Leeds > * Bob Dionne > * Adam Kocoloski > * Dale Harvey > * Paul Davis > * Benoit Chesneau > * Robert Newson > * Jan Lehnardt > * ...with more to come! > > The course runs from 2012.1.9 - 2012.3.20 or so, with a commitment of > about 4-8 hours per week recommended. Also, be aware that this is an > online studio course, meaning it's run entirely online and in the open. > It's 100% free (as in beer and freedom). > > Prerequisites: > Strong knowledge of at least one programming language, preferably >non-scripted [*] > Know how to use CouchDB. This is not a CouchDB course. > > To enrol: > 1. Reply to me privately by email, or via IRC (freenode's > #couchdb, wohali), or via Twitter (@wohali). > 2. I will give you the enrolment key. > 3. Visit http://moodle.wohmart.com/ , select the course and register > a new user using the provided enrolment key. > > These people get priority enrolment until 12-16 since they expressed > early interest: > Timothy Chen > Roman Geber > Matt Adams > Bryan Green > Sean Copenhaver > Pete Vander Giessen > Clifford Hung > Dave Cottlehuber > > See you there, > Joan > > [*] If your only programming language is JavaScript, prepare to devote > more time to the course for the first 4 weeks. Moving to Erlang from > JavaScript will take more effort than if you have a background in at > least one non-scripting language. >
Re: Unique instance IDs?
On Wed, Dec 14, 2011 at 04:13:41PM -0800, Randall Leeds wrote: > I might argue that these bits at the end are link and network layer > issues that we don't care about. On the contrary - until there is a solution in the mainline to deal with NATs and firewalls, you cannot assume that a CouchDB instance can be seen publicly. A very common use case (for the application I've been working on) is for a desktop machine, behind a NAT, to have continuous 2-way replication with a public server. Imagine 30 or so machines connected to such a central server. 31 databases, 62 ongoing replications, but only one them has a valid URL. Each desktop will have to pull from and push to the central server; the central server on its *own* cannot access the machines behind the firewalls. This is not so unusual a case, is it? > pulling from or pushing to the device. In this case, the db on the > mobile device can be identified by a bare database name without any > URL at all. Examples: Pull http://remotecouch/mydb -> mydb; Push mydb > -> http://remotercouch/mydb. The replicator works like this today. See above - the same approach (the exact opposite of what you've suggested) is the most workable solution *today*. > The CouchDB community is being very radical by suggesting that we > might _serve_ content or address content stored on a mobile device. Yes! And further more solving how to serve data from behind a NAT or firewall is equally challenging - but doable. > Given the commitment CouchDB has made to HTTP so far, I hesitate to > say that the solution to this problem is to subvert URLs. I'm not saying that's the solution. I'm saying that URLs cannot necessarily identify all participants in a replication scenario reliably, especially given RFC 1918 space, variable host names, mobile platforms and NAT. > Again, this is getting away from the transitive checkpoint problem, > which may turn out to obviate the need for identification of databases > in the first place. Or, as I put it earlier, to focus the problem on > "what is in this database" rather than "what database this is". +100 on solving things this way. :) -Joan
Re: Enrolment open: Introduction to CouchDB Development
I am very interested in this course. Please let me know what I can do to enroll. David Pratt w> 801-422-4823 e> david.pr...@byu.edu On Dec 14, 2011, at 3:14 PM, "Joan Touzet" wrote: > Enrolment for the "Introduction to CouchDB Development" course is now > open at http://moodle.wohmart.com/ ! > > As previously described, this course will get you up to speed in Erlang > (40%), the fundamentals of how CouchDB is implemented (40%), and will > culminate in a small group project (20%). > > There will be guest presentations from these illustrious contributors: > * Randall Leeds > * Bob Dionne > * Adam Kocoloski > * Dale Harvey > * Paul Davis > * Benoit Chesneau > * Robert Newson > * Jan Lehnardt > * ...with more to come! > > The course runs from 2012.1.9 - 2012.3.20 or so, with a commitment of > about 4-8 hours per week recommended. Also, be aware that this is an > online studio course, meaning it's run entirely online and in the open. > It's 100% free (as in beer and freedom). > > Prerequisites: > Strong knowledge of at least one programming language, preferably > non-scripted [*] > Know how to use CouchDB. This is not a CouchDB course. > > To enrol: > 1. Reply to me privately by email, or via IRC (freenode's >#couchdb, wohali), or via Twitter (@wohali). > 2. I will give you the enrolment key. > 3. Visit http://moodle.wohmart.com/ , select the course and register >a new user using the provided enrolment key. > > These people get priority enrolment until 12-16 since they expressed > early interest: > Timothy Chen > Roman Geber > Matt Adams > Bryan Green > Sean Copenhaver > Pete Vander Giessen > Clifford Hung > Dave Cottlehuber > > See you there, > Joan > > [*] If your only programming language is JavaScript, prepare to devote > more time to the course for the first 4 weeks. Moving to Erlang from > JavaScript will take more effort than if you have a background in at > least one non-scripting language.
Re: Unique instance IDs?
On Wed, Dec 14, 2011 at 10:52, Alex Besogonov wrote: > On Wed, Dec 14, 2011 at 3:55 AM, Randall Leeds > wrote: >> I think you miss the point that was made above about mirrors, still, >> unless I misunderstand. B may have other changes interleaves with >> those received from A, whether from interactive updates or other >> replications, making its hashes different. > Of course. But that's not a problem, because we save all the A's > changeset hashes > that we've seen during the replication. B's resulting hash would be > different, but we don't > care about it. > > Also, since merging is commutative and associative we can reorder changesets > in any way, so interleaving changes in itself should be OK. I might need you to restart your solution for me to understand. If the hash tree isn't of the sequence or id index, then I'm not seeing what this applies to except the rev tree of a single document. Documents do already have a sort of hash, as you identified, in their revision id. Comparing the presence of these on the client and server is already part of the replication protocol. However, since CouchDB is _not_ a versioned document store ("_rev is only for MVCC"), there's no need to optimize the problem of diffing the revs present. Only the newest revs need ever be replicated. The sequence number checkpointing is an optimization to avoid comparing the revs for all documents. I think progress looks like finding a way to skip large chunks of the seq index because they contain changes already received, possibly from elsewhere. So I'm not sure what your solution proposes. Can you go further? -Randall
Re: Unique instance IDs?
On Wed, Dec 14, 2011 at 14:52, Joan Touzet wrote: > -1 on using URI/URLs, for the simple fact that mobile and desktop > devices often don't have a stable hostname and/or IP address. This is a > huge area where CouchDB is used, increasingly so, and attempting to tie > a DB UUID to something inherently variable on the platform is doomed to > fail. > > Renaming my PC or phone, getting a new DHCP address, connecting to a > different network or changing the MAC address of my NIC should not > invalidate my DBs, their "UUIDs", or cause unreasonable problems for > replication. > > -Joan I might argue that these bits at the end are link and network layer issues that we don't care about. As far as the Web is concerned, the URL is the address and it's more than just convenience and readability that separates that from an IP address. URLs are foundational to resource identification on the Web, and I'm really hesitant to "work around" that (nevertheless I've dreaming up and reading all kinds of ways to do just this these days, and it's pretty hard). I definitely don't mean to condescendingly suggest you don't know this already; I'm just restating the basic facts. Take, for example, the mobile use case. Most people, I'd submit, want to push from and pull data to a mobile device. Given that the device doesn't have a stable address (neither in IP nor URL space), most would punt on the problem of serving from the mobile device, i.e. pulling from or pushing to the device. In this case, the db on the mobile device can be identified by a bare database name without any URL at all. Examples: Pull http://remotecouch/mydb -> mydb; Push mydb -> http://remotercouch/mydb. The replicator works like this today. I think it's generally accepted that URLs don't point at the same device all the time. In practice, obviously, they very frequently "point at many devices" in that reverse proxies are used all over the Web for load balancing. I might say it's out of scope for CouchDB to worry about tying a stable URL to a mobile device. For the ops person in the datacenter the story right now is clear: if you want to copy your database, you should probably also copy the hostname over to the new box or replication starts over. The CouchDB community is being very radical by suggesting that we might _serve_ content or address content stored on a mobile device. Given the commitment CouchDB has made to HTTP so far, I hesitate to say that the solution to this problem is to subvert URLs. Again, this is getting away from the transitive checkpoint problem, which may turn out to obviate the need for identification of databases in the first place. Or, as I put it earlier, to focus the problem on "what is in this database" rather than "what database this is". -Randall
Re: Unique instance IDs?
I think my point is, if URLs don't work, nothing will. There's no free lunch. But if an optimization surfaces, I will happily stand corrected. On Thu, Dec 15, 2011 at 5:52 AM, Joan Touzet wrote: > -1 on using URI/URLs, for the simple fact that mobile and desktop > devices often don't have a stable hostname and/or IP address. This is a > huge area where CouchDB is used, increasingly so, and attempting to tie > a DB UUID to something inherently variable on the platform is doomed to > fail. > > Renaming my PC or phone, getting a new DHCP address, connecting to a > different network or changing the MAC address of my NIC should not > invalidate my DBs, their "UUIDs", or cause unreasonable problems for > replication. > > -Joan -- Iris Couch
Re: Unique instance IDs?
-1 on using URI/URLs, for the simple fact that mobile and desktop devices often don't have a stable hostname and/or IP address. This is a huge area where CouchDB is used, increasingly so, and attempting to tie a DB UUID to something inherently variable on the platform is doomed to fail. Renaming my PC or phone, getting a new DHCP address, connecting to a different network or changing the MAC address of my NIC should not invalidate my DBs, their "UUIDs", or cause unreasonable problems for replication. -Joan
Enrolment open: Introduction to CouchDB Development
Enrolment for the "Introduction to CouchDB Development" course is now open at http://moodle.wohmart.com/ ! As previously described, this course will get you up to speed in Erlang (40%), the fundamentals of how CouchDB is implemented (40%), and will culminate in a small group project (20%). There will be guest presentations from these illustrious contributors: * Randall Leeds * Bob Dionne * Adam Kocoloski * Dale Harvey * Paul Davis * Benoit Chesneau * Robert Newson * Jan Lehnardt * ...with more to come! The course runs from 2012.1.9 - 2012.3.20 or so, with a commitment of about 4-8 hours per week recommended. Also, be aware that this is an online studio course, meaning it's run entirely online and in the open. It's 100% free (as in beer and freedom). Prerequisites: Strong knowledge of at least one programming language, preferably non-scripted [*] Know how to use CouchDB. This is not a CouchDB course. To enrol: 1. Reply to me privately by email, or via IRC (freenode's #couchdb, wohali), or via Twitter (@wohali). 2. I will give you the enrolment key. 3. Visit http://moodle.wohmart.com/ , select the course and register a new user using the provided enrolment key. These people get priority enrolment until 12-16 since they expressed early interest: Timothy Chen Roman Geber Matt Adams Bryan Green Sean Copenhaver Pete Vander Giessen Clifford Hung Dave Cottlehuber See you there, Joan [*] If your only programming language is JavaScript, prepare to devote more time to the course for the first 4 weeks. Moving to Erlang from JavaScript will take more effort than if you have a background in at least one non-scripting language.
Re: Unique instance IDs?
On Wed, Dec 14, 2011 at 3:55 AM, Randall Leeds wrote: > I think you miss the point that was made above about mirrors, still, > unless I misunderstand. B may have other changes interleaves with > those received from A, whether from interactive updates or other > replications, making its hashes different. Of course. But that's not a problem, because we save all the A's changeset hashes that we've seen during the replication. B's resulting hash would be different, but we don't care about it. Also, since merging is commutative and associative we can reorder changesets in any way, so interleaving changes in itself should be OK.
Re: Unique instance IDs?
On Tue, Dec 13, 2011 at 20:08, Alex Besogonov wrote: > On Mon, Dec 12, 2011 at 10:26 PM, Paul Davis > wrote: >>> * Merkle trees are great for two-way synchronization, but it's not >>> immediately clear to me how you'd use them to bootstrap a single source -> >>> target replication. I might just be missing a straightforward extension of >>> the tech here. >> This is the point that's important with checksums and so on. Merkle >> trees are great when you want to mirror structured data but CouchDB >> replication is a mirror operation. Think, N db's replicating to a >> central DB. you have a mixture of things which breaks checksums (or at >> least any obvious application I can think of given our internal >> structures) > Uhm. What are the things that break checksums? Right now revision IDs > are _almost_ > deterministic and it's not that hard to make them completely > deterministic. And for > replication purposes nothing else matters. > > To be exact: the only entity used for ID generation is '[Deleted, > OldStart, OldRev, Body, Atts2]' > tuple and only 'Atts2' field can be non-deterministic. And that can be > fixed (with other minor > forward-looking features like explicit versioning). > > Then it's easy to devise a protocol to replicate based on hash trees. > I'm thinking about > this protocol: > 1) The current state of replicated database is identified by a hash. > Suppose that we > have unidirectional replication A->B. > > Let's denote state of the initial database A as A1 and B's as B1. > > We store the ancestry as a list of hashes outside database (so it > doesn't influence the > hash of the database). > > 2) As the first step B sends its list of replication ancestry. > > It's actually not even required to send the whole hashes each time, > just send the first > 4 bytes of each hash. That way even 1 million records of replication > history would take > only 4Mb. The 'A' server then replies with its own set of hashes with > the matching > initial bytes. If there are none, then the client falls back to the > usual replication. > > So at this step 'B' knows the most recent common ancestor and requests > the changes > that have happened since that point of time. Each changeset, naturally, has > its > own hash. > > 3) After these changes are applied and merged, B's state is the A1 > state plus all the > B's changes that might have happened ever since. Then B stores the hashes > of the changesets that have been applied. > > That's it. Should work, as far as I see (it's 3am local time, so I > might miss something). > > Overhead: 16 bytes for the hash information for each changeset. I think you miss the point that was made above about mirrors, still, unless I misunderstand. B may have other changes interleaves with those received from A, whether from interactive updates or other replications, making its hashes different.