Hi guys!
Having schema less documents like in CouchDB opens up for a lot of
cool things as we all know. You can f.ex store all sorts of related
data in one document and different documents can also store different
amounts and types of data.
In theory this is all great, but in reality I have had a lot of
problems when:
1. I want to do a small change to a document. Then I have to load ALL
its data (which for big documents make for a huge overhead) so I can
store back the complete document with its change.
2. When several processes want to perform small updates on the same
document I get a lot of conflict errors.
In praxis this has led me to store my data in numerous smaller
documents and store their relationships as parameters holding the ID
of the parent object.
If partial updating could be implemented it would solve all this! I
have no idea how hard this would be to implement for you guys, but
from my side I would like it to work something like this:
We have the following document stored on the server:
{
_id: "foo",
revision: "123",
data: {
days: [1,2,3,4,5],
horses: [{
name:"kaspar",
races_won: 10
},
{
name:"greg",
races_won: 0
}]
},
pizzas_eaten: 15
};
We could have two processes working on the document:
Process 1 changes the number of pizzas eaten by sending back the id of
the document it wants to change and the current revision it is at
along with the changed data like this:
PUT {
_id: "foo",
revision: "123",
_update: {
pizzas_eaten: 20
}
}
and gets back the new revision number 234
Process 2 which still is at revision 123 can change the values of
data.days without getting any conflicts by PUTing the following data:
PUT {
_id: "foo",
revision: "123",
_update: {
data.days: [1,2,3,4,5,6]}
}
}
and gets back the new revision number 345
Now if Process one tries to update the data.days parameter like this:
PUT {
_id: "foo",
revision: "234",
_update: {
data.days: [1,2,3,4,5,6,7,8,9,0]}
}
}
it will get an conflict error because the data.days value has been
changed since revision 234 (by the other process. The value of
data.days is a the newer revision 345).
You could add new parameters as well:
PUT {
_id: "foo",
revision: "234",
_update: {
pizzas_eaten_on_avarage_a_day: 0.01
}
}
Updating a value that doesn't exist could add it.
You could also remove/delete values and rearrange documents:
PUT {
_id: "foo",
revision: "456",
_update: {
pizzas: {
eaten: 20,
daily_avarage: 0.01
}
}
_remove: {
pizzas_eaten_on_avarage_a_day,
pizzas_eaten
}
}
The document would now look like this:
{
_id: "foo",
revision: "567",
data: {
days: [1,2,3,4,5,6],
horses: [{
name:"kaspar",
races_won: 10
},
{
name:"greg",
races_won: 0
}]
},
pizzas: {
eaten: 20,
daily_avarage: 0.01
}
};
The database server would have to keep track of at what revision the
different values are at though... that might be cumbersome...
It would greatly improve CouchDB's usability in my case though!
Let me know what you think!
Best regards
Sebastian