Hi all, I've been thinking a little about defining properties in DataMapper, particularly relating to how they work within repository blocks. I've posted some ideas to my site: http://antw.me/thoughts/datamapper-property-api.html
If you prefer to read them here instead (sans links and syntax highlighting), I've included the full message below. I'd love to get some feedback; let me know what you think... Anthony. TRANSCRIPT: In my own dm-core fork on Github I've recently been experimenting with ways to trim down both Resource and Model, extracting specific functionality out to separate classes and modules. I've started by relieving Resource of the need to take care of attributes, creating an AttributeSet class which holds all of a Resource's attributes, tracks when they've been updated (marking them as dirty), and lazy-loading attributes when needed. Although not yet pushed to Github, my latest commits pass the full dm- core spec suite against SQLite3 and PostgreSQL, but have 3-4 failures with the InMemory and Yaml adapters. I eventually tracked this down to an example where a Property is defined on a Model within the context of a specific repository. For example: require 'dm-core' DataMapper.setup(:default, 'in_memory://localhost/one') DataMapper.setup(:second, 'in_memory://localhost/two') class Person include DataMapper::Resource property :id, Serial property :name, String repository(:second) do property :external_id, Integer end end This creates a Person model with two attributes: `id` and `name`, and a third `external_id` attribute which applies only when using the model within the `:second` repository context: DataMapper.repository(:second) do Person.create(:name => 'Michael Scarn', :external_id => 1) end In then realised my AttributeSet implementation didn't account for the properties of a Model changing depending on the current repository context. It then led to me thinking a little more about the purpose-- and usefulness--of being able to define models in this way. I'd like to--perhaps a little presumptuously--suggest that this functionality isn't as nice as it first seems, provide an alternative means for achieving the same result, and elaborate on how I think such repository blocks should work. ### Inconsistent instance API Allowing a user to wrap properties in a repository block results in a model changing it's behaviour depending on external state (the current repository context). At one moment the resource has an `external_id` attribute, and in the next the attribute seems to disappear. DataMapper.repository(:second) do Person.new(:external_id => 1) end # => #<Person @id=nil @name=nil> Wait... where did the `external_id` attribute go? In fact the attribute was set, it just doesn't appear since `Person#inspect` was called outside of the repository block... DataMapper.repository(:second) do puts Person.new(:external_id => 1).inspect end # => #<Person @id=nil @name=nil @external_id=1> Aha! There it is. Trying to set the `external_id` attribute outside of the `:second` repository context will also (rightly) fail. ### Ambiguity as to where a resource is saved DataMapper's repository context allows you to save any Resource to any defined repository (providing they support the same features). person = Person.new(:name => 'Samuel L. Chang') # Now that I have my resource, I can save it to wherever I # want... By default, the resource will be saved in the # :default repository person.save # Alternatively, I can specify a different repository... DataMapper.repository(:second) do person.save end While this is an interesting feature, I'm struggling to come up with a reason why you'd _want_ to do this. To me it just introduces ambiguity: > Erm, where did I save that person instance? I'm sure it's around here > somewhere... Where are you little person instance? Peekaboo! > > <cite>Me, a year later.</cite> In reality, so long as you're explicit about wrapping parts of your application in the correct repository blocks, this is not a problem. But wherever you have to be explicit there is the possibility that someone will forget; forgetting _just once_ might be enough to cause obscure bugs. If the `external_id` attribute was set to disallow nil, the second call to `person.save` in the above example would fail, since no value was set. (In fact, the above example would fail anyway, since the first call to `person.save` would mark the resource as clean, thus the second call would do nothing.) ## A better way? I'm of the belief that each model should be associated with one--and only one--repository. This would be the `:default` repository, except where a user explicitly declares otherwise when setting up their model. A `Person` would be associated with the default repository _always_, regardless of the current repository context. In the example below, the person would be persisted to the default repository even though it's wrapped in another repo. person = Person.new(:name => 'Michael Scarn') DataMapper.repository(:second) do person.save end DataMapper could provide a method for changing the default repository: class Person include DataMapper::Resource # Tells DM that the Person model should be persisted # to the :second repository. set_repository :second property :id, Serial property :name, String end By doing this, users would never need to worry about repository context outside of their models, making their domain objects much more straight-forward. As far as I'm concerned, `Person` and `repo(:second) { Person }` are two different models, with different interfaces, different properties, and are stored in different repositories. The second Person should probably be represented as another model, distinct from the first. Since DataMapper doesn't congflate class inheritance with Single Table Inheritance, we could use inheritance to achieve the same effect as the current API: class Person include DataMapper::Resource property :id, Serial property :name, String end # Inherits properties from Person, but adds it's own # custom properties, and persists to another repo. class HRPerson < Person set_repository :second property :external_id, Integer end ### Problems with this approach... `Model#copy` would break. Well... it wouldn't just break. The entire concept of copying resources across repositories would become redundant. ## An alternative meaning for repository blocks By doing away with the current meaning of repository blocks within model instances, we free up the API to do something I think is much more interesting: models which persist _across_ multiple repositories. Let's take a (slightly contrived) example... DataMapper.setup(:default, 'yaml://localhost/main') DataMapper.setup(:human_resources, 'yaml://localhost/hr') class Employee include DataMapper::Resource property :id, Serial property :name, String property :username, String property :password, String repository(:human_resources) do property :salary, Integer property :pay_on, Date end end Our employee model has six properties: `name`, `username`, and `password` will be persisted to the default repository, while `salary` and `pay_on` will be persisted to the human resources repository. `id`, since it is a key, is used in _both_. Let's create a employee... Employee.create( :name => 'Michael Scarn', :username => 'mscarn', :password => '12345', :salary => 2000, :pay_on => Date.today ) Here's what would happen "under the hood": 1. We assume that the key is generated by the model's default repository. In the absence of a `set_repository` statement, DataMapper assumes `:default`. 2. DataMapper then saves the resource to the default repository. In this example it persists the name, username, and password, and returns the ID which was generated. 3. It then proceeds to persist the salary and pay_on attributes to the human resources repository with the ID returned by the default repo. Our storage ends up looking a little like this: # default/employees.yaml - id: 95143 name: "Michael Scarn" username: "mscarn" password: "12345" # hr/employees.yaml - id: 95143 salary: 2000 pay_on: 2010-02-01 ### Lazy loading from multiple repositories Loading a resource without specifying which fields you want to load would work in a way similar to lazy loading. user = User.get(95143) This loads the User with `id`, `name`, `username`, and `password` from the default repository. Calling `user.salary` would load all of the attributes which belong to the human resources repository. user = DataMapper.repository(:human_resources) do User.get(95143) end # ... or ... user = User.get(95143, :repository => :human_resources) This loads the User with `id`, `salary`, and `pay_on` from the human resources repository. Calling `user.name` would load all of the attributes which belong to the default repository. ### Finishing up I think this behaviour has a lot of potential: In many web applications developers have made the compromise of denormalising data in order to improve performance. DataMapper could instead provide an API to store these denormalised "cache" attributes in a fast key/value store. class Journey include DataMapper::Resource property :id, Serial property :start_at, String property :end_at, String repository(:redis) do property :really_expensive_computation, String end end -- You received this message because you are subscribed to the Google Groups "DataMapper" group. To post to this group, send email to datamap...@googlegroups.com. To unsubscribe from this group, send email to datamapper+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/datamapper?hl=en.