Anthony,

I read your post, but I need time to process it. In general, DM's notions of
repositories has always seemed a little "off" to me, and your proposal
strikes me as a huge step in the right direction.

In our code, we use multiple repositories, and have never been able to get
the identity map working correctly because of this. The fact that the
identity map is not "automatic" is a clear sign that something needs to be
cleaned up design-wise.

..tony..

On Mon, Feb 1, 2010 at 8:45 AM, Anthony Williams <h...@antw.me> wrote:

> Hi all,
>
> I've been thinking a little about defining properties in DataMapper,
> particularly relating to how they work within repository blocks. I've
> posted some ideas to my site:
> http://antw.me/thoughts/datamapper-property-api.html
>
> If you prefer to read them here instead (sans links and syntax
> highlighting), I've included the full message below.
>
> I'd love to get some feedback; let me know what you think...
>
> Anthony.
>
> TRANSCRIPT:
>
> In my own dm-core fork on Github I've recently been experimenting with
> ways to trim down both Resource and Model, extracting specific
> functionality out to separate classes and modules. I've started by
> relieving Resource of the need to take care of attributes, creating an
> AttributeSet class which holds all of a Resource's attributes, tracks
> when they've been updated (marking them as dirty), and lazy-loading
> attributes when needed.
>
> Although not yet pushed to Github, my latest commits pass the full dm-
> core spec suite against SQLite3 and PostgreSQL, but have 3-4 failures
> with the InMemory and Yaml adapters. I eventually tracked this down to
> an example where a Property is defined on a Model within the context
> of a specific repository. For example:
>
>    require 'dm-core'
>
>    DataMapper.setup(:default, 'in_memory://localhost/one')
>    DataMapper.setup(:second,  'in_memory://localhost/two')
>
>    class Person
>      include DataMapper::Resource
>
>      property :id,   Serial
>      property :name, String
>
>      repository(:second) do
>        property :external_id, Integer
>      end
>    end
>
> This creates a Person model with two attributes: `id` and `name`, and
> a third `external_id` attribute which applies only when using the
> model within the `:second` repository context:
>
>    DataMapper.repository(:second) do
>      Person.create(:name => 'Michael Scarn', :external_id => 1)
>    end
>
> In then realised my AttributeSet implementation didn't account for the
> properties of a Model changing depending on the current repository
> context. It then led to me thinking a little more about the purpose--
> and usefulness--of being able to define models in this way.
>
> I'd like to--perhaps a little presumptuously--suggest that this
> functionality isn't as nice as it first seems, provide an alternative
> means for achieving the same result, and elaborate on how I think such
> repository blocks should work.
>
> ### Inconsistent instance API
>
> Allowing a user to wrap properties in a repository block results in a
> model changing it's behaviour depending on external state (the current
> repository context). At one moment the resource has an `external_id`
> attribute, and in the next the attribute seems to disappear.
>
>    DataMapper.repository(:second) do
>      Person.new(:external_id => 1)
>    end
>    # => #<Person @id=nil @name=nil>
>
> Wait... where did the `external_id` attribute go? In fact the
> attribute was set, it just doesn't appear since `Person#inspect` was
> called outside of the repository block...
>
>    DataMapper.repository(:second) do
>      puts Person.new(:external_id => 1).inspect
>    end
>    # => #<Person @id=nil @name=nil @external_id=1>
>
> Aha! There it is. Trying to set the `external_id` attribute outside of
> the `:second` repository context will also (rightly) fail.
>
> ### Ambiguity as to where a resource is saved
>
> DataMapper's repository context allows you to save any Resource to any
> defined repository (providing they support the same features).
>
>    person = Person.new(:name => 'Samuel L. Chang')
>
>    # Now that I have my resource, I can save it to wherever I
>    # want... By default, the resource will be saved in the
>    # :default repository
>    person.save
>
>    # Alternatively, I can specify a different repository...
>    DataMapper.repository(:second) do
>      person.save
>    end
>
> While this is an interesting feature, I'm struggling to come up with a
> reason why you'd _want_ to do this. To me it just introduces
> ambiguity:
>
> > Erm, where did I save that person instance? I'm sure it's around here
> > somewhere... Where are you little person instance? Peekaboo!
> >
> > <cite>Me, a year later.</cite>
>
> In reality, so long as you're explicit about wrapping parts of your
> application in the correct repository blocks, this is not a problem.
> But wherever you have to be explicit there is the possibility that
> someone will forget; forgetting _just once_ might be enough to cause
> obscure bugs.
>
> If the `external_id` attribute was set to disallow nil, the second
> call to `person.save` in the above example would fail, since no value
> was set. (In fact, the above example would fail anyway, since the
> first call to `person.save` would mark the resource as clean, thus the
> second call would do nothing.)
>
>
> ## A better way?
>
> I'm of the belief that each model should be associated with one--and
> only one--repository. This would be the `:default` repository, except
> where a user explicitly declares otherwise when setting up their
> model. A `Person` would be associated with the default repository
> _always_, regardless of the current repository context. In the example
> below, the person would be persisted to the default repository even
> though it's wrapped in another repo.
>
>    person = Person.new(:name => 'Michael Scarn')
>
>    DataMapper.repository(:second) do
>      person.save
>    end
>
> DataMapper could provide a method for changing the default repository:
>
>    class Person
>      include DataMapper::Resource
>
>      # Tells DM that the Person model should be persisted
>      # to the :second repository.
>      set_repository :second
>
>      property :id,   Serial
>      property :name, String
>    end
>
> By doing this, users would never need to worry about repository
> context outside of their models, making their domain objects much more
> straight-forward.
>
> As far as I'm concerned, `Person` and `repo(:second) { Person }` are
> two different models, with different interfaces, different properties,
> and are stored in different repositories. The second Person should
> probably be represented as another model, distinct from the first.
>
> Since DataMapper doesn't congflate class inheritance with Single Table
> Inheritance, we could use inheritance to achieve the same effect as
> the current API:
>
>    class Person
>      include DataMapper::Resource
>
>      property :id,   Serial
>      property :name, String
>    end
>
>    # Inherits properties from Person, but adds it's own
>    # custom properties, and persists to another repo.
>    class HRPerson < Person
>      set_repository :second
>      property :external_id, Integer
>    end
>
> ### Problems with this approach...
>
> `Model#copy` would break. Well... it wouldn't just break. The entire
> concept of copying resources across repositories would become
> redundant.
>
> ## An alternative meaning for repository blocks
>
> By doing away with the current meaning of repository blocks within
> model instances, we free up the API to do something I think is much
> more interesting: models which persist _across_ multiple repositories.
>
> Let's take a (slightly contrived) example...
>
>    DataMapper.setup(:default,         'yaml://localhost/main')
>    DataMapper.setup(:human_resources, 'yaml://localhost/hr')
>
>    class Employee
>      include DataMapper::Resource
>
>      property :id,       Serial
>      property :name,     String
>      property :username, String
>      property :password, String
>
>      repository(:human_resources) do
>        property :salary, Integer
>        property :pay_on, Date
>      end
>    end
>
> Our employee model has six properties: `name`, `username`, and
> `password` will be persisted to the default repository, while `salary`
> and `pay_on` will be persisted to the human resources repository.
> `id`, since it is a key, is used in _both_.
>
>
> Let's create a employee...
>
>    Employee.create(
>      :name     => 'Michael Scarn',
>      :username => 'mscarn',
>      :password => '12345',
>      :salary   => 2000,
>      :pay_on   => Date.today
>    )
>
> Here's what would happen "under the hood":
>
> 1. We assume that the key is generated by the model's default
> repository. In the absence of a `set_repository` statement, DataMapper
> assumes `:default`.
>
> 2. DataMapper then saves the resource to the default repository. In
> this example it persists the name, username, and password, and returns
> the ID which was generated.
>
> 3. It then proceeds to persist the salary and pay_on attributes to the
> human resources repository with the ID returned by the default repo.
>
> Our storage ends up looking a little like this:
>
>    # default/employees.yaml
>    - id: 95143
>      name: "Michael Scarn"
>      username: "mscarn"
>      password: "12345"
>
>    # hr/employees.yaml
>    - id: 95143
>      salary: 2000
>      pay_on: 2010-02-01
>
> ### Lazy loading from multiple repositories
>
> Loading a resource without specifying which fields you want to load
> would work in a way similar to lazy loading.
>
>    user = User.get(95143)
>
> This loads the User with `id`, `name`, `username`, and `password` from
> the default repository. Calling `user.salary` would load all of the
> attributes which belong to the human resources repository.
>
>    user = DataMapper.repository(:human_resources) do
>      User.get(95143)
>    end
>
>    # ... or ...
>    user = User.get(95143, :repository => :human_resources)
>
> This loads the User with `id`, `salary`, and `pay_on` from the human
> resources repository. Calling `user.name` would load all of the
> attributes which belong to the default repository.
>
> ### Finishing up
>
> I think this behaviour has a lot of potential: In many web
> applications developers have made the compromise of denormalising data
> in order to improve performance. DataMapper could instead provide an
> API to store these denormalised "cache" attributes in a fast key/value
> store.
>
>    class Journey
>      include DataMapper::Resource
>
>      property :id,       Serial
>      property :start_at, String
>      property :end_at,   String
>
>      repository(:redis) do
>        property :really_expensive_computation, String
>      end
>    end
>
> --
> You received this message because you are subscribed to the Google Groups
> "DataMapper" group.
> To post to this group, send email to datamap...@googlegroups.com.
> To unsubscribe from this group, send email to
> datamapper+unsubscr...@googlegroups.com<datamapper%2bunsubscr...@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/datamapper?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DataMapper" group.
To post to this group, send email to datamap...@googlegroups.com.
To unsubscribe from this group, send email to 
datamapper+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/datamapper?hl=en.

Reply via email to