On May 30, 2010, at 2:34 AM, Darren Duncan wrote:
Stevan Little wrote:
On May 29, 2010, at 11:20 PM, Darren Duncan wrote:
2. Besides the ability to introspect or perform powerful searches
on your objects using SQL/etc, I see another big advantage of
using database storage without serialization as portability. You
can have applications written in different programming languages
sharing the same database and the same objects, because they don't
contain Perl-specific data formats.
KiokuDB mostly uses JSON and JSPON as the storage format, which is
not Perl specific. The serialization format we store in is
dependent on the Moose class definition, so in that way it is not
terribly portable.
An advantage of not using serialization like JSON, but rather
storing each object attribute as a database member attribute, is
that the DBMS itself can then most easily be defined to enforce the
consistency of your objects, so someone accessing the database by
some way other than KiokuDB, or using a buggy version of KiokuDB, is
less likely to be able to corrupt the data. As for how to get the
database to do that, one general answer is CHECK constraints, though
that is a fallback to where terser/simpler kinds of constraints
don't do the job.
I think you misunderstand, KiokuDB is *not* just a JSON serialization
service, it breaks up the object graph on a per-instance basis and
stores each instance separately. It uses JSPON as a way to handle
references from one object to another.
It is not like MongoDB which stores the entire "document" at once and
has no (built in) way to refer from one "document" to another. In
fact, MongoDB is more like a traditional relational DB in that all
it's relationships are stored implicitly and laid out by the user.
KiokuDB on the other hand stores all the relations as explicit and
resolves them for you when they are extracted.
I think perhaps you need to take a much closer look at KiokuDB because
I suspect you have not done so and so are pointing out issues that you
perceive it to have, but in fact, it does not.
A relational database can map to an object structure of any
language fairly easily. Add attributes/columns for mutually
heterogeneous data, like when you would add object attributes, and
add tuples/rows for mutually homogeneous data, like when you would
use arrays or sets.
And then you get the impedance mismatch. You are ignoring
inheritance, which is not really possible in a relational model.
I wasn't ignoring inheritance, but rather was just being terse by
giving examples rather than every relevant detail.
Fair enough.
As for inheritance, a relational model can handle that just fine.
You also have several options for how to lay it out, depending on
what you're going for.
One option in the general case is to have a distinct database relvar/
table per each instantiatable class, which has one attribute/column
per class attribute, plus an extra attribute/column to hold an ID
value for the object. When a class composes a role or inherits from
a class, the attributes defined in the others plus those defined
directly in the first class would each have a corresponding
attribute in the relvar/table attribute/column, so that each
attribute of the object of that class has a place to be stored. And
so, when multiple classes compose the same attributes, their
corresponding relvars/tables all have common-named/typed attributes/
columns corresponding to said.
Another option in the general case is to also have a database relvar/
table for each role or non-instantiatable class as well, which is
then the only one having the attributes/columns that the
corresponding declares, and then the relvars/tables mentioned in the
previous paragraph then wouldn't have these but instead would have
matching ID values to relate records in them to ones in the others.
Generally speaking, with the exception perhaps of Moose classes
where every single object can have different names or kinds etc of
attributes, rather than those being class-defined, I would think the
best design is for the database to have exactly the same granularity
of component data as the Moose objects do. Just where each object
can have different attributes, then the database could probably be
designed like a key-value store, but that's less ideal.
Yes, sorry allow me to clarify, the relational model does not do
inheritance easily or cleanly. What you describe above (a key-value
store) no longer has as much value as a relational DB. The other
common inheritance mappings all have equally bad tradeoffs involved.
The result is you either have a sub-standard DB schema or a non-ideal
object graph.
It also does not deal well with polymorphism since the ID (the
object's identity) is essentially fixed to a table (usually mapped to
a class). But this is a well worn topic and there is no need to beat
this dead horse one more time.
One should think about the database schema like they think about
their code. It is just as reasonable to change the schema as it is
to change what classes you have or what attributes they have. The
schema *is* code, and the data it holds is like objects of classes.
No more, and no less.
That is a very DB centric viewpoint and I disagree with you
completely. Changing a schema during development is one thing,
changing it after deployment, after you have started to collect data,
etc. is another thing entirely and very much a non-trivial task.
Remember, objects are graphs not sets of tuples.
And graphs can be represented as sets of tuples, such as where
tuples have 2 attributes that name connected graph nodes. For that
matter, objects only *represent* graphs themselves.
Sure, but your relationships are implicit and require outside-the-DB
programming to make them real.
( 1, 'Foo', 2 )
( 2, 'Bar', 1 )
These two tuples create a cyclical graph, but the RDBMS doesn't help
me reconstruct that graph, that I must do in my code. KiokuDB stores
things as graphs and when you extract them, you get the graphs back.
I mean, in reality we really don't need anything more then functions
with a single argument to write code. Things like numeric literals,
conditionals, loop constructs and local variables are not really
necessary. But why would you want to use Church Numerals to do math
and limit yourself to pure lambda calculus?
Now, all that I've had to say here isn't meant to diminish that the
JSON serialization approach is useful and probably a best fit for
many usage scenarios.
Again, look more closely at KiokuDB and take a look at JSPON. KiokuDB
is not just a dumb JSON store, instead it *uses* JSON/JSPON as a
storage format, that is all.
But at the same time, relational databases are very powerful and
their strengths, of ensuring that data is consistent and making it
easier to search, should be utilized, where it makes sense to do
so. Using a relational database, without exploiting the features
that make them uniquely powerful, is like wasting the tools you have.
Well, yes I agree, but while we might be not (by default) be utilizing
the full power of the RDBMS, we are not throwing the whole thing out.
Again, take a look at what is actually done then we can have a real
conversation about it.
Perhaps a reasonable analogy is people who use Perl 5 but write
their Perl code as if they were using Perl 4, and were faking
references rather than using real references, structures, and
objects. Sometimes I think of that when I hear of people just
dumping objects as a serialized string in a relational database.
No, that is not a reasonable analogy.
KiokuDB does not fake references at all, it stores real true explicit
references, not implicit tuple relationships, I think you have it
backwards.
Sometimes a relational DB is not the right tool for the job. Sometimes
the data structures being used wouldn't make sense if they were
flattened into a relational model. Sometimes there is no real business
value in having the data in a relational model and there is only value
having said data in object form.
Using a RDBMS for storing serialized objects using KiokuDB allows us
to take full advantage of the reliability, flexibility and scalability
of a RDBMS platform without being stunted by the impedance mismatch
that would typically come with a more traditional ORM tool.
- Stevan