Viktor Klang wrote:
My not so extensive experience has told me that it depends on the kind
of schema you're building.
For something like a Twitter-clone you probably won't run into this
unless you've done some bad planning,
but I definitely would agree with you that it could be(come) a big issue.
I'd love to hear if someone's had problems with this and what their
domain/use was.
Regarding schema evolution...
The good old (err, ancient!) Z39.50 protocol used by libraries for
distributed search is actually quite good for this. Lots of silly
things in the protocol specification itself, but the semantic model is
quite good. There are abstraction layers between the physical
representation, what you query on, and what you retrieve (and more).
The abstraction of the query model for example allows you to send the
same query to different collections even if the schema was not
identical. They just had to support the same query fields used by the
query to evaluate the query. Actually, its even more general than that
- you can set it up to return zero matches for unknown query fields
rather than aborting with an error. This allowed introduction of newer
versions of database schemas with backwards compatibility to old
applications. (I am simplifying a bit here!)
We still use Z39.50 today in the non-SQL database system we develop at
work (TeraText.com). We have customers who want to log a continuous
flow of arriving information (e.g. syslog messages), retiring off old
content. E.g. create a new database each week and keep the last 26
databases around for 6 months of historical data. Then query across the
appropriate subset of databases to find results. Z39.50 makes it easy
to introduce schema changes into next week's database while still being
able to search across all the older databases as well. (Obviously only
the new database would find matches on searches specifying newer query
fields.)
Schema changes are typically not frequent, but when some new query comes
along that the customer wants to be able to do, the ability to introduce
new fields is very useful - especially if its a high volume of content.
Rebuilding all the old databases to retrospectively add new indexes can
take a long time and would potentially take the service off line, making
it not so desirable.
Alan
--
You received this message because you are subscribed to the Google Groups "The Java
Posse" group.
To post to this group, send email to javapo...@googlegroups.com.
To unsubscribe from this group, send email to
javaposse+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/javaposse?hl=en.