Ovid wrote:
So if we have a hotel whose "short description" is missing or longer than 200
characters, we can live with that, but we can't live with an invalid price.
We're happy to clean up bad information but we don't want type constraints to
throw out *everything* just because someone goofed (often an external data
source we have little control over). Part of the reason management is very
unhappy with our use of Mouse is because of these type constraints causing
application failure we've never had in the past. We much prefer to bend
rather than break.
I recommend that you have multiple related fields, such as raw_price and
clean_price, where the latter has more specific constraints than the former; eg,
raw_price could be a string while clean_price is a number.
Then it would more accurately reflect your real workflow and stored assertions.
Set raw_price to be whatever your external data source says it is, and set
clean_price only if the raw_price is valid for it or you have cleaned it up and
leave it blank meanwhile.
On a tangent, I often design systems to have separate objects or database tables
for "X source says Y is" and "so we say Y is" where they overlap a lot but have
a clear separation of concerns and give a lot of flexibility and can also make
some things simpler.
For example, the core logic of resyncing with your external data source only has
to exactly match your lists without worrying about corrections, and then
corrections are just a separate internal-only affair.
This design is especially useful when you have multiple sources for overlapping
data, or if you have both external sources and internal sources, so you can
always properly audit where data is from and its reliability to you.
-- Darren Duncan