Serdar Tumgoren wrote:
That's a good start. You're missing one requirement that I think needs to
be explicit. Presumably you're requiring that the XML be well-formed. This
refers to things like matching <xxx> and </xxx> nodes, and proper use of
quotes and escaping within strings. Most DOM parsers won't even give you a
tree if the file isn't well-formed.
I actually hadn't been checking for well-formedness on the assumption
that ElementTree's parse method did that behind the scenes. Is that
not correct?
(I didn't see any specifics on that subject in the docs:
http://docs.python.org/library/xml.etree.elementtree.html)
I also would assume that ElementTree would do the check. But the point
is: it's part of the spec, and needs to be explicitly handled in your
list of errors:
file xxxxyyy.xml was rejected because .....
I am not saying you need to separately test for it in your validator,
but effectively it's the second test you'll be doing. (The first is:
the file exists and is readable)
But most importantly, you can divide the rules where you say "if the data
looks like XXXX" the file is rejected. Versus "if the data looks like
YYYY, we'll pretend it's actually ZZZZ, and keep going. An example of that
last might be what to do if somebody specifies March 35. You might just
pretend March 31, and keep going.
Ok, so if I'm understanding -- I should convert invalid data to
sensible defaults where possible (like setting blank fields to 0);
otherwise if the data is clearly invalid and the default is
unknowable, I should flag the field for editing, deletion or some
other type of handling.
Exactly. As you said in one of your other messages, human intervention
required. Then the humans may decide to modify the spec to reduce the
number of cases needing human intervention. So I see the spec and the
validator as a matched pair that will evolve.
Note that none of this says anything about testing your code. You'll
need a controlled suite of test data to help with that. The word "test"
is heavily overloaded (and heavily underdone) in our industry.
DaveA
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor