Hey guys. Just did some testing. If you have a look at
sword/tests/xmltest and try the problem case:
./xmltest "<title type='nested \"quotation\" '/>"
(xmltest already tries to add an attribute to your input which tests for
embedded quotes, so you'll see an addedAttribute in your output)
You get:
[scribe@charis tests]$ ./xmltest "<title type='nested \"quotation\" '/>"
<title type='nested "quotation" '/>
<title type='nested "quotation" '/>
<title addedAttribute='with a " quote' type='nested "quotation" '/>
Tag name: [title]
- attribute: [addedAttribute] = [with a " quote]
4 parts:
with
a
"
quote
- attribute: [type] = [nested "quotation" ]
3 parts:
nested
"quotation"
isEmpty: 1
isEndTag: 0
It is a little odd that the second attribute has "3 parts", but looking
at the example given, it have a space at the end, so I supposed this
might be correct.
Hope this is helpful in tracking this down,
Troy
On 10/26/2011 06:38 PM, DM Smith wrote:
On 10/26/2011 09:47 AM, Peter von Kaehne wrote:
Is there any actual credible reason for having quotation marks in
attributes? I agree that it may be grammatically correct for XML as
such, but OSIS's attributes are defined and do not contain quotation
marks. And x-marked attributes are largely thrown out during the
osis2mod run, no? Or at least ignored - apart from our own - like
x-preverse.
Peter
I had never spent the time to look at the allowable attribute values
in an OSIS document. Now, having looked at the schema, it is allowed
to nest quotes. See below for details.
I think there are many good reasons that a single quote will be found
in an attribute value. Many languages use it for other things than
quoting.
I can only think of a few, probably obscure, reasons for a double
quote to be there. E.g chapterTitle='xxx aka "yyy"', who='James
"Jimmy" Smith', ...
Osis2mod *should* allow for all well-formed, valid (both syntactically
and semantically) OSIS documents. Regarding quoting attribute values,
the recommendation still stands, use double quotes if at all possible,
but also avoid " and ' too. (Note that these entities are
only needed within attribute values and never elsewhere in the text.)
(Below I'm using x@y to mean element x with attribute y.)
In looking at this, I think there are some bugs in the definition of
l@type, lg@type, and rdg@type.
In Him,
DM
Here are the attributes that allow for arbitrary text:
actor@who
<xs:attribute name="who" type="xs:string" use="optional"/>
contributor@file-as
<xs:attribute name="file-as" type="xs:string" use="optional"/>
a@href
<xs:attribute name="href" type="xs:string" use="required"/>
abbr@expansion
<xs:attribute name="expansion" type="xs:string" use="optional"/>
chapter@chapterTitle
<xs:attribute name="chapterTitle" type="xs:string" use="optional"/>
figure@alt, @catalog, @location, @rights, @size, @src
<xs:attribute name="alt" type="xs:string" use="optional"/>
<xs:attribute name="catalog" type="xs:string" use="optional"/>
<xs:attribute name="location" type="xs:string" use="optional"/>
<xs:attribute name="rights" type="xs:string" use="optional"/>
<xs:attribute name="size" type="xs:string" use="optional"/>
<xs:attribute name="src" type="xs:string"/>
index@index, @level1, @level2, @level3, @level4, @see
<xs:attribute name="index" type="xs:string" use="required"/>
<xs:attribute name="level1" type="xs:string" use="required"/>
<xs:attribute name="level2" type="xs:string" use="optional"/>
<xs:attribute name="level3" type="xs:string" use="optional"/>
<xs:attribute name="level4" type="xs:string" use="optional"/>
<xs:attribute name="see" type="xs:string" use="optional"/>
item@role
<xs:attribute name="role" type="xs:string" use="optional"/>
label@role
<xs:attribute name="role" type="xs:string" use="optional"/>
milestone@marker
<xs:attribute name="marker" type="xs:string" default="DEFAULT"
use="optional"/>
milestoneEnd@start
<xs:attribute name="start" type="xs:string" use="required"/>
milestoneStart@end
<xs:attribute name="end" type="xs:string" use="required"/>
name@regular
<xs:attribute name="regular" type="xs:string" use="optional"/>
q@level, @marker, @who
<xs:attribute name="level" type="xs:string" use="optional"/>
<xs:attribute name="marker" type="xs:string" default="DEFAULT"
use="optional"/>
<xs:attribute name="who" type="xs:string" use="optional"/>
speaker@who
<xs:attribute name="who" type="xs:string" use="optional"/>
speech@marker
<xs:attribute name="marker" type="xs:string" default="DEFAULT"
use="optional"/>
title@short
<xs:attribute name="short" type="xs:string" use="optional"/>
w@gloss, @src, @xlit
<xs:attribute name="gloss" type="xs:string" use="optional"/>
<xs:attribute name="src" type="xs:string" use="optional"/>
<xs:attribute name="xlit" type="xs:string" use="optional"/>
Globally (globalWithType, globalWithoutType)
@annotateWork, @resp, @n
<xs:attribute name="annotateWork" type="xs:string" use="optional"/>
<xs:attribute name="resp" type="xs:string" use="optional"/>
<xs:attribute name="n" type="xs:string" use="optional"/>
Milestone attributes
@sID, @eID
<xs:attribute name="sID" type="xs:string" use="optional"/>
<xs:attribute name="eID" type="xs:string" use="optional"/>
osisID, osisRef, osisAnnotateType regexes allowing quotation marks:
(look for [^...] constructs)
<xs:pattern
value="((((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))*:)?([^:\s])+)"/>
<xs:pattern
value="(((\p{L}|\p{N}|_)+)((\.(\p{L}|\p{N}|_)+)*)?:)?((\p{L}|\p{N}|_|(\\[^\s]))+)((\.(\p{L}|\p{N}|_|(\\[^\s]))+)*)?(!((\p{L}|\p{N}|_|(\\[^\s]))+)((\.(\p{L}|\p{N}|_|(\\[^\s]))+)*)?)?"/>
<xs:pattern
value="(((\p{L}|\p{N}|_)+)((\.(\p{L}|\p{N}|_)+)*)?:)?((\p{L}|\p{N}|_|(\\[^\s]))+)(\.(\p{L}|\p{N}|_|(\\[^\s]))*)*(!((\p{L}|\p{N}|_|(\\[^\s]))+)((\.(\p{L}|\p{N}|_|(\\[^\s]))+)*)?)?(@(cp\[(\p{Nd})*\]|s\[(\p{L}|\p{N})+\](\[(\p{N})+\])?))?(\-((((\p{L}|\p{N}|_|(\\[^\s]))+)(\.(\p{L}|\p{N}|_|(\\[^\s]))*)*)+)(!((\p{L}|\p{N}|_|(\\[^\s]))+)((\.(\p{L}|\p{N}|_|(\\[^\s]))+)*)?)?(@(cp\[(\p{Nd})*\]|s\[(\p{L}|\p{N})+\](\[(\p{N})+\])?))?)?"/>
Attribute extension regex:
<xs:pattern value="x-([^\s])+"/>
l@type
<xs:union memberTypes="osisLine attributeExtension xs:string"/>
lg@type
<xs:union memberTypes="osisLineGroup attributeExtension xs:string"/>
<xs:simpleType name="osisLineGroup">
<xs:restriction base="xs:string">
<!-- <xs:enumeration value="doxology"/> -->
</xs:restriction>
</xs:simpleType>
rdg@type
<xs:union memberTypes="osisRdg attributeExtension xs:string"/>
-------- Original-Nachricht --------
Datum: Wed, 26 Oct 2011 08:59:14 -0400
Von: DM Smith<dmsm...@crosswire.org>
An: SWORD Developers\' Collaboration Forum<sword-devel@crosswire.org>
Betreff: Re: [sword-devel] XML attribute delimiters in OSIS files?
Ah, now I understand. This is a bug. And should be fixed. (BTW, not
having
the entire thread reproduced in each email makes it harder to
understand
the context of the email. I don't like having to go digging for the
context.
Having looked, I see that the first email in the thread defines
delimiters.)
But I'm not sure where it should be fixed. I haven't looked at the
code,
but as I recall, we use the SWORD parser to obtain the attribute
value. My
guess is that it is returning it with the quotes. If the problem is
there
and we fix it there, it may break a whole host of other things.
(This parser
is not a true XML parser, but one that is highly optimized for speed
and
thus we work with it's definition.)
It should be easy to change osis2mod to work. I'll look into doing this
soon.
That said, it is and has been the recommendation that double quotes be
used to wrap attribute values. It is valid to use single quotes, but
it may
(does) expose bugs. Fixing this bug does not change this
recommendation.
Until osis2mod has been changed and it is available, it is advisable to
change the input so that the quoting of sID/eID pairs to be identical.
In Him,
DM
On Oct 26, 2011, at 6:38 AM, David Haslam wrote:
Mixing double and single quotes, as per earlier messages in this
thread.
Example (minus the chaff):
sID="reference"
.....
eID='reference'
But this time for the same verse, just as Chris replied, rather
than in
completely separate OSIS elements.
As this is just an observation, I see no immediate need to give a
detailed
example of what happens to the module.
To locate the places where I spotted it yesterday would take some
time.
Perhaps the most interesting thing is that there was no error message
from
osis2mod.
And I agree with Chris, the OSIS needs fixing first, before using as
input
for osis2mod.
David
--
View this message in context:
http://sword-dev.350566.n4.nabble.com/XML-attribute-delimiters-in-OSIS-files-tp3907261p3940110.html
Sent from the SWORD Dev mailing list archive at Nabble.com.
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page