Hello,
one part of the problem seems to be the encoding of special characters
in XML. There is a difference between content and the XML serialization
of content. If I put the string "a<b" in an editors field I would expect
"a<b" inside the XML document. Showing the content from that XML
document to a user the entity reference must be resolved to the
character it stands for resulting in "a<b". That should be done for all
special characters and so, there is no problem with entity references as
content. The string "a © b" is serialized in XML as "a &copy;
b". Using that content in a HTML document the entity reference must be
resolved to the appropriate character.
Maybe I did not get the problem ...
Cheers, Frank
Von: Richard Green [mailto:[email protected]]
Gesendet: Mittwoch, 19. August 2009 11:25
An: Steve Bayliss; Fedora Commons Developers
Betreff: Re: [Fedora-commons-developers] New admin client
Potentially there could be a problem with re-editing?
Personally I can live with a copyright symbol being © but see
below. There are other things that give problems too. The real issue,
I suggest, is with (eg) dc:description. Like it or not there will be
people using the editor to edit their descriptive metadata and so
dc:description is going to get things like abstracts thrown at it. From
bitter experience I know that chemical (etc) abstracts regularly contain
'<' (concentrations < 5ppm) and all abstracts are capable of producing
a '&' when you're not looking. Both these, predictably, cause the
editor to panic.
So, you put in the numeric codes and save, and when you re-open, what
have you got? < and & So why can't I put these in in the first
place (I can), and also © (I can't). It's inconsistent. Putting
in < or & with Alt+xxx isn't going to help - you'll just get an illegal
character.
If we go for numeric codes there are too many to remember, so a
drop-down?
R
___________________________________________________________________
Richard Green
Consultant to the University of Hull IT Systems Group
managing the CLIF and Hydra (Hull) Projects
http://edocs.hull.ac.uk
http://www.hull.ac.uk/clif
https://fedora-commons.org/confluence/display/hydra
From: Steve Bayliss [mailto:[email protected]]
Sent: 18 August 2009 17:09
To: 'Fedora Commons Developers'
Cc: Richard Green
Subject: Re: [Fedora-commons-developers] New admin client
We had a discussion on this on the Committer Meeting call today.
Taking a look at http://dublincore.org/documents/dcmes-xml/, 2.5.
Language and character encoding - this says that HTML entities should
not be used; but for instance © for the copyright symbol is ok. And
the way that the DC datastream is wrapped in FOXML would cause problems
in declaring these HTML entities. So in the FOXML the HTML entities (if
allowed in the admin client) would need converting to the character code
representations.
It would seem that this is really a usability issue for the new admin
client - ie how to make it easy for users to enter symbols such as the
copyright symbol?
Should the admin client handle this at all, or leave it to the platform
to deal with (eg, in Windows you could enter (c) by typing Alt+01699, or
by using Character Map)?
What do people think? Provide buttons/dropdowns etc for entering
special symbols; allow typing HTML entities but convert straight to the
character code equivalent? Other suggestions?
Steve
-----Original Message-----
From: Bill Branan [mailto:[email protected]]
Sent: 05 August 2009 15:12
To: Peter Cliff
Cc: Richard Green; Fedora Commons Developers
Subject: Re: [Fedora-commons-developers] New admin client
Hi Pete,
I believe that you're correct in that the entity definitions for
these characters are just not included, so when the XML is processed
during the add/modify datastream calls the parsing fails. I've added an
issue in JIRA for this:
http://fedora-commons.org/jira/browse/FCREPO-520.
Bill
On Wed, Aug 5, 2009 at 6:04 AM, Peter Cliff
<[email protected]> wrote:
Possibly not relevant at all - having not tried to enter &
anything into
the new admin client! ;-) - but (I expect you know) you need to
define
entities with names (©) etc.
See:
http://www.xml.com/pub/a/98/08/xmlqna2.html
http://www.tizag.com/xmlTutorial/xmlentity.php
So my guess is that somewhere some XML parsing/creating is
happening
behind the scenes of that client and it is throwing the whole
thing off
when the XML processing fails on account of an undefined entity?
I couldn't find any entity definitions for the HTML named ones
in the
src/xsd/ (aside from the reference in xhtml1-strict.xsd). Do
there need
to be some?
Hope that is useful and not teaching either of you to suck eggs!
;-)
Pete Cliff
OULS
-------------------------------------------------------
Fachinformationszentrum Karlsruhe, Gesellschaft für wissenschaftlich-technische
Information mbH.
Sitz der Gesellschaft: Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB
101892.
Geschäftsführerin: Sabine Brünger-Weilandt.
Vorsitzender des Aufsichtsrats: MinR Hermann Riehl.
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers