I'm going to chime in here as I've also recently been working with an
issue related to (read: learning about) html content within dc tags
and the rendering thereof within Manakin (cocoon).
Antonio, you and I are in the same situation. We both have "html" data
(actually stored with entity references) within our databases that we
need our end users' browsers to render as html. The trouble is that
there is only one "parsing" step between our databases and what the
user sees on his/her browser (that step would be the browser engine's
actual parsing of the content, if I haven't missed my mark).
Parsing: transforms "<" to "<" and "<" to interpretation as an
actual node (<node/>).
Serializing: transforms nodes into "<" and "<" into "<".
Two steps each way.
Since our source is of entity references and there is only one parsing
step, it means that in order to have the browser understand our
intention of outputting "nodes", we need to add another parsing step.
Since DSpace uses Cocoon and Cocoon uses Xerces, that means Exslt (at
least, that's the only extensions package I'm aware of for Xerces).
I think there are two potential ways to go about this.
1) dyn:evaluate()
This would function like Saxon's saxon:parse() (I think). This is
probably not the way to go unless it's the only option, as it can get
to be a very expensive operation fairly quickly.
2) str:replace()
This is probably the way to go, but it might require adding a
transformer (step 2.5) to the theme's sitemap.xmap to replace the
entity references with actual symbols before it goes any further down
the chain. It's possible that a transformer wouldn't be necessary and
you could just add it in the xslt stylesheet, but I think the
transformer might keep things more simplified.
Of course the third option would be to not have html data using entity
references within the database, but for you that presents a security
risk and for me it's just content I have very little control over.
Other than that, if anyone has any further comments on this issue or
parsing/serializing as it relates to cocoon/dspace, the feedback would
be appreciated!
- Patrick
P.S. - Antonio, there's a good chance that I'll be exploring the
options I listed above over the next couple weeks, but I'll be out
most of this week, if you like I can keep you informed
On Jul 13, 2009, at 10:07 AM, Antonio Cuomo wrote:
Dear Mark, it's the common behavior with all the DSpace installation
i have seen (MIT included).
The problem is that all the data in the field dc.description are
saved as plain text for security issues.
so the data must to be reconverted in html before being pushed to
the UI.
so, do you know what is the java class that retrieve the information
from the database and pass it to the UI?
thank you very much
Antonio
On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory <mdigg...@atmire.com>
wrote:
Antonio,
It is unclear why your case is not working. I can assure you that a
default installation of DSpace Manakin XMLUI will allow you to place
html in the form fields for any Community/Collection and that will
render as HTML in the Community/collection views without being
escaped, this is expected behavior. IT shouldn't require altering the
xslt templates to correct for your problem, there is apparently
something else going wrong with your installation. HTML escaping is
not used when the content is stored in the database, it is stored as
plain old unescaped html text. I suspect that there is something
concerning your environment different from a typical default
installation running on tomcat/linux that is giving rise to this
problem.
> the "problem" that the data inside the database are saved as text
and
> special character are added to avoid SQL-injection
Are you running some sort of sql-injection filtering in-front of
DSpace?
Mark
--
Mark R. Diggory
@mire - http://www.atmire.com
On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo<anto...@parliaments.info
> wrote:
> Dear Mark, thank you for your reply,
>
> unfortunately i didn't worked.
>
> The problem is:
> cocoon throught a java class take the data from the data base and
pass them
> directly to manakin without any change.
>
> manakin build up the layout the sent it to the browser with the
data cocoon
> passed
>
>
> the "problem" that the data inside the database are saved as text
and
> special character are added to avoid SQL-injection
> so if i write:
> <h3> Ciao <h3> <br/> <p> forever</p>
>
> the data appears in the data base in this way
>
> <h3>Ciao</h3> <br/> <p>for ever <p>
>
> and so what the browser receive is
>
> <h3>Ciao</h3> <br/> <p>for ever <p>
>
> that is visualized as
> <h3> Ciao <h3> <br/> <p> forever</p>
>
> what i need is to decode this special carachers < and >
in < and >
>
> to do this i can try to modify the xsl DIM-Handler (do u know how?)
>
> or the java cocoon class (do you know witch one and how?)
>
> Thank you very much
> Antonio
>
>
> On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory
<mdigg...@atmire.com> wrote:
>>
>> Use well formed xml here and try to wrap content with a <div> or
<p>
>> tag and it should work better for you. You shouldn't require
>> alteration of the xslt for this.
>>
>> <div>
>> <h3> hello </h3>
>> <p>it is a description </p>
>> </div>
>>
>> Mark
>>
>> --
>> Mark R. Diggory
>> @mire - http://www.atmire.com
>>
>> 2009/7/10 Antonio Cuomo <anto...@parliaments.info>:
>> > dear D-Space developer/user
>> >
>> > i have a question:
>> >
>> > i have some html code in my Database in the description field,
of course
>> > the
>> > html have been transformed in plain text.
>> > so the database entry is:
>> > <h3> hello </h3> </br><p>it is a description <p>
>> >
>> > when DSpace shows the database content it actually shows the
text:
>> > <h3> hello </h3> </br><p>it is a description <p>
>> >
>> > while i wuold like to say the html resoults instead:
>> >
>> > hello
>> > it is a description
>> >
>> >
>> >
>> > How can i do it?
>> >
>> > i see two possibilities:
>> >
>> > - Overwrite the java class that take data from the database
and send
>> > them
>> > to manakin in order to decode the html
>> >
>> >
>> > - working at Mankin level(but it seems me pretty much more
>> > complicated):in
>> > the file DIM-Handler.xsl
>> >
>> > <xsl:if test="dim:fie...@element='description' and
not(@qualifier)]">
>> > ...
>> > <xsl:copy-of select="./node()"/>
<-- call
>> > some
>> > html decoder here
>> > ...
>> > </xsl:if>
>> >
>> >
>> > I'm sure i'm not the first one who had this need... and i can
see some
>> > security issues concerned with the solution
>> > can somebody give me some indication or "a solution"?
>> >
>> > Thank you very much
>> > Antonio
>> >
>> >
>> > _______________________________________________
>> > Dspace-general mailing list
>> > dspace-gene...@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/dspace-general
>> >
>> >
>>
>>
>>
------------------------------------------------------------------------------
>> Enter the BlackBerry Developer Challenge
>> This is your chance to win up to $100,000 in prizes! For a
limited time,
>> vendors submitting new applications to BlackBerry App World(TM)
will have
>> the opportunity to enter the BlackBerry Developer Challenge. See
full
>> prize
>> details at: http://p.sf.net/sfu/Challenge
>> _______________________________________________
>> Dspace-devel mailing list
>> dspace-de...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>
>
------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge
This is your chance to win up to $100,000 in prizes! For a limited
time,
vendors submitting new applications to BlackBerry App World(TM) will
have
the opportunity to enter the BlackBerry Developer Challenge. See
full prize
details at:
http://p.sf.net/sfu/Challenge_______________________________________________
Dspace-devel mailing list
dspace-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel
---
Patrick K. Étienne
Systems Analyst
Digital Library Development
Library and Information Center
Georgia Institute of Technology
email: patrick.etie...@library.gatech.edu
phone: 404.385.8121
"Mediocre Writers Borrow; Great Writers Steal" - T.S. Eliot
------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech