Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-11 Thread Antonio Cuomo
Dear Mark, thank you for your reply,

unfortunately i didn't worked.

The problem is:
cocoon throught a java class take the data from the data base and pass them
directly to manakin without any change.

manakin build up the layout the sent it to the browser with the data cocoon
passed


the "problem" that the data inside the database are saved as text and
special character are added to avoid SQL-injection
so if i write:
 Ciaoforever

the data appears in the data base in this way

Ciao


for ever

and so what the browser receive is

Ciao


for ever

that is visualized as Ciaoforever what i need is to decode this special carachers < and > in < and > to do this i can try to modify the xsl DIM-Handler (do u know how?) or the java cocoon class (do you know witch one and how?) Thank you very much Antonio On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory wrote: > Use well formed xml here and try to wrap content with a or > tag and it should work better for you. You shouldn't require > alteration of the xslt for this. > > > hello > it is a description > > > Mark > > -- > Mark R. Diggory > @mire - http://www.atmire.com > > 2009/7/10 Antonio Cuomo : > > dear D-Space developer/user > > > > i have a question: > > > > i have some html code in my Database in the description field, of course > the > > html have been transformed in plain text. > > so the database entry is: > > hello it is a description > > > > when DSpace shows the database content it actually shows the text: > > hello it is a description > > > > while i wuold like to say the html resoults instead: > > > > hello > > it is a description > > > > > > > > How can i do it? > > > > i see two possibilities: > > > > - Overwrite the java class that take data from the database and send > them > > to manakin in order to decode the html > > > > > > - working at Mankin level(but it seems me pretty much more > complicated):in > > the file DIM-Handler.xsl > > > > > > ... > > <-- call > some > > html decoder here > > ... > > > > > > > > I'm sure i'm not the first one who had this need... and i can see some > > security issues concerned with the solution > > can somebody give me some indication or "a solution"? > > > > Thank you very much > > Antonio > > > > > > ___ > > Dspace-general mailing list > > dspace-gene...@mit.edu > > http://mailman.mit.edu/mailman/listinfo/dspace-general > > > > > > > -- > Enter the BlackBerry Developer Challenge > This is your chance to win up to $100,000 in prizes! For a limited time, > vendors submitting new applications to BlackBerry App World(TM) will have > the opportunity to enter the BlackBerry Developer Challenge. See full prize > details at: http://p.sf.net/sfu/Challenge > ___ > Dspace-devel mailing list > dspace-de...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspace-devel > -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-12 Thread Mark Diggory
Antonio,

It is unclear why your case is not working.  I can assure you that a
default installation of DSpace Manakin XMLUI will allow you to place
html in the form fields for any Community/Collection and that will
render as HTML in the Community/collection views without being
escaped, this is expected behavior.  IT shouldn't require altering the
xslt templates to correct for your problem, there is apparently
something else going wrong with your installation.  HTML escaping is
not used when the content is stored in the database, it is stored as
plain old unescaped html text.  I suspect that there is something
concerning your environment different from a typical default
installation running on tomcat/linux that is giving rise to this
problem.

> the "problem" that the data inside the database are saved as text and
> special character are added to avoid SQL-injection

Are you running some sort of sql-injection filtering in-front of DSpace?

Mark

-- 
Mark R. Diggory
@mire - http://www.atmire.com

On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo wrote:
> Dear Mark, thank you for your reply,
>
> unfortunately i didn't worked.
>
> The problem is:
> cocoon throught a java class take the data from the data base and pass them
> directly to manakin without any change.
>
> manakin build up the layout the sent it to the browser with the data cocoon
> passed
>
>
> the "problem" that the data inside the database are saved as text and
> special character are added to avoid SQL-injection
> so if i write:
>  Ciaoforever
>
> the data appears in the data base in this way
>
> 

Ciao


for ever

> > and so what the browser receive is > >

Ciao


for ever

> > that is visualized as > Ciaoforever > > what i need is to decode this special carachers <  and  >  in < and > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > or the java cocoon class (do you know witch one and how?) > > Thank you very much > Antonio > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory wrote: >> >> Use well formed xml here and try to wrap content with a or >> tag and it should work better for you. You shouldn't require >> alteration of the xslt for this. >> >> >> hello >> it is a description >> >> >> Mark >> >> -- >> Mark R. Diggory >> @mire - http://www.atmire.com >> >> 2009/7/10 Antonio Cuomo : >> > dear D-Space developer/user >> > >> > i have a question: >> > >> > i have some html code in my Database in the description field, of course >> > the >> > html have been transformed in plain text. >> > so the database entry is: >> > hello it is a description >> > >> > when DSpace shows the database content it actually shows the text: >> > hello it is a description >> > >> > while i wuold like to say the html resoults instead: >> > >> > hello >> > it is a description >> > >> > >> > >> > How can i do it? >> > >> > i see two possibilities: >> > >> > -  Overwrite the java class that take data from the database and send >> > them >> > to manakin in order to decode the html >> > >> > >> > - working at Mankin level(but it seems me pretty much more >> > complicated):in >> > the file DIM-Handler.xsl >> > >> >   >> >     ... >> >             <-- call >> > some >> > html decoder here >> >         ... >> >     >> > >> > >> > I'm sure i'm not the first one who had this need... and i can see some >> > security issues concerned with the solution >> > can somebody give me some indication or "a solution"? >> > >> > Thank you very much >> > Antonio >> > >> > >> > ___ >> > Dspace-general mailing list >> > dspace-gene...@mit.edu >> > http://mailman.mit.edu/mailman/listinfo/dspace-general >> > >> > >> >> >> -- >> Enter the BlackBerry Developer Challenge >> This is your chance to win up to $100,000 in prizes! For a limited time, >> vendors submitting new applications to BlackBerry App World(TM) will have >> the opportunity to enter the BlackBerry Developer Challenge. See full >> prize >> details at: http://p.sf.net/sfu/Challenge >> ___ >> Dspace-devel mailing list >> dspace-de...@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/dspace-devel > > -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-13 Thread Antonio Cuomo
Dear Mark, it's the common behavior with all the DSpace installation i have
seen (MIT included).

The problem is that all the data in the field dc.description are saved as
plain text for security issues.

so the data must to be reconverted in html before being pushed to the UI.

so, do you know what is the java class that retrieve the information from
the database and pass it to the UI?


thank you very much
Antonio



On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory  wrote:

> Antonio,
>
> It is unclear why your case is not working.  I can assure you that a
> default installation of DSpace Manakin XMLUI will allow you to place
> html in the form fields for any Community/Collection and that will
> render as HTML in the Community/collection views without being
> escaped, this is expected behavior.  IT shouldn't require altering the
> xslt templates to correct for your problem, there is apparently
> something else going wrong with your installation.  HTML escaping is
> not used when the content is stored in the database, it is stored as
> plain old unescaped html text.  I suspect that there is something
> concerning your environment different from a typical default
> installation running on tomcat/linux that is giving rise to this
> problem.
>
> > the "problem" that the data inside the database are saved as text and
> > special character are added to avoid SQL-injection
>
> Are you running some sort of sql-injection filtering in-front of DSpace?
>
> Mark
>
> --
> Mark R. Diggory
> @mire - http://www.atmire.com
>
> On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo
> wrote:
> > Dear Mark, thank you for your reply,
> >
> > unfortunately i didn't worked.
> >
> > The problem is:
> > cocoon throught a java class take the data from the data base and pass
> them
> > directly to manakin without any change.
> >
> > manakin build up the layout the sent it to the browser with the data
> cocoon
> > passed
> >
> >
> > the "problem" that the data inside the database are saved as text and
> > special character are added to avoid SQL-injection
> > so if i write:
> >  Ciaoforever
> >
> > the data appears in the data base in this way
> >
> > 

Ciao


for ever

> > > > and so what the browser receive is > > > >

Ciao


for ever

> > > > that is visualized as > > Ciaoforever > > > > what i need is to decode this special carachers < and > in < and > > > > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > > > or the java cocoon class (do you know witch one and how?) > > > > Thank you very much > > Antonio > > > > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory > wrote: > >> > >> Use well formed xml here and try to wrap content with a or > >> tag and it should work better for you. You shouldn't require > >> alteration of the xslt for this. > >> > >> > >> hello > >> it is a description > >> > >> > >> Mark > >> > >> -- > >> Mark R. Diggory > >> @mire - http://www.atmire.com > >> > >> 2009/7/10 Antonio Cuomo : > >> > dear D-Space developer/user > >> > > >> > i have a question: > >> > > >> > i have some html code in my Database in the description field, of > course > >> > the > >> > html have been transformed in plain text. > >> > so the database entry is: > >> > hello it is a description > >> > > >> > when DSpace shows the database content it actually shows the text: > >> > hello it is a description > >> > > >> > while i wuold like to say the html resoults instead: > >> > > >> > hello > >> > it is a description > >> > > >> > > >> > > >> > How can i do it? > >> > > >> > i see two possibilities: > >> > > >> > - Overwrite the java class that take data from the database and send > >> > them > >> > to manakin in order to decode the html > >> > > >> > > >> > - working at Mankin level(but it seems me pretty much more > >> > complicated):in > >> > the file DIM-Handler.xsl > >> > > >> > > >> > ... > >> > <-- call > >> > some > >> > html decoder here > >> > ... > >> > > >> > > >> > > >> > I'm sure i'm not the first one who had this need... and i can see some > >> > security issues concerned with the solution > >> > can somebody give me some indication or "a solution"? > >> > > >> > Thank you very much > >> > Antonio > >> > > >> > > >> > ___ > >> > Dspace-general mailing list > >> > dspace-gene...@mit.edu > >> > http://mailman.mit.edu/mailman/listinfo/dspace-general > >> > > >> > > >> > >> > >> > -- > >> Enter the BlackBerry Developer Challenge > >> This is your chance to win up to $100,000 in prizes! For a limited time, > >> vendors submitting new applications to BlackBerry App World(TM) will > have > >> the opportunity to enter the BlackBerry Developer Challenge. See full > >> prize > >> details at: http://p.sf.net/sfu/Challenge > >> _


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-13 Thread Lewatle Phaladi
Hi Guys, I am very new to Dspace, I am using ubuntu 9.4, my aim is to install 
dspace 1.5.2 on ubuntu desktop box please help me i tried the online manual 
without success.


-Original Message-
From: Antonio Cuomo [mailto:anto...@parliaments.info]
Sent: Mon 7/13/2009 4:07 PM
To: Mark Diggory
Cc: dspace-tech; dspace-de...@lists.sourceforge.net
Subject: Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
 
Dear Mark, it's the common behavior with all the DSpace installation i have
seen (MIT included).

The problem is that all the data in the field dc.description are saved as
plain text for security issues.

so the data must to be reconverted in html before being pushed to the UI.

so, do you know what is the java class that retrieve the information from
the database and pass it to the UI?


thank you very much
Antonio



On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory  wrote:

> Antonio,
>
> It is unclear why your case is not working.  I can assure you that a
> default installation of DSpace Manakin XMLUI will allow you to place
> html in the form fields for any Community/Collection and that will
> render as HTML in the Community/collection views without being
> escaped, this is expected behavior.  IT shouldn't require altering the
> xslt templates to correct for your problem, there is apparently
> something else going wrong with your installation.  HTML escaping is
> not used when the content is stored in the database, it is stored as
> plain old unescaped html text.  I suspect that there is something
> concerning your environment different from a typical default
> installation running on tomcat/linux that is giving rise to this
> problem.
>
> > the "problem" that the data inside the database are saved as text and
> > special character are added to avoid SQL-injection
>
> Are you running some sort of sql-injection filtering in-front of DSpace?
>
> Mark
>
> --
> Mark R. Diggory
> @mire - http://www.atmire.com
>
> On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo
> wrote:
> > Dear Mark, thank you for your reply,
> >
> > unfortunately i didn't worked.
> >
> > The problem is:
> > cocoon throught a java class take the data from the data base and pass
> them
> > directly to manakin without any change.
> >
> > manakin build up the layout the sent it to the browser with the data
> cocoon
> > passed
> >
> >
> > the "problem" that the data inside the database are saved as text and
> > special character are added to avoid SQL-injection
> > so if i write:
> >  Ciaoforever
> >
> > the data appears in the data base in this way
> >
> > <h3>Ciao</h3> <br/> <p>for ever <p>
> >
> > and so what the browser receive is
> >
> > <h3>Ciao</h3> <br/> <p>for ever <p>
> >
> > that is visualized as
> >  Ciaoforever
> >
> > what i need is to decode this special carachers <  and  >  in < and
> >
> >
> > to do this i can try to modify the xsl DIM-Handler (do u know how?)
> >
> > or the java cocoon class (do you know witch one and how?)
> >
> > Thank you very much
> > Antonio
> >
> >
> > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory 
> wrote:
> >>
> >> Use well formed xml here and try to wrap content with a  or 
> >> tag and it should work better for you. You shouldn't require
> >> alteration of the xslt for this.
> >>
> >> 
> >>  hello 
> >> it is a description 
> >> 
> >>
> >> Mark
> >>
> >> --
> >> Mark R. Diggory
> >> @mire - http://www.atmire.com
> >>
> >> 2009/7/10 Antonio Cuomo :
> >> > dear D-Space developer/user
> >> >
> >> > i have a question:
> >> >
> >> > i have some html code in my Database in the description field, of
> course
> >> > the
> >> > html have been transformed in plain text.
> >> > so the database entry is:
> >> >  hello  it is a description 
> >> >
> >> > when DSpace shows the database content it actually shows the text:
> >> >  hello  it is a description 
> >> >
> >> > while i wuold like to say the html resoults instead:
> >> >
> >> > hello
> >> > it is a description
> >> >
> >> >
> >> >
> >> > How can i do it?
> >> >
> >> > i see two possibilities:
> >&g

Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-13 Thread Patrick K. Etienne
I'm going to chime in here as I've also recently been working with an  
issue related to (read: learning about) html content within dc tags  
and the rendering thereof within Manakin (cocoon).


Antonio, you and I are in the same situation. We both have "html" data  
(actually stored with entity references) within our databases that we  
need our end users' browsers to render as html. The trouble is that  
there is only one "parsing" step between our databases and what the  
user sees on his/her browser (that step would be the browser engine's  
actual parsing of the content, if I haven't missed my mark).


Parsing: transforms "<" to "<" and "<" to interpretation as an  
actual node ().

Serializing: transforms nodes into "<" and "<" into "<".
Two steps each way.

Since our source is of entity references and there is only one parsing  
step, it means that in order to have the browser understand our  
intention of outputting "nodes", we need to add another parsing step.  
Since DSpace uses Cocoon and Cocoon uses Xerces, that means Exslt (at  
least, that's the only extensions package I'm aware of for Xerces).


I think there are two potential ways to go about this.

1)  dyn:evaluate()
This would function like Saxon's saxon:parse() (I think). This is  
probably not the way to go unless it's the only option, as it can get  
to be a very expensive operation fairly quickly.


2) str:replace()
This is probably the way to go, but it might require adding a  
transformer (step 2.5) to the theme's sitemap.xmap to replace the  
entity references with actual symbols before it goes any further down  
the chain. It's possible that a transformer wouldn't be necessary and  
you could just add it in the xslt stylesheet, but I think the  
transformer might keep things more simplified.


Of course the third option would be to not have html data using entity  
references within the database, but for you that presents a security  
risk and for me it's just content I have very little control over.


Other than that, if anyone has any further comments on this issue or  
parsing/serializing as it relates to cocoon/dspace, the feedback would  
be appreciated!


 - Patrick

P.S. - Antonio, there's a good chance that I'll be exploring the  
options I listed above over the next couple weeks, but I'll be out  
most of this week, if you like I can keep you informed



On Jul 13, 2009, at 10:07 AM, Antonio Cuomo wrote:

Dear Mark, it's the common behavior with all the DSpace installation  
i have seen (MIT included).


The problem is that all the data in the field dc.description are  
saved as plain text for security issues.


so the data must to be reconverted in html before being pushed to  
the UI.


so, do you know what is the java class that retrieve the information  
from the database and pass it to the UI?



thank you very much
Antonio



On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory   
wrote:

Antonio,

It is unclear why your case is not working.  I can assure you that a
default installation of DSpace Manakin XMLUI will allow you to place
html in the form fields for any Community/Collection and that will
render as HTML in the Community/collection views without being
escaped, this is expected behavior.  IT shouldn't require altering the
xslt templates to correct for your problem, there is apparently
something else going wrong with your installation.  HTML escaping is
not used when the content is stored in the database, it is stored as
plain old unescaped html text.  I suspect that there is something
concerning your environment different from a typical default
installation running on tomcat/linux that is giving rise to this
problem.

> the "problem" that the data inside the database are saved as text  
and

> special character are added to avoid SQL-injection

Are you running some sort of sql-injection filtering in-front of  
DSpace?


Mark

--
Mark R. Diggory
@mire - http://www.atmire.com

On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo> wrote:

> Dear Mark, thank you for your reply,
>
> unfortunately i didn't worked.
>
> The problem is:
> cocoon throught a java class take the data from the data base and  
pass them

> directly to manakin without any change.
>
> manakin build up the layout the sent it to the browser with the  
data cocoon

> passed
>
>
> the "problem" that the data inside the database are saved as text  
and

> special character are added to avoid SQL-injection
> so if i write:
>  Ciaoforever
>
> the data appears in the data base in this way
>
> 

Ciao


for ever

> > and so what the browser receive is > >

Ciao


for ever

> > that is visualized as > Ciaoforever > > what i need is to decode this special carachers < and > in < and > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > or the java cocoon class (do you know witch one and how?) > > Thank you very much > Antonio > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory wrote: >> >


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-13 Thread Mark Diggory
On Mon, Jul 13, 2009 at 7:07 AM, Antonio Cuomo wrote:
> Dear Mark, it's the common behavior with all the DSpace installation i have
> seen (MIT included).
>
> The problem is that all the data in the field dc.description are saved as
> plain text for security issues.

I understand now, this is a discussion about Item metadata fields, not
Community Collection descriptions where html content is allowed. I
agree with Alan's assessment here:

1.) I really do not advise placing html content into metadata fields.
this will cause much difficulty downstream in the application when
those fields are rendered into things like  fields, oai records and other xml centric serializations

2.) Placing html into metadata fields suggests that they are more than
content, but also presentation. Overall this is a very bad practice
and I do not recommend doing it.

If you do feel it necessary to approach doing this you might approach
some of Patricks comments, but I will heavily caution that if the user
inputs ill-formed xml, it will break the rendering pipeline and result
in a 500 error page being rendered. The concern here is that the field
value is parsed into the sax stream before the i18n and serialization
transformations occur and thus needs to be well formed for those
stages to occur.

Another alternative might be to look at using JTidy to cleanup the
value prior to having saxon or xalan parse it. See for instance

http://scm.dspace.org/svn/repo/modules/dspace-rdf/trunk/src/main/java/org/dspace/adapters/rdf/DSpaceObjectAdapter.java

Mark

-- 
Mark R. Diggory
@mire - http://www.atmire.com

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-13 Thread Patrick K. Etienne

Mark,

Many thanks for your comments!

And to the audience at large, for what it's worth, I'm certainly in  
agreement with Mark and Alan concerning the use of html within  
metadata fields. I believe the only reason we have this going on in  
our repository is due to a messy recovery we had to manage after one  
of our servers was compromised. I'm aiming to have our data cleaned up  
so we no longer have html content in metadata fields. If all goes  
well, we'll be back to best practices rather than messy hacks.


 - Patrick

On Jul 13, 2009, at 4:41 PM, Mark Diggory wrote:

On Mon, Jul 13, 2009 at 7:07 AM, Antonio Cuomo> wrote:
Dear Mark, it's the common behavior with all the DSpace  
installation i have

seen (MIT included).

The problem is that all the data in the field dc.description are  
saved as

plain text for security issues.


I understand now, this is a discussion about Item metadata fields, not
Community Collection descriptions where html content is allowed. I
agree with Alan's assessment here:

1.) I really do not advise placing html content into metadata fields.
this will cause much difficulty downstream in the application when
those fields are rendered into things like  fields, oai records and other xml centric serializations

2.) Placing html into metadata fields suggests that they are more than
content, but also presentation. Overall this is a very bad practice
and I do not recommend doing it.

If you do feel it necessary to approach doing this you might approach
some of Patricks comments, but I will heavily caution that if the user
inputs ill-formed xml, it will break the rendering pipeline and result
in a 500 error page being rendered. The concern here is that the field
value is parsed into the sax stream before the i18n and serialization
transformations occur and thus needs to be well formed for those
stages to occur.

Another alternative might be to look at using JTidy to cleanup the
value prior to having saxon or xalan parse it. See for instance

http://scm.dspace.org/svn/repo/modules/dspace-rdf/trunk/src/main/java/org/dspace/adapters/rdf/DSpaceObjectAdapter.java

Mark

--
Mark R. Diggory
@mire - http://www.atmire.com

--
Enter the BlackBerry Developer Challenge
This is your chance to win up to $100,000 in prizes! For a limited  
time,
vendors submitting new applications to BlackBerry App World(TM) will  
have
the opportunity to enter the BlackBerry Developer Challenge. See  
full prize

details at: http://p.sf.net/sfu/Challenge
___
Dspace-devel mailing list
dspace-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


---
Patrick K. Étienne
Systems Analyst
Digital Library Development
Library and Information Center
Georgia Institute of Technology
email: patrick.etie...@library.gatech.edu
phone: 404.385.8121

"Mediocre Writers Borrow; Great Writers Steal" - T.S. Eliot

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-14 Thread Mark H. Wood
On the one hand I tend to agree that markup shouldn't be part of
metadata.

On the other, if there are places where DSpace needs to write out,
say, XML with metadata as attribute values or element content then it
should be encoding it properly no matter what those values are.

And, strangely, I seem to have a third hand: dc.description.abstract
and its brethren in other schemae.  Now, *I* think that an abstract
should be a single brief block of text -- three, perhaps four
sentences.  I've been shown many times that others disagree:  they
write three or four *paragraphs*, often with various kinds of emphasis
or even complex layout expectations.  We've compromised by accepting
paragraph breaks (since we have one third-party customized DSpace that
already has code to turn "\n\n" into "" and ensure that it's
wrapped properly).

Abstracts seem to be the one exception that tests the rule.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Friends don't let friends publish revisable-form documents.


pgpMFH8CzxlUU.pgp
Description: PGP signature
--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-14 Thread Sarah L. Shreeves
As someone who has spent a lot of time talking with folks about 
'shareable' metadata, I just want to second Mark's comment that we 
really should not be putting html content into metadata fields. If you 
want a feel for what this looks like downstream, do a search on  (or 
other html tags of your choice) in something like OAIster 
(http://www.oaister.org/). Downstream applications that reuse your 
metadata generally ignore the markup but still keep it in the metadata. 
We've also seen example like:

GOPHERUS POLYPHEMUS (Gopher Tortoise) 
COYOTE PREDATION

where creators have tried to not include actual html markup, but in the 
end cause even more confusion for applications using the metadata (not 
to mention users who are accessing these).

Sarah Shreeves

-- 
Sarah L. Shreeves
Coordinator, IDEALS
http://www.ideals.uiuc.edu/
University of Illinois at Urbana-Champaign
sshre...@illinois.edu
217-244-3877 or 217-333-4648




Mark Diggory wrote:
> On Mon, Jul 13, 2009 at 7:07 AM, Antonio Cuomo 
> wrote:
>   
>> Dear Mark, it's the common behavior with all the DSpace installation i have
>> seen (MIT included).
>>
>> The problem is that all the data in the field dc.description are saved as
>> plain text for security issues.
>> 
>
> I understand now, this is a discussion about Item metadata fields, not
> Community Collection descriptions where html content is allowed. I
> agree with Alan's assessment here:
>
> 1.) I really do not advise placing html content into metadata fields.
> this will cause much difficulty downstream in the application when
> those fields are rendered into things like  fields, oai records and other xml centric serializations
>
> 2.) Placing html into metadata fields suggests that they are more than
> content, but also presentation. Overall this is a very bad practice
> and I do not recommend doing it.
>
> If you do feel it necessary to approach doing this you might approach
> some of Patricks comments, but I will heavily caution that if the user
> inputs ill-formed xml, it will break the rendering pipeline and result
> in a 500 error page being rendered. The concern here is that the field
> value is parsed into the sax stream before the i18n and serialization
> transformations occur and thus needs to be well formed for those
> stages to occur.
>
> Another alternative might be to look at using JTidy to cleanup the
> value prior to having saxon or xalan parse it. See for instance
>
> http://scm.dspace.org/svn/repo/modules/dspace-rdf/trunk/src/main/java/org/dspace/adapters/rdf/DSpaceObjectAdapter.java
>
> Mark
>   

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-15 Thread Andrew Marlow
FWFW, I have seen lengthly and complex abstracts with alot of markup
including math and/or chemistry markup, including embedded LaTeX.

On Tue, Jul 14, 2009 at 1:39 PM, Mark H. Wood  wrote:

>   Now, *I* think that an abstract
> should be a single brief block of text -- three, perhaps four
> sentences.  I've been shown many times that others disagree:  they
> write three or four *paragraphs*, often with various kinds of emphasis
> or even complex layout expectations.

-- 
Regards,

Andrew M.
http://www.andrewpetermarlow.co.uk
--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data

2009-07-16 Thread Antonio Cuomo
yes guys you are right but not to have the possibility to use bold character
or dotted list etcetera sometimes makes hard a good description.

On Wed, Jul 15, 2009 at 3:11 PM, Andrew Marlow  wrote:

> FWFW, I have seen lengthly and complex abstracts with alot of markup
> including math and/or chemistry markup, including embedded LaTeX.
>
>
> On Tue, Jul 14, 2009 at 1:39 PM, Mark H. Wood  wrote:
>
>>   Now, *I* think that an abstract
>> should be a single brief block of text -- three, perhaps four
>> sentences.  I've been shown many times that others disagree:  they
>> write three or four *paragraphs*, often with various kinds of emphasis
>> or even complex layout expectations.
>
> --
> Regards,
>
> Andrew M.
> http://www.andrewpetermarlow.co.uk
>
>
>
> --
> Enter the BlackBerry Developer Challenge
> This is your chance to win up to $100,000 in prizes! For a limited time,
> vendors submitting new applications to BlackBerry App World(TM) will have
> the opportunity to enter the BlackBerry Developer Challenge. See full prize
> details at: http://p.sf.net/sfu/Challenge
> ___
> Dspace-devel mailing list
> dspace-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>
>
--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech