Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
Dear Mark, thank you for your reply, unfortunately i didn't worked. The problem is: cocoon throught a java class take the data from the data base and pass them directly to manakin without any change. manakin build up the layout the sent it to the browser with the data cocoon passed the "problem" that the data inside the database are saved as text and special character are added to avoid SQL-injection so if i write: Ciaoforever the data appears in the data base in this wayCiao
for ever
and so what the browser receive is
Ciao
for ever
that is visualized as Ciaoforever what i need is to decode this special carachers < and > in < and > to do this i can try to modify the xsl DIM-Handler (do u know how?) or the java cocoon class (do you know witch one and how?) Thank you very much Antonio On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory wrote: > Use well formed xml here and try to wrap content with a or > tag and it should work better for you. You shouldn't require > alteration of the xslt for this. > > > hello > it is a description > > > Mark > > -- > Mark R. Diggory > @mire - http://www.atmire.com > > 2009/7/10 Antonio Cuomo : > > dear D-Space developer/user > > > > i have a question: > > > > i have some html code in my Database in the description field, of course > the > > html have been transformed in plain text. > > so the database entry is: > > hello it is a description > > > > when DSpace shows the database content it actually shows the text: > > hello it is a description > > > > while i wuold like to say the html resoults instead: > > > > hello > > it is a description > > > > > > > > How can i do it? > > > > i see two possibilities: > > > > - Overwrite the java class that take data from the database and send > them > > to manakin in order to decode the html > > > > > > - working at Mankin level(but it seems me pretty much more > complicated):in > > the file DIM-Handler.xsl > > > > > > ... > > <-- call > some > > html decoder here > > ... > > > > > > > > I'm sure i'm not the first one who had this need... and i can see some > > security issues concerned with the solution > > can somebody give me some indication or "a solution"? > > > > Thank you very much > > Antonio > > > > > > ___ > > Dspace-general mailing list > > dspace-gene...@mit.edu > > http://mailman.mit.edu/mailman/listinfo/dspace-general > > > > > > > -- > Enter the BlackBerry Developer Challenge > This is your chance to win up to $100,000 in prizes! For a limited time, > vendors submitting new applications to BlackBerry App World(TM) will have > the opportunity to enter the BlackBerry Developer Challenge. See full prize > details at: http://p.sf.net/sfu/Challenge > ___ > Dspace-devel mailing list > dspace-de...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspace-devel > -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
Antonio, It is unclear why your case is not working. I can assure you that a default installation of DSpace Manakin XMLUI will allow you to place html in the form fields for any Community/Collection and that will render as HTML in the Community/collection views without being escaped, this is expected behavior. IT shouldn't require altering the xslt templates to correct for your problem, there is apparently something else going wrong with your installation. HTML escaping is not used when the content is stored in the database, it is stored as plain old unescaped html text. I suspect that there is something concerning your environment different from a typical default installation running on tomcat/linux that is giving rise to this problem. > the "problem" that the data inside the database are saved as text and > special character are added to avoid SQL-injection Are you running some sort of sql-injection filtering in-front of DSpace? Mark -- Mark R. Diggory @mire - http://www.atmire.com On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo wrote: > Dear Mark, thank you for your reply, > > unfortunately i didn't worked. > > The problem is: > cocoon throught a java class take the data from the data base and pass them > directly to manakin without any change. > > manakin build up the layout the sent it to the browser with the data cocoon > passed > > > the "problem" that the data inside the database are saved as text and > special character are added to avoid SQL-injection > so if i write: > Ciaoforever > > the data appears in the data base in this way > >Ciao
for ever
> > and so what the browser receive is > >
Ciao
for ever
> > that is visualized as > Ciaoforever > > what i need is to decode this special carachers < and > in < and > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > or the java cocoon class (do you know witch one and how?) > > Thank you very much > Antonio > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory wrote: >> >> Use well formed xml here and try to wrap content with a or >> tag and it should work better for you. You shouldn't require >> alteration of the xslt for this. >> >> >> hello >> it is a description >> >> >> Mark >> >> -- >> Mark R. Diggory >> @mire - http://www.atmire.com >> >> 2009/7/10 Antonio Cuomo : >> > dear D-Space developer/user >> > >> > i have a question: >> > >> > i have some html code in my Database in the description field, of course >> > the >> > html have been transformed in plain text. >> > so the database entry is: >> > hello it is a description >> > >> > when DSpace shows the database content it actually shows the text: >> > hello it is a description >> > >> > while i wuold like to say the html resoults instead: >> > >> > hello >> > it is a description >> > >> > >> > >> > How can i do it? >> > >> > i see two possibilities: >> > >> > - Overwrite the java class that take data from the database and send >> > them >> > to manakin in order to decode the html >> > >> > >> > - working at Mankin level(but it seems me pretty much more >> > complicated):in >> > the file DIM-Handler.xsl >> > >> > >> > ... >> > <-- call >> > some >> > html decoder here >> > ... >> > >> > >> > >> > I'm sure i'm not the first one who had this need... and i can see some >> > security issues concerned with the solution >> > can somebody give me some indication or "a solution"? >> > >> > Thank you very much >> > Antonio >> > >> > >> > ___ >> > Dspace-general mailing list >> > dspace-gene...@mit.edu >> > http://mailman.mit.edu/mailman/listinfo/dspace-general >> > >> > >> >> >> -- >> Enter the BlackBerry Developer Challenge >> This is your chance to win up to $100,000 in prizes! For a limited time, >> vendors submitting new applications to BlackBerry App World(TM) will have >> the opportunity to enter the BlackBerry Developer Challenge. See full >> prize >> details at: http://p.sf.net/sfu/Challenge >> ___ >> Dspace-devel mailing list >> dspace-de...@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/dspace-devel > > -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
Dear Mark, it's the common behavior with all the DSpace installation i have seen (MIT included). The problem is that all the data in the field dc.description are saved as plain text for security issues. so the data must to be reconverted in html before being pushed to the UI. so, do you know what is the java class that retrieve the information from the database and pass it to the UI? thank you very much Antonio On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory wrote: > Antonio, > > It is unclear why your case is not working. I can assure you that a > default installation of DSpace Manakin XMLUI will allow you to place > html in the form fields for any Community/Collection and that will > render as HTML in the Community/collection views without being > escaped, this is expected behavior. IT shouldn't require altering the > xslt templates to correct for your problem, there is apparently > something else going wrong with your installation. HTML escaping is > not used when the content is stored in the database, it is stored as > plain old unescaped html text. I suspect that there is something > concerning your environment different from a typical default > installation running on tomcat/linux that is giving rise to this > problem. > > > the "problem" that the data inside the database are saved as text and > > special character are added to avoid SQL-injection > > Are you running some sort of sql-injection filtering in-front of DSpace? > > Mark > > -- > Mark R. Diggory > @mire - http://www.atmire.com > > On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo > wrote: > > Dear Mark, thank you for your reply, > > > > unfortunately i didn't worked. > > > > The problem is: > > cocoon throught a java class take the data from the data base and pass > them > > directly to manakin without any change. > > > > manakin build up the layout the sent it to the browser with the data > cocoon > > passed > > > > > > the "problem" that the data inside the database are saved as text and > > special character are added to avoid SQL-injection > > so if i write: > > Ciaoforever > > > > the data appears in the data base in this way > > > >Ciao
for ever
> > > > and so what the browser receive is > > > >
Ciao
for ever
> > > > that is visualized as > > Ciaoforever > > > > what i need is to decode this special carachers < and > in < and > > > > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > > > or the java cocoon class (do you know witch one and how?) > > > > Thank you very much > > Antonio > > > > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory > wrote: > >> > >> Use well formed xml here and try to wrap content with a or > >> tag and it should work better for you. You shouldn't require > >> alteration of the xslt for this. > >> > >> > >> hello > >> it is a description > >> > >> > >> Mark > >> > >> -- > >> Mark R. Diggory > >> @mire - http://www.atmire.com > >> > >> 2009/7/10 Antonio Cuomo : > >> > dear D-Space developer/user > >> > > >> > i have a question: > >> > > >> > i have some html code in my Database in the description field, of > course > >> > the > >> > html have been transformed in plain text. > >> > so the database entry is: > >> > hello it is a description > >> > > >> > when DSpace shows the database content it actually shows the text: > >> > hello it is a description > >> > > >> > while i wuold like to say the html resoults instead: > >> > > >> > hello > >> > it is a description > >> > > >> > > >> > > >> > How can i do it? > >> > > >> > i see two possibilities: > >> > > >> > - Overwrite the java class that take data from the database and send > >> > them > >> > to manakin in order to decode the html > >> > > >> > > >> > - working at Mankin level(but it seems me pretty much more > >> > complicated):in > >> > the file DIM-Handler.xsl > >> > > >> > > >> > ... > >> > <-- call > >> > some > >> > html decoder here > >> > ... > >> > > >> > > >> > > >> > I'm sure i'm not the first one who had this need... and i can see some > >> > security issues concerned with the solution > >> > can somebody give me some indication or "a solution"? > >> > > >> > Thank you very much > >> > Antonio > >> > > >> > > >> > ___ > >> > Dspace-general mailing list > >> > dspace-gene...@mit.edu > >> > http://mailman.mit.edu/mailman/listinfo/dspace-general > >> > > >> > > >> > >> > >> > -- > >> Enter the BlackBerry Developer Challenge > >> This is your chance to win up to $100,000 in prizes! For a limited time, > >> vendors submitting new applications to BlackBerry App World(TM) will > have > >> the opportunity to enter the BlackBerry Developer Challenge. See full > >> prize > >> details at: http://p.sf.net/sfu/Challenge > >> _
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
Hi Guys, I am very new to Dspace, I am using ubuntu 9.4, my aim is to install dspace 1.5.2 on ubuntu desktop box please help me i tried the online manual without success. -Original Message- From: Antonio Cuomo [mailto:anto...@parliaments.info] Sent: Mon 7/13/2009 4:07 PM To: Mark Diggory Cc: dspace-tech; dspace-de...@lists.sourceforge.net Subject: Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data Dear Mark, it's the common behavior with all the DSpace installation i have seen (MIT included). The problem is that all the data in the field dc.description are saved as plain text for security issues. so the data must to be reconverted in html before being pushed to the UI. so, do you know what is the java class that retrieve the information from the database and pass it to the UI? thank you very much Antonio On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory wrote: > Antonio, > > It is unclear why your case is not working. I can assure you that a > default installation of DSpace Manakin XMLUI will allow you to place > html in the form fields for any Community/Collection and that will > render as HTML in the Community/collection views without being > escaped, this is expected behavior. IT shouldn't require altering the > xslt templates to correct for your problem, there is apparently > something else going wrong with your installation. HTML escaping is > not used when the content is stored in the database, it is stored as > plain old unescaped html text. I suspect that there is something > concerning your environment different from a typical default > installation running on tomcat/linux that is giving rise to this > problem. > > > the "problem" that the data inside the database are saved as text and > > special character are added to avoid SQL-injection > > Are you running some sort of sql-injection filtering in-front of DSpace? > > Mark > > -- > Mark R. Diggory > @mire - http://www.atmire.com > > On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo > wrote: > > Dear Mark, thank you for your reply, > > > > unfortunately i didn't worked. > > > > The problem is: > > cocoon throught a java class take the data from the data base and pass > them > > directly to manakin without any change. > > > > manakin build up the layout the sent it to the browser with the data > cocoon > > passed > > > > > > the "problem" that the data inside the database are saved as text and > > special character are added to avoid SQL-injection > > so if i write: > > Ciaoforever > > > > the data appears in the data base in this way > > > > <h3>Ciao</h3> <br/> <p>for ever <p> > > > > and so what the browser receive is > > > > <h3>Ciao</h3> <br/> <p>for ever <p> > > > > that is visualized as > > Ciaoforever > > > > what i need is to decode this special carachers < and > in < and > > > > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > > > or the java cocoon class (do you know witch one and how?) > > > > Thank you very much > > Antonio > > > > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory > wrote: > >> > >> Use well formed xml here and try to wrap content with a or > >> tag and it should work better for you. You shouldn't require > >> alteration of the xslt for this. > >> > >> > >> hello > >> it is a description > >> > >> > >> Mark > >> > >> -- > >> Mark R. Diggory > >> @mire - http://www.atmire.com > >> > >> 2009/7/10 Antonio Cuomo : > >> > dear D-Space developer/user > >> > > >> > i have a question: > >> > > >> > i have some html code in my Database in the description field, of > course > >> > the > >> > html have been transformed in plain text. > >> > so the database entry is: > >> > hello it is a description > >> > > >> > when DSpace shows the database content it actually shows the text: > >> > hello it is a description > >> > > >> > while i wuold like to say the html resoults instead: > >> > > >> > hello > >> > it is a description > >> > > >> > > >> > > >> > How can i do it? > >> > > >> > i see two possibilities: > >&g
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
I'm going to chime in here as I've also recently been working with an issue related to (read: learning about) html content within dc tags and the rendering thereof within Manakin (cocoon). Antonio, you and I are in the same situation. We both have "html" data (actually stored with entity references) within our databases that we need our end users' browsers to render as html. The trouble is that there is only one "parsing" step between our databases and what the user sees on his/her browser (that step would be the browser engine's actual parsing of the content, if I haven't missed my mark). Parsing: transforms "<" to "<" and "<" to interpretation as an actual node (). Serializing: transforms nodes into "<" and "<" into "<". Two steps each way. Since our source is of entity references and there is only one parsing step, it means that in order to have the browser understand our intention of outputting "nodes", we need to add another parsing step. Since DSpace uses Cocoon and Cocoon uses Xerces, that means Exslt (at least, that's the only extensions package I'm aware of for Xerces). I think there are two potential ways to go about this. 1) dyn:evaluate() This would function like Saxon's saxon:parse() (I think). This is probably not the way to go unless it's the only option, as it can get to be a very expensive operation fairly quickly. 2) str:replace() This is probably the way to go, but it might require adding a transformer (step 2.5) to the theme's sitemap.xmap to replace the entity references with actual symbols before it goes any further down the chain. It's possible that a transformer wouldn't be necessary and you could just add it in the xslt stylesheet, but I think the transformer might keep things more simplified. Of course the third option would be to not have html data using entity references within the database, but for you that presents a security risk and for me it's just content I have very little control over. Other than that, if anyone has any further comments on this issue or parsing/serializing as it relates to cocoon/dspace, the feedback would be appreciated! - Patrick P.S. - Antonio, there's a good chance that I'll be exploring the options I listed above over the next couple weeks, but I'll be out most of this week, if you like I can keep you informed On Jul 13, 2009, at 10:07 AM, Antonio Cuomo wrote: Dear Mark, it's the common behavior with all the DSpace installation i have seen (MIT included). The problem is that all the data in the field dc.description are saved as plain text for security issues. so the data must to be reconverted in html before being pushed to the UI. so, do you know what is the java class that retrieve the information from the database and pass it to the UI? thank you very much Antonio On Mon, Jul 13, 2009 at 9:16 AM, Mark Diggory wrote: Antonio, It is unclear why your case is not working. I can assure you that a default installation of DSpace Manakin XMLUI will allow you to place html in the form fields for any Community/Collection and that will render as HTML in the Community/collection views without being escaped, this is expected behavior. IT shouldn't require altering the xslt templates to correct for your problem, there is apparently something else going wrong with your installation. HTML escaping is not used when the content is stored in the database, it is stored as plain old unescaped html text. I suspect that there is something concerning your environment different from a typical default installation running on tomcat/linux that is giving rise to this problem. > the "problem" that the data inside the database are saved as text and > special character are added to avoid SQL-injection Are you running some sort of sql-injection filtering in-front of DSpace? Mark -- Mark R. Diggory @mire - http://www.atmire.com On Sat, Jul 11, 2009 at 3:33 AM, Antonio Cuomo> wrote: > Dear Mark, thank you for your reply, > > unfortunately i didn't worked. > > The problem is: > cocoon throught a java class take the data from the data base and pass them > directly to manakin without any change. > > manakin build up the layout the sent it to the browser with the data cocoon > passed > > > the "problem" that the data inside the database are saved as text and > special character are added to avoid SQL-injection > so if i write: > Ciaoforever > > the data appears in the data base in this way > >Ciao
for ever
> > and so what the browser receive is > >
Ciao
for ever
> > that is visualized as > Ciaoforever > > what i need is to decode this special carachers < and > in < and > > > to do this i can try to modify the xsl DIM-Handler (do u know how?) > > or the java cocoon class (do you know witch one and how?) > > Thank you very much > Antonio > > > On Sat, Jul 11, 2009 at 3:02 AM, Mark Diggory wrote: >> >
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
On Mon, Jul 13, 2009 at 7:07 AM, Antonio Cuomo wrote: > Dear Mark, it's the common behavior with all the DSpace installation i have > seen (MIT included). > > The problem is that all the data in the field dc.description are saved as > plain text for security issues. I understand now, this is a discussion about Item metadata fields, not Community Collection descriptions where html content is allowed. I agree with Alan's assessment here: 1.) I really do not advise placing html content into metadata fields. this will cause much difficulty downstream in the application when those fields are rendered into things like fields, oai records and other xml centric serializations 2.) Placing html into metadata fields suggests that they are more than content, but also presentation. Overall this is a very bad practice and I do not recommend doing it. If you do feel it necessary to approach doing this you might approach some of Patricks comments, but I will heavily caution that if the user inputs ill-formed xml, it will break the rendering pipeline and result in a 500 error page being rendered. The concern here is that the field value is parsed into the sax stream before the i18n and serialization transformations occur and thus needs to be well formed for those stages to occur. Another alternative might be to look at using JTidy to cleanup the value prior to having saxon or xalan parse it. See for instance http://scm.dspace.org/svn/repo/modules/dspace-rdf/trunk/src/main/java/org/dspace/adapters/rdf/DSpaceObjectAdapter.java Mark -- Mark R. Diggory @mire - http://www.atmire.com -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
Mark, Many thanks for your comments! And to the audience at large, for what it's worth, I'm certainly in agreement with Mark and Alan concerning the use of html within metadata fields. I believe the only reason we have this going on in our repository is due to a messy recovery we had to manage after one of our servers was compromised. I'm aiming to have our data cleaned up so we no longer have html content in metadata fields. If all goes well, we'll be back to best practices rather than messy hacks. - Patrick On Jul 13, 2009, at 4:41 PM, Mark Diggory wrote: On Mon, Jul 13, 2009 at 7:07 AM, Antonio Cuomo> wrote: Dear Mark, it's the common behavior with all the DSpace installation i have seen (MIT included). The problem is that all the data in the field dc.description are saved as plain text for security issues. I understand now, this is a discussion about Item metadata fields, not Community Collection descriptions where html content is allowed. I agree with Alan's assessment here: 1.) I really do not advise placing html content into metadata fields. this will cause much difficulty downstream in the application when those fields are rendered into things like fields, oai records and other xml centric serializations 2.) Placing html into metadata fields suggests that they are more than content, but also presentation. Overall this is a very bad practice and I do not recommend doing it. If you do feel it necessary to approach doing this you might approach some of Patricks comments, but I will heavily caution that if the user inputs ill-formed xml, it will break the rendering pipeline and result in a 500 error page being rendered. The concern here is that the field value is parsed into the sax stream before the i18n and serialization transformations occur and thus needs to be well formed for those stages to occur. Another alternative might be to look at using JTidy to cleanup the value prior to having saxon or xalan parse it. See for instance http://scm.dspace.org/svn/repo/modules/dspace-rdf/trunk/src/main/java/org/dspace/adapters/rdf/DSpaceObjectAdapter.java Mark -- Mark R. Diggory @mire - http://www.atmire.com -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ Dspace-devel mailing list dspace-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel --- Patrick K. Étienne Systems Analyst Digital Library Development Library and Information Center Georgia Institute of Technology email: patrick.etie...@library.gatech.edu phone: 404.385.8121 "Mediocre Writers Borrow; Great Writers Steal" - T.S. Eliot -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
On the one hand I tend to agree that markup shouldn't be part of metadata. On the other, if there are places where DSpace needs to write out, say, XML with metadata as attribute values or element content then it should be encoding it properly no matter what those values are. And, strangely, I seem to have a third hand: dc.description.abstract and its brethren in other schemae. Now, *I* think that an abstract should be a single brief block of text -- three, perhaps four sentences. I've been shown many times that others disagree: they write three or four *paragraphs*, often with various kinds of emphasis or even complex layout expectations. We've compromised by accepting paragraph breaks (since we have one third-party customized DSpace that already has code to turn "\n\n" into "" and ensure that it's wrapped properly). Abstracts seem to be the one exception that tests the rule. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Friends don't let friends publish revisable-form documents. pgpMFH8CzxlUU.pgp Description: PGP signature -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
As someone who has spent a lot of time talking with folks about 'shareable' metadata, I just want to second Mark's comment that we really should not be putting html content into metadata fields. If you want a feel for what this looks like downstream, do a search on (or other html tags of your choice) in something like OAIster (http://www.oaister.org/). Downstream applications that reuse your metadata generally ignore the markup but still keep it in the metadata. We've also seen example like: GOPHERUS POLYPHEMUS (Gopher Tortoise) COYOTE PREDATION where creators have tried to not include actual html markup, but in the end cause even more confusion for applications using the metadata (not to mention users who are accessing these). Sarah Shreeves -- Sarah L. Shreeves Coordinator, IDEALS http://www.ideals.uiuc.edu/ University of Illinois at Urbana-Champaign sshre...@illinois.edu 217-244-3877 or 217-333-4648 Mark Diggory wrote: > On Mon, Jul 13, 2009 at 7:07 AM, Antonio Cuomo > wrote: > >> Dear Mark, it's the common behavior with all the DSpace installation i have >> seen (MIT included). >> >> The problem is that all the data in the field dc.description are saved as >> plain text for security issues. >> > > I understand now, this is a discussion about Item metadata fields, not > Community Collection descriptions where html content is allowed. I > agree with Alan's assessment here: > > 1.) I really do not advise placing html content into metadata fields. > this will cause much difficulty downstream in the application when > those fields are rendered into things like fields, oai records and other xml centric serializations > > 2.) Placing html into metadata fields suggests that they are more than > content, but also presentation. Overall this is a very bad practice > and I do not recommend doing it. > > If you do feel it necessary to approach doing this you might approach > some of Patricks comments, but I will heavily caution that if the user > inputs ill-formed xml, it will break the rendering pipeline and result > in a 500 error page being rendered. The concern here is that the field > value is parsed into the sax stream before the i18n and serialization > transformations occur and thus needs to be well formed for those > stages to occur. > > Another alternative might be to look at using JTidy to cleanup the > value prior to having saxon or xalan parse it. See for instance > > http://scm.dspace.org/svn/repo/modules/dspace-rdf/trunk/src/main/java/org/dspace/adapters/rdf/DSpaceObjectAdapter.java > > Mark > -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
FWFW, I have seen lengthly and complex abstracts with alot of markup including math and/or chemistry markup, including embedded LaTeX. On Tue, Jul 14, 2009 at 1:39 PM, Mark H. Wood wrote: > Now, *I* think that an abstract > should be a single brief block of text -- three, perhaps four > sentences. I've been shown many times that others disagree: they > write three or four *paragraphs*, often with various kinds of emphasis > or even complex layout expectations. -- Regards, Andrew M. http://www.andrewpetermarlow.co.uk -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-devel] [Dspace-general] show HTML data
yes guys you are right but not to have the possibility to use bold character or dotted list etcetera sometimes makes hard a good description. On Wed, Jul 15, 2009 at 3:11 PM, Andrew Marlow wrote: > FWFW, I have seen lengthly and complex abstracts with alot of markup > including math and/or chemistry markup, including embedded LaTeX. > > > On Tue, Jul 14, 2009 at 1:39 PM, Mark H. Wood wrote: > >> Now, *I* think that an abstract >> should be a single brief block of text -- three, perhaps four >> sentences. I've been shown many times that others disagree: they >> write three or four *paragraphs*, often with various kinds of emphasis >> or even complex layout expectations. > > -- > Regards, > > Andrew M. > http://www.andrewpetermarlow.co.uk > > > > -- > Enter the BlackBerry Developer Challenge > This is your chance to win up to $100,000 in prizes! For a limited time, > vendors submitting new applications to BlackBerry App World(TM) will have > the opportunity to enter the BlackBerry Developer Challenge. See full prize > details at: http://p.sf.net/sfu/Challenge > ___ > Dspace-devel mailing list > dspace-de...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspace-devel > > -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech