Re: [Dspace-tech] Question about replacement of special characters in XML
Hi Laurie, We use SunWebserver7, but yes, our server.xml file is set to UTF-8: Thanks, Sue Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract (757) 224-4074 -Original Message- From: Laurie Nelson [mailto:laurie_nel...@sil.org] Sent: Monday, January 03, 2011 12:28 PM To: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] Question about replacement of special characters in XML Sue, Is your tomcat server.xml set to utf-8? Laurie Nelson REAP Administrator SIL International On 1/3/2011 11:15 AM, dspace-tech-requ...@lists.sourceforge.net wrote: > > Message: 1 > Date: Mon, 3 Jan 2011 10:31:03 -0600 > From: "Thornton, Susan M. (LARC-B702)[LITES]" > > Subject: Re: [Dspace-tech] Question about replacement of special > characters in XML > To: Allen Lam, > "dspace-tech@lists.sourceforge.net" > > Message-ID: > <03de6124b1f32240b3692ed5e591ed16052eed2...@ndmsscc07.ndc.nasa.gov> > Content-Type: text/plain; charset="utf-8" > > This is interesting too. When I edit the item in DSpace 1.5.1 and copy the > correct title into the input text box, it ends up looking like this on both > the ?Edit item? screen and the DSpace long listing page: > > First results for 13CH4 at 7 ?m<== Note the Greek letter after the 7 is > displaying correctly, but the 13 is no longer a superscript and the 4 is no > longer a subscript. > > But when I look at the title on the Item page (Short listing), this is how it > looks: > > First Results for 13CH4 at 7 {mu}m<== Note the Greek letter after the 7 is > now displaying as {mu}. > > > Is there anything I can do to correct this in DSpace? > > Thanks, > Sue > > > > > > Sue Walker-Thornton > Software Developer/Database Administrator > NASA Langley Research Center|LITES Contract > (757) 224-4074 > > > From: Thornton, Susan M. (LARC-B702)[LITES] > Sent: Monday, January 03, 2011 11:25 AM > To: 'Allen Lam'; dspace-tech@lists.sourceforge.net > Subject: RE: [Dspace-tech] Question about replacement of special characters > in XML > > Here?s an example of a portion of a title I?m having problems with. This is > how the title is supposed to look: > > ?results for 13CH4 at 7 ?m? > > > Even if I edit the item and copy and paste the text into the title, this is > how it ends up looking in DSpace: > > ?Results for 13CH4 at 7 {mu}m? > > > > > Sue Walker-Thornton > Software Developer/Database Administrator > NASA Langley Research Center|LITES Contract > (757) 224-4074 > > > From: Allen Lam [mailto:allen.dsp...@gmail.com] > Sent: Thursday, December 30, 2010 9:29 PM > To: dspace-tech@lists.sourceforge.net; Thornton, Susan M. (LARC-B702)[LITES] > Subject: Re: [Dspace-tech] Question about replacement of special characters > in XML > > Hi Sue, > > It is not necessary to convert all chars into html reference, except for a > few special chars not usable in xml. > All you need is to define and save the xml file in utf-8. > > Appended an example file with some special chars in its original form. Open > it in a browser. > (don't know can this mailing list retain the appended file) > > Best, > Allen Lam. > HKU Scholars Hub Administrator, http://hub.hku.hk > > > On 2010-12-30 10:22 AM, Thornton, Susan M. (LARC-B702)[LITES] wrote: > > Hi, > > We are working on an interface between a legacy system and DSpace 1.5.1 > and I keep running into problems with special characters in the text. NASA > research documents have lots of different special characters in them ? some > of them are common ones such as the degree symbol - ? and some of them are > more uncommon ones such as ?right ceiling? - ? (see > http://myhandbook.info/codes_htmlchr.html for a pretty good list of symbols > and their equivalent ?character references?). The interface is fairly new > and so far we?ve just been adding code to the extract program that outputs an > xml file, to replace the special character or symbol with the equivalent > ?character reference? as we identify them. Inevitably though, the program is > going to abend when it finds a symbol we haven?t coded for and we?re going to > have to keep changing it to replace new symbols. > > > > I did some Googling today, trying to find an already-existing JAVA > method or class that replaces symbols with the equivalent character > reference, hoping that I don?t have to write one myself, but so far have not > found one. Does anyone know of one? > > > > Thanks in advance, > > Sue > > > > > > > > Sue Walker-Thornto
Re: [Dspace-tech] Question about replacement of special characters in XML
Sue, Is your tomcat server.xml set to utf-8? Laurie Nelson REAP Administrator SIL International On 1/3/2011 11:15 AM, dspace-tech-requ...@lists.sourceforge.net wrote: > > Message: 1 > Date: Mon, 3 Jan 2011 10:31:03 -0600 > From: "Thornton, Susan M. (LARC-B702)[LITES]" > > Subject: Re: [Dspace-tech] Question about replacement of special > characters in XML > To: Allen Lam, > "dspace-tech@lists.sourceforge.net" > > Message-ID: > <03de6124b1f32240b3692ed5e591ed16052eed2...@ndmsscc07.ndc.nasa.gov> > Content-Type: text/plain; charset="utf-8" > > This is interesting too. When I edit the item in DSpace 1.5.1 and copy the > correct title into the input text box, it ends up looking like this on both > the ?Edit item? screen and the DSpace long listing page: > > First results for 13CH4 at 7 ?m<== Note the Greek letter after the 7 is > displaying correctly, but the 13 is no longer a superscript and the 4 is no > longer a subscript. > > But when I look at the title on the Item page (Short listing), this is how it > looks: > > First Results for 13CH4 at 7 {mu}m<== Note the Greek letter after the 7 is > now displaying as {mu}. > > > Is there anything I can do to correct this in DSpace? > > Thanks, > Sue > > > > > > Sue Walker-Thornton > Software Developer/Database Administrator > NASA Langley Research Center|LITES Contract > (757) 224-4074 > > > From: Thornton, Susan M. (LARC-B702)[LITES] > Sent: Monday, January 03, 2011 11:25 AM > To: 'Allen Lam'; dspace-tech@lists.sourceforge.net > Subject: RE: [Dspace-tech] Question about replacement of special characters > in XML > > Here?s an example of a portion of a title I?m having problems with. This is > how the title is supposed to look: > > ?results for 13CH4 at 7 ?m? > > > Even if I edit the item and copy and paste the text into the title, this is > how it ends up looking in DSpace: > > ?Results for 13CH4 at 7 {mu}m? > > > > > Sue Walker-Thornton > Software Developer/Database Administrator > NASA Langley Research Center|LITES Contract > (757) 224-4074 > > > From: Allen Lam [mailto:allen.dsp...@gmail.com] > Sent: Thursday, December 30, 2010 9:29 PM > To: dspace-tech@lists.sourceforge.net; Thornton, Susan M. (LARC-B702)[LITES] > Subject: Re: [Dspace-tech] Question about replacement of special characters > in XML > > Hi Sue, > > It is not necessary to convert all chars into html reference, except for a > few special chars not usable in xml. > All you need is to define and save the xml file in utf-8. > > Appended an example file with some special chars in its original form. Open > it in a browser. > (don't know can this mailing list retain the appended file) > > Best, > Allen Lam. > HKU Scholars Hub Administrator, http://hub.hku.hk > > > On 2010-12-30 10:22 AM, Thornton, Susan M. (LARC-B702)[LITES] wrote: > > Hi, > > We are working on an interface between a legacy system and DSpace 1.5.1 > and I keep running into problems with special characters in the text. NASA > research documents have lots of different special characters in them ? some > of them are common ones such as the degree symbol - ? and some of them are > more uncommon ones such as ?right ceiling? - ? (see > http://myhandbook.info/codes_htmlchr.html for a pretty good list of symbols > and their equivalent ?character references?). The interface is fairly new > and so far we?ve just been adding code to the extract program that outputs an > xml file, to replace the special character or symbol with the equivalent > ?character reference? as we identify them. Inevitably though, the program is > going to abend when it finds a symbol we haven?t coded for and we?re going to > have to keep changing it to replace new symbols. > > > > I did some Googling today, trying to find an already-existing JAVA > method or class that replaces symbols with the equivalent character > reference, hoping that I don?t have to write one myself, but so far have not > found one. Does anyone know of one? > > > > Thanks in advance, > > Sue > > > > > > > > Sue Walker-Thornton > > Software Developer/Database Administrator > > NASA Langley Research Center|LITES Contract > > SGT, Inc.|130 Research Drive > > Hampton, Va. 23666 > > Office: (757) 224-4074 > > Mobile: (757) 506-9903 > > Fax: (757) 224-4001 > > susan.m.thorn...@nasa.gov<mailto:susan.m.thorn...@nasa.gov> > -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Question about replacement of special characters in XML
This is interesting too. When I edit the item in DSpace 1.5.1 and copy the correct title into the input text box, it ends up looking like this on both the “Edit item” screen and the DSpace long listing page: First results for 13CH4 at 7 μm <== Note the Greek letter after the 7 is displaying correctly, but the 13 is no longer a superscript and the 4 is no longer a subscript. But when I look at the title on the Item page (Short listing), this is how it looks: First Results for 13CH4 at 7 {mu}m<== Note the Greek letter after the 7 is now displaying as {mu}. Is there anything I can do to correct this in DSpace? Thanks, Sue Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract (757) 224-4074 From: Thornton, Susan M. (LARC-B702)[LITES] Sent: Monday, January 03, 2011 11:25 AM To: 'Allen Lam'; dspace-tech@lists.sourceforge.net Subject: RE: [Dspace-tech] Question about replacement of special characters in XML Here’s an example of a portion of a title I’m having problems with. This is how the title is supposed to look: …results for 13CH4 at 7 μm… Even if I edit the item and copy and paste the text into the title, this is how it ends up looking in DSpace: …Results for 13CH4 at 7 {mu}m… Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract (757) 224-4074 From: Allen Lam [mailto:allen.dsp...@gmail.com] Sent: Thursday, December 30, 2010 9:29 PM To: dspace-tech@lists.sourceforge.net; Thornton, Susan M. (LARC-B702)[LITES] Subject: Re: [Dspace-tech] Question about replacement of special characters in XML Hi Sue, It is not necessary to convert all chars into html reference, except for a few special chars not usable in xml. All you need is to define and save the xml file in utf-8. Appended an example file with some special chars in its original form. Open it in a browser. (don't know can this mailing list retain the appended file) Best, Allen Lam. HKU Scholars Hub Administrator, http://hub.hku.hk On 2010-12-30 10:22 AM, Thornton, Susan M. (LARC-B702)[LITES] wrote: Hi, We are working on an interface between a legacy system and DSpace 1.5.1 and I keep running into problems with special characters in the text. NASA research documents have lots of different special characters in them – some of them are common ones such as the degree symbol - ° and some of them are more uncommon ones such as “right ceiling” - ⌉ (see http://myhandbook.info/codes_htmlchr.html for a pretty good list of symbols and their equivalent “character references”). The interface is fairly new and so far we’ve just been adding code to the extract program that outputs an xml file, to replace the special character or symbol with the equivalent “character reference” as we identify them. Inevitably though, the program is going to abend when it finds a symbol we haven’t coded for and we’re going to have to keep changing it to replace new symbols. I did some Googling today, trying to find an already-existing JAVA method or class that replaces symbols with the equivalent character reference, hoping that I don’t have to write one myself, but so far have not found one. Does anyone know of one? Thanks in advance, Sue Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract SGT, Inc.|130 Research Drive Hampton, Va. 23666 Office: (757) 224-4074 Mobile: (757) 506-9903 Fax: (757) 224-4001 susan.m.thorn...@nasa.gov<mailto:susan.m.thorn...@nasa.gov> -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net<mailto:DSpace-tech@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/dspace-tech -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Question about replacement of special characters in XML
Here’s an example of a portion of a title I’m having problems with. This is how the title is supposed to look: …results for 13CH4 at 7 μm… Even if I edit the item and copy and paste the text into the title, this is how it ends up looking in DSpace: …Results for 13CH4 at 7 {mu}m… Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract (757) 224-4074 From: Allen Lam [mailto:allen.dsp...@gmail.com] Sent: Thursday, December 30, 2010 9:29 PM To: dspace-tech@lists.sourceforge.net; Thornton, Susan M. (LARC-B702)[LITES] Subject: Re: [Dspace-tech] Question about replacement of special characters in XML Hi Sue, It is not necessary to convert all chars into html reference, except for a few special chars not usable in xml. All you need is to define and save the xml file in utf-8. Appended an example file with some special chars in its original form. Open it in a browser. (don't know can this mailing list retain the appended file) Best, Allen Lam. HKU Scholars Hub Administrator, http://hub.hku.hk On 2010-12-30 10:22 AM, Thornton, Susan M. (LARC-B702)[LITES] wrote: Hi, We are working on an interface between a legacy system and DSpace 1.5.1 and I keep running into problems with special characters in the text. NASA research documents have lots of different special characters in them – some of them are common ones such as the degree symbol - ° and some of them are more uncommon ones such as “right ceiling” - ⌉ (see http://myhandbook.info/codes_htmlchr.html for a pretty good list of symbols and their equivalent “character references”). The interface is fairly new and so far we’ve just been adding code to the extract program that outputs an xml file, to replace the special character or symbol with the equivalent “character reference” as we identify them. Inevitably though, the program is going to abend when it finds a symbol we haven’t coded for and we’re going to have to keep changing it to replace new symbols. I did some Googling today, trying to find an already-existing JAVA method or class that replaces symbols with the equivalent character reference, hoping that I don’t have to write one myself, but so far have not found one. Does anyone know of one? Thanks in advance, Sue Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract SGT, Inc.|130 Research Drive Hampton, Va. 23666 Office: (757) 224-4074 Mobile: (757) 506-9903 Fax: (757) 224-4001 susan.m.thorn...@nasa.gov<mailto:susan.m.thorn...@nasa.gov> -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net<mailto:DSpace-tech@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/dspace-tech -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Question about replacement of special characters in XML
Hi Sue, It is not necessary to convert all chars into html reference, except for a few special chars not usable in xml. All you need is to define and save the xml file in utf-8. Appended an example file with some special chars in its original form. Open it in a browser. (don't know can this mailing list retain the appended file) Best, Allen Lam. HKU Scholars Hub Administrator, http://hub.hku.hk On 2010-12-30 10:22 AM, Thornton, Susan M. (LARC-B702)[LITES] wrote: Hi, We are working on an interface between a legacy system and DSpace 1.5.1 and I keep running into problems with special characters in the text. NASA research documents have lots of different special characters in them – some of them are common ones such as the degree symbol - ° and some of them are more uncommon ones such as “right ceiling” - ⌉ (see http://myhandbook.info/codes_htmlchr.html for a pretty good list of symbols and their equivalent “character references”). The interface is fairly new and so far we’ve just been adding code to the extract program that outputs an xml file, to replace the special character or symbol with the equivalent “character reference” as we identify them. Inevitably though, the program is going to abend when it finds a symbol we haven’t coded for and we’re going to have to keep changing it to replace new symbols. I did some Googling today, trying to find an already-existing JAVA method or class that replaces symbols with the equivalent character reference, hoping that I don’t have to write one myself, but so far have not found one. Does anyone know of one? Thanks in advance, Sue Sue Walker-Thornton Software Developer/Database Administrator NASA Langley Research Center|LITES Contract SGT, Inc.|130 Research Drive Hampton, Va. 23666 Office: (757) 224-4074 Mobile: (757) 506-9903 Fax: (757) 224-4001 susan.m.thorn...@nasa.gov -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech  ¢£¤¥½¼¾ââââ¬Ï -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech