Just use cdata to have the parser ignore the html characters.

http://www.w3schools.com/xml/xml_cdata.asp

-Reece



On Fri, Mar 7, 2008 at 5:11 PM, Latj <[EMAIL PROTECTED]> wrote:
>
>
>  When I use HTML::Entities to encode my text, I get this error:
>
>  SEVERE: org.xmlpull.v1.XmlPullParserException: could not resolve entity
>  named 'para'
>
>  Its complaining about finding:   &para;   in my text. Anyone know why this
>  is a problem?
>
>
>
>
>
>  Jérôme Etévé-2 wrote:
>  >
>  > If I understand, you want to keep the raw html code in solr like that
>  > (in your posting xml file):
>  >
>  > <field name="storyFullText">
>  >   <html></html>
>  > </field>
>  >
>  > I think you should encode your content to protect these xml entities:
>  > <  ->  &lt;
>  >> -> &gt;
>  > " -> &quot;
>  > & -> &amp;
>  >
>  > If you use perl, have a look at HTML::Entities.
>  >
>  >
>  > On 9/25/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>  >> Hello,
>  >>
>  >> I've got some problem with html code who is embedded in xml file:
>  >>
>  >> Sample source .
>  >>
>  >> <content>
>  >>         <stories>
>  >>                 <div class="storyTitle">
>  >>                          Les débats
>  >>                 </div>
>  >>                 <div class="storyIntroductionText">
>  >>                         Le premier tour des élections fédérales se
>  >> déroulera le 21
>  >> octobre prochain. D'ici là, La 1ère vous propose plusieurs rendez-
>  >> vous, dont plusieurs grands débats à l'enseigne de Forums.
>  >>                 </div>
>  >>                 <div class="paragraph">
>  >>                         <div class="paragraphTitle"/>
>  >>                         <div class="paragraphText">
>  >>                                 my para textehere
>  >>                                 <br/>
>  >>                                 <br/>
>  >>                                 Vous trouverez sur cette page toutes les
>  >> dates et les heures de
>  >> ces différents rendez-vous ainsi que le nom et les partis des
>  >> débatteurs. De plus, vous pourrez également écouter ou réécouter
>  >> l'ensemble de ces émissions.
>  >>                         </div>
>  >>                 </div>
>  >> ....
>  >> ---------
>  >> When a make a query on solr I've got something like that in the
>  >> source code of the xml result:
>  >>
>  >> <td xmlns="http://www.w3.org/1999/xhtml";>
>  >> &lt;
>  >> div
>  >> class
>  >> =
>  >> "paragraph"
>  >> &gt;<div class="expander-content">
>  >> <div class="indent">&lt;
>  >> div
>  >> class
>  >> =
>  >> "paragraphTitle"
>  >> /&gt;</div><table><tr>
>  >> <td class="expander">−<div class="spacer"/>
>  >> </td><td>&lt;
>  >> ...
>  >>
>  >> It is not exactly what I want. I want to keep the html tags, that all
>  >> without formatting.
>  >>
>  >> So the br tags and a tags are well formed in xml and json result, but
>  >> the div tags are not kept.
>  >> ---------
>  >> In the schema.xml I've got this for the html content
>  >>
>  >> <fieldType name="html" class="solr.TextField" />
>  >>
>  >>   <field name="storyFullText" type="html" indexed="true"
>  >> stored="true" multiValued="true"/>
>  >>
>  >> ---------
>  >>
>  >> Any help would be appreciate.
>  >>
>  >> Thanks in advance.
>  >>
>  >> S. Christin
>  >>
>  >>
>  >>
>  >>
>  >>
>  >>
>  >
>  >
>  > --
>  > Jerome Eteve.
>  > [EMAIL PROTECTED]
>  > http://jerome.eteve.free.fr/
>  >
>  >
>
>  --
>  View this message in context: 
> http://www.nabble.com/Problem-with-html-code-inside-xml-tp12877194p15907551.html
>  Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Reply via email to