hi

yes I dont have html as documents I have data saved in sql data base in
HTML format and I want to index it on solr but not as complete string that
is with tags but just want to index the actual text in it...that is strip
off the tags.

regards
Rohan

On Wed, Feb 20, 2013 at 6:40 PM, Gora Mohanty <g...@mimirtech.com> wrote:

> On 20 February 2013 18:31, Rohan Thakur <rohan.i...@gmail.com> wrote:
> > hi all
> >
> > I have data stored in HTML format in a column in sql database and want to
> > index the data from that field to solr how can I do that any one has idea
> > please help. right now i am treating it as a string which is indexing
> > complete HTML with tags as one string to solr.
>
> How do you want to process the HTML? If you simply want to
> strip HTML tags, please take a look at the HTMLStripTransformer
> http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer
>
> Your title implies that you want to parse the HTML in some
> fashion. If so, you will need to do that on your own, e.g., by
> using a transformer.
>
> Regards,
> Gora
>

Reply via email to