Re: Solr is not extracting the CDATA part of xml
This all comes from a database? Here is what you want. The DataImportHandler includes a toolkit for doing full and incremental loading from databases. Read this first: http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/DIHQuickStart Then these: http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/DataImportHandlerFaq http://lucidworks.lucidimagination.com/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler After you try the procedure in QuickStart and read the other two, if you still have questions please ask. Cheers! On Fri, Apr 13, 2012 at 12:34 PM, srini wrote: > Thanks Again for quick reply. Little curious about the procedure you > suggested. I thought of using same procedure as you suggested. Like writing > a java program to fetch xml record from db and parse the content hand it to > Solr for indexing. > > but what if my database content get changed? should I re run my java program > to fetch xml and add to solr for re indexing? > > the content of xml format does not match to solr example xml formats. Any > suggestions here? > > when I import xml records from oracle and add it to solr and search for a > word, solr is displaying whole xml doc which has that word. what is wrong > with this procedure( I do see my search word in the content of xml, only bad > part is it is displaying whole doc instead CDATA part of it). Please suggest > if there is better of doing this task other than SolrJ > > Thanks in Advance > Srini > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908825.html > Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Solr is not extracting the CDATA part of xml
Thanks Again for quick reply. Little curious about the procedure you suggested. I thought of using same procedure as you suggested. Like writing a java program to fetch xml record from db and parse the content hand it to Solr for indexing. but what if my database content get changed? should I re run my java program to fetch xml and add to solr for re indexing? the content of xml format does not match to solr example xml formats. Any suggestions here? when I import xml records from oracle and add it to solr and search for a word, solr is displaying whole xml doc which has that word. what is wrong with this procedure( I do see my search word in the content of xml, only bad part is it is displaying whole doc instead CDATA part of it). Please suggest if there is better of doing this task other than SolrJ Thanks in Advance Srini -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908825.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr is not extracting the CDATA part of xml
Hi This is not solr format. You must re-format your XML into solr XML. you may find examples on solr wiki or in solr examples dir. Best Regards Alexander Aristov On 13 April 2012 23:13, srini wrote: > Erick, > > Thanks for your reply. when you say Solr does not index arbitery xml > document, then below is the way my xml document looks like which is sitting > in oracle. Could you suggest the best of indexing it ? which method should > I > follow? Should I use XPathEntityProcessor? > > > http://www.w3.org/2001/XMLSchema-instance"; > xmlns="someurl" xmlns:csp="someurl.xsd" xsi:schemaLocation="somelocation > jar: id="002" message-type="create"> > > > 100 > 115 > > > > > Thanks in Advance > Erick Erickson wrote > > > > Solr does not index arbitrary XML content. There is and XML > > form of a solr document that can be sent to Solr, but it is > > a specific form of XML. > > > > An example of the XML you're trying to index and what you mean > > by "not working" would be helpful. > > > > Best > > Erick > > > > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: > >> not sure why CDATA part did not get interpreted. this is how xml content > >> looks like. I added quotes just to present the exact content xml > content. > >> > >> "" > >> > >> -- > >> View this message in context: > >> > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > > > > Erick Erickson wrote > > > > Solr does not index arbitrary XML content. There is and XML > > form of a solr document that can be sent to Solr, but it is > > a specific form of XML. > > > > An example of the XML you're trying to index and what you mean > > by "not working" would be helpful. > > > > Best > > Erick > > > > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: > >> not sure why CDATA part did not get interpreted. this is how xml content > >> looks like. I added quotes just to present the exact content xml > content. > >> > >> "" > >> > >> -- > >> View this message in context: > >> > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > > > > Erick Erickson wrote > > > > Solr does not index arbitrary XML content. There is and XML > > form of a solr document that can be sent to Solr, but it is > > a specific form of XML. > > > > An example of the XML you're trying to index and what you mean > > by "not working" would be helpful. > > > > Best > > Erick > > > > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: > >> not sure why CDATA part did not get interpreted. this is how xml content > >> looks like. I added quotes just to present the exact content xml > content. > >> > >> "" > >> > >> -- > >> View this message in context: > >> > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908791.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Solr is not extracting the CDATA part of xml
Right, that will not work at all for direct transmission to Solr. You could write a Java program that parses this and sends it to Solr via SolrJ. Personally I haven't connected a database to Solr with XPathEntityProcessor in the mix, but I believe I've seen messages go by with this configuration. You might want to search the mail archive... Best Erick On Fri, Apr 13, 2012 at 3:13 PM, srini wrote: > Erick, > > Thanks for your reply. when you say Solr does not index arbitery xml > document, then below is the way my xml document looks like which is sitting > in oracle. Could you suggest the best of indexing it ? which method should I > follow? Should I use XPathEntityProcessor? > > > http://www.w3.org/2001/XMLSchema-instance"; > xmlns="someurl" xmlns:csp="someurl.xsd" xsi:schemaLocation="somelocation > jar: id="002" message-type="create"> > > > 100 > 115 > > > > > Thanks in Advance > Erick Erickson wrote >> >> Solr does not index arbitrary XML content. There is and XML >> form of a solr document that can be sent to Solr, but it is >> a specific form of XML. >> >> An example of the XML you're trying to index and what you mean >> by "not working" would be helpful. >> >> Best >> Erick >> >> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: >>> not sure why CDATA part did not get interpreted. this is how xml content >>> looks like. I added quotes just to present the exact content xml content. >>> >>> "" >>> >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> > > Erick Erickson wrote >> >> Solr does not index arbitrary XML content. There is and XML >> form of a solr document that can be sent to Solr, but it is >> a specific form of XML. >> >> An example of the XML you're trying to index and what you mean >> by "not working" would be helpful. >> >> Best >> Erick >> >> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: >>> not sure why CDATA part did not get interpreted. this is how xml content >>> looks like. I added quotes just to present the exact content xml content. >>> >>> "" >>> >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> > > Erick Erickson wrote >> >> Solr does not index arbitrary XML content. There is and XML >> form of a solr document that can be sent to Solr, but it is >> a specific form of XML. >> >> An example of the XML you're trying to index and what you mean >> by "not working" would be helpful. >> >> Best >> Erick >> >> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: >>> not sure why CDATA part did not get interpreted. this is how xml content >>> looks like. I added quotes just to present the exact content xml content. >>> >>> "" >>> >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908791.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr is not extracting the CDATA part of xml
Erick, Thanks for your reply. when you say Solr does not index arbitery xml document, then below is the way my xml document looks like which is sitting in oracle. Could you suggest the best of indexing it ? which method should I follow? Should I use XPathEntityProcessor? http://www.w3.org/2001/XMLSchema-instance"; xmlns="someurl" xmlns:csp="someurl.xsd" xsi:schemaLocation="somelocation jar: id="002" message-type="create"> 100 115 Thanks in Advance Erick Erickson wrote > > Solr does not index arbitrary XML content. There is and XML > form of a solr document that can be sent to Solr, but it is > a specific form of XML. > > An example of the XML you're trying to index and what you mean > by "not working" would be helpful. > > Best > Erick > > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: >> not sure why CDATA part did not get interpreted. this is how xml content >> looks like. I added quotes just to present the exact content xml content. >> >> "" >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html >> Sent from the Solr - User mailing list archive at Nabble.com. > Erick Erickson wrote > > Solr does not index arbitrary XML content. There is and XML > form of a solr document that can be sent to Solr, but it is > a specific form of XML. > > An example of the XML you're trying to index and what you mean > by "not working" would be helpful. > > Best > Erick > > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: >> not sure why CDATA part did not get interpreted. this is how xml content >> looks like. I added quotes just to present the exact content xml content. >> >> "" >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html >> Sent from the Solr - User mailing list archive at Nabble.com. > Erick Erickson wrote > > Solr does not index arbitrary XML content. There is and XML > form of a solr document that can be sent to Solr, but it is > a specific form of XML. > > An example of the XML you're trying to index and what you mean > by "not working" would be helpful. > > Best > Erick > > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote: >> not sure why CDATA part did not get interpreted. this is how xml content >> looks like. I added quotes just to present the exact content xml content. >> >> "" >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html >> Sent from the Solr - User mailing list archive at Nabble.com. > -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908791.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr is not extracting the CDATA part of xml
Solr does not index arbitrary XML content. There is and XML form of a solr document that can be sent to Solr, but it is a specific form of XML. An example of the XML you're trying to index and what you mean by "not working" would be helpful. Best Erick On Fri, Apr 13, 2012 at 11:50 AM, srini wrote: > not sure why CDATA part did not get interpreted. this is how xml content > looks like. I added quotes just to present the exact content xml content. > > "" > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr is not extracting the CDATA part of xml
not sure why CDATA part did not get interpreted. this is how xml content looks like. I added quotes just to present the exact content xml content. "" -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr is not extracting the CDATA part of xml
I am trying to use method that is suggested in solr forum to remove CDATA part of xml. but it is not working. result show whole xml content instead of CDATA part. schema.xml mappings.txt "" => "" my xml content -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908317.html Sent from the Solr - User mailing list archive at Nabble.com.