Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread Lance Norskog
This all comes from a database? Here is what you want.

The DataImportHandler includes a toolkit for doing full and
incremental loading from databases.

Read this first:
http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/DIHQuickStart

Then these:
http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/DataImportHandlerFaq
http://lucidworks.lucidimagination.com/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

After you try the procedure in QuickStart and read the other two, if
you still have questions please ask.

Cheers!

On Fri, Apr 13, 2012 at 12:34 PM, srini  wrote:
> Thanks Again for quick reply. Little curious about the procedure you
> suggested. I thought of using same procedure as you suggested. Like writing
> a java program to fetch xml record from db and parse the content hand it to
> Solr for indexing.
>
> but what if my database content get changed? should I re run my java program
> to fetch xml and add to solr for re indexing?
>
> the content of xml format does not match to solr example xml formats. Any
> suggestions here?
>
> when I import xml records from oracle and add it to solr and search for a
> word, solr is displaying whole xml doc which has that word. what is wrong
> with this procedure( I do see my search word in the content of xml, only bad
> part is it is displaying whole doc instead CDATA part of it). Please suggest
> if there is better of doing this task other than SolrJ
>
> Thanks in Advance
> Srini
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908825.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Lance Norskog
goks...@gmail.com


Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread srini
Thanks Again for quick reply. Little curious about the procedure you
suggested. I thought of using same procedure as you suggested. Like writing
a java program to fetch xml record from db and parse the content hand it to
Solr for indexing.

but what if my database content get changed? should I re run my java program
to fetch xml and add to solr for re indexing?

the content of xml format does not match to solr example xml formats. Any
suggestions here?

when I import xml records from oracle and add it to solr and search for a
word, solr is displaying whole xml doc which has that word. what is wrong
with this procedure( I do see my search word in the content of xml, only bad
part is it is displaying whole doc instead CDATA part of it). Please suggest
if there is better of doing this task other than SolrJ

Thanks in Advance
Srini





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908825.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread Alexander Aristov
Hi

This is not solr format. You must re-format your XML into solr XML. you may
find examples on solr wiki or in solr examples dir.

Best Regards
Alexander Aristov


On 13 April 2012 23:13, srini  wrote:

> Erick,
>
> Thanks for your reply. when you say Solr does not index arbitery xml
> document, then below is the way my xml document looks like which is sitting
> in oracle. Could you suggest the best of indexing it ? which method should
> I
> follow? Should I use XPathEntityProcessor?
>
> 
> http://www.w3.org/2001/XMLSchema-instance";
> xmlns="someurl" xmlns:csp="someurl.xsd" xsi:schemaLocation="somelocation
> jar: id="002" message-type="create">
> 
> 
>  100
>  115
>  
>
>  
>
> Thanks in Advance
> Erick Erickson wrote
> >
> > Solr does not index arbitrary XML content. There is and XML
> > form of a solr document that can be sent to Solr, but it is
> > a specific form of XML.
> >
> > An example of the XML you're trying to index and what you mean
> > by "not working" would be helpful.
> >
> > Best
> > Erick
> >
> > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
> >> not sure why CDATA part did not get interpreted. this is how xml content
> >> looks like. I added quotes just to present the exact content xml
> content.
> >>
> >> ""
> >>
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
> Erick Erickson wrote
> >
> > Solr does not index arbitrary XML content. There is and XML
> > form of a solr document that can be sent to Solr, but it is
> > a specific form of XML.
> >
> > An example of the XML you're trying to index and what you mean
> > by "not working" would be helpful.
> >
> > Best
> > Erick
> >
> > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
> >> not sure why CDATA part did not get interpreted. this is how xml content
> >> looks like. I added quotes just to present the exact content xml
> content.
> >>
> >> ""
> >>
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
> Erick Erickson wrote
> >
> > Solr does not index arbitrary XML content. There is and XML
> > form of a solr document that can be sent to Solr, but it is
> > a specific form of XML.
> >
> > An example of the XML you're trying to index and what you mean
> > by "not working" would be helpful.
> >
> > Best
> > Erick
> >
> > On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
> >> not sure why CDATA part did not get interpreted. this is how xml content
> >> looks like. I added quotes just to present the exact content xml
> content.
> >>
> >> ""
> >>
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908791.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread Erick Erickson
Right, that will not work at all for direct transmission to
Solr.

You could write a Java program that parses this and sends
it to Solr via SolrJ.

Personally I haven't connected a database to Solr with
XPathEntityProcessor in the mix, but I believe I've seen
messages go by with this configuration. You might want
to search the mail archive...

Best
Erick

On Fri, Apr 13, 2012 at 3:13 PM, srini  wrote:
> Erick,
>
> Thanks for your reply. when you say Solr does not index arbitery xml
> document, then below is the way my xml document looks like which is sitting
> in oracle. Could you suggest the best of indexing it ? which method should I
> follow? Should I use XPathEntityProcessor?
>
> 
> http://www.w3.org/2001/XMLSchema-instance";
> xmlns="someurl" xmlns:csp="someurl.xsd" xsi:schemaLocation="somelocation
> jar: id="002" message-type="create">
> 
>     
>      100
>      115
>      
>
>  
>
> Thanks in Advance
> Erick Erickson wrote
>>
>> Solr does not index arbitrary XML content. There is and XML
>> form of a solr document that can be sent to Solr, but it is
>> a specific form of XML.
>>
>> An example of the XML you're trying to index and what you mean
>> by "not working" would be helpful.
>>
>> Best
>> Erick
>>
>> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
>>> not sure why CDATA part did not get interpreted. this is how xml content
>>> looks like. I added quotes just to present the exact content xml content.
>>>
>>> ""
>>>
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
> Erick Erickson wrote
>>
>> Solr does not index arbitrary XML content. There is and XML
>> form of a solr document that can be sent to Solr, but it is
>> a specific form of XML.
>>
>> An example of the XML you're trying to index and what you mean
>> by "not working" would be helpful.
>>
>> Best
>> Erick
>>
>> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
>>> not sure why CDATA part did not get interpreted. this is how xml content
>>> looks like. I added quotes just to present the exact content xml content.
>>>
>>> ""
>>>
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
> Erick Erickson wrote
>>
>> Solr does not index arbitrary XML content. There is and XML
>> form of a solr document that can be sent to Solr, but it is
>> a specific form of XML.
>>
>> An example of the XML you're trying to index and what you mean
>> by "not working" would be helpful.
>>
>> Best
>> Erick
>>
>> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
>>> not sure why CDATA part did not get interpreted. this is how xml content
>>> looks like. I added quotes just to present the exact content xml content.
>>>
>>> ""
>>>
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908791.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread srini
Erick,

Thanks for your reply. when you say Solr does not index arbitery xml
document, then below is the way my xml document looks like which is sitting
in oracle. Could you suggest the best of indexing it ? which method should I
follow? Should I use XPathEntityProcessor?

 
http://www.w3.org/2001/XMLSchema-instance";
xmlns="someurl" xmlns:csp="someurl.xsd" xsi:schemaLocation="somelocation
jar: id="002" message-type="create">

 
  100  
  115
  
 
 

Thanks in Advance
Erick Erickson wrote
> 
> Solr does not index arbitrary XML content. There is and XML
> form of a solr document that can be sent to Solr, but it is
> a specific form of XML.
> 
> An example of the XML you're trying to index and what you mean
> by "not working" would be helpful.
> 
> Best
> Erick
> 
> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
>> not sure why CDATA part did not get interpreted. this is how xml content
>> looks like. I added quotes just to present the exact content xml content.
>>
>> ""
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 

Erick Erickson wrote
> 
> Solr does not index arbitrary XML content. There is and XML
> form of a solr document that can be sent to Solr, but it is
> a specific form of XML.
> 
> An example of the XML you're trying to index and what you mean
> by "not working" would be helpful.
> 
> Best
> Erick
> 
> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
>> not sure why CDATA part did not get interpreted. this is how xml content
>> looks like. I added quotes just to present the exact content xml content.
>>
>> ""
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 

Erick Erickson wrote
> 
> Solr does not index arbitrary XML content. There is and XML
> form of a solr document that can be sent to Solr, but it is
> a specific form of XML.
> 
> An example of the XML you're trying to index and what you mean
> by "not working" would be helpful.
> 
> Best
> Erick
> 
> On Fri, Apr 13, 2012 at 11:50 AM, srini <softtech88@> wrote:
>> not sure why CDATA part did not get interpreted. this is how xml content
>> looks like. I added quotes just to present the exact content xml content.
>>
>> ""
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908791.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread Erick Erickson
Solr does not index arbitrary XML content. There is and XML
form of a solr document that can be sent to Solr, but it is
a specific form of XML.

An example of the XML you're trying to index and what you mean
by "not working" would be helpful.

Best
Erick

On Fri, Apr 13, 2012 at 11:50 AM, srini  wrote:
> not sure why CDATA part did not get interpreted. this is how xml content
> looks like. I added quotes just to present the exact content xml content.
>
> ""
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr is not extracting the CDATA part of xml

2012-04-13 Thread srini
not sure why CDATA part did not get interpreted. this is how xml content
looks like. I added quotes just to present the exact content xml content.

""

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908341.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr is not extracting the CDATA part of xml

2012-04-13 Thread srini
I am trying to use method that is suggested in solr forum to remove CDATA
part of xml. but it is not working. result show whole xml content instead of
CDATA part.

schema.xml

  


  
  


mappings.txt
"" => ""

my xml content


 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-not-extracting-the-CDATA-part-of-xml-tp3908317p3908317.html
Sent from the Solr - User mailing list archive at Nabble.com.