Attempting dataimport using FileListEntityProcessor

2008-06-23 Thread mike segv

I'm trying to use the fileListEntityProcessor to add some xml documents to a
solr index.  I'm running a nightly version of solr-1.3 with SOLR-469 and
SOLR-563.  I've been able to successfuly run the slashdot httpDataSource
example.  My data-config.xml file loads without errors.  When I attempt the
full-import command I get the exception below.  Thanks for any help.

Mike

WARNING: No lockType configured for
/san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
java.lang.RuntimeException: java.lang.NullPointerException
at
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
at
org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
Caused by: java.lang.NullPointerException
at java.io.Reader.(Reader.java:61)
at java.io.BufferedReader.(BufferedReader.java:76)
at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
at
com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
at
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
... 10 more

Here is my data-config:




  
 
  




And a snippet from an xml file:

12236137

1980
01
03



-- 
View this message in context: 
http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Attempting dataimport using FileListEntityProcessor

2008-06-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi ,
You have not registered any datasources . the second entity needs a datasource.
Remove the dataSource="null"  and add a name for the second entity
(good practice). No need for baseDir attribute for second entity .
See the modified xml added below
--Noble





 

 




On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
>
> I'm trying to use the fileListEntityProcessor to add some xml documents to a
> solr index.  I'm running a nightly version of solr-1.3 with SOLR-469 and
> SOLR-563.  I've been able to successfuly run the slashdot httpDataSource
> example.  My data-config.xml file loads without errors.  When I attempt the
> full-import command I get the exception below.  Thanks for any help.
>
> Mike
>
> WARNING: No lockType configured for
> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> SEVERE: Full Import failed
> java.lang.RuntimeException: java.lang.NullPointerException
>at
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
>at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
>at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
>at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
>at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
>at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
>at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
>at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
>at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>at
> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
> Caused by: java.lang.NullPointerException
>at java.io.Reader.(Reader.java:61)
>at java.io.BufferedReader.(BufferedReader.java:76)
>at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
>at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
>at
> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
>at
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
>... 10 more
>
> Here is my data-config:
>
> 
> 
>  newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
> dataSource="null" baseDi
> r="/san/tomcat-services/solr-medline">
>   url="${f.fileAbsolutePath}" dataSource="null">
> 
>  
> 
> 
> 
>
> And a snippet from an xml file:
> 
> 12236137
> 
> 1980
> 01
> 03
> 
>
>
> --
> View this message in context: 
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: Attempting dataimport using FileListEntityProcessor

2008-06-23 Thread mike segv

That fixed it.

If I'm inserting millions of documents, how do I control docs/update?  E.g.
if there are 50K docs per file, I'm thinking that I should probably code up
my own DataSource that allows me to stipulate docs/update.  Like say, 100
instead of 50K.  Does this make sense?

Mike


Noble Paul നോബിള്‍ नोब्ळ् wrote:
> 
> hi ,
> You have not registered any datasources . the second entity needs a
> datasource.
> Remove the dataSource="null"  and add a name for the second entity
> (good practice). No need for baseDir attribute for second entity .
> See the modified xml added below
> --Noble
> 
> 
> 
> 
>  newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
> dataSource="null"  baseDir="/san/tomcat-services/solr-medline">
>   forEach="/MedlineCitation"
> url="${f.fileAbsolutePath}" >
> 
>  
> 
> 
> 
> 
> On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
>>
>> I'm trying to use the fileListEntityProcessor to add some xml documents
>> to a
>> solr index.  I'm running a nightly version of solr-1.3 with SOLR-469 and
>> SOLR-563.  I've been able to successfuly run the slashdot httpDataSource
>> example.  My data-config.xml file loads without errors.  When I attempt
>> the
>> full-import command I get the exception below.  Thanks for any help.
>>
>> Mike
>>
>> WARNING: No lockType configured for
>> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
>> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> java.lang.RuntimeException: java.lang.NullPointerException
>>at
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
>>at
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
>>at
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
>>at
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
>>at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
>>at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
>>at
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
>>at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
>>at
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>>at
>> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>>at
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>> Caused by: java.lang.NullPointerException
>>at java.io.Reader.(Reader.java:61)
>>at java.io.BufferedReader.(BufferedReader.java:76)
>>at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
>>at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
>>at
>> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
>>at
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
>>... 10 more
>>
>> Here is my data-config:
>>
>> 
>> 
>> > newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>> dataSource="null" baseDi
>> r="/san/tomcat-services/solr-medline">
>>  > url="${f.fileAbsolutePath}" dataSource="null">
>> 
>>  
>> 
>> 
>> 
>>
>> And a snippet from an xml file:
>> 
>> 12236137
>> 
>> 1980
>> 01
>> 03
>> 
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Attempting dataimport using FileListEntityProcessor

2008-06-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
Just extend XPathEntityProcessor  override nextRow()  after 100 . Use
it as your processor
return null;

On Tue, Jun 24, 2008 at 10:23 AM, mike segv <[EMAIL PROTECTED]> wrote:
>
> That fixed it.
>
> If I'm inserting millions of documents, how do I control docs/update?  E.g.
> if there are 50K docs per file, I'm thinking that I should probably code up
> my own DataSource that allows me to stipulate docs/update.  Like say, 100
> instead of 50K.  Does this make sense?
>
> Mike
>
>
> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>
>> hi ,
>> You have not registered any datasources . the second entity needs a
>> datasource.
>> Remove the dataSource="null"  and add a name for the second entity
>> (good practice). No need for baseDir attribute for second entity .
>> See the modified xml added below
>> --Noble
>>
>> 
>> 
>> 
>> > newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>> dataSource="null"  baseDir="/san/tomcat-services/solr-medline">
>>  > forEach="/MedlineCitation"
>> url="${f.fileAbsolutePath}" >
>> 
>>  
>> 
>> 
>> 
>>
>> On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
>>>
>>> I'm trying to use the fileListEntityProcessor to add some xml documents
>>> to a
>>> solr index.  I'm running a nightly version of solr-1.3 with SOLR-469 and
>>> SOLR-563.  I've been able to successfuly run the slashdot httpDataSource
>>> example.  My data-config.xml file loads without errors.  When I attempt
>>> the
>>> full-import command I get the exception below.  Thanks for any help.
>>>
>>> Mike
>>>
>>> WARNING: No lockType configured for
>>> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
>>> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
>>> doFullImport
>>> SEVERE: Full Import failed
>>> java.lang.RuntimeException: java.lang.NullPointerException
>>>at
>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
>>>at
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
>>>at
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
>>>at
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
>>>at
>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
>>>at
>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
>>>at
>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
>>>at
>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
>>>at
>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>>>at
>>> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>>>at
>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>> Caused by: java.lang.NullPointerException
>>>at java.io.Reader.(Reader.java:61)
>>>at java.io.BufferedReader.(BufferedReader.java:76)
>>>at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
>>>at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
>>>at
>>> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
>>>at
>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
>>>... 10 more
>>>
>>> Here is my data-config:
>>>
>>> 
>>> 
>>> >> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>>> dataSource="null" baseDi
>>> r="/san/tomcat-services/solr-medline">
>>>  >> url="${f.fileAbsolutePath}" dataSource="null">
>>> 
>>>  
>>> 
>>> 
>>> 
>>>
>>> And a snippet from an xml file:
>>> 
>>> 12236137
>>> 
>>> 1980
>>> 01
>>> 03
>>> 
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: Attempting dataimport using FileListEntityProcessor

2008-06-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
Just extend XPathEntityProcessor  override nextRow()  after 100
return null. Use it as your processor
--Noble

On Tue, Jun 24, 2008 at 10:45 AM, Noble Paul നോബിള്‍ नोब्ळ्
<[EMAIL PROTECTED]> wrote:
> Just extend XPathEntityProcessor  override nextRow()  after 100 . Use
> it as your processor
> return null;
>
> On Tue, Jun 24, 2008 at 10:23 AM, mike segv <[EMAIL PROTECTED]> wrote:
>>
>> That fixed it.
>>
>> If I'm inserting millions of documents, how do I control docs/update?  E.g.
>> if there are 50K docs per file, I'm thinking that I should probably code up
>> my own DataSource that allows me to stipulate docs/update.  Like say, 100
>> instead of 50K.  Does this make sense?
>>
>> Mike
>>
>>
>> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>>
>>> hi ,
>>> You have not registered any datasources . the second entity needs a
>>> datasource.
>>> Remove the dataSource="null"  and add a name for the second entity
>>> (good practice). No need for baseDir attribute for second entity .
>>> See the modified xml added below
>>> --Noble
>>>
>>> 
>>> 
>>> 
>>> >> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>>> dataSource="null"  baseDir="/san/tomcat-services/solr-medline">
>>>  >> forEach="/MedlineCitation"
>>> url="${f.fileAbsolutePath}" >
>>> 
>>>  
>>> 
>>> 
>>> 
>>>
>>> On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
>>>>
>>>> I'm trying to use the fileListEntityProcessor to add some xml documents
>>>> to a
>>>> solr index.  I'm running a nightly version of solr-1.3 with SOLR-469 and
>>>> SOLR-563.  I've been able to successfuly run the slashdot httpDataSource
>>>> example.  My data-config.xml file loads without errors.  When I attempt
>>>> the
>>>> full-import command I get the exception below.  Thanks for any help.
>>>>
>>>> Mike
>>>>
>>>> WARNING: No lockType configured for
>>>> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
>>>> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
>>>> doFullImport
>>>> SEVERE: Full Import failed
>>>> java.lang.RuntimeException: java.lang.NullPointerException
>>>>at
>>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
>>>>at
>>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
>>>>at
>>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
>>>>at
>>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
>>>>at
>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
>>>>at
>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
>>>>at
>>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
>>>>at
>>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
>>>>at
>>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>>>>at
>>>> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>>>>at
>>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>>> Caused by: java.lang.NullPointerException
>>>>at java.io.Reader.(Reader.java:61)
>>>>at java.io.BufferedReader.(BufferedReader.java:76)
>>>>at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
>>>>at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
>>>>    at
>>>> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
>>>>at
>>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
>>>>... 10 more
>>>>
>>>> Here is my data-config:
>>>>
>>>> 
>>>> 
>>>> >>> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>>>> dataSource="null" baseDi
>>>> r="/san/tomcat-services/solr-medline">
>>>>  >>> url="${f.fileAbsolutePath}" dataSource="null">
>>>> 
>>>>  
>>>> 
>>>> 
>>>> 
>>>>
>>>> And a snippet from an xml file:
>>>> 
>>>> 12236137
>>>> 
>>>> 1980
>>>> 01
>>>> 03
>>>> 
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>>
>>>
>>
>> --
>> View this message in context: 
>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
>
>
> --
> --Noble Paul
>



-- 
--Noble Paul


Re: Attempting dataimport using FileListEntityProcessor

2008-06-23 Thread Shalin Shekhar Mangar
Hi Mike,

Just curious to know the use-case here. Why do you want to limit updates to
100 instead of importing all documents?

On Tue, Jun 24, 2008 at 10:23 AM, mike segv <[EMAIL PROTECTED]> wrote:

>
> That fixed it.
>
> If I'm inserting millions of documents, how do I control docs/update?  E.g.
> if there are 50K docs per file, I'm thinking that I should probably code up
> my own DataSource that allows me to stipulate docs/update.  Like say, 100
> instead of 50K.  Does this make sense?
>
> Mike
>
>
> Noble Paul നോബിള്‍ नोब्ळ् wrote:
> >
> > hi ,
> > You have not registered any datasources . the second entity needs a
> > datasource.
> > Remove the dataSource="null"  and add a name for the second entity
> > (good practice). No need for baseDir attribute for second entity .
> > See the modified xml added below
> > --Noble
> >
> > 
> > 
> > 
> >  > newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
> > dataSource="null"  baseDir="/san/tomcat-services/solr-medline">
> >   > forEach="/MedlineCitation"
> > url="${f.fileAbsolutePath}" >
> > 
> >  
> > 
> > 
> > 
> >
> > On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
> >>
> >> I'm trying to use the fileListEntityProcessor to add some xml documents
> >> to a
> >> solr index.  I'm running a nightly version of solr-1.3 with SOLR-469 and
> >> SOLR-563.  I've been able to successfuly run the slashdot httpDataSource
> >> example.  My data-config.xml file loads without errors.  When I attempt
> >> the
> >> full-import command I get the exception below.  Thanks for any help.
> >>
> >> Mike
> >>
> >> WARNING: No lockType configured for
> >> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
> >> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
> >> doFullImport
> >> SEVERE: Full Import failed
> >> java.lang.RuntimeException: java.lang.NullPointerException
> >>at
> >>
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
> >>at
> >>
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
> >>at
> >>
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
> >>at
> >>
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
> >>at
> >>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
> >>at
> >>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
> >>at
> >>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
> >>at
> >>
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
> >>at
> >>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
> >>at
> >>
> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
> >>at
> >>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
> >> Caused by: java.lang.NullPointerException
> >>at java.io.Reader.(Reader.java:61)
> >>at java.io.BufferedReader.(BufferedReader.java:76)
> >>at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
> >>at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
> >>at
> >>
> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
> >>at
> >>
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
> >>... 10 more
> >>
> >> Here is my data-config:
> >>
> >> 
> >> 
> >>  >> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
> >> dataSource="null" baseDi
> >> r="/san/tomcat-services/solr-medline">
> >>   >> url="${f.fileAbsolutePath}" dataSource="null">
> >> 
> >>  
> >> 
> >> 
> >> 
> >>
> >> And a snippet from an xml file:
> >> 
> >> 12236137
> >> 
> >> 1980
> >> 01
> >> 03
> >> 
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> >
> > --
> > --Noble Paul
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: Attempting dataimport using FileListEntityProcessor

2008-06-24 Thread mike segv

I do want to import all documents.  My understanding of the way things work,
correct me if I'm wrong, is that there can be a certain number of documents
included in a single atomic update.  Instead of having all my 16 Million
documents be part of a single update (that could more easily fail being so
big), I was thinking that it would be better to be able to stipulate how
many docs are part of an update and my 16 Million doc import would consist
of 16M/100 updates.


Shalin Shekhar Mangar wrote:
> 
> Hi Mike,
> 
> Just curious to know the use-case here. Why do you want to limit updates
> to
> 100 instead of importing all documents?
> 
> On Tue, Jun 24, 2008 at 10:23 AM, mike segv <[EMAIL PROTECTED]> wrote:
> 
>>
>> That fixed it.
>>
>> If I'm inserting millions of documents, how do I control docs/update? 
>> E.g.
>> if there are 50K docs per file, I'm thinking that I should probably code
>> up
>> my own DataSource that allows me to stipulate docs/update.  Like say, 100
>> instead of 50K.  Does this make sense?
>>
>> Mike
>>
>>
>> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>> >
>> > hi ,
>> > You have not registered any datasources . the second entity needs a
>> > datasource.
>> > Remove the dataSource="null"  and add a name for the second entity
>> > (good practice). No need for baseDir attribute for second entity .
>> > See the modified xml added below
>> > --Noble
>> >
>> > 
>> > 
>> > 
>> > > > newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>> > dataSource="null"  baseDir="/san/tomcat-services/solr-medline">
>> >  > > forEach="/MedlineCitation"
>> > url="${f.fileAbsolutePath}" >
>> > 
>> >  
>> > 
>> > 
>> > 
>> >
>> > On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
>> >>
>> >> I'm trying to use the fileListEntityProcessor to add some xml
>> documents
>> >> to a
>> >> solr index.  I'm running a nightly version of solr-1.3 with SOLR-469
>> and
>> >> SOLR-563.  I've been able to successfuly run the slashdot
>> httpDataSource
>> >> example.  My data-config.xml file loads without errors.  When I
>> attempt
>> >> the
>> >> full-import command I get the exception below.  Thanks for any help.
>> >>
>> >> Mike
>> >>
>> >> WARNING: No lockType configured for
>> >> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
>> >> Jun 23, 2008 7:59:49 PM
>> org.apache.solr.handler.dataimport.DataImporter
>> >> doFullImport
>> >> SEVERE: Full Import failed
>> >> java.lang.RuntimeException: java.lang.NullPointerException
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>> >>at
>> >>
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>> >> Caused by: java.lang.NullPointerException
>> >>at java.io.Reader.(Reader.java:61)
>> >>at java.io.BufferedReader.(BufferedReader.java:76)
>> >>

Re: Attempting dataimport using FileListEntityProcessor

2008-06-24 Thread Shalin Shekhar Mangar
r.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
> >> >>at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
> >> >>at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
> >> >>at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
> >> >>at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
> >> >>at
> >> >>
> >>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
> >> >> Caused by: java.lang.NullPointerException
> >> >>at java.io.Reader.(Reader.java:61)
> >> >>at java.io.BufferedReader.(BufferedReader.java:76)
> >> >>at
> >> com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
> >> >>    at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
> >> >>at
> >> >>
> >>
> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
> >> >>at
> >> >>
> >>
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
> >> >>... 10 more
> >> >>
> >> >> Here is my data-config:
> >> >>
> >> >> 
> >> >> 
> >> >>  >> >> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
> >> >> dataSource="null" baseDi
> >> >> r="/san/tomcat-services/solr-medline">
> >> >>   >> >> url="${f.fileAbsolutePath}" dataSource="null">
> >> >> 
> >> >>  
> >> >> 
> >> >> 
> >> >> 
> >> >>
> >> >> And a snippet from an xml file:
> >> >> 
> >> >> 12236137
> >> >> 
> >> >> 1980
> >> >> 01
> >> >> 03
> >> >> 
> >> >>
> >> >>
> >> >> --
> >> >> View this message in context:
> >> >>
> >>
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
> >> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > --Noble Paul
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18095951.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.