Re: solr autodetectparser tikaconfig dataimporter error

2013-07-18 Thread Andreas Owen
i have now changed some things and the import runs without error. in schema.xml 
i haven't got the field text but contentsExact. unfortunatly the text (from 
file) isn't indexed even though i mapped it to the proper field. what am i 
doing wrong?

data-config.xml:

dataConfig
dataSource type=BinFileDataSource name=data/
dataSource type=BinURLDataSource name=dataUrl/
dataSource type=URLDataSource 
baseUrl=http://127.0.0.1/tkb/internet/; name=main/
document
entity name=rec processor=XPathEntityProcessor url=docImport.xml 
forEach=/albums/album dataSource=main 
!--transformer=script:GenerateId--
field column=title xpath=//title /
field column=id xpath=//file /
field column=path xpath=//path /
field column=Author xpath=//author /

!-- field column=tstamp2013-07-05T14:59:46.889Z/field --

entity name=f processor=FileListEntityProcessor 
baseDir=C:\web\development\tkb\internet\public fileName=${rec.id} 
dataSource=data onError=skip
entity name=tika processor=TikaEntityProcessor 
url=${f.fileAbsolutePath}
field column=text name=contentsExact /
/entity
/entity
/entity
/document
/dataConfig

i noticed, that when I move the field author into the tika-entity it isn't 
indexed. can this have something to do why the text from the file isn't 
indexed? Do I have to do something special about the entity-levels in 
document

ps: how do i import tsstamp, it's a static value?




On 14. Jul 2013, at 10:30 PM, Jack Krupansky wrote:

 Caused by: java.lang.NoSuchMethodError:
 
 That means you have some out of date jars or some newer jars mixed in with 
 the old ones.
 
 -- Jack Krupansky
 
 -Original Message- From: Andreas Owen
 Sent: Sunday, July 14, 2013 3:07 PM
 To: solr-user@lucene.apache.org
 Subject: Re: solr autodetectparser tikaconfig dataimporter error
 
 hi
 
 is there nowone with a idea what this error is or even give me a pointer 
 where to look? If not is there a alternitave way to import documents from a 
 xml-file with meta-data and the filename to parse?
 
 thanks for any help.
 
 
 On 12. Jul 2013, at 10:38 PM, Andreas Owen wrote:
 
 i am using solr 3.5, tika-app-1.4 and tagcloud 1.2.1. when i try to =
 import a
 file via xml i get this error, it doesn't matter what file format i try =
 to index txt, cfm, pdf all the same error:
 
 SEVERE: Exception while processing: rec document :
 SolrInputDocument[{id=3Did(1.0)=3D{myTest.txt},
 title=3Dtitle(1.0)=3D{Beratungsseminar kundenbrief}, =
 contents=3Dcontents(1.0)=3D{wie
 kommuniziert man}, author=3Dauthor(1.0)=3D{Peter Z.},
 =
 path=3Dpath(1.0)=3D{download/online}}]:org.apache.solr.handler.dataimport.=
 DataImportHandlerException:
 java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:669)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:622)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
 68)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=
 
 at
 =
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
 java:359)
 at
 =
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
 27)
 at
 =
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
 8)
 Caused by: java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
 at
 =
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
 rocessor.java:122)
 at
 =
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
 ocessorWrapper.java:238)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:596)
 ... 6 more
 
 Jul 11, 2013 5:23:36 PM org.apache.solr.common.SolrException log
 SEVERE: Full Import
 failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
 java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:669)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:622)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
 68)
 at
 =
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=
 
 at
 =
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
 java:359)
 at
 =
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
 27)
 at
 =
 org.apache.solr.handler.dataimport.DataImporter$1.run

Re: solr autodetectparser tikaconfig dataimporter error

2013-07-14 Thread Andreas Owen
hi

is there nowone with a idea what this error is or even give me a pointer where 
to look? If not is there a alternitave way to import documents from a xml-file 
with meta-data and the filename to parse?

thanks for any help.


On 12. Jul 2013, at 10:38 PM, Andreas Owen wrote:

 i am using solr 3.5, tika-app-1.4 and tagcloud 1.2.1. when i try to =
 import a
 file via xml i get this error, it doesn't matter what file format i try =
 to index txt, cfm, pdf all the same error:
 
 SEVERE: Exception while processing: rec document :
 SolrInputDocument[{id=3Did(1.0)=3D{myTest.txt},
 title=3Dtitle(1.0)=3D{Beratungsseminar kundenbrief}, =
 contents=3Dcontents(1.0)=3D{wie
 kommuniziert man}, author=3Dauthor(1.0)=3D{Peter Z.},
 =
 path=3Dpath(1.0)=3D{download/online}}]:org.apache.solr.handler.dataimport.=
 DataImportHandlerException:
 java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:669)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:622)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
 68)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=
 
   at
 =
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
 java:359)
   at
 =
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
 27)
   at
 =
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
 8)
 Caused by: java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
   at
 =
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
 rocessor.java:122)
   at
 =
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
 ocessorWrapper.java:238)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:596)
   ... 6 more
 
 Jul 11, 2013 5:23:36 PM org.apache.solr.common.SolrException log
 SEVERE: Full Import
 failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
 java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:669)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:622)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
 68)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=
 
   at
 =
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
 java:359)
   at
 =
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
 27)
   at
 =
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
 8)
 Caused by: java.lang.NoSuchMethodError:
 =
 org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
 TikaConfig;)V
   at
 =
 org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
 rocessor.java:122)
   at
 =
 org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
 ocessorWrapper.java:238)
   at
 =
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
 a:596)
   ... 6 more
 
 Jul 11, 2013 5:23:36 PM org.apache.solr.update.DirectUpdateHandler2 =
 rollback
 
 data-config.xml:
 dataConfig
   dataSource type=3DBinURLDataSource name=3Ddata/
   dataSource type=3DURLDataSource =
 baseUrl=3Dhttp://127.0.0.1/tkb/internet/;
 name=3Dmain/
 document
   entity name=3Drec processor=3DXPathEntityProcessor =
 url=3DdocImport.xml
 forEach=3D/albums/album dataSource=3Dmain=20
   field column=3Dtitle xpath=3D//title /
   field column=3Did xpath=3D//file /
   field column=3Dcontents xpath=3D//description /
   field column=3Dpath xpath=3D//path /
   field column=3DAuthor xpath=3D//author /
   =09
   =09
   =09
   entity processor=3DTikaEntityProcessor
 =
 url=3Dfile:///C:\web\development\tkb\internet\public\download\online\${re=
 c.id}
 dataSource=3Ddata onerror=3Dskip
field column=3Dcontents name=3Dtext /
   /entity
   /entity
 /document
 /dataConfig
 
 the lib are included and declared in the logs, i have also tried =
 tika-app
 1.0 and tagsoup 1.2 with the same result. can someone please help, i =
 don't
 know where to start looking for the error.



Re: solr autodetectparser tikaconfig dataimporter error

2013-07-14 Thread Jack Krupansky

Caused by: java.lang.NoSuchMethodError:

That means you have some out of date jars or some newer jars mixed in with 
the old ones.


-- Jack Krupansky

-Original Message- 
From: Andreas Owen

Sent: Sunday, July 14, 2013 3:07 PM
To: solr-user@lucene.apache.org
Subject: Re: solr autodetectparser tikaconfig dataimporter error

hi

is there nowone with a idea what this error is or even give me a pointer 
where to look? If not is there a alternitave way to import documents from a 
xml-file with meta-data and the filename to parse?


thanks for any help.


On 12. Jul 2013, at 10:38 PM, Andreas Owen wrote:


i am using solr 3.5, tika-app-1.4 and tagcloud 1.2.1. when i try to =
import a
file via xml i get this error, it doesn't matter what file format i try =
to index txt, cfm, pdf all the same error:

SEVERE: Exception while processing: rec document :
SolrInputDocument[{id=3Did(1.0)=3D{myTest.txt},
title=3Dtitle(1.0)=3D{Beratungsseminar kundenbrief}, =
contents=3Dcontents(1.0)=3D{wie
kommuniziert man}, author=3Dauthor(1.0)=3D{Peter Z.},
=
path=3Dpath(1.0)=3D{download/online}}]:org.apache.solr.handler.dataimport.=
DataImportHandlerException:
java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:669)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:622)
at
=
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
68)
at
=
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=

at
=
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
java:359)
at
=
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
27)
at
=
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
8)
Caused by: java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
rocessor.java:122)
at
=
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
ocessorWrapper.java:238)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:596)
... 6 more

Jul 11, 2013 5:23:36 PM org.apache.solr.common.SolrException log
SEVERE: Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:669)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:622)
at
=
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
68)
at
=
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=

at
=
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
java:359)
at
=
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
27)
at
=
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
8)
Caused by: java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
rocessor.java:122)
at
=
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
ocessorWrapper.java:238)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:596)
... 6 more

Jul 11, 2013 5:23:36 PM org.apache.solr.update.DirectUpdateHandler2 =
rollback

data-config.xml:
dataConfig
dataSource type=3DBinURLDataSource name=3Ddata/
dataSource type=3DURLDataSource =
baseUrl=3Dhttp://127.0.0.1/tkb/internet/;
name=3Dmain/
document
entity name=3Drec processor=3DXPathEntityProcessor =
url=3DdocImport.xml
forEach=3D/albums/album dataSource=3Dmain=20
field column=3Dtitle xpath=3D//title /
field column=3Did xpath=3D//file /
field column=3Dcontents xpath=3D//description /
field column=3Dpath xpath=3D//path /
field column=3DAuthor xpath=3D//author /
=09
=09
=09
entity processor=3DTikaEntityProcessor
=
url=3Dfile:///C:\web\development\tkb\internet\public\download\online\${re=
c.id}
dataSource=3Ddata onerror=3Dskip
field column=3Dcontents name=3Dtext /
/entity
/entity
/document
/dataConfig

the lib are included and declared in the logs, i have also tried =
tika-app
1.0 and tagsoup 1.2 with the same result. can someone please help, i =
don't
know where to start looking for the error. 




solr autodetectparser tikaconfig dataimporter error

2013-07-12 Thread Andreas Owen
i am using solr 3.5, tika-app-1.4 and tagcloud 1.2.1. when i try to =
import a
file via xml i get this error, it doesn't matter what file format i try =
to index txt, cfm, pdf all the same error:

SEVERE: Exception while processing: rec document :
SolrInputDocument[{id=3Did(1.0)=3D{myTest.txt},
title=3Dtitle(1.0)=3D{Beratungsseminar kundenbrief}, =
contents=3Dcontents(1.0)=3D{wie
kommuniziert man}, author=3Dauthor(1.0)=3D{Peter Z.},
=
path=3Dpath(1.0)=3D{download/online}}]:org.apache.solr.handler.dataimport.=
DataImportHandlerException:
java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:669)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:622)
at
=
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
68)
at
=
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=

at
=
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
java:359)
at
=
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
27)
at
=
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
8)
Caused by: java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
rocessor.java:122)
at
=
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
ocessorWrapper.java:238)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:596)
... 6 more

Jul 11, 2013 5:23:36 PM org.apache.solr.common.SolrException log
SEVERE: Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:669)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:622)
at
=
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:2=
68)
at
=
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)=

at
=
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.=
java:359)
at
=
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:4=
27)
at
=
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:40=
8)
Caused by: java.lang.NoSuchMethodError:
=
org.apache.tika.parser.AutoDetectParser.setConfig(Lorg/apache/tika/config/=
TikaConfig;)V
at
=
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityP=
rocessor.java:122)
at
=
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityPr=
ocessorWrapper.java:238)
at
=
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.jav=
a:596)
... 6 more

Jul 11, 2013 5:23:36 PM org.apache.solr.update.DirectUpdateHandler2 =
rollback

data-config.xml:
dataConfig
dataSource type=3DBinURLDataSource name=3Ddata/
dataSource type=3DURLDataSource =
baseUrl=3Dhttp://127.0.0.1/tkb/internet/;
name=3Dmain/
document
entity name=3Drec processor=3DXPathEntityProcessor =
url=3DdocImport.xml
forEach=3D/albums/album dataSource=3Dmain=20
field column=3Dtitle xpath=3D//title /
field column=3Did xpath=3D//file /
field column=3Dcontents xpath=3D//description /
field column=3Dpath xpath=3D//path /
field column=3DAuthor xpath=3D//author /
=09
=09
=09
entity processor=3DTikaEntityProcessor
=
url=3Dfile:///C:\web\development\tkb\internet\public\download\online\${re=
c.id}
dataSource=3Ddata onerror=3Dskip
 field column=3Dcontents name=3Dtext /
/entity
/entity
/document
/dataConfig

the lib are included and declared in the logs, i have also tried =
tika-app
1.0 and tagsoup 1.2 with the same result. can someone please help, i =
don't
know where to start looking for the error.