I will try using solrJ.

Now I tried indexing .docx files and I get some different error,logs are:
SEVERE: null:java.lang.RuntimeException: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.VerifyError: (class: 
org/apache/poi/extractor/ExtractorFactory, method: createExtractor signature: 
(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)Lorg/apache/poi/POITextExtractor;)
 Wrong return type in function
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:59)
at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
... 16 more

But does the jars cause these errors? Because I read one solution which said 
removal of few jars in classpath may solve the errors,but those jars are not 
present in my classpath.(the link to solution 
:http://stackoverflow.com/questions/14696371/how-to-extract-the-text-of-a-ppt-file-with-tika)

Thank You.



On Wednesday, October 9, 2013 6:05 AM, Erick Erickson [via Lucene] 
<ml-node+s472066n4094231...@n3.nabble.com> wrote:
 
Hmmm, that is odd, the glob dynamicField should 
pick this up. 

Not quite sure what's going on. You an parse the file 
via Tika yourself and look at what's in there, it's a relatively 
simple SolrJ program, here's a sample: 
http://searchhub.org/2012/02/14/indexing-with-solrj/

Best, 
Erick 

On Tue, Oct 8, 2013 at 4:15 PM, sweety <[hidden email]> wrote: 

> This my new schema.xml: 
> <schema  name="documents"> 
> <fields> 
> <field name="id" type="string" indexed="true" stored="true" required="true" 
> multiValued="false"/> 
> <field name="author" type="string" indexed="true" stored="true" 
> multiValued="true"/> 
> <field name="comments" type="text" indexed="true" stored="true" 
> multiValued="false"/> 
> <field name="keywords" type="text" indexed="true" stored="true" 
> multiValued="false"/> 
> <field name="contents" type="text" indexed="true" stored="true" 
> multiValued="false"/> 
> <field name="title" type="text" indexed="true" stored="true" 
> multiValued="false"/> 
> <field name="revision_number" type="string" indexed="true" stored="true" 
> multiValued="false"/> 
> <field name="_version_" type="long" indexed="true" stored="true" 
> multiValued="false"/> 
> <dynamicField name="ignored_*" type="string" indexed="false" stored="true" 
> multiValued="true"/> 
> <dynamicField name="*" type="ignored"  multiValued="true" /> 
> <copyfield source="id" dest="text" /> 
> <copyfield source="author" dest="text" /> 
> </fields> 
> <types> 
> <fieldtype name="ignored" stored="false" indexed="false" 
> class="solr.StrField" /> 
> <fieldType name="integer" class="solr.IntField" /> 
> <fieldType name="long" class="solr.LongField" /> 
> <fieldType name="string" class="solr.StrField"  /> 
> <fieldType name="text" class="solr.TextField" /> 
> </types> 
> <uniqueKey>id</uniqueKey> 
> </schema> 
> I still get the same error. 
> 
> ________________________________ 
>  From: Erick Erickson [via Lucene] <[hidden email]> 
> To: sweety <[hidden email]> 
> Sent: Tuesday, October 8, 2013 7:16 AM 
> Subject: Re: no such field error:smaller big block size details while 
> indexing doc files 
> 
> 
> 
> Well, one of the attributes parsed out of, probably the 
> meta-information associated with one of your structured 
> docs is SMALLER_BIG_BLOCK_SIZE_DETAILS and 
> Solr Cel is faithfully sending that to your index. If you 
> want to throw all these in the bit bucket, try defining 
> a true catch-all field that ignores things, like this. 
> <dynamicField name="*" type="ignored" multiValued="true" /> 
> 
> Best, 
> Erick 
> 
> On Mon, Oct 7, 2013 at 8:03 AM, sweety <[hidden email]> wrote: 
> 
>> Im trying to index .doc,.docx,pdf files, 
>> im using this url: 
>> curl 
>> "http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
>>  
>> -F"myfile=@complex.doc" 
>> 
>> This is the error I get: 
>> Oct 07, 2013 5:02:18 PM org.apache.solr.common.SolrException log 
>> SEVERE: null:java.lang.RuntimeException: java.lang.NoSuchFieldError: 
>> SMALLER_BIG_BLOCK_SIZE_DETAILS 
>>         at 
>> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
>>  
>>         at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
>>  
>>         at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
>>  
>>         at 
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>>  
>>         at 
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>>  
>>         at 
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
>>  
>>         at 
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
>>  
>>         at 
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
>>  
>>         at 
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) 
>>         at 
>> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928) 
>>         at 
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>>  
>>         at 
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
>>         at 
>> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
>>  
>>         at 
>> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
>>  
>>         at 
>> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)
>>  
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
>> Source) 
>>         at java.lang.Thread.run(Unknown Source) 
>> Caused by: java.lang.NoSuchFieldError: SMALLER_BIG_BLOCK_SIZE_DETAILS 
>>         at 
>> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:93)
>>  
>>         at 
>> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:190)
>>  
>>         at 
>> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:184)
>>  
>>         at 
>> org.apache.tika.parser.microsoft.POIFSContainerDetector.getTopLevelNames(POIFSContainerDetector.java:376)
>>  
>>         at 
>> org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:165)
>>  
>>         at 
>> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61) 
>>         at 
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113) 
>>         at 
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
>>  
>>         at 
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>>  
>>         at 
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>>  
>>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797) 
>>         at 
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
>>  
>>         at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
>>  
>>         ... 16 more 
>> 
>> Also using same type of url,txt,mp3 and pdf files are indexed successfully. 
>> (curl 
>> "http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
>>  
>> -F"myfile=@abc.txt") 
>> 
>> Schema.xml is: 
>> <schema  name="documents"> 
>> <fields> 
>> <field name="id" type="string" indexed="true" stored="true" required="true" 
>> multiValued="false"/> 
>> <field name="author" type="string" indexed="true" stored="true" 
>> multiValued="true"/> 
>> <field name="comments" type="text" indexed="true" stored="true" 
>> multiValued="false"/> 
>> <field name="keywords" type="text" indexed="true" stored="true" 
>> multiValued="false"/> 
>> <field name="contents" type="text" indexed="true" stored="true" 
>> multiValued="false"/> 
>> <field name="title" type="text" indexed="true" stored="true" 
>> multiValued="false"/> 
>> <field name="revision_number" type="string" indexed="true" stored="true" 
>> multiValued="false"/> 
>> <field name="_version_" type="long" indexed="true" stored="true" 
>> multiValued="false"/> 
>> 
>> <dynamicField name="ignored_*" type="string" indexed="false" stored="true" 
>> multiValued="true"/> 
>> <copyfield source="id" dest="text" /> 
>> <copyfield source="author" dest="text" /> 
>> </fields> 
>> 
>> <types> 
>> <fieldType name="integer" class="solr.IntField" /> 
>> <fieldType name="long" class="solr.LongField" /> 
>> <fieldType name="string" class="solr.StrField"  /> 
>> <fieldType name="text" class="solr.TextField" /> 
>> <fieldtype name="ignored" stored="false" indexed="false" multiValued="true" 
>> class="solr.StrField" /> 
>> </types> 
>> <uniqueKey>id</uniqueKey> 
>> </schema> 
>> 
>> Im not able to understand what kind of error this is,please help me. 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/no-such-field-error-smaller-big-block-size-details-while-indexing-doc-files-tp4093883.html
>> Sent from the Solr - User mailing list archive at Nabble.com. 
> 
> 
> ________________________________ 
> 
> If you reply to this email, your message will be added to the discussion 
> below:http://lucene.472066.n3.nabble.com/no-such-field-error-smaller-big-block-size-details-while-indexing-doc-files-tp4093883p4094013.html
> To unsubscribe from no such field error:smaller big block size details while 
> indexing doc files, click here. 
> NAML 
> 
> 
> 
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/no-such-field-error-smaller-big-block-size-details-while-indexing-doc-files-tp4093883p4094166.html
> Sent from the Solr - User mailing list archive at Nabble.com. 


________________________________
 
If you reply to this email, your message will be added to the discussion 
below:http://lucene.472066.n3.nabble.com/no-such-field-error-smaller-big-block-size-details-while-indexing-doc-files-tp4093883p4094231.html
 
To unsubscribe from no such field error:smaller big block size details while 
indexing doc files, click here.
NAML



--
View this message in context: 
http://lucene.472066.n3.nabble.com/no-such-field-error-smaller-big-block-size-details-while-indexing-doc-files-tp4093883p4094306.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to