NullPointerException with TermVectorComponent

2009-11-02 Thread Andrew Clegg

Hi,

I've recently added the TermVectorComponent as a separate handler, following
the example in the supplied config file, i.e.:

  searchComponent name=tvComponent
class=org.apache.solr.handler.component.TermVectorComponent/

  requestHandler name=/tvrh
class=org.apache.solr.handler.component.SearchHandler
  lst name=defaults
  bool name=tvtrue/bool
  /lst
  arr name=last-components
  strtvComponent/str
  /arr
  /requestHandler

It works, but with one quirk. When you use tf.all=true, you get the tf*idf
scores in the output, just fine (along with tf and df). But if you use
tv.tf_idf=true you get an NPE:

http://server:8080/solr/tvrh/?q=1cukversion=2.2indent=ontv.tf_idf=true

HTTP Status 500 - null java.lang.NullPointerException at
org.apache.solr.handler.component.TermVectorComponent$TVMapper.getDocFreq(TermVectorComponent.java:253)
at
org.apache.solr.handler.component.TermVectorComponent$TVMapper.map(TermVectorComponent.java:245)
at
org.apache.lucene.index.TermVectorsReader.readTermVector(TermVectorsReader.java:522)
at
org.apache.lucene.index.TermVectorsReader.readTermVectors(TermVectorsReader.java:401)
at org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:378)
at
org.apache.lucene.index.SegmentReader.getTermFreqVector(SegmentReader.java:1253)
at
org.apache.lucene.index.DirectoryReader.getTermFreqVector(DirectoryReader.java:474)
at
org.apache.solr.search.SolrIndexReader.getTermFreqVector(SolrIndexReader.java:244)
at
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:125)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
(etc.)

Is this a bug, or am I doing it wrong?

Cheers,

Andrew.

-- 
View this message in context: 
http://old.nabble.com/NullPointerException-with-TermVectorComponent-tp26156903p26156903.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: NullPointerException with TermVectorComponent

2009-11-02 Thread david.stu...@progressivealliance.co.uk
I think it might be to do with the library itself

I downloaded semanticvectors-1.22 and compiled from source. Then created a demo
corpus using 
java org.apache.lucene.demo.IndexFiles against the lucene src directory
I then ran a java pitt.search.semanticvectors.BuildIndex against the index and
got the following

Seedlength = 10
Dimension = 200
Minimum frequency = 0
Number non-alphabet characters = 0
Contents fields are: [contents]
Creating semantic term vectors ...
Populating basic sparse doc vector store, number of vectors: 774
Creating store of sparse vectors  ...
Created 774 sparse random vectors.
Creating term vectors ...
There are 36881 terms (and 774 docs)
0 ... 1000 ... 2000 ... 3000 ... 4000 ... Exception in thread main
java.lang.NullPointerException
    at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.freq(DirectoryReader.java:
1068)
    at
pitt.search.semanticvectors.LuceneUtils.getGlobalTermFreq(LuceneUtils.java:70)
    at
pitt.search.semanticvectors.LuceneUtils.termFilter(LuceneUtils.java:187)
    at
pitt.search.semanticvectors.TermVectorsFromLucene.init(TermVectorsFromLucene.j
ava:163)
    at pitt.search.semanticvectors.BuildIndex.main(BuildIndex.java:138)
I am still digging but when you look at the source code it references lucene
call dating back to lucene 2.4 alot fo which are deprecated might need some
refreshing.

Cheers,

Dave

 
On 02 November 2009 at 14:40 Andrew Clegg andrew.cl...@gmail.com wrote:

 
 Hi,
 
 I've recently added the TermVectorComponent as a separate handler, following
 the example in the supplied config file, i.e.:
 
   searchComponent name=tvComponent
 class=org.apache.solr.handler.component.TermVectorComponent/
 
   requestHandler name=/tvrh
 class=org.apache.solr.handler.component.SearchHandler
           lst name=defaults
                   bool name=tvtrue/bool
           /lst
           arr name=last-components
                   strtvComponent/str
           /arr
   /requestHandler
 
 It works, but with one quirk. When you use tf.all=true, you get the tf*idf
 scores in the output, just fine (along with tf and df). But if you use
 tv.tf_idf=true you get an NPE:
 
 http://server:8080/solr/tvrh/?q=1cukversion=2.2indent=ontv.tf_idf=true
 
 HTTP Status 500 - null java.lang.NullPointerException at
 org.apache.solr.handler.component.TermVectorComponent$TVMapper.getDocFreq(Term
 VectorComponent.java:253)
 at
 org.apache.solr.handler.component.TermVectorComponent$TVMapper.map(TermVectorC
 omponent.java:245)
 at
 org.apache.lucene.index.TermVectorsReader.readTermVector(TermVectorsReader.jav
 a:522)
 at
 org.apache.lucene.index.TermVectorsReader.readTermVectors(TermVectorsReader.ja
 va:401)
 at org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:378)
 at
 org.apache.lucene.index.SegmentReader.getTermFreqVector(SegmentReader.java:125
 3)
 at
 org.apache.lucene.index.DirectoryReader.getTermFreqVector(DirectoryReader.java
 :474)
 at
 org.apache.solr.search.SolrIndexReader.getTermFreqVector(SolrIndexReader.java:
 244)
 at
 org.apache.solr.handler.component.TermVectorComponent.process(TermVectorCompon
 ent.java:125)
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandle
 r.java:195)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.ja
 va:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338
 )
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:24
 1)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFi
 lterChain.java:235)
 at 
 (etc.)
 
 Is this a bug, or am I doing it wrong?
 
 Cheers,
 
 Andrew.
 
 -- 
 View this message in context:
 http://old.nabble.com/NullPointerException-with-TermVectorComponent-tp26156903p26156903.html
 Sent from the Solr - User mailing list archive at Nabble.com.