I think it might be to do with the library itself
I downloaded semanticvectors-1.22 and compiled from source. Then created a demo
corpus using
java org.apache.lucene.demo.IndexFiles against the lucene src directory
I then ran a java pitt.search.semanticvectors.BuildIndex against the index and
got the following
Seedlength = 10
Dimension = 200
Minimum frequency = 0
Number non-alphabet characters = 0
Contents fields are: [contents]
Creating semantic term vectors ...
Populating basic sparse doc vector store, number of vectors: 774
Creating store of sparse vectors ...
Created 774 sparse random vectors.
Creating term vectors ...
There are 36881 terms (and 774 docs)
0 ... 1000 ... 2000 ... 3000 ... 4000 ... Exception in thread "main"
java.lang.NullPointerException
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.freq(DirectoryReader.java:
1068)
at
pitt.search.semanticvectors.LuceneUtils.getGlobalTermFreq(LuceneUtils.java:70)
at
pitt.search.semanticvectors.LuceneUtils.termFilter(LuceneUtils.java:187)
at
pitt.search.semanticvectors.TermVectorsFromLucene.(TermVectorsFromLucene.j
ava:163)
at pitt.search.semanticvectors.BuildIndex.main(BuildIndex.java:138)
I am still digging but when you look at the source code it references lucene
call dating back to lucene 2.4 alot fo which are deprecated might need some
refreshing.
Cheers,
Dave
On 02 November 2009 at 14:40 Andrew Clegg wrote:
>
> Hi,
>
> I've recently added the TermVectorComponent as a separate handler, following
> the example in the supplied config file, i.e.:
>
> class="org.apache.solr.handler.component.TermVectorComponent"/>
>
> class="org.apache.solr.handler.component.SearchHandler">
>
> true
>
>
> tvComponent
>
>
>
> It works, but with one quirk. When you use tf.all=true, you get the tf*idf
> scores in the output, just fine (along with tf and df). But if you use
> tv.tf_idf=true you get an NPE:
>
> http://server:8080/solr/tvrh/?q=1cuk&version=2.2&indent=on&tv.tf_idf=true
>
> HTTP Status 500 - null java.lang.NullPointerException at
> org.apache.solr.handler.component.TermVectorComponent$TVMapper.getDocFreq(Term
> VectorComponent.java:253)
> at
> org.apache.solr.handler.component.TermVectorComponent$TVMapper.map(TermVectorC
> omponent.java:245)
> at
> org.apache.lucene.index.TermVectorsReader.readTermVector(TermVectorsReader.jav
> a:522)
> at
> org.apache.lucene.index.TermVectorsReader.readTermVectors(TermVectorsReader.ja
> va:401)
> at org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:378)
> at
> org.apache.lucene.index.SegmentReader.getTermFreqVector(SegmentReader.java:125
> 3)
> at
> org.apache.lucene.index.DirectoryReader.getTermFreqVector(DirectoryReader.java
> :474)
> at
> org.apache.solr.search.SolrIndexReader.getTermFreqVector(SolrIndexReader.java:
> 244)
> at
> org.apache.solr.handler.component.TermVectorComponent.process(TermVectorCompon
> ent.java:125)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandle
> r.java:195)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.ja
> va:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338
> )
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:24
> 1)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFi
> lterChain.java:235)
> at
> (etc.)
>
> Is this a bug, or am I doing it wrong?
>
> Cheers,
>
> Andrew.
>
> --
> View this message in context:
> http://old.nabble.com/NullPointerException-with-TermVectorComponent-tp26156903p26156903.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>