I am using Solr in a web app to extract text from .pdf, and docx files. I was 
wondering if I can access the TermFreq and TermPosition vectors via the HTTP 
interface exposed by Solr Cell. I’m posting/getting documents fine, I’ve 
enabled the TV, TFV etc in the managed schema:

<field name="doc_content" type="text_ws" indexed="true" termOffsets="true" 
stored="true" termPayloads="true" termPositions="true" termVectors="true”/>

And use a get request similar to :

   
http://localhost:8983/solr/myCore/tvrh?q=doc_content&tv=true&tv.tf=true&tv.df=true&tv.positions=true&tv.offsets=true&tv.payload
  s=true&tv.fl=includes

When I look in the browser network tab, I see that the query went in as 
expected with tv=true, tv.positions= true etc. But no Term Positions/Offsets in 
the results. I’ve done similar using the Data Import Handler with java, but 
looking for a web solution. Before I “Roll my own” Term Vector, thought I’d see 
if it’s available from Solr Cell. 

Reply via email to