OK, payloads are a bit of a mystery to me, so this may be way off
base.

But...

The ordering of your analysis chain is suspicious, the admin/analysis
page is a life-saver.

WordDelimiterFilterFactory is breaking up your input before it gets to
the payload filter I think, so your payload information is completely
disassociated with from your terms and treated as individual terms
all by themselves. At that point what you get
in your index *probably* has no payloads attached at all!

Use the admin/schema browser link to actually look at the data (or
just go straight to Luke) and I believe you'll see that your position
information is being treated just like any other token in the input stream.

There should be nothing about payloads that prevents normal
text query on the text part, although.

Best
Erick

On Thu, Feb 16, 2012 at 9:18 AM, leonardo2 <leonardo.rigut...@gmail.com> wrote:
> Hello,
> I already posted this question but for some reason it was attached to a
> thread with different topic.
>
>
> Is there the possibility of perform 'exact search' in a payload field?
>
> I'have to index text with auxiliary info for each word. In particular at
> each word is associated the bounding box containing it in the original pdf
> page (it is used for highligthing the search terms in the pdf). I used the
> payload to store that information.
>
> In the schema.xml, the fieldType definition is:
>
> -------------------------------
> <fieldtype name="wppayloads" stored="false" indexed="true"
> class="solr.TextField" >
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1"
>                         catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="identity"/>
> </analyzer>
> </fieldtype>
> -------------------------------
>
> while the field definition is:
>
> -------------------------------
> <field name="words" type="wppayloads" indexed="true" stored="true"
> required="true" multiValued="true"/>
> -------------------------------
>
> When indexing, the field 'words' contains a list of word|box as in the
> following example:
>
> -------------------------------
> doc_id=example
> words={Fonte:|307.62,948.16,324.62,954.25 Comune|326.29,948.16,349.07,954.25
> di|350.74,948.16,355.62,954.25 Bologna|358.95,948.16,381.28,954.25}
> -------------------------------
>
> Such solution works well except in the case of an exact search. For example,
> assuming the only indexed doc is the 'example' doc (before shown), the query
> words:"Comune di Bologna" returns no results.
>
> Someone know if there is the possibility of perform 'exact search' in a
> payload field?
>
> Thanks in advance,
> Leonardo
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Payload-and-exact-search-2-tp3750355p3750355.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to