Re: Copy field a source of copy field

Erick Erickson Tue, 18 Jul 2017 08:58:50 -0700

The code is very simple, it looks at a quick glance like it just reads
the words in then the "accept" method just returns true or false based
on whether the text file contains the token.


Are you sure you reloaded your core/collection and pushed the changed
schema to the right place? The admin/analysis page is very helpful
here, your indexing side should have two keep word filters and you
should be able to see each transformation (uncheck the "verbose"
checkbox for more readability.

Best,
Erick

On Tue, Jul 18, 2017 at 8:49 AM, tstusr <ulfrhe...@gmail.com> wrote:
> Ok, I know shingling will join with "_".
>
> But that is the behaviour we want, imagine we have this fields (contained in
> species file):
>
> abarema idiopoda
> abutilon bakerianum
>
> Those become in:
> abarema
> idiopoda
> abutilon
> bakerianum
> abarema_idiopoda
> abutilon_bakerianum
>
> But now in my genus file maybe is only the word abarema, so, we end up with
> a field with only that word.
>
> So, the requirements here, are to be able to find all species in species
> files (step one) and then make a facet with species in file genus, step two.
>
> It seems reasonable to just chain the fields, I just forgot solr didn't
> change the field, as Shawn points (thanks for it).
>
> So what we came here is to make 2 fields the first with species.
>
> <fieldType name="species_type" class="solr.TextField"
> positionIncrementGap="0">
>     <analyzer type="index">
>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping/mapping-ISOLatin1Accent.txt"/>
>       <charFilter class="solr.PatternReplaceCharFilterFactory"
> pattern="[0-9]+|(\-)(\s*)" replacement=""/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>       <filter class="solr.ShingleFilterFactory" maxShingleSize="3"
> outputUnigrams="true"/>
>       <filter class="solr.KeepWordFilterFactory" words="species.txt"
> ignoreCase="true"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.ShingleFilterFactory" maxShingleSize="3"
> outputUnigrams="false"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>   </fieldType>
>
> And the second one (genus), which contains genus that has to be for facet
> purposes, like this:
>
> <fieldType name="genus_type" class="solr.TextField"
> positionIncrementGap="0">
>     <analyzer type="index">
>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping/mapping-ISOLatin1Accent.txt"/>
>       <charFilter class="solr.PatternReplaceCharFilterFactory"
> pattern="[0-9]+|(\-)(\s*)" replacement=""/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>       <filter class="solr.ShingleFilterFactory" maxShingleSize="3"
> outputUnigrams="true"/>
>       <filter class="solr.KeepWordFilterFactory" words="species.txt"
> ignoreCase="true"/>
>       <filter class="solr.KeepWordFilterFactory" words="genus.txt"
> ignoreCase="true"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>   </fieldType>
>
> Nevertheless, there is no second processing for keep word filter as (I)
> expect. Am I missing something?
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346593.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Copy field a source of copy field

Reply via email to