I think the problem here is that the list has 3-values, but the last one is
actually a set of several as well. Anurag seem to be able to split them
into separate values whether they came as individual array items or as part
of joint list. So, we have a mix of multiValue submission and desire to
split it out.

The correct solution I suspect would be to normalize everything to just be
training_skill:["c", "c++", "php", "java", ".net"] before this hits Solr.

However, since he wants this for facets and as a training exercise, one
could remember that facets values come from the tokens, not stored value.
So, it might be possible to do this:
    <field name="test" type="comaSplit" indexed="true" stored="true"
multiValued="true"/>
    <fieldType name="comaSplit" class="solr.TextField"
 positionIncrementGap="100" >
        <analyzer>
           <tokenizer class="solr.PatternTokenizerFactory" pattern="," />
        </analyzer>
    </fieldType>

I think the filter code will probably just aggregate all tokens despite the
fact that they are spread over multiple values.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Jan 17, 2013 at 2:33 PM, Gora Mohanty <g...@mimirtech.com> wrote:

> On 18 January 2013 00:31, anurag.jain <anurag.k...@gmail.com> wrote:
> >
> >   [ { "last_name" : "jain", "training_skill":["c", "c++",
> "php,java,.net"]
> > }
> > ]
> >
> > actually i want to tokenize in   c c++ php java .net
>
> What do you mean by "tokenize" in this case? It has
> been a while since I had occasion to use JSON input,
> and also do not remember which Solr version introduced
> this, but with a JSON array mapped to a multi-valued
> Solr field, you should get one value per entry in the array.
> http://wiki.apache.org/solr/UpdateJSON#Update_Commands
> seems to be in agreement.
>
> > so through this i can make them as facet.
> >
> >
> > but problem is in list
> > "training_skill":["c", "c++", *"php,java,.net"*]
>
> Faceting should be straightforward. Are you not
> seeing the behaviour described above? Could
> you describe the issues that you are facing in
> more detail?
>
> Regards,
> Gora
>

Reply via email to