Here is it, regex is very simple:

    <fieldType name="ex" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z 
])" replacement="" replace="all" />
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z 
])" replacement="" replace="all" />

        </analyzer>
    </fieldType>

But the problem is not about the filed type. The problem is: how to retrive
final token and put it into the filed. Theoretically I gan retrive
token with AnalysisRequestHandler.
    
JN> Could you post fieldType specification for "ex"?  What your regex look
JN> like?





JN> -----Original Message-----
JN> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] 
JN> Sent: Wednesday, October 22, 2008 11:39 Joe
JN> To: Joe Nguyen
JN> Subject: Re[6]: Question about copyField


JN>> It doesn't need to be a copy field, right?  Could you create a new
JN> field
JN>> "ex", extract value from description, delete digits, and set to "ex"
JN>> field before add/index to solr server?

JN> Yes, I can. I just was wondering can I use solr for this purpose or
JN> not.

JN>> -----Original Message-----
JN>> From: Feak, Todd [mailto:[EMAIL PROTECTED] 
JN>> Sent: Wednesday, October 22, 2008 11:25 Joe
JN>> To: solr-user@lucene.apache.org
JN>> Subject: RE: Re[4]: Question about copyField

JN>> My bad. I misunderstood what you wanted. 

JN>> The example I gave was for the searching side of things. Not the
JN> data
JN>> representation in the document.

JN>> -Todd

JN>> -----Original Message-----
JN>> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] 
JN>> Sent: Wednesday, October 22, 2008 11:14 AM
JN>> To: Feak, Todd
JN>> Subject: Re[4]: Question about copyField



FT>>> I would suggest doing this in your schema, then starting up Solr
JN> and
FT>>> using the analysis admin page to see if it will index and search
JN> the
JN>> way
FT>>> you want. That way you don't have to pay the cost of actually
JN>> indexing
FT>>> the data to find out.

JN>> Thanks. I did it exactly like you said.

JN>> I created a fieldType "ex" (short for experiment), defined
JN>> corresponding <copyFiled> and try it on the analysis page. Here is
JN> what
JN>> I got (I uploaded the page, so you can see it): 

JN>> http://tut-i-tam.com.ua/static/analysis.jsp.htm

JN>> I want the final token "samsung spinpoint p spn hard drive gb ata"
JN> to
JN>> be the actual "ex" value. So I expect such response:

JN>> <result name="response" numFound="1" start="0">
JN>>         <doc>
JN>>              <str name="ex">samsung spinpoint p spn hard drive gb
JN>> ata</str>
JN>>              <str name="id">SP2514N</str>
JN>>              <str name="description">Samsung SpinPoint12 P120
JN> SP2514N -
JN>> hard drive - 250 GB - ATA-133</str>
JN>>              </doc>
JN>> </result>

JN>> But when I'm searching this doc, I got this:

JN>> <result name="response" numFound="1" start="0">
JN>>         <doc>
JN>>              <str name="ex">Samsung SpinPoint12 P120 SP2514N - hard
JN>> drive - 250 GB - ATA-133</str>
JN>>              <str name="id">SP2514N</str>
JN>>              <str name="description">Samsung SpinPoint12 P120
JN> SP2514N -
JN>> hard drive - 250 GB - ATA-133</str>
JN>>              </doc>
JN>> </result>

JN>> As you can see "description" and "ex" filed are identical.
JN>> The result of filter chain wasn't actually stored in the "ex" filed
JN> :(

JN>> Anyway, thank you :)

FT>>> -Todd

FT>>> -----Original Message-----
FT>>> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] 
FT>>> Sent: Wednesday, October 22, 2008 9:24 AM
FT>>> To: Feak, Todd
FT>>> Subject: Re[2]: Question about copyField


FT>>> Thanks for reply. I want to make your point more exact, cause I'm
JN>> not
FT>>> sure that I correctly understood you :)

FT>>> As far as I know (correct me please, if I wrong) type defines the
JN>> way
FT>>> in which the field is indexed and queried. But I don't want to
JN> index
FT>>> or query "suggestion" field in different way, I want "suggestion"
JN>> field
FT>>> store different value (like in example I wrote in first mail). 

FT>>> So you are saying that I can tell to slor (using filedType) how
JN> solr
FT>>> should process string before saving it? Yes?

FT>>>> The filters and tokenizer that are applied to the copy field are
FT>>>> determined by it's type in the schema. Simply create a new field
FT>>> type in
FT>>>> your schema with the filters you would like, and use that type for
FT>>> your
FT>>>> copy field. So, the field description would have it's old type,
JN> but
FT>>> the
FT>>>> field suggestion would get a new type.

FT>>>> -Todd Feak

FT>>>> -----Original Message-----
FT>>>> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] 
FT>>>> Sent: Wednesday, October 22, 2008 8:28 AM
FT>>>> To: solr-user@lucene.apache.org
FT>>>> Subject: Question about copyField


FT>>>> Hello.

FT>>>> I have field "description" in my schema. And I want make a filed
FT>>>> "suggestion" with the same content. So I added following line to
JN> my
FT>>>> schema.xml:

FT>>>>    <copyField source="description" dest="suggestion"/>

FT>>>> But I also want to modify "description" string before copying it
JN> to
FT>>>> "suggestion" field. I want to remove all comas, dots and slashes.
FT>>> Here
FT>>>> is an example of such transformation:

FT>>>> "TvPL/st, SAMSUNG, SML200"  => "TvPL st SAMSUNG SML200"

FT>>>> And so as result I want to have such doc:

FT>>>> <doc>
FT>>>>      <field name="id">8asydauf9nbcngfaad</filed>
FT>>>>      <field name="description">TvPL/st, SAMSUNG, SML200</filed>
FT>>>>      <field name="description">TvPL st SAMSUNG SML200</filed>
FT>>>> </doc>

FT>>>> I think it would be nice to use solr.PatternReplaceFilterFactory
JN>> for
FT>>>> this purpose. So the question is: Can I use solr filters for
FT>>>> processing "description" string before copying it to "suggestion"
FT>>>> field?

FT>>>> Thank you for your attention.













-- 
Aleksey Gogolev
developer, 
dev.co.ua
Aleksey                         mailto:[EMAIL PROTECTED]

Reply via email to