Here is it, regex is very simple: <fieldType name="ex" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z ])" replacement="" replace="all" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z ])" replacement="" replace="all" />
</analyzer> </fieldType> But the problem is not about the filed type. The problem is: how to retrive final token and put it into the filed. Theoretically I gan retrive token with AnalysisRequestHandler. JN> Could you post fieldType specification for "ex"? What your regex look JN> like? JN> -----Original Message----- JN> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] JN> Sent: Wednesday, October 22, 2008 11:39 Joe JN> To: Joe Nguyen JN> Subject: Re[6]: Question about copyField JN>> It doesn't need to be a copy field, right? Could you create a new JN> field JN>> "ex", extract value from description, delete digits, and set to "ex" JN>> field before add/index to solr server? JN> Yes, I can. I just was wondering can I use solr for this purpose or JN> not. JN>> -----Original Message----- JN>> From: Feak, Todd [mailto:[EMAIL PROTECTED] JN>> Sent: Wednesday, October 22, 2008 11:25 Joe JN>> To: solr-user@lucene.apache.org JN>> Subject: RE: Re[4]: Question about copyField JN>> My bad. I misunderstood what you wanted. JN>> The example I gave was for the searching side of things. Not the JN> data JN>> representation in the document. JN>> -Todd JN>> -----Original Message----- JN>> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] JN>> Sent: Wednesday, October 22, 2008 11:14 AM JN>> To: Feak, Todd JN>> Subject: Re[4]: Question about copyField FT>>> I would suggest doing this in your schema, then starting up Solr JN> and FT>>> using the analysis admin page to see if it will index and search JN> the JN>> way FT>>> you want. That way you don't have to pay the cost of actually JN>> indexing FT>>> the data to find out. JN>> Thanks. I did it exactly like you said. JN>> I created a fieldType "ex" (short for experiment), defined JN>> corresponding <copyFiled> and try it on the analysis page. Here is JN> what JN>> I got (I uploaded the page, so you can see it): JN>> http://tut-i-tam.com.ua/static/analysis.jsp.htm JN>> I want the final token "samsung spinpoint p spn hard drive gb ata" JN> to JN>> be the actual "ex" value. So I expect such response: JN>> <result name="response" numFound="1" start="0"> JN>> <doc> JN>> <str name="ex">samsung spinpoint p spn hard drive gb JN>> ata</str> JN>> <str name="id">SP2514N</str> JN>> <str name="description">Samsung SpinPoint12 P120 JN> SP2514N - JN>> hard drive - 250 GB - ATA-133</str> JN>> </doc> JN>> </result> JN>> But when I'm searching this doc, I got this: JN>> <result name="response" numFound="1" start="0"> JN>> <doc> JN>> <str name="ex">Samsung SpinPoint12 P120 SP2514N - hard JN>> drive - 250 GB - ATA-133</str> JN>> <str name="id">SP2514N</str> JN>> <str name="description">Samsung SpinPoint12 P120 JN> SP2514N - JN>> hard drive - 250 GB - ATA-133</str> JN>> </doc> JN>> </result> JN>> As you can see "description" and "ex" filed are identical. JN>> The result of filter chain wasn't actually stored in the "ex" filed JN> :( JN>> Anyway, thank you :) FT>>> -Todd FT>>> -----Original Message----- FT>>> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] FT>>> Sent: Wednesday, October 22, 2008 9:24 AM FT>>> To: Feak, Todd FT>>> Subject: Re[2]: Question about copyField FT>>> Thanks for reply. I want to make your point more exact, cause I'm JN>> not FT>>> sure that I correctly understood you :) FT>>> As far as I know (correct me please, if I wrong) type defines the JN>> way FT>>> in which the field is indexed and queried. But I don't want to JN> index FT>>> or query "suggestion" field in different way, I want "suggestion" JN>> field FT>>> store different value (like in example I wrote in first mail). FT>>> So you are saying that I can tell to slor (using filedType) how JN> solr FT>>> should process string before saving it? Yes? FT>>>> The filters and tokenizer that are applied to the copy field are FT>>>> determined by it's type in the schema. Simply create a new field FT>>> type in FT>>>> your schema with the filters you would like, and use that type for FT>>> your FT>>>> copy field. So, the field description would have it's old type, JN> but FT>>> the FT>>>> field suggestion would get a new type. FT>>>> -Todd Feak FT>>>> -----Original Message----- FT>>>> From: Aleksey Gogolev [mailto:[EMAIL PROTECTED] FT>>>> Sent: Wednesday, October 22, 2008 8:28 AM FT>>>> To: solr-user@lucene.apache.org FT>>>> Subject: Question about copyField FT>>>> Hello. FT>>>> I have field "description" in my schema. And I want make a filed FT>>>> "suggestion" with the same content. So I added following line to JN> my FT>>>> schema.xml: FT>>>> <copyField source="description" dest="suggestion"/> FT>>>> But I also want to modify "description" string before copying it JN> to FT>>>> "suggestion" field. I want to remove all comas, dots and slashes. FT>>> Here FT>>>> is an example of such transformation: FT>>>> "TvPL/st, SAMSUNG, SML200" => "TvPL st SAMSUNG SML200" FT>>>> And so as result I want to have such doc: FT>>>> <doc> FT>>>> <field name="id">8asydauf9nbcngfaad</filed> FT>>>> <field name="description">TvPL/st, SAMSUNG, SML200</filed> FT>>>> <field name="description">TvPL st SAMSUNG SML200</filed> FT>>>> </doc> FT>>>> I think it would be nice to use solr.PatternReplaceFilterFactory JN>> for FT>>>> this purpose. So the question is: Can I use solr filters for FT>>>> processing "description" string before copying it to "suggestion" FT>>>> field? FT>>>> Thank you for your attention. -- Aleksey Gogolev developer, dev.co.ua Aleksey mailto:[EMAIL PROTECTED]