Re: Store input text after analyzers and token filters
For solr 1.4 Is basically the same but IndexSchema (org.apache.solr.schema.IndexSchema) needs to be updated to include the function getFieldTypeByName(String fieldTypeName) which is already in sorl1.5 /** * Given the name of a {...@link org.apache.solr.schema.FieldType} (not to be confused with {...@link #getFieldType(String)} which * takes in the name of a field), return the {...@link org.apache.solr.schema.FieldType}. * @param fieldTypeName The name of the {...@link org.apache.solr.schema.FieldType} * @return The {...@link org.apache.solr.schema.FieldType} or null. */ public FieldType getFieldTypeByName(String fieldTypeName){ return fieldTypes.get(fieldTypeName); } then the AnalyzedField is a bit different, but basicaclly is a copy of the TextField as is in 1.4 http://old.nabble.com/file/p27902273/AnalyzedField.java AnalyzedField.java -- View this message in context: http://old.nabble.com/Store-input-text-after-analyzers-and-token-filters-tp27792550p27902273.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Store input text after analyzers and token filters
Otis, I've been thinking on it, and trying to figure out the different solutions - Try to solve it doing a bridge between solr and clustering. - Try to solve it before/during indexing The second option, of course is better for performance, but how to do it?? I think a good option may be to create a new type derived type from the FieldType class like the SortableIntField which has the toInternal(String val) function. Then the problem is how to include the result of the analysis of anoter field type in the toInternal function So there would be a new type that can be used on copy fields , that takes the analysis of the source field and injects in the code. It takes as parameter the field from which takes the analysis . So, how can I get the result of the analysis of a given text by a given field using internal functions?? Otis Gospodnetic wrote: Hi Joan, You could use the FieldAnalysisRequestHandler: http://www.search-lucene.com/?q=FieldAnalysisRequestHandler Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ -- View this message in context: http://old.nabble.com/Store-input-text-after-analyzers-and-token-filters-tp27792550p27840488.html Sent from the Solr - User mailing list archive at Nabble.com.
Store input text after analyzers and token filters
In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or interesting transformation of the text, then it could be interesting to store the text after the tokenizer/Filter chain there is a way to do this? To be able to get back the text of the document after being processed?? thanks Joan -- View this message in context: http://old.nabble.com/Store-input-text-after-analyzers-and-token-filters-tp27792550p27792550.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Store input text after analyzers and token filters
In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or interesting transformation of the text, then it could be interesting to store the text after the tokenizer/Filter chain there is a way to do this? To be able to get back the text of the document after being processed?? You can get term vectors [1] of analyzed text. Also you can see analyzed text in solr/admin/analysis.jsp if you copy and paste sample text data. [1] http://wiki.apache.org/solr/TermVectorComponent
Re: Store input text after analyzers and token filters
Thanks, It can be useful as a workarrond, but I get a vector not a result that I may use wherever I could used the stored text. I'm thinking in clustering. Ahmet Arslan wrote: In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or interesting transformation of the text, then it could be interesting to store the text after the tokenizer/Filter chain there is a way to do this? To be able to get back the text of the document after being processed?? You can get term vectors [1] of analyzed text. Also you can see analyzed text in solr/admin/analysis.jsp if you copy and paste sample text data. [1] http://wiki.apache.org/solr/TermVectorComponent -- View this message in context: http://old.nabble.com/Store-input-text-after-analyzers-and-token-filters-tp27792550p27794689.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Store input text after analyzers and token filters
Hi Joan, You could use the FieldAnalysisRequestHandler: http://www.search-lucene.com/?q=FieldAnalysisRequestHandler Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: JCodina joan.cod...@barcelonamedia.org To: solr-user@lucene.apache.org Sent: Fri, March 5, 2010 6:01:37 AM Subject: Store input text after analyzers and token filters In an stored field, the content stored is the raw input text. But when the analyzers perform some cleaning or interesting transformation of the text, then it could be interesting to store the text after the tokenizer/Filter chain there is a way to do this? To be able to get back the text of the document after being processed?? thanks Joan -- View this message in context: http://old.nabble.com/Store-input-text-after-analyzers-and-token-filters-tp27792550p27792550.html Sent from the Solr - User mailing list archive at Nabble.com.