[ https://issues.apache.org/jira/browse/LUCENE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870041#comment-16870041 ]
Tomoko Uchida commented on LUCENE-8873: --------------------------------------- I'm trying to find handy ways to properly manage / document the properties (for both of developers and users). e.g., The pseudo would look good? {code:java} /** * Factory for {@link NGramTokenizer}. * * @since 3.1 * @lucene.spi {@value #NAME} */ public class NGramTokenizerFactory extends TokenizerFactory { /** SPI name */ public static final String NAME = "nGram"; /** Property {@value #PROP_MAX_GRAM_SIZE} - Maximum gram size */ public static final String PROP_MAX_GRAM_SIZE = "maxGramSize"; /** Property {@value #PROP_MIN_GRAM_SIZE} - Minimum gram size */ public static final String PROP_MIN_GRAM_SIZE = "minGramSize"; @lucene.analysis.property(name="maxGramSize", required=false, default=NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE) private final int maxGramSize; @lucene.analysis.property(name="minGramSize", required=false, default=NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE) private final int minGramSize; /** Creates a new NGramTokenizerFactory */ public NGramTokenizerFactory(Map<String, String> args) { super(args); /* All properties are derived from annotations (in the superclass's constructor), so we don't have to set those manually */ // minGramSize = getInt(args, "minGramSize", NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE); // maxGramSize = getInt(args, "maxGramSize", NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE); if (!args.isEmpty()) { throw new IllegalArgumentException("Unknown parameters: " + args); } } } {code} [~thetaphi]: if you have anything in your mind (about the interface design), please share your thoughts. > Improve analyzer factoryies' Javadoc. > ------------------------------------- > > Key: LUCENE-8873 > URL: https://issues.apache.org/jira/browse/LUCENE-8873 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Reporter: Tomoko Uchida > Priority: Minor > > Currently, the documentation for analyzer factories (subclasses of > {{TokenizerFactory}}, {{CharFilterFactory}}, {{TokenFilterFactory}}) still > includes lots of Solr schema.xml examples and not all properties are > documented. >From my perspective, the latter is more problematic because > users who want to use the factories have to refer to source code to know what > properties are defined. > To improve documentation, XML examples should be removed for cleanup, and > instead, *all properties which can be passed to factory constructors should > be properly documented*. > Documentation is often overlooked so some validation rules and > standardization effort would be desired (e.g. marking properties by > annotations). > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org