Hi Sergey,
Here is the table of tags from http://www.nltk.org/book/ch05.html Tag Meaning English Examples ADJ adjective new, good, high, special, big, local ADP adposition on, of, at, with, by, into, under ADV adverb really, already, still, early, now CONJ conjunction and, or, but, if, while, although DET determiner, article the, a, some, most, every, no, which NOUN noun year, home, costs, time, Africa NUM numeral twenty-four, fourth, 1991, 14:24 PRT particle at, on, out, over per, that, up, with PRON pronoun he, their, her, its, my, I, us VERB verb is, say, told, given, playing, would . punctuation marks . , ; ! X other ersatz, esprit, dunno, gr8, univeristy So that DET and CONJ are stop-words for most cases Lucene tries to resolve. For example, there is absolutely no need to search for PRON:he since it will return 100% of documents for a fiction books site. However, if you still ned to index tokens such as “Brand:Microsoft”, “Sentiment:Positive”, “DET:123” and so on, you can do it in Lucene, by defining fields: Brand, Sentiment, DET, PRON, VERB, and so on. I hope I helped a little :) thanks, Fuad Efendi Search Relevancy Tuning http://www.tokenizer.ca On October 31, 2016 at 7:53:28 AM, Sergey Repnikov (repni...@megaputer.ru) wrote: Hello. My name is Sergeiy, I'm working on Lucene's functionality extension. As I've read in JavaDoc for "org.apache.lucene.analysis" package, it's preferably to ask this email before extending, because some features could be done. So I want to have opportunity to perform search by parts of speech and within a sentence. Is there any way to get this functionality out of the box? If it is, how? If it's not, do I understand correct, that custom attributes are not being saved to index while writing "tokenstrean" into Directory? And the only way to save any metadata, associated with term is to use payload, and then, while searching, ask for it? As I've found in Google, payload is being saved not alongside with term, but it(payload) is associated with term by position count. I haven't yet understood, how does index save tokens and associated metadata, maybe that speciality is crucial sometime. Maybe it's not. Maybe there is a way to extend index/IndexWriter to save and then retrieve custom attributes. So can you tell me, based by your experience, what is the best way to do what i want? Thank you. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org