Re: How to create concatenated token
Hi Erick, In that issue you forwarded to me, they want to make one token from all tokens received from token stream but in my case I want to keep the tokens same and create and extra new token which is concat of all the tokens. I'd guess, is the case here. I mean do you really want to concatenate 50 tokens? We are applying it on *title field* of product so max length can be 10 I guess and that too will be in rare case. With Regards Aman Tandon On Wed, Jun 17, 2015 at 7:16 PM, Erick Erickson erickerick...@gmail.com wrote: If you used the JIRA I linked, vote for it, add any improvements etc. Anyone can attach a patch to a JIRA, you just have to create a login. That said, this may be too rare a use-case to deal with. I just thought of shingling which I should have suggested before that will work for concatenating small numbers of tokens which, I'd guess, is the case here. I mean do you really want to concatenate 50 tokens? Best, Erick On Wed, Jun 17, 2015 at 12:07 AM, Aman Tandon amantandon...@gmail.com wrote: Dear Erick, e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 I did implemented the filter as per my requirement. Thank you so much for your help and guidance. So how could I contribute it to the solr. With Regards Aman Tandon On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com wrote: Hi Erick, Thank you so much, it will be helpful for me to learn how to save the state of token. I has no idea of how to save state of previous tokens due to this it was difficult to generate a concatenated token in the last. So is there anything should I read to learn more about it. With Regards Aman Tandon On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com wrote: I really question the premise, but have a look at: https://issues.apache.org/jira/browse/SOLR-7193 Note that this is not committed and I haven't reviewed it so I don't have anything to say about that. And you'd have to implement it as a custom Filter. Best, Erick On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, Any guesses, how could I achieve this behaviour. With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com wrote: e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti
Re: How to create concatenated token
Dear Erick, e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 I did implemented the filter as per my requirement. Thank you so much for your help and guidance. So how could I contribute it to the solr. With Regards Aman Tandon On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com wrote: Hi Erick, Thank you so much, it will be helpful for me to learn how to save the state of token. I has no idea of how to save state of previous tokens due to this it was difficult to generate a concatenated token in the last. So is there anything should I read to learn more about it. With Regards Aman Tandon On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com wrote: I really question the premise, but have a look at: https://issues.apache.org/jira/browse/SOLR-7193 Note that this is not committed and I haven't reviewed it so I don't have anything to say about that. And you'd have to implement it as a custom Filter. Best, Erick On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, Any guesses, how could I achieve this behaviour. With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com wrote: e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
If you used the JIRA I linked, vote for it, add any improvements etc. Anyone can attach a patch to a JIRA, you just have to create a login. That said, this may be too rare a use-case to deal with. I just thought of shingling which I should have suggested before that will work for concatenating small numbers of tokens which, I'd guess, is the case here. I mean do you really want to concatenate 50 tokens? Best, Erick On Wed, Jun 17, 2015 at 12:07 AM, Aman Tandon amantandon...@gmail.com wrote: Dear Erick, e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 I did implemented the filter as per my requirement. Thank you so much for your help and guidance. So how could I contribute it to the solr. With Regards Aman Tandon On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com wrote: Hi Erick, Thank you so much, it will be helpful for me to learn how to save the state of token. I has no idea of how to save state of previous tokens due to this it was difficult to generate a concatenated token in the last. So is there anything should I read to learn more about it. With Regards Aman Tandon On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com wrote: I really question the premise, but have a look at: https://issues.apache.org/jira/browse/SOLR-7193 Note that this is not committed and I haven't reviewed it so I don't have anything to say about that. And you'd have to implement it as a custom Filter. Best, Erick On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, Any guesses, how could I achieve this behaviour. With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com wrote: e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
Hi, Any guesses, how could I achieve this behaviour. With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com wrote: e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
Hi Erick, Thank you so much, it will be helpful for me to learn how to save the state of token. I has no idea of how to save state of previous tokens due to this it was difficult to generate a concatenated token in the last. So is there anything should I read to learn more about it. With Regards Aman Tandon On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com wrote: I really question the premise, but have a look at: https://issues.apache.org/jira/browse/SOLR-7193 Note that this is not committed and I haven't reviewed it so I don't have anything to say about that. And you'd have to implement it as a custom Filter. Best, Erick On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, Any guesses, how could I achieve this behaviour. With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com wrote: e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: How to create concatenated token
I really question the premise, but have a look at: https://issues.apache.org/jira/browse/SOLR-7193 Note that this is not committed and I haven't reviewed it so I don't have anything to say about that. And you'd have to implement it as a custom Filter. Best, Erick On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, Any guesses, how could I achieve this behaviour. With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com wrote: e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) typo error e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training) With Regards Aman Tandon On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com wrote: We has some business logic to search the user query in user intent or finding the exact matching products. e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training) As we can see it is phrase query so it will took more time than the single stemmed token query. There are also 5-7 words phrase query. So we want to reduce the search time by implementing this feature. With Regards Aman Tandon On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Can I ask you why you need to concatenate the tokens ? Maybe we can find a better solution to concat all the tokens in one single big token . I find it difficult to understand the reasons behind tokenising, token filtering and then un-tokenizing again :) It would be great if you explain a little bit better what you would like to do ! Cheers 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com: Hi, I have a requirement to create the concatenated token of all the tokens created from the last item of my analyzer chain. *Suppose my analyzer chain is :* * tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1 preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front /filter class=solr.PorterStemmerFilterFactory/* I want to create a concatenated token plugin to add at concatenated token along with the last token. e.g. Solr training *Porter:-* solr train Position 1 2 *Concatenated :-* solr train solrtrain Position 1 2 Please help me out. How to create custom filter for this requirement. With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England