Hello,
I've been searching the forum and found several more or less relevant
topic listed below.
http://www.nabble.com/Parsing-text-containing-forward-slash-and-wildcard-td13541503.html#a13541503
http://www.nabble.com/Parsing-text-containing-forward-slash-and-wildcard-td13541503.html#a13541503
http://www.nabble.com/Search-that-supports-all-valid-characters-in-a-Unix-filename-td11495983.html
http://www.nabble.com/Search-that-supports-all-valid-characters-in-a-Unix-filename-td11495983.html
http://www.nabble.com/Payloads%2C-Tokenizers%2C-and-Filters.--Oh-My!-td13806662.html
http://www.nabble.com/Payloads%2C-Tokenizers%2C-and-Filters.--Oh-My!-td13806662.html
However they did not help me much and that's why I've decided to start a
separate topic.
Basically what I need to do, is to be able to search '/' (forward slash)
symbol much the same as any alpha-numeric symbol AND with using wildcards.
E.g. if I have following records in my index
1 | /cat1
2 | /cat1/test
3 | /cat1/tea
4 | /cat2/test
Search by
/cat1* - will return records 1, 2, 3
/cat1/test - will return record 2
/cat1/t* - will return records 2, 3
/cat1/t - will return no records
I use Lucene with Compass. The searched property is indexed as
"untokenized".
I have tried using StandardAnalyzer and KeywordAnalyzer but got the
following in the
Results from testing
org.apache.lucene.analysis.standard.StandardAnalyzer
Search string: rootKey:/cat1/te*
Rewritten search string: rootKey:/cat1/te*
Found results : 0
Search string: rootKey:/cat1/test
Rewritten search string: rootKey:"cat1 test"
Found results : 1
Search string: rootKey:/cat1/te
Rewritten search string: rootKey:"cat1 te"
Found results : 0
Results from testing org.apache.lucene.analysis.KeywordAnalyzer
Search string: rootKey:/cat1/te
Rewritten search string: rootKey:/cat1/te
Found results : 0
Search string: rootKey:/cat1/te*
Rewritten search string: rootKey:/cat1/te*
Found results : 0
So it looks like Lucene internally replaces the '/' with whitespace when
searching however I did not find any "replace symbol" code in the source
code of StandardAnalyzer, neither I had much luck in finding that place
during debugging the Lucene sources.
With that all said, I would appreciate any hint on how this task could
be accomplished.
Regards,
Roman
--
View this message in context:
http://www.nabble.com/Search-with-wild-cards-by-words-with-forward-slash-%28%22-%22%29-tp25578017p25578017.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]