Re: [slim] Double Metaphone based library search

2020-07-04 Thread philchillbill


mherger wrote: 
> I could imagine that FTS and "fuzzy" search could result in a lot of
> false positives.

Yes, it very likely would. I was envisaging using this as a fallback
mechanism - first do a 'normal' text search and if that returns nothing
then do a DM search to get back something that I can parse with the
FUSEjs library to make sense of the likely falsies. I already use FUSEjs
but I need to feed it more data :)



philchillbill's Profile: http://forums.slimdevices.com/member.php?userid=68920
View this thread: http://forums.slimdevices.com/showthread.php?t=112531

___
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/discuss


Re: [slim] Double Metaphone based library search

2020-07-04 Thread Michael Herger

The "Full Text search" plugin (from Logitech) would seem like a good
place to put this although I suspect mherger might suggest doing a whole
new plugin for it.


Not sure yet (lack of investigation). More plugins with 
inter-dependencies could become a bit complicated to manage.


--

Michael
___
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/discuss


Re: [slim] Double Metaphone based library search

2020-07-04 Thread Michael Herger

Just curious if anybody ever experimented with this kind of thing
before? Would the increased DB size and time-to-rescan be an issue at
all?


"Fuzzy search" has been on my list for years... but I haven't done any 
experimenting yet, tbh.


The very first phrase in the wikipedia article about metaphone was 
somewhat of a bummer:


"Metaphone is a phonetic algorithm, published by Lawrence Philips in 
1990, for indexing words by their English pronunciation."


But it seems DM improved support for other languages.

As for the integration... I was wondering whether this should be a 
different column/index, or whether this could be a feature you enable, 
and thereafter all search indexes would be using DM. There is a 
CUSTOMSEARCH column in most tables which could be used by a plugin - and 
which is put in the fulltext index, too. That said, I could imagine that 
FTS and "fuzzy" search could result in a lot of false positives. But 
anyway, it's probably something to explore.

--

Michael
___
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/discuss


Re: [slim] Double Metaphone based library search

2020-07-04 Thread Paul Webster


I remember using Soundex back in the early 90s and having just read a
bit about Metaphone/Double Metaphone (and even Metaphone 3) this does
look like a clever upgrade from it.

The "Full Text search" plugin (from Logitech) would seem like a good
place to put this although I suspect mherger might suggest doing a whole
new plugin for it.



Paul Webster
http://dabdig.blogspot.com
author of \"now playing\" plugins covering radio france (fip etc), kcrw,
supla finland, abc australia, cbc/radio-canada and rte ireland

Paul Webster's Profile: http://forums.slimdevices.com/member.php?userid=105
View this thread: http://forums.slimdevices.com/showthread.php?t=112531

___
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/discuss


[slim] Double Metaphone based library search

2020-07-04 Thread philchillbill


One of the issues I face in developing the Alexa skills for LMS is that
I have to search your library -textually- based on the speech-to-text of
what Alexa thinks she -heard-. This can be problematic with
similar-sounding words like pair/pear, hear/here, knot/not, ate/eight,
flour/flower, etc. I can only fuzzy match / post-process the returned
values if the searched term is among the search results.  Of course a
search for a song with -flower- in the title will not return a song with
-flour- - end of story. Asking Alexa again will not help because she
will hear the same thing every time and return the same search-for
text.

A potential workaround for this would be to have a plugin that extends
the scanning process to add columns to the database that contain
computed *Double Metaphone* (DM) values for tags. There is a CPAN lib
for DM and all it does is produce a text mapping from a word to
something that sounds like the word according to pre-defined rules. Some
examples:

Flour, flower -> FLR
Not, Knot -> NT
Hear, Here -> HR
Ate, Eight -> AT

So the song title 'True Colors' would become TRKL, but so would Tru
Colors, True Colours and even Trew Cullers. This means that I can match
based on *sound* of the title rather than spelling and therefore more
easily find stuff in your library. With the added benefit that this
would work in the GUI interface too and help you find stuff with
misspelled tags.

Just curious if anybody ever experimented with this kind of thing
before? Would the increased DB size and time-to-rescan be an issue at
all?



philchillbill's Profile: http://forums.slimdevices.com/member.php?userid=68920
View this thread: http://forums.slimdevices.com/showthread.php?t=112531

___
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/discuss