How about using the function of espeak to get the phonemes for a tag?

The result in calling espeak for virtuallisation (the right way to spell
it) and virtualization (nasty, left-pondian way of spelling it ;- ) is
surely the same.

I once did this in a commercial database to provide a kind of fuzzy
search.  Works well.

I did it by adding a column to each table where searches done on names
was done and populating that column with the phonemes returned by
espeak, after removing 'noise' words like 'the' etc.

So, if there is a one-to-many relationship between show and tags, IOW
show number XXX has one row in a table but as many rows as there are
tags for it in the tags table, the tags table would have an extra
column, one for the tag and one for it's phoneme.  Then search on the
phoneme, not the literal tag.




On 18/08/2018 19:02, Brenda J. Butler wrote:
> 
> Things are quiet here, ... too quiet ... : -)
> 
> 
> 
> I'm looking at writing tags-and-summary for show 0031 on
> "Intel Virtualization Technology"
> 
> I see (thanks to the shiny new "show me all the tags" feature)
> there are two existing tags for that:
> 
>  virtualisation: 12
>  virtualization: 16, 48
> 
> 
> Is there a preference for one over the other (risking a flame war, I
> know) and should we collapse the existing two tags into one?
> 
> For now, I guess I will go with "z" as the show I'm working on spells
> it with "z".
> 
> But I really have no preference.
> 
> Are we going to try to consolidate the tags, or just leave them as
> they are?
> 
> consolidating:  pros
> 
>     makes it easier and more certain to search by tag
> 
> consolidating:  cons
> 
>     could weaken the community by causing division
> 
> 
> My own preference:
> 
> I'd really like to use one set of tags for any given concept, and I
> don't have any preference as to which one.
> 
> But if some people stronly prefer one or the other, (the "z" group and
> the "s" group) we could really annoy some when we choose one side.
> 
> I suppose we could set the tag to "virtuali.ation" to avoid that
> ... it might work for longer names.  This sort of side-step is less
> effective for shorter words.
> 
> But what about color/colour?  It is short, but also has different
> numbers of letters in the two incarnations.
> 
> 
> I suppose at this stage we just want tags, any tags, for the shows
> that have none.  But this topic bears thinking about as the body of
> published work grows.
> 
> 
> bjb
> 
> _______________________________________________
> Hpr mailing list
> Hpr@hackerpublicradio.org
> http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org
> 


-- 
Michael A. Ray
Analyst/Programmer
Witley, Surrey, South-east UK

"Perfection is achieved, not when there is nothing more to add, but when
there is nothing left to take away." -- A. de Saint-Exupery


https://cromarty.github.io/
http://eyesfreelinux.ninja/
http://www.raspberryvi.org/



_______________________________________________
Hpr mailing list
Hpr@hackerpublicradio.org
http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org

Reply via email to