It would be for typed English.

On Tue, Jul 10, 2012 at 11:25 PM, Lance Norskog <[email protected]> wrote:

> Is this in the general case or for specific speech? For example, it
> should be possible to create an HMM that breaks medical jargon, based
> on work in splitting Simplified Chinese language text. The average
> Simplified Chinese "word" is 1.5 ideograms, and you need a
> well-trained HMM (or similar) to split Simplified Chinese well. The
> language is very context-specific with both prefixes and suffixes that
> alter the meaning of "interior" words.
>
> On Mon, Jul 9, 2012 at 4:39 PM, John Stewart <[email protected]> wrote:
> > That's right, better use a lexical database.  CELEX2, available fairly
> > inexpensively from the Linguistic Data Consortium, has syllable
> > boundaries in its phonological representations.
> >
> > http://www.ldc.upenn.edu/Catalog/readme_files/celex.readme.html#overview
> >
> > jds
> >
> > On Mon, Jul 9, 2012 at 6:37 PM, James Kosin <[email protected]>
> wrote:
> >> Adam,
> >>
> >> Sorry, OpenNLP doesn't detect syllables.  What you probably need is more
> >> of a dictionary with pronunciation syllables.
> >> It could be trained to do it maybe; but, would be very language specific
> >> and not very useful.  The dictionary approach would be best.  Though
> >> OpenNLP could help parse the words/tokens for you to use in the
> dictionary.
> >>
> >> James
> >>
> >> On 7/9/2012 5:26 PM, Adam Goodkind wrote:
> >>> Hi all,
> >>>
> >>> Does OpenNLP have the ability to detect syllables? If not, could you
> point
> >>> me to a java toolkit that can do this?
> >>>
> >>> Thanks,
> >>> Adam
> >>>
> >>
> >>
>
>
>
> --
> Lance Norskog
> [email protected]
>



-- 
*Adam Goodkind *
*w*  adamgoodkind.com <http://www.adamgoodkind.com>
*t*   @adamgreatkind <https://twitter.com/#%21/adamgreatkind>

Reply via email to