Sounds reasonable, Andriy.

Having a certain set of hyphenated compounds as correct and others not would add maintenance work. Besides, whether hyphenation is acceptable is subjective. The evolution of English in this matter goes something like this: * When two words that are often used together in the same order, after some time, some people will start to hyphenate them. * This may or may not catch on. But if it does, after some time, some people will start to drop the hyphen. * This may or may not catch on. But if it does, we'll start to it as a new word in the dictionary. E.g., grand mother -> grand-mother -> grandmother. There's a general tendency for this evolution.

kb


Andriy Rysin wrote thus at 04:25 AM 19-11-13:

I am acutally taking this approach for Ukrainian (in
MorfologikUkrainianSpellerRule.java) - if word is not in dictionary
and contains hyphen I check all parts of it and if all of them correct
I consider the compound correct. Similar to arguments for English
above it may miss some words wrongly using hyphen but the benefits
definitely outweight the drawbacks.

I am all for having such logic in common code instead.

Andriy

2013/11/18 Marcin Miłkowski <list-addr...@wp.pl>:
> W dniu 2013-11-18 15:36, Daniel Naber pisze:
>> On 2013-11-18 15:10, Mike Unwalla wrote:
>>
>>> I agree with Paolo. If you permit all hyphenated compounds, then many
>>> errors
>>> will not be found. How will you find errors such as these?
>>>      I want a powerful-computer.
>>>      The large-book was expensive.
>>>      Make sure that step-three is correct.
>>
>> But does anybody actually make these errors? The way LanguageTool works
>> there will always be errors which it won't find anyway.

> Well, some style guides have clear rules when to use hyphens, so
> according to them, some of hyphens is a mistake. On the other hand, you
> can create a neologism containing a lot of terms that are
> words-containing-hyphens, and they would be correct. (Also, the
> style-guide rules [now using Chicago-manual-of-style hyphenation rule]
> could mark up such errors by themselves.)

> Actually, we could try to mark the words as fine in the disambiguator
> (not necessarily by immunizing them - we could have something like
> "spelling immunization", or maybe we already have this feature? I don't
> remember right now), because they have a clear structure: "web-based"
> has NN+_+JJ structure, for example.

> Regards,
> Marcin


> ------------------------------------------------------------------------------
> DreamFactory - Open Source REST & JSON Services for HTML5 & Native Apps
> OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
> Free app hosting. Or install the open source package on any LAMP server.
> Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
> http://pubads.g.doubleclick.net/gampad/clk?id=63469471&iu=/4140/ostg.clktrk
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel

------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing
conversations that shape the rapidly evolving mobile landscape. Sign up now.
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to