Hi, Paolo's proposal would be quite complicated to implement. I was thinking of something really simple.
See the attachment. If there are more than 3 matches per sentence, I put a mark in all the matches of that sentence, and then a remark is printed in the output. This works for me. Well, it can be sophisticated in different ways. If you consider it useful, It could be set as an option in the command line, and the threshold passed as a parameter. Regads, Jaume Ortolà 2013/1/14 R.J. Baars <[email protected]> > Interesting suggestion. > > Feels good, but it will rely on language detection quite a bit. Falling > back to a specified language is a good and save option. > > +1! > > > What about applying language detection at a paragraph level and handling, > > loading/unloadng of rules accordingly? > > > > We could, then, fire up an options setting for treating foreign language > > paragraphs. Something like: > > > > foreign language paragraphs: > > * ignore (faster) > > * apply home language correction (fast) > > * apply auto detected language > > * apply this language (follows a selectable list of supported languages) > > > > So, if I'm writing a doc in Italian with English paragraphs in it I could > > set Italian as a primary language and indicate that the undetected > > paragraphs fall back to English. If I know that I will be using lots of > > quotations form other languages I can leave the ignore option on and not > > check them at all. > > > > Ciao > > > > Paolo > > > > On Jan 14, 2013, at 10:05 AM, Jaume Ortolà i Font wrote: > > > >> Hi, > >> > >> When analyzing a long text or a corpus with LanguageTool on the command > >> line, it would be useful to discard sentences in a language other than > >> the expected one (quotations, dialogs, bibliography...). That way we > >> could remove a lot a annoying alarms. Some kind of threshold should be > >> set (i.e. 3 or 4 spelling mistakes, or 5 or 6 total mistakes per > >> sentence), and the sentences that exceed the threshold should be marked > >> someway as discarded and printed separately. > >> > >> This could be an option on the command line similar to this one: > >> > >> -u, --list-unkown also print a summary of words from the input that > >> LanguageTool doesn't know. > >> > >> What do you think about implementing this option? > >> > >> Regards, > >> Jaume Ortolà > >> > >> > >> > ------------------------------------------------------------------------------ > >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, > >> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current > >> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft > >> MVPs and experts. SALE $99.99 this month only -- learn more at: > >> > http://p.sf.net/sfu/learnmore_122412_______________________________________________ > >> Languagetool-devel mailing list > >> [email protected] > >> https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > > > > > ------------------------------------------------------------------------------ > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, > > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current > > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft > > MVPs and experts. SALE $99.99 this month only -- learn more at: > > http://p.sf.net/sfu/learnmore_122412 > > _______________________________________________ > > Languagetool-devel mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > > > > > ------------------------------------------------------------------------------ > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft > MVPs and experts. SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122412 > _______________________________________________ > Languagetool-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/languagetool-devel >
patch_discardOutlyingSentences.diff
Description: Binary data
------------------------------------------------------------------------------ Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512
_______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
