Re: Suggestion for English rule - masters thesis > master's thesis
Ouch! Nick, you put it the right way. That is what I wanted to write, but I messed it up. I'm sorry. Puroda On 22.06.2016 23:50, Nick Hough wrote: > It would be the other way around: > > If you wrote it : "my masters thesis”, because it is the thesis from > your “masters" degree > If your master wrote it : "my master’s thesis”, because the thesis > is owned (hence the possessive apostrophe) by the “master" > > Both could be valid English. Which version is correct would depend on > the context and desired meaning. > > Cheers, > > Nick > >> On 22 Jun 2016, at 8:16 PM, Purodha Blissenbach >> <puro...@blissenbach.org> wrote: >> If you wrote it : "my master's thesis" >> If your master wrote it : "my masters thesis" > -- > Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San > Francisco, CA to explore cutting-edge tech and listen to tech > luminaries > present their vision of the future. This family event has something for > everyone, including kids. Get more information and register today. > http://sdm.link/attshape > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Suggestion for English rule - masters thesis > master's thesis
Imho, it depends. If you wrote it : "my master's thesis" If your master wrote it : "my masters thesis" but I am not a native English speaker :-) Purodha On 22.06.2016 18:33, Marco A.G.Pinto wrote: > Hello! > > Using MS Word 2016 I typed "blah blah my masters thesis" and Word > suggested me to replace "masters" with "master's". > > Could this rule be added to LanguageTool? > > Thanks! > > Kind regards, >>Marco A.G.Pinto > --- > > -- > -- > Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San > Francisco, CA to explore cutting-edge tech and listen to tech > luminaries > present their vision of the future. This family event has something for > everyone, including kids. Get more information and register today. > http://sdm.link/attshape > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Rule to check common mistakes in URL
On 24.10.2015 11:29, Dominique Pellé wrote: > Purodha Blissenbach <puro...@blissenbach.org> wrote: > >> Hi, >> >>>http:/www.google.com (there should be 2 slashes after >>> protocole) >> >> This is valid, at least protocolwise. I refers to a directory >> /www.google.com on the current server. Good warning, of course, if >> there >> is at least a dot in there. >> >> Purodha > > > I don't think that 1 slash only after http: is correct. > Or do you have a link which explains what you say? > > For documents on local hosts, you need to write something > like http://localhost/foo.html or http://127.0.0.1/foo.html > > For file: protocol, it localhost can be omitted so > you can have 3 slashes as in file:///foo.txt It is unusual in texts, of course, but specifying the scheme (e.g. http) and omitting the host name asks the already known host via a (possibly different) scheme for the path, which then begins after the colon. One slash instructs the server to look for the path beginning from its document root. If there is no slash after the colon, the current directory is used. As said: this is all pretty unusual in normal text. Purodha -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Rule to check common mistakes in URL
Hi, >http:/www.google.com (there should be 2 slashes after > protocole) This is valid, at least protocolwise. I refers to a directory /www.google.com on the current server. Good warning, of course, if there is at least a dot in there. Purodha On 24.10.2015 05:31, Dominique Pellé wrote: > Hi > > I've added a rule in French grammar.xml to check for common mistakes > in URLs > in this checkin: > > > https://github.com/languagetool-org/languagetool/commit/4bd2109242ad02f2d50e1f597580764a1dd45d97 > > Some examples of mistakes detected: > >http//www.google.com (missing colon) >http:/www.google.com (there should be 2 slashes after > protocole) >mailto://john.doe.com(no // in mailto: or news: protocoles) >.google.com (there should probably be 3 w, not 4) >https://ww.google.com (there should probably be 3 w, not 2) > > The rule could be re-used in other languages. > The rule does not use suggestions yet. I could not get > to work somehow with . Maybe someone can help. > If you can think of other common mistakes in URL not yet detected, > let me also know. > > Dominique > > > -- > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Behavior of non-breaking space U+00A0 in LanguageTool
Hi, you put a nbsp for instance between a figure and the unit, or in some languages between two part of an abbreviations, or between the (short) 1st word of a sentence such as an article and the 2nd word, etc. Purodha On 12.10.2015 14:33, Andre Couture wrote: > Hi > I did not follow the entire conversation here but I was curious as of > why would someone put a non breaking space between two words? > We face that in other areas of our code as well. > > If the idea of the nbsp is to keep the two apparent words together, > would it be good to handle the nbsp as an hyphen? Which mean that the > two words could be treated as two words or a single one?? > > > > Sent from my iPhone 6 > >> On Oct 12, 2015, at 05:52, Daniel Naber >>wrote: >> >>> On 2015-10-11 19:18, Dominique Pellé wrote: >>> >>> I think that spaces or non-breaking spaces should behave >>> the same for LanguageTool. >> >> I guess so. I've made a commit that changes this. It broke some >> tests: I >> fixed fr/grammar.xml and commented out tests in >> QuestionWhitespaceRuleTest. Could you check that, i.e. re-activate >> the >> tests? >> >> Regards >> Daniel >> >> >> >> -- >> ___ >> Languagetool-devel mailing list >> Languagetool-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > -- > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: new syntax available
On 10.10.2015 06:16, Dominique Pellé wrote: > Daniel Naber wrote: > >> On 2015-10-09 07:32, Dominique Pellé wrote: >> >>> I suppose that I care more than most because I only use LT to check >>> text files where the situation is frequent. >> >> I think normalizing the text makes sense if: >> 1) single line breaks get removed from plain text files (but not >> double >> spaces) >> 2) this normalization doesn't happen in LT core, but in the >> command-line >> client >> >> My understanding is that's not enough for your use case as you use >> spaces for indentation? For me, this sounds like a general input >> format >> issue, just like people want to use LT to check LaTeX. We cannot >> support >> that in the core, but if we find a way to do it outside that would >> be >> okay for me. We just need to avoid becoming a parser for every >> format >> out there. >> >> We already have the concept of annotated text[1], I think this could >> be >> used to check plain text files. "\n" is then markup just like "" >> is >> markup in XML. So we don't need normalization in that sense, but we >> need >> to parse the input. >> >> [1] >> >> https://languagetool.org/development/api/org/languagetool/markup/AnnotatedText.html > > I'm not sure I understand how it would work for users. > Would users have to give an option? Command line, or check box > for the GUI? That seems unfortunate, since it worked well before > without specifying an option, which users may not be aware of. > > I wonder how many users copy paste text in the web interface > of LT. Those users will also have degraded experience. > > I seem to be the only one really bothered with the regression. > I don't mean to be too negative about it. I like the new > feature, but I don't like the regression because text format is > ubiquitous and many text files use multiple double spaces as > well as line breaks in sentences. > > I could instead use \s+ in regexp for fr, eo, br that I maintain. > But it's not nice if only those 3 languages work. > And yes, it would clutter regexps, but I'd still find it acceptable. > > Mike Unwalla wrote: > >> I understand why you want to preprocess text. Sometimes, I have a >> similar >> problem. Sometimes, I want to ignore multiple spaces, line breaks, >> and tab >> characters. >> >> However, automatically ignoring such text could cause problems. For >> example, >> not all double spaces are errors. For the Netherlands, "there should >> be a >> double space between the postcode and the post town" >> >> (http://www.royalmail.com/personal/help-and-support/Addressing-your-items-Western-Europe). > > That's true. It's a rare case, but it's good to be able to detect > such errors. > > Ironically, the example given in your link does not respect > the rule it preaches for the Dutch address since I see only one space > between the postal codes in the post town in "2312 BK LEIDEN". > The address in Luxembourg is also misspelled (Longway -> Longwy) > but that's off-topic. > > Your link gives me the idea of writing semantic rules to check > address formating in various countries. Examples of rules for > checking addresses in France: > - house number should be before street name > - postal code should be before city name > - postal code should be 5 digits without space (29200 is ok, 29 200 > is wrong) > - etc. > > Good example: > 23 Rue de l’église > 29200 BREST > FRANCE > > Bad example (postal code after city name): >23 Rue de l’église >BREST 29200 >FRANCE > > The feature will be great for such rules. > Something like this may work (no tested) > > > \b(Rue|Avenue|Av\.|Place|Pl\.|Boulvevard|Boul\.)\s.*\n\s+\d{5}\s+\p{Lu}.*\n\s+FRANCE\b > > >> I did not mean that you should not preprocess text. I meant that you >> should >> not mess with the meaning of a regexp. >> >> Possibly, we can solve the conflict by having 2 types of : >> >> > > That would be ideal in my opinion. > Use of "exact-meaning" would be very rare. > Maybe a better name: > > Regards > Dominique How about making preprocessing explicit in the rule set like this: foo bar ... foo bar Purodha -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: new syntax available
On 08.10.2015 06:59, Dominique Pellé wrote: > Daniel Naber wrote: > >> On 2015-10-07 06:41, Dominique Pellé wrote: >> >> Hi Dominique, >> >> thanks for your feedback. > > One more remark: > > If I replace a rule like... > > > foo > bar > > > ... into ... > > foo bar > > ... then the regexp rule does not detect all the errors > that the rule detected. It does not detect errors > in "foo bar" (2 spaces or more, or tabs) or when there is a > new line as in: > > foo > bar > > How to fix it? > > 1) should we write regex like foo\s+bar > > 2) or should be smart and automatically treat > all sequences of spaces/tabs/newlines/unbreakable spaces > as if it was one space? I suggest version 1, since 2 would alter the usual meaning of regular expressions which I believe is a bad idea. Purodha -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Subjects of post to this mailinglist
Dear list admins, is it possible to have a string like [language tool] or maybe something shorter prefixed to the subjects of mailing list posts? Many lists have that, and at least to me it is really helpful in quickly understanding what is what in my 100+ daily e-mails. Thank you and greetings Purodha -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Subjects of post to this mailinglist
On 29.08.2015 14:06, Daniel Naber wrote: On 2015-08-29 12:02, Purodha Blissenbach wrote: is it possible to have a string like [language tool] or maybe something shorter prefixed to the subjects of mailing list posts? Many lists have that, and at least to me it is really helpful in quickly understanding what is what in my 100+ daily e-mails. I suggest you filter your emails, e.g. by the List-Id header. The value for this list is development discussion for LanguageTool languagetool-devel.lists.sourceforge.net, all other lists should also have such a header. Given the multiple ways how I receive e-mail makes filtering not an option. I could route ALL my mail through an extra server allowing a filtering process which might selectively add the wanted prefix, but that is cumbersome and slow. Currently, I am receiving only one mailing list that does not identify itself per Subject: header. That is why I am asking. Purodha -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel