Re: Suggestion for English rule - masters thesis > master's thesis

2016-06-22 Thread Purodha Blissenbach
Ouch! Nick, you put it the right way.
That is what I wanted to write, but I
messed it up. I'm sorry.
Puroda

On 22.06.2016 23:50, Nick Hough wrote:
> It would be the other way around:
> 
> If you wrote it : "my masters thesis”, because it is the thesis from
> your “masters" degree
> If your master wrote it : "my master’s thesis”, because the thesis
> is owned (hence the possessive apostrophe) by the “master"
> 
> Both could be valid English. Which version is correct would depend on
> the context and desired meaning.
> 
> Cheers,
> 
> Nick
> 
>> On 22 Jun 2016, at 8:16 PM, Purodha Blissenbach
>> <puro...@blissenbach.org> wrote:
>> If you wrote it : "my master's thesis"
>> If your master wrote it : "my masters thesis"
> --
> Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech 
> luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel

--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Suggestion for English rule - masters thesis > master's thesis

2016-06-22 Thread Purodha Blissenbach
Imho, it depends.
If you wrote it : "my master's thesis"
If your master wrote it : "my masters thesis"
but I am not a native English speaker :-)

Purodha

On 22.06.2016 18:33, Marco A.G.Pinto wrote:
> Hello!
> 
> Using MS Word 2016 I typed "blah blah my masters thesis" and Word
> suggested me to replace "masters" with "master's".
> 
> Could this rule be added to LanguageTool?
> 
> Thanks!
> 
> Kind regards,
>>Marco A.G.Pinto
>  ---
> 
> --
> --
> Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech 
> luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel

--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Rule to check common mistakes in URL

2015-10-24 Thread Purodha Blissenbach
On 24.10.2015 11:29, Dominique Pellé wrote:
> Purodha Blissenbach <puro...@blissenbach.org> wrote:
>
>> Hi,
>>
>>>http:/www.google.com (there should be 2 slashes after
>>> protocole)
>>
>> This is valid, at least protocolwise. I refers to a directory
>> /www.google.com on the current server. Good warning, of course, if 
>> there
>> is at least a dot in there.
>>
>> Purodha
>
>
> I don't think that 1 slash only after http: is correct.
> Or do you have a link which explains what you say?
>
> For documents on local hosts, you need to write something
> like http://localhost/foo.html or http://127.0.0.1/foo.html
>
> For file: protocol, it localhost can be omitted so
> you can have 3 slashes as in file:///foo.txt

It is unusual in texts, of course, but specifying the scheme (e.g. 
http) and omitting the host name asks the already known host via a 
(possibly different) scheme for the path, which then begins after the 
colon. One slash instructs the server to look for the path beginning 
from its document root. If there is no slash after the colon, the 
current directory is used. As said: this is all pretty unusual in normal 
text.

Purodha


--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Rule to check common mistakes in URL

2015-10-23 Thread Purodha Blissenbach
Hi,

>http:/www.google.com (there should be 2 slashes after 
> protocole)

This is valid, at least protocolwise. I refers to a directory 
/www.google.com on the current server. Good warning, of course, if there 
is at least a dot in there.

Purodha


On 24.10.2015 05:31, Dominique Pellé wrote:
> Hi
>
> I've added a rule in French grammar.xml to check for common mistakes 
> in URLs
> in this checkin:
>
> 
> https://github.com/languagetool-org/languagetool/commit/4bd2109242ad02f2d50e1f597580764a1dd45d97
>
> Some examples of mistakes detected:
>
>http//www.google.com (missing colon)
>http:/www.google.com (there should be 2 slashes after 
> protocole)
>mailto://john.doe.com(no // in mailto: or news: protocoles)
>.google.com  (there should probably be 3 w, not 4)
>https://ww.google.com   (there should probably be 3 w, not 2)
>
> The rule could be re-used in other languages.
> The rule does not use suggestions yet. I could not get 
> to work somehow with . Maybe someone can help.
> If you can think of other common mistakes in URL not yet detected,
> let me also know.
>
> Dominique
>
> 
> --
> ___
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel


--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Behavior of non-breaking space U+00A0 in LanguageTool

2015-10-12 Thread Purodha Blissenbach
Hi,

you put a nbsp for instance between a figure and the unit, or in some 
languages between two part of an abbreviations, or between the (short) 
1st word of a sentence such as an article and the 2nd word, etc.

Purodha

On 12.10.2015 14:33, Andre Couture wrote:
> Hi
> I did not follow the entire conversation here but I was curious as of
> why would someone put a non breaking space between two words?
> We face that in other areas of our code as well.
>
> If the idea of the nbsp is to keep the two apparent words together,
> would it be good to handle the nbsp as an hyphen? Which mean that the
> two words could be treated as two words or a single one??
>
>
>
> Sent from my iPhone 6
>
>> On Oct 12, 2015, at 05:52, Daniel Naber 
>>  wrote:
>>
>>> On 2015-10-11 19:18, Dominique Pellé wrote:
>>>
>>> I think that spaces or non-breaking spaces should behave
>>> the same for LanguageTool.
>>
>> I guess so. I've made a commit that changes this. It broke some 
>> tests: I
>> fixed fr/grammar.xml and commented out tests in
>> QuestionWhitespaceRuleTest. Could you check that, i.e. re-activate 
>> the
>> tests?
>>
>> Regards
>>  Daniel
>>
>>
>> 
>> --
>> ___
>> Languagetool-devel mailing list
>> Languagetool-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
> 
> --
> ___
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel


--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: new syntax available

2015-10-09 Thread Purodha Blissenbach
On 10.10.2015 06:16, Dominique Pellé wrote:
> Daniel Naber wrote:
>
>> On 2015-10-09 07:32, Dominique Pellé wrote:
>>
>>> I suppose that I care more than most because I only use LT to check
>>> text files where the situation is frequent.
>>
>> I think normalizing the text makes sense if:
>> 1) single line breaks get removed from plain text files (but not 
>> double
>> spaces)
>> 2) this normalization doesn't happen in LT core, but in the 
>> command-line
>> client
>>
>> My understanding is that's not enough for your use case as you use
>> spaces for indentation? For me, this sounds like a general input 
>> format
>> issue, just like people want to use LT to check LaTeX. We cannot 
>> support
>> that in the core, but if we find a way to do it outside that would 
>> be
>> okay for me. We just need to avoid becoming a parser for every 
>> format
>> out there.
>>
>> We already have the concept of annotated text[1], I think this could 
>> be
>> used to check plain text files. "\n" is then markup just like "" 
>> is
>> markup in XML. So we don't need normalization in that sense, but we 
>> need
>> to parse the input.
>>
>> [1]
>> 
>> https://languagetool.org/development/api/org/languagetool/markup/AnnotatedText.html
>
> I'm not sure I understand how it would work for users.
> Would users have to give an option? Command line, or check box
> for the GUI? That seems unfortunate, since it worked well before
> without specifying an option, which users may not be aware of.
>
> I wonder how many users copy paste text in the web interface
> of LT. Those users will also have degraded experience.
>
> I seem to be the only one really bothered with the regression.
> I don't mean to be too negative about it. I like the new 
> feature, but I don't like the regression because text format is
> ubiquitous and many text files use multiple double spaces as
> well as line breaks in sentences.
>
> I could instead use \s+ in regexp for fr, eo, br that I maintain.
> But it's not nice if only those 3 languages work.
> And yes, it would clutter regexps, but I'd still find it acceptable.
>
> Mike Unwalla wrote:
>
>> I understand why you want to preprocess text. Sometimes, I have a 
>> similar
>> problem. Sometimes, I want to ignore multiple spaces, line breaks, 
>> and tab
>> characters.
>>
>> However, automatically ignoring such text could cause problems. For 
>> example,
>> not all double spaces are errors. For the Netherlands, "there should 
>> be a
>> double space between the postcode and the post town"
>> 
>> (http://www.royalmail.com/personal/help-and-support/Addressing-your-items-Western-Europe).
>
> That's true.  It's a rare case, but it's good to be able to detect
> such errors.
>
> Ironically, the example given in your link does not respect
> the rule it preaches for the Dutch address since I see only one space
> between the postal codes in the post town in "2312 BK LEIDEN".
> The address in Luxembourg is also misspelled (Longway -> Longwy)
> but that's off-topic.
>
> Your link gives me the idea of writing semantic rules to check
> address formating in various countries. Examples of rules for
> checking addresses in France:
> - house number should be before street name
> - postal code should be before city name
> - postal code should be 5 digits without space (29200 is ok, 29 200 
> is wrong)
> - etc.
>
> Good example:
> 23 Rue de l’église
> 29200 BREST
> FRANCE
>
> Bad example (postal code after city name):
>23 Rue de l’église
>BREST 29200
>FRANCE
>
> The  feature will be great for such rules.
> Something like this may work (no tested)
>
> 
> \b(Rue|Avenue|Av\.|Place|Pl\.|Boulvevard|Boul\.)\s.*\n\s+\d{5}\s+\p{Lu}.*\n\s+FRANCE\b
>
>
>> I did not mean that you should not preprocess text. I meant that you 
>> should
>> not mess with the meaning of a regexp.
>>
>> Possibly, we can solve the conflict by having 2 types of :
>> 
>> 
>
> That would be ideal in my opinion.
> Use of "exact-meaning" would be very rare.
> Maybe a better name: 
>
> Regards
> Dominique

How about making preprocessing explicit in the rule set like this:


   foo bar
   ...

foo  bar

Purodha

--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: new syntax available

2015-10-08 Thread Purodha Blissenbach


On 08.10.2015 06:59, Dominique Pellé wrote:
> Daniel Naber wrote:
>
>> On 2015-10-07 06:41, Dominique Pellé wrote:
>>
>> Hi Dominique,
>>
>> thanks for your feedback.
>
> One more remark:
>
> If I replace a rule like...
>
> 
>   foo
>   bar
> 
>
> ... into ...
>
> foo bar
>
> ... then the regexp rule does not detect all the errors
> that the  rule detected. It does not detect errors
> in "foo  bar"  (2 spaces or more, or tabs) or when there is a
> new line as in:
>
>   foo
>   bar
>
> How to fix it?
>
> 1) should we write regex like foo\s+bar
>
> 2) or should  be smart and automatically treat
> all sequences of spaces/tabs/newlines/unbreakable spaces
> as if it was one space?

I suggest version 1, since 2 would alter the usual meaning of regular 
expressions which I believe is a bad idea.

Purodha

--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Subjects of post to this mailinglist

2015-08-29 Thread Purodha Blissenbach
Dear list admins,
is it possible to have a string like [language tool] or maybe something 
shorter prefixed to the subjects of mailing list posts? Many lists have 
that, and at least to me it is really helpful in quickly understanding 
what is what in my 100+ daily e-mails.

Thank you and greetings

Purodha


--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Subjects of post to this mailinglist

2015-08-29 Thread Purodha Blissenbach
On 29.08.2015 14:06, Daniel Naber wrote:
 On 2015-08-29 12:02, Purodha Blissenbach wrote:

 is it possible to have a string like [language tool] or maybe 
 something
 shorter prefixed to the subjects of mailing list posts? Many lists 
 have
 that, and at least to me it is really helpful in quickly 
 understanding
 what is what in my 100+ daily e-mails.

 I suggest you filter your emails, e.g. by the List-Id header. The
 value for this list is development discussion for LanguageTool
 languagetool-devel.lists.sourceforge.net, all other lists should 
 also
 have such a header.

Given the multiple ways how I receive e-mail makes filtering not an 
option.
I could route ALL my mail through an extra server allowing a filtering 
process
which might selectively add the wanted prefix, but that is cumbersome 
and slow.
Currently, I am receiving only one mailing list that does not identify 
itself
per Subject:  header. That is why I am asking.

Purodha


--
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel