[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-06-17 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

au audr...@audreyt.org changed:

   What|Removed |Added

 CC||audr...@audreyt.org

--- Comment #13 from au audr...@audreyt.org 2012-06-17 18:41:10 UTC ---
Hi Vlakoff, thank you for the patch!

As you may already know, MediaWiki is currently revamping its PHP-based parser
into a Parsoid prototype component, to support the rich-text Visual Editor
project:

   https://www.mediawiki.org/wiki/Parsoid
   https://www.mediawiki.org/wiki/Visual_editor

Folks interested in enhancing the parser's capabilities are very much welcome
to join the Parsoid project, and contribute patches as Git branches:

   https://www.mediawiki.org/wiki/Git/Tutorial#How_to_submit_a_patch

Compared to .diff attachments in Bugzilla tickets, Git branches are much easier
for us to review, refine and merge features together.

Each change set has a distinct URL generated by the git review tool, which
can be referenced in Bugzilla by pasting its gerrit.wikimedia.org URL as a
comment.

If you run into any issues with the patch process, please feel free to ask on
irc.freenode.net #wikimedia-dev and the wikitext-l mailing list. Thank you!

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-05-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Rich Farmbrough rich...@farmbrough.co.uk changed:

   What|Removed |Added

 CC||rich...@farmbrough.co.uk

--- Comment #12 from Rich Farmbrough rich...@farmbrough.co.uk 2012-05-12 
18:10:08 UTC ---
Of the 5 exceptions I have corrected 3, one was corrected already (but there
was an instance of a tab character, and one was a false positive (Rugby
Football Club, not Request For Comments).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-03-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Sumana Harihareswara suma...@panix.com changed:

   What|Removed |Added

   Keywords|need-review |reviewed

--- Comment #11 from Sumana Harihareswara suma...@panix.com 2012-03-10 
19:53:09 UTC ---
(In reply to comment #10)
 The current patch no longer applies on current trunk though. Could you refresh
 and tweak it?

Thus marking reviewed - thanks, Gabriel.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-03-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

--- Comment #10 from Gabriel Wicke wi...@wikidev.net 2012-03-09 10:00:49 UTC 
---
The trunk version of this regexp already uses Unicode regexp mode, so your
patch should not affect performance significantly. 

The current patch no longer applies on current trunk though. Could you refresh
and tweak it?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-03-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

--- Comment #9 from Gabriel Wicke wi...@wikidev.net 2012-03-08 18:59:28 UTC 
---
I have not tested the performance impact yet, but have doubts about promoting
manual non-breaking space insertion. 

An alternative to consider would be to insert non-breaking spaces automatically
in these links, which would avoid cluttering the source with ugly HTML
entities. The length of IDs and especially of RFC/ISBN/PMID prefixes is quite
short, so a non-breaking space should not cause major display problems, even on
small displays. ISBN links in narrow table columns (infoboxes maybe) could
still be an issue, and would need to be checked.

Independent of the non-breaking space issue I am in favor of restricting the
number of newlines between link prefix and ID just as you did in your patch.
Could you rework the patch to include the count directly in the regexp, similar
to my regexp above?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-02-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Gabriel Wicke wi...@wikidev.net changed:

   What|Removed |Added

 CC||wi...@wikidev.net

--- Comment #7 from Gabriel Wicke wi...@wikidev.net 2012-02-29 13:55:15 UTC 
---
The latest patch would remove linking for five links in the English Wikipedia
found by the following regexp:

'(?:(?:RFC|PMID)[ \t\n\r\f]*(?:[\n\f\r][ \t\n\r\f]*){2,}([0-9]+)|ISBN[
\t\n\r\f]*(?:[\n\f\r][ \t\n\r\f]*){2,}(\b(?:97[89][ -]?)?(?:[0-9][
-]?){9}[0-9Xx]\b))'

--
== Match: [[Ridda wars]] ==
on, Trident Press Ltd., 2001, p. 81-84. ISBN

1-900724-47-2./ref

=== Buzakha ===

On receiving i
== Match: [[Hiroyuki Agawa]] ==
 Imperial NavybrISBN 9780870113550brISBN 

9784770025395   
|[[Biography]]; translation by John Bes
== Match: [[USS Fulton (AS-11)]] ==
 Ships'', pg. 377. Amber Books, London. ISBN 

9781905704439 /ref, launched on 27 December 1940 by   
== Match: [[Leinster League Division Two]] ==
1997/1998 Ashbourne

1998/1999 Coolmine RFC

1999/2000 [[Garda RFC|Garda]]   

2000/2001 Ark
== Match: [[Bluff Cove Air Attacks]] ==
sic primer''. Diane publishing, p. 235. ISBN

1585660914/ref

A total of 56 British servicemen
---

This is rare enough (and can easily be fixed manually). In the patch, the
checks for the number of newlines and the leading space (pre) could also be
folded into the regexp. Apart from that, it would be a good idea to benchmark
the performance impact from switching the regexp to unicode mode.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-02-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

--- Comment #8 from vlak...@gmail.com 2012-02-29 18:49:20 UTC ---
Thank you for this deep research. These articles should be corrected manually.
Also, your 4th match seems to be wrong ;)

About the unicode regex mode, if the performance impact is worth it, we could
remove that \x{00A0} part. It shouldn't be a big problem since directly using
the nbsp char is a bad idea: not visible, and removed from textarea by some
browsers.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-01-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

--- Comment #2 from vlak...@gmail.com 2012-01-23 13:28:37 UTC ---
Created attachment 9894
  -- https://bugzilla.wikimedia.org/attachment.cgi?id=9894
patch proposal for Bug 28950 and Bug 29025

First patch proposal for Bug 28950 and Bug 29025. Seems to be working great,
nevertheless any suggestion would be very welcome.

The benefits of this patch are:
- (Bug 28950) non-breaking spaces (both literal char and HTML entities) support
- (Bug 29025) no surprising link creation if several \n's (like
ISBN\n\n1234567890)


The only limitation I am aware of is that \n isn't implemented (yet), so for
example ISBN\n1234567890 doesn't produce a link. But don't forget cases like
ISBN \n123..., ISBN\n 123... (pre insertion!), ISBN\nnbsp;123..., and
so on.

\n support is feasible, I don't know if it would be that useful, however I'd
like to be as close as possible to normal wikicode parsing.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-01-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Sumana Harihareswara suma...@panix.com changed:

   What|Removed |Added

   Keywords||need-review, patch
 CC||suma...@panix.com

--- Comment #3 from Sumana Harihareswara suma...@panix.com 2012-01-23 
19:37:21 UTC ---
I've added the patch and need-review tags so developers know to review
this.  Thanks for the patch!  Next time you can do that yourself to speed
things up.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-01-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

vlak...@gmail.com changed:

   What|Removed |Added

   Attachment #9894|0   |1
is obsolete||

--- Comment #4 from vlak...@gmail.com 2012-01-23 20:56:41 UTC ---
Created attachment 9897
  -- https://bugzilla.wikimedia.org/attachment.cgi?id=9897
patch proposal (v2) for Bug 28950 and Bug 29025

New patch proposal for Bug 28950 and Bug 29025.

Now also perfectly manages \n's :-)

Here's my approach: the \n's are captured by the regex, as does the production
code, but no link is created if the spaces between ISBN and 123... met any
of these conditions:
- several \n's (not necessarily consecutive); which forcibly leave the current
p
- \n followed by a normal space; which trigger pre creation


Also, I removed the following lines I added in my previous patch:
$spaces = preg_replace( '![ \t]+!', ' ', $spaces );
$spaces = preg_replace( '!\#0?160;|\x{0160}!u', 'nbsp;', $spaces );

I noticed the parser usually doesn't do this cleanup (it only replaces entities
with #160;), so for consistency and performances I just let the spaces
unmodified.


Please inform me of anything I would have forgotten.

Sumana, you're welcome; I'm keeping it in mind for the next time.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-01-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

--- Comment #5 from Platonides platoni...@gmail.com 2012-01-23 21:16:05 UTC 
---
The entities look a bit like cluttering. Also, why do you need to
replace the spaces?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2012-01-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

--- Comment #6 from vlak...@gmail.com 2012-01-23 22:18:05 UTC ---
I'm not sure to understand what you mean?

I just support the different ways of providing a non-break space in the
wikisource, which are the entities nbsp;, #160;, #0160;, as well as the
literal \u00A0 char. It is important for consistency to recognize all of them
and not just an arbitrary part.

Also, I said I **don't** replace anymore the spaces ;-) I realized after my
first patch that it was useless...

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-07-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Bug 28950 depends on bug 29025, which changed state.

Bug 29025 Summary: Magic links are inconsistent with common parser rules
https://bugzilla.wikimedia.org/show_bug.cgi?id=29025

   What|Old Value   |New Value

 Status|NEW |RESOLVED
 Resolution||FIXED

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-07-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Bug 28950 depends on bug 29025, which changed state.

Bug 29025 Summary: Magic links are inconsistent with common parser rules
https://bugzilla.wikimedia.org/show_bug.cgi?id=29025

   What|Old Value   |New Value

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-06-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

p858snake p858sn...@gmail.com changed:

   What|Removed |Added

 Blocks||29473
 Blocks|26207   |

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-05-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Platonides platoni...@gmail.com changed:

   What|Removed |Added

 CC||platoni...@gmail.com
 Depends on||29025

--- Comment #1 from Platonides platoni...@gmail.com 2011-05-25 21:33:01 UTC 
---
Might as well do it when fixing bug 29025
(a literal 0160 char, not when done with entities)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-05-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Reedy s...@reedyboy.net changed:

   What|Removed |Added

 Blocks||26207
 Depends on|26207   |

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-05-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

 CC||m...@everybody.org
 Depends on||26207

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28950] ISBN, RFC and PMID magic links might support non-breaking spaces (nbsp;) too

2011-05-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28950

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

   Priority|Unprioritized   |Normal

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l