Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-25 Thread Jaska Zedlik
On Tue, Jun 23, 2009 at 04:26, wrote: > > In https://bugzilla.wikimedia.org/show_bug.cgi?id=8445 I had to insert > quote marks into peoples searches in order to make them work for > Chinese... (one more point in the quote/apostrophe mess.) > > This is the same idea we would like to use and seems

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-25 Thread Steve Bennett
On Wed, Jun 24, 2009 at 4:38 AM, Brion Vibber wrote: > Unless you cut and paste a term containing a fancy character from > another window, but the page uses the plain character... Yeah, I know. But my point is made and understood: think through the reasons for each equivalence, rather than automat

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-23 Thread Andrew Dunbar
2009/6/23 Brion Vibber : > Steve Bennett wrote: >> So, apostrophe (U+0027) -> curved right single quote (U+2019): yes, probably. >> The other way around...probably not, unless that U+2019 exists on any >> keyboards. >> >> Hyphen-minus (U+002D) -> em dash (U+2014): I would say no. If you >> search

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-23 Thread Brion Vibber
Steve Bennett wrote: > So, apostrophe (U+0027) -> curved right single quote (U+2019): yes, probably. > The other way around...probably not, unless that U+2019 exists on any > keyboards. > > Hyphen-minus (U+002D) -> em dash (U+2014): I would say no. If you > search for "clock-work", you probably d

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-22 Thread Tim Larson
Steve Bennett wrote: > I think you have to be mindful of the original goal here: for each > character a user is likely to enter from their keyboard in the search > box, what possible range of characters would they expect to match? > > So, apostrophe (U+0027) -> curved right single quote (U+2019):

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-22 Thread Steve Bennett
On Sat, Jun 20, 2009 at 9:46 PM, Neil Harris wrote: >> Regarding dashes and hyphens, I've now found my original data set, and >> a quick inspection gives this set of various similar-looking Latin >> hyphens, dashes and minus signs: >> U+002D HYPHEN-MINUS >> U+2010 HYPHEN >> U+2011 NON-BREAKING HYPH

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-22 Thread jidanni
> "JZ" == Jaska Zedlik writes: JZ> The one problem which is left, is how to integrate this into JZ> MediaWiki and enable search compatibility for different apostrophes? In https://bugzilla.wikimedia.org/show_bug.cgi?id=8445 I had to insert quote marks into peoples searches in order to make th

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-21 Thread Jaska Zedlik
Thanks to everybody for the replies. The one problem which is left, is how to integrate this into MediaWiki and enable search compatibility for different apostrophes? Will the solution with overriding the stripForSearch() function in the local language class be good for this? zedlik On Sat, Jun 20

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Neil Harris
Mike.lifeguard wrote: > Speaking of AntiSpoof, there is a freshly-opened bug that could use > attention: https://bugzilla.wikimedia.org/show_bug.cgi?id=19273 > > Thanks, > -Mike > Unless the normalization code has changed radically since I first wrote it, this should not be an issue with the n

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Mike.lifeguard
Speaking of AntiSpoof, there is a freshly-opened bug that could use attention: https://bugzilla.wikimedia.org/show_bug.cgi?id=19273 Thanks, -Mike On Sat, 2009-06-20 at 10:39 +0100, Neil Harris wrote: > Andrew Dunbar wrote: > > 2009/6/20 Jaska Zedlik : > > > >> Hello, > >> On Fri, Jun 19, 2009

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Neil Harris
Neil Harris wrote: > > Regarding dashes and hyphens, I've now found my original data set, and > a quick inspection gives this set of various similar-looking Latin > hyphens, dashes and minus signs: > U+002D HYPHEN-MINUS > U+2010 HYPHEN > U+2011 NON-BREAKING HYPHEN > U+2012 FIGURE DASH > U+2013

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Neil Harris
Andrew Dunbar wrote: > 2009/6/20 Neil Harris : > >> Neil Harris wrote: >> >>> Andrew Dunbar wrote: >>> >>> 2009/6/20 Jaska Zedlik : > Hello, > On Fri, Jun 19, 2009 at 20:31, Rolf Lampa wrote: > > > > >> Jaska Zedlik

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Andrew Dunbar
2009/6/20 Neil Harris : > Neil Harris wrote: >> Andrew Dunbar wrote: >> >>> 2009/6/20 Jaska Zedlik : >>> >>> Hello, On Fri, Jun 19, 2009 at 20:31, Rolf Lampa wrote: > Jaska Zedlik skrev: > <...> > > >> The code of the override function is the following:

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Neil Harris
Neil Harris wrote: > Andrew Dunbar wrote: > >> 2009/6/20 Jaska Zedlik : >> >> >>> Hello, >>> On Fri, Jun 19, 2009 at 20:31, Rolf Lampa wrote: >>> >>> >>> Jaska Zedlik skrev: <...> > The code of the override function is the following: >

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-20 Thread Neil Harris
Andrew Dunbar wrote: > 2009/6/20 Jaska Zedlik : > >> Hello, >> On Fri, Jun 19, 2009 at 20:31, Rolf Lampa wrote: >> >> >>> Jaska Zedlik skrev: >>> <...> >>> The code of the override function is the following: function stripForSearch( $string ) { $s = $string; >>

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-19 Thread Andrew Dunbar
2009/6/20 Jaska Zedlik : > Hello, > On Fri, Jun 19, 2009 at 20:31, Rolf Lampa wrote: > >> Jaska Zedlik skrev: >> <...> >> > The code of the override function is the following: >> > >> > function stripForSearch( $string ) { >> >   $s = $string; >> >   $s = preg_replace( '/\xe2\x80\x99/', '\'', $s )

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-19 Thread Jaska Zedlik
Hello, On Fri, Jun 19, 2009 at 23:28, Brion Vibber wrote: > Jaska Zedlik wrote: > >> Hi! >> >> There are different apostrophe signs exist. Let's consider 2 of them: >> U+0027 and U+2019. They have the same meaning and both of them are >> acceptable and apostrophes for the English language, for i

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-19 Thread Jaska Zedlik
Hello, On Fri, Jun 19, 2009 at 20:31, Rolf Lampa wrote: > Jaska Zedlik skrev: > <...> > > The code of the override function is the following: > > > > function stripForSearch( $string ) { > > $s = $string; > > $s = preg_replace( '/\xe2\x80\x99/', '\'', $s ); > > return parent::stripForSearch

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-19 Thread Brion Vibber
Jaska Zedlik wrote: > Hi! > > There are different apostrophe signs exist. Let's consider 2 of them: > U+0027 and U+2019. They have the same meaning and both of them are > acceptable and apostrophes for the English language, for instance. The > problem is that MediaWiki internal search distinguishes

Re: [Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-19 Thread Rolf Lampa
Jaska Zedlik skrev: <...> > The code of the override function is the following: > > function stripForSearch( $string ) { > $s = $string; > $s = preg_replace( '/\xe2\x80\x99/', '\'', $s ); > return parent::stripForSearch( $s ); > } I'm not a PHP programmer, but why using the extra assignment

[Wikitech-l] Different apostrophe signs and MediaWiki internal search

2009-06-19 Thread Jaska Zedlik
Hi! There are different apostrophe signs exist. Let's consider 2 of them: U+0027 and U+2019. They have the same meaning and both of them are acceptable and apostrophes for the English language, for instance. The problem is that MediaWiki internal search distinguishes these two apostrophes and the