Re: [Wikitech-l] For title normalization, what characters are converted to uppercase ?

2019-08-05 Thread Nicolas Vervelle
Last question (I believe) :
I've implemented something similar as Php72ToUpper in WPCleaner, and it
seems to work fine for removing false positives.
I've only one left on frwiki : ⅷ
<https://fr.wikipedia.org/w/index.php?title=%E2%85%B7=no>.
My code still converts it to uppercase, but on frwiki there is one page for
the lowercase letter, and one page for the uppercase letter, so this letter
is not converted to uppercase by current MediaWiki version.
Is it missing in Php72ToUpper to prevent it to be converted with PHP 7.2 ?

Nico

On Mon, Aug 5, 2019 at 8:45 AM Nicolas Vervelle  wrote:

> Thanks Giuseppe !
>
> I've subscribed to T219279 to know when the pages are properly converted,
> and when I can remove the hack in my code.
>
> Nico
>
> On Mon, Aug 5, 2019 at 7:03 AM Giuseppe Lavagetto <
> glavage...@wikimedia.org> wrote:
>
>> On Sun, Aug 4, 2019 at 11:34 AM Nicolas Vervelle 
>> wrote:
>>
>> > Thanks Brian,
>> >
>> > Great for the link to Php72ToUpper.php !
>> > I think I understand with it : for example, the first line says 'ƀ' =>
>> 'ƀ',
>> > which should mean that this letter shouldn't be converted to uppercase
>> by
>> > MW ?
>> > That's one of the letter I found that wasn't converted to uppercase and
>> > that was generating a false positive in my code : so it's because
>> specific
>> > MW code is preventing the conversion :-)
>> >
>>
>> Hi!
>>
>> No, that file is a temporary measure during a transition between two
>> versions of php.
>>
>> In HHVM and PHP 5.x, calling mb_toupper("ƀ") would give the erroneous
>> result "ƀ".
>>
>> In PHP 7.x, the result is the correct capitalization.
>>
>> The issue is that the titles of wiki articles get normalized, so under
>> php7
>> we would have
>>
>> ƀar => Ƀar
>>
>> which would prevent you from being able to reach the page.
>>
>> Once we're done with the transition and we go through the process of
>> coverting the (several hundred) pages/users that have the wrong title
>> normalization, we will remove that table, and obtain the correct
>> behaviour.
>>
>> You just need to subscribe https://phabricator.wikimedia.org/T219279 and
>> wait for its resolution I think - most unicode horrors are fixed in recent
>> versions of PHP, including the one you were citing.
>>
>> Cheers,
>>
>> Giuseppe
>> --
>> Giuseppe Lavagetto
>> Principal Site Reliability Engineer, Wikimedia Foundation
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] For title normalization, what characters are converted to uppercase ?

2019-08-05 Thread Nicolas Vervelle
Thanks Giuseppe !

I've subscribed to T219279 to know when the pages are properly converted,
and when I can remove the hack in my code.

Nico

On Mon, Aug 5, 2019 at 7:03 AM Giuseppe Lavagetto 
wrote:

> On Sun, Aug 4, 2019 at 11:34 AM Nicolas Vervelle 
> wrote:
>
> > Thanks Brian,
> >
> > Great for the link to Php72ToUpper.php !
> > I think I understand with it : for example, the first line says 'ƀ' =>
> 'ƀ',
> > which should mean that this letter shouldn't be converted to uppercase by
> > MW ?
> > That's one of the letter I found that wasn't converted to uppercase and
> > that was generating a false positive in my code : so it's because
> specific
> > MW code is preventing the conversion :-)
> >
>
> Hi!
>
> No, that file is a temporary measure during a transition between two
> versions of php.
>
> In HHVM and PHP 5.x, calling mb_toupper("ƀ") would give the erroneous
> result "ƀ".
>
> In PHP 7.x, the result is the correct capitalization.
>
> The issue is that the titles of wiki articles get normalized, so under php7
> we would have
>
> ƀar => Ƀar
>
> which would prevent you from being able to reach the page.
>
> Once we're done with the transition and we go through the process of
> coverting the (several hundred) pages/users that have the wrong title
> normalization, we will remove that table, and obtain the correct behaviour.
>
> You just need to subscribe https://phabricator.wikimedia.org/T219279 and
> wait for its resolution I think - most unicode horrors are fixed in recent
> versions of PHP, including the one you were citing.
>
> Cheers,
>
> Giuseppe
> --
> Giuseppe Lavagetto
> Principal Site Reliability Engineer, Wikimedia Foundation
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] For title normalization, what characters are converted to uppercase ?

2019-08-04 Thread Nicolas Vervelle
Thanks Brian,

Great for the link to Php72ToUpper.php !
I think I understand with it : for example, the first line says 'ƀ' => 'ƀ',
which should mean that this letter shouldn't be converted to uppercase by
MW ?
That's one of the letter I found that wasn't converted to uppercase and
that was generating a false positive in my code : so it's because specific
MW code is preventing the conversion :-)

Nico

On Sun, Aug 4, 2019 at 1:32 AM bawolff  wrote:

> MediaWiki uses php's mb_strtoupper.
>
> I believe this will use normal unicode uppercase algorithm. However this
> can vary depending on version of unicode. We are currently in the process
> of switching to php7, but for the moment we are still using HHVM's
> uppercasing code. There's a list of differences between hhvm and php7.2
> uppercasing at
>
> https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/Php72ToUpper.php
> [All this is probably subject to change]
>
> However, I am at a loss as to why hhvm & php < 5.6 [1] wouldn't map that
> character, since the ɽ -> Ɽ mapping has been present since unicode 5
> (2006). Guess it was using a really old unicode data or something.
>
> See also  bug T219279 [2]
>
> --
> Brian
>
> [1] https://3v4l.org/GHt3b
> [2] https://phabricator.wikimedia.org/T219279
>
> On Sat, Aug 3, 2019 at 7:57 AM Nicolas Vervelle 
> wrote:
>
> > Hello,
> >
> > On most wikis, MediaWiki is configuration to convert the first letter of
> a
> > title to uppercase, but apparently it's not converting every Unicode
> > characters : for example, on frwiki ɽ
> > <https://fr.wikipedia.org/w/index.php?title=%C9%BD=no> is a
> > different article than Ɽ <https://fr.wikipedia.org/wiki/%E2%B1%A4>, even
> > if
> > the second character is the uppercase version of the first one in
> Unicode.
> >
> > So, what characters are actually converted to uppercase by the title
> > normalization ?
> >
> > I need to know this information to stop reporting some false positives in
> > WPCleaner <https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:WPCleaner>.
> >
> > Thanks, Nico
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] For title normalization, what characters are converted to uppercase ?

2019-08-04 Thread Nicolas Vervelle
Thanks Yuri,

I know of the normalization done through the API, but it doesn't work for
the case I'm working on : it's a dump analysis, and I want it to be able to
work offline...

Nico

On Sun, Aug 4, 2019 at 2:12 AM Yuri Astrakhan 
wrote:

> Hi Nico, if possible, can your tool to actually use MW API to normalize
> titles? It's a very quick API call, you can do multiple titles at once, but
> it will save you a lot of grief over incompatibilities.
> --Yuri
>
> On Sat, Aug 3, 2019 at 10:57 AM Nicolas Vervelle 
> wrote:
>
> > Hello,
> >
> > On most wikis, MediaWiki is configuration to convert the first letter of
> a
> > title to uppercase, but apparently it's not converting every Unicode
> > characters : for example, on frwiki ɽ
> > <https://fr.wikipedia.org/w/index.php?title=%C9%BD=no> is a
> > different article than Ɽ <https://fr.wikipedia.org/wiki/%E2%B1%A4>, even
> > if
> > the second character is the uppercase version of the first one in
> Unicode.
> >
> > So, what characters are actually converted to uppercase by the title
> > normalization ?
> >
> > I need to know this information to stop reporting some false positives in
> > WPCleaner <https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:WPCleaner>.
> >
> > Thanks, Nico
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] For title normalization, what characters are converted to uppercase ?

2019-08-03 Thread Nicolas Vervelle
Hello,

On most wikis, MediaWiki is configuration to convert the first letter of a
title to uppercase, but apparently it's not converting every Unicode
characters : for example, on frwiki ɽ
 is a
different article than Ɽ , even if
the second character is the uppercase version of the first one in Unicode.

So, what characters are actually converted to uppercase by the title
normalization ?

I need to know this information to stop reporting some false positives in
WPCleaner .

Thanks, Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Roadmap for CX?

2018-09-06 Thread Nicolas Vervelle
I'd like to tell too my concern too about CX : it has been pushed into
production for several years, with a lot of unnecessary advertisements (I
still get a popup inciting me to use it each time I try to look at a
deleted page, while I have clearly disabled CX in the configuration) and
with many problems in the articles created with it, which requires a lot of
work from wiki gnomes to fix them... Bug reports are just put on the side
(example T192582 ) with the
reasoning that CX2 is coming, but it's been months/years and nothing seems
to be in sight.

What's the actual roadmap ? A real one, not a wishlist...
If CX2 is not going to be available soon, could you at least reduce the
burden imposed on volunteers to fix the problems it creates by reducing
ads, focusing on fixing problems...


On Thu, Sep 6, 2018 at 10:13 AM Strainu  wrote:

> Hello,
>
> More than a year has passed since the email below and subjectively,
> editors are complaining just as much about not being able to save
> changes and other nuisances. Now, I know that wikipedians are not shy
> about expressing their discontent, but I also cannot overstate the
> impact CX can have for small and medium-sized communities.
>
> Since the pages mentioned in Amir's email have not seen much action, I
> would like to ask for another update from the engineering team. Is CX
> still developed? Are small bug reports handled yet or are you still
> waiting for some big feature?
>
> Thank you,
>Strainu
>
>
> 2017-05-02 11:42 GMT+03:00 Amir E. Aharoni :
> > 2017-04-27 8:55 GMT+03:00 Strainu :
> >
> >> Following the recent outage, we've had a new series of complaints
> >> about the lack of improvements in CX, especially related to
> >> server-side activities like saving/publishing pages.
> >>
> >> Now, I know the team is involved in a long-term effort to merge the
> >> editor with the VE, but is there an end in sight for that effort? Can
> >> I tell people who ask "look, 6 more months then we'll have a much
> >> better translation tool"?
> >>
> >> Is there a publicly available roadmap for this project and more
> >> generally, for CX?
> >>
> >>
> > Hi,
> >
> > Thanks again for bringing this up.
> >
> > Currently the Language team is indeed working on transitioning the
> editing
> > component to VE. At the moment we are completing the rewrite of the
> > frontend internals using OOjs UI and so using VE's special handling of
> edge
> > cases. This is more than a refactoring—this will also improve the
> stability
> > of several features such as saving and loading, paragraph alignment, and
> > table handling.
> >
> > We hope to complete the transition of the translation editing interface
> to
> > VE in July–September 2017. This will not only change the interface
> itself,
> > but will also bring in some of the most often requested CX features, such
> > as the ability to add new categories, templates, and references using
> VE's
> > existing tools rather than just adapt them, and to edit the translation
> > using wiki syntax.
> >
> > The next part to develop would be another round of improvement of
> template
> > support. The previous iteration was done in the latter half of 2016, and
> > allowed adapting a much wider array of templates, including infoboxes.
> > However, one important kind of template that is not yet supported well
> > enough is ones inside references (a.k.a. citations or footnotes), and
> this
> > will be the focus of the next iteration. We also plan to improve CX’s
> > template editor itself by allowing machine translation of template
> > parameter values, and by fixing several outstanding bugs in it.
> >
> > After finishing these two major projects, in early 2018 we expect to work
> > on fixing various remaining bugs, after which we plan to start declaring
> > Content Translation as non-beta in some languages. We are figuring out
> > which bugs exactly will these be; the current list is at
> > https://phabricator.wikimedia.org/project/view/2030/ , but it will
> likely
> > change somewhat before we get there. (Suggestions about what should go
> > there are welcome at any time.)
> >
> > Finally, two further future directions that we are thinking about
> > longer-term are:
> > 1. Translation List: Shared and personal lists of articles awaiting
> > translation ( https://phabricator.wikimedia.org/T96147 ). We already
> have
> > designs for it, but the implementation will have to wait until we fix the
> > more urgent issues above.
> > 2. Better support on mobile devices. This is complicated, but
> much-needed.
> > Some early thoughts about this can be found at
> >
> https://www.mediawiki.org/wiki/Content_translation/Product_Definition/Mobile_exploration
> > , but there will need to be much more design and development around this.
> >
> > You can see a more formal document about this here, although the content
> is
> > largely the same:
> >
> 

[Wikitech-l] Documentation for API query/revisions needs an update

2018-08-02 Thread Nicolas Vervelle
Hi,

today, I noticed that I was getting API warnings in WPCleaner for each call
to API query/revisions :
revisions - Because "rvslots" was not specified, a legacy format has been
used for the output. This format is deprecated, and in the future the new
format will always be used.

This rvslots is not documented on API:Revisions and I don't find an email
mentionning a compatibility problem on this subject.

I tried the API Sandbox, but the documentation for rvslots is cryptic for
me : " Which revision slots to return data for, when slot-related
properties are included in rvprops. If omitted, data from the main slot
will be returned in a backwards-compatible format."

Any understandable explanation on this parameter and what value should I
put, and what difference should I expect in the result ?

Thanks
Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Important news about the November dumps run!

2017-11-07 Thread Nicolas Vervelle
Hi,

Are there problems with some dumps like frwiki with the new system ?
On your.org mirror, important files like page-articles are still missing
from the 20171103 dump directory, when usually it only takes a day...

Nico

On Mon, Nov 6, 2017 at 8:01 PM, Ariel Glenn WMF  wrote:

> Rsync of xml/sql dumps to the web server is now running on a rolling basis
> via a script, so you should see updates regularly rather than "every
> $random hours".  There's more to be done on that front, see
> https://phabricator.wikimedia.org/T179857 for what's next.
>
> Ariel
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-17 Thread Nicolas Vervelle
On Thu, Jul 13, 2017 at 9:18 AM, Nicolas Vervelle <nverve...@gmail.com>
wrote:

>
>
> On Tue, Jul 11, 2017 at 5:05 PM, Subramanya Sastry <ssas...@wikimedia.org>
> wrote:
>
>> On 07/11/2017 05:13 AM, Nicolas Vervelle wrote:
>>
>> - In the page dedicated to a category, there's a column telling if the
>>>
>> problem is due to one template (and which one) or by several
>>> templates, but
>>> I don't get this information in the REST API for Linter. Is it
>>> possible to
>>> have it in the API result or should I deduce it myself where the
>>> offset
>>> given by the API matches a call to a template?
>>>
>>
>> Look for this in the template response.
>>
>> |"templateInfo": { "multiPartTemplateBlock": true }|
>>
>
> Thanks ! I have updated WPCleaner to display the information about the
> template (template name or multiple templates).
>

I've started adding a detection in WPCleaner (error #532) for the
missing-end-tag error reported by Linter (I'm starting with easy ones).

Is it normal that errrors inside a gallery tag are reported as being an
error in a "multiPartTemplateBlock" while it's directly inside the page
wikitext ?
Examples on frwiki : Manali
<https://fr.wikipedia.org/w/index.php?title=Manali=edit=4555235>,
Zillis-Reischen
<https://fr.wikipedia.org/w/index.php?title=Zillis-Reischen=edit=485>
...

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-13 Thread Nicolas Vervelle
On Tue, Jul 11, 2017 at 5:05 PM, Subramanya Sastry <ssas...@wikimedia.org>
wrote:

> On 07/11/2017 05:13 AM, Nicolas Vervelle wrote:
>
> - In the page dedicated to a category, there's a column telling if the
>>
> problem is due to one template (and which one) or by several
>> templates, but
>> I don't get this information in the REST API for Linter. Is it
>> possible to
>> have it in the API result or should I deduce it myself where the
>> offset
>> given by the API matches a call to a template?
>>
>
> Look for this in the template response.
>
> |"templateInfo": { "multiPartTemplateBlock": true }|
>

Thanks ! I have updated WPCleaner to display the information about the
template (template name or multiple templates).


I think I've found some discrepancy between Linter reports. On frwiki, the
page "Discussion:Yasser Arafat" is reported in the list for self-closed-tag
[1], but when run the text of the page through the transform API [2], I
only get errors for obsolete-tag and mixed-content and nothing for
self-closed-tag.

[1] https://fr.wikipedia.org/wiki/Sp%C3%A9cial:LintErrors/self-closed-tag
[2]
https://fr.wikipedia.org/api/rest_v1/#!/Transforms/post_transform_wikitext_to_lint_title_revision
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-12 Thread Nicolas Vervelle
On Wed, Jul 12, 2017 at 4:43 PM, Subramanya Sastry <ssas...@wikimedia.org>
wrote:

> On 07/12/2017 01:12 AM, Nicolas Vervelle wrote:
>
> Hi Subbu,
>>
>> Using the localized names, I've found that not all Linter categories are
>> listed in the API result. Is it normal ?
>> For example, on frwiki, Linter reports 3 "mixed-content" errors for "Les
>> Trolls (film)" but this category is not in the API siteinfo call.
>>
>
> Yup.
>
> Parsoid currently has detection for more patterns than are exposed via the
> Linter extension. Mixed content is more informational at this point - it
> will become relevant when we are ready to start nudging markup towards
> being more well-formed / well-balanced than it is now.
>
> This was raised earlier on the Linter Extension talk page as well (
> https://www.mediawiki.org/w/index.php?title=Topic:Tszvb85ccd
> 0thbeo_showPostId=tteddfdly7fin8p6#flow-post-tteddfdly7fin8p6 )
>

Ok, I will only report patterns known by the Linter extension then.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-12 Thread Nicolas Vervelle
Hi Subbu,

Using the localized names, I've found that not all Linter categories are
listed in the API result. Is it normal ?
For example, on frwiki, Linter reports 3 "mixed-content" errors for "Les
Trolls (film)" but this category is not in the API siteinfo call.

Nico

On Wed, Jul 12, 2017 at 8:02 AM, Nicolas Vervelle <nverve...@gmail.com>
wrote:

>
>
> On Tue, Jul 11, 2017 at 5:05 PM, Subramanya Sastry <ssas...@wikimedia.org>
> wrote:
>
>> On 07/11/2017 05:13 AM, Nicolas Vervelle wrote:
>>
>> But I have a few questions / suggestions regarding Linter for the moment:
>>>
>>> - Is is possible to retrieve also the localized names of the Linter
>>> categories and priorities: for example, on frwiki, you can see on the
>>> Linter page [1] that the high priority is translated into "Priorité
>>> haute"
>>> and that self-closed-tag has a user friendly name "Balises
>>> auto-fermantes".
>>> I don't see the localized names in the informations sent by the API
>>> for
>>> siteinfo.
>>>
>>
>> Okay, will file a bug and take a look at this.
>
>
> I used Arlo answer, and I'm getting the localized names from the messages,
> so I can do without the localized names in Linter answers.
>
> Nico
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-12 Thread Nicolas Vervelle
On Tue, Jul 11, 2017 at 5:05 PM, Subramanya Sastry <ssas...@wikimedia.org>
wrote:

> On 07/11/2017 05:13 AM, Nicolas Vervelle wrote:
>
> But I have a few questions / suggestions regarding Linter for the moment:
>>
>> - Is is possible to retrieve also the localized names of the Linter
>> categories and priorities: for example, on frwiki, you can see on the
>> Linter page [1] that the high priority is translated into "Priorité
>> haute"
>> and that self-closed-tag has a user friendly name "Balises
>> auto-fermantes".
>> I don't see the localized names in the informations sent by the API
>> for
>> siteinfo.
>>
>
> Okay, will file a bug and take a look at this.


I used Arlo answer, and I'm getting the localized names from the messages,
so I can do without the localized names in Linter answers.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-11 Thread Nicolas Vervelle
Hi Subbu !

I have barely started using WPCleaner to fix some errors reported by
Linter, and I know I still have work to do on WPCleaner to make it easier
for users.
But I have a few questions / suggestions regarding Linter for the moment:

   - Is is possible to retrieve also the localized names of the Linter
   categories and priorities: for example, on frwiki, you can see on the
   Linter page [1] that the high priority is translated into "Priorité haute"
   and that self-closed-tag has a user friendly name "Balises auto-fermantes".
   I don't see the localized names in the informations sent by the API for
   siteinfo.
   - Where is it possible to change the description displayed in each page
   dedicated to a category ? For example, the page for self-closed-ags [2] is
   very short. It would be nice to be able to add a description of what the
   error is, what problems it can cause and what are the solutions to fix it
   (or to be able to link to a page explaining all that).
   - In the page dedicated to a category, there's a column telling if the
   problem is due to one template (and which one) or by several templates, but
   I don't get this information in the REST API for Linter. Is it possible to
   have it in the API result or should I deduce it myself where the offset
   given by the API matches a call to a template?


[1] https://fr.wikipedia.org/wiki/Sp%C3%A9cial:LintErrors
[2] https://fr.wikipedia.org/wiki/Sp%C3%A9cial:LintErrors/self-closed-tag



On Thu, Jul 6, 2017 at 2:02 PM, Subramanya Sastry 
wrote:

> How to read this post?
> --
> * For those without time to read lengthy technical emails,
>   read the TL;DR section.
> * For those who don't care about all the details but want to
>   help with this project, you can read sections 1 and 2 about Tidy,
>   and then skip to section 7.
> * For those who like all their details, read the post in its entirety,
>   and follow the links.
>
> Please ask follow up questions on wiki *on the FAQ’s talk page* [0]. If you
> find a bug, please report it *on Phabricator or on the page mentioned
> above*.
>
> TL;DR
> -
> The Parsing team wants to replace Tidy with a RemexHTML-based solution on
> the
> Wikimedia cluster by June 2018. This will require editors to fix pages and
> templates to address wikitext patterns that behave differently with
> RemexHTML.  Please see 'What editors will need to do' section on the Tidy
> replacement FAQ [1].
>
> 1. What is Tidy?
> 
> Tidy [2] is a library currently used by MediaWiki to fix some HTML errors
> found in wiki pages.
>
> Badly formed markup is common on wiki pages when editors use HTML tags in
> templates and on the page itself. (Ex: unclosed HTML tags, such as a
> 
> without a , are common). In some cases, MediaWiki can generate
> erroneous HTML by itself. If we didn't fix these before sending it to
> browsers, some would display things in a broken way to readers.
>
> But Tidy also does other "cleanup" on its own that is not required for
> correctness. Ex: it removes empty elements and adds whitespace between HTML
> tags, which can sometimes change rendering.
>
> 2. Why replace it?
> --
> Since Tidy is based on HTML4 semantics and the Web has moved to HTML5, it
> also makes some incorrect changes to HTML to 'fix' things that used to not
> work; for example, Tidy will unexpectedly move a bullet list out of a table
> caption even though that's allowed. HTML4 Tidy is no longer maintained or
> packaged. There have also been a number of bug reports filed against Tidy
> [3]. Since Parsoid is based on HTML5 semantics, there are differences in
> rendering between Parsoid's rendering of a page and current read view that
> is based on Tidy.
>
> 3. Project status
> -
> Given all these considerations, the Parsing team started work to replace
> Tidy
> [4] around mid-2015. Tim Starling started this work and after a survey of
> existing options, decided to write a wrapper over a Java-based HTML5
> parser.
> At the time we started the project, we thought we could probably have Tidy
> replaced by mid-2016. Alas!
>
> 4. What is replacing Tidy?
> --
> Tidy will be replaced by a RemexHTML-based solution that uses the
> RemexHTML[5] library along with some Tidy-compatibility shims to ensure
> better parity with the current rendering. RemexHTML is a PHP library that
> Tim
> wrote with C.Scott’s input that implements the HTML5 parsing spec.
>
> 5. Testing and followup
> ---
> We knew that some pages will be affected and need fixing due to this
> change.
> In order to more precisely identify what that would be, we wanted to do
> some
> thorough testing. So, we built some new tools [6][7] and overhauled and
> upgraded other test infrastructure [8][9] to let us evaluate the impacts of
> replacing Tidy (among other such things in the future) which can be a
> subject
> of a post all on its own.
>
> You 

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-09 Thread Nicolas Vervelle
On Sat, Jul 8, 2017 at 12:54 AM, Subramanya Sastry 
wrote:

>
> - On the Full Analysis window, the second button with a globe and a
>>> broom (Subbu, would you have a recommended icon for Linter related
>>> stuff ?)
>>>
>>
>> I will have to get back to you on this. I'll have to get some help from
>> someone who can design / recommend something appropriate here.
>>
>
> I added a logo to https://www.mediawiki.org/wiki/Extension:Linter
>
>
Thanks, I've included it in WPCleaner :-)

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-06 Thread Nicolas Vervelle
Hi,

On Thu, Jul 6, 2017 at 5:53 PM, Subramanya Sastry 
wrote:

> 3. How feasible would it be to build bots to make 90% of high priority
>> fixes and 90% of all fixes?
>>
>
> Since the start of the Linter project (when we started off with the GSoC
> prototype in summer of 2014, and once again when Kunal picked it up in
> 2016), we have been in conversation with Nico V (frwiki and who maintains
> WPCleaner) and with Marios Magioladitis and Bryan White (Checkwiki) to
> integrate the output with their projects / tools. On Nico's request, we
> have added API endpoints to Linter, Parsoid, and RESTBase so that the tool
> can programmatically fetch linter issues, and let editors / bots fix them
> appropriately.
>

 I'm happy to announce that I have just released WPCleaner[1] version 1.43
which brings a better integration which the Linter extension.
It's a first step, but I hope it can already help in fixing errors reported
by Linter.

The features related to the Linter extension are the following:

   - On the main WPCleaner window, there's a "Linter categories" button
   which gives the list of categories that Linter is detecting. Clicking on
   one of the categories returns the list of pages detected by Linter for this
   category. From the list of pages, you can go to the full analysis window
   for the pages that you want to fix.
   - On the Full Analysis window, the second button with a globe and a
   broom (Subbu, would you have a recommended icon for Linter related stuff ?)
   allows to retrieve the list of errors still detected by Linter on the
   current text: for each error, there's a magnifying glass that brings you to
   the location of the error in the text. You can then fix errors and check if
   Linter still finds something in your current version.
   - On the Check Wiki window, there's a similar button

Subbu, I have a question about the result returned by the API to transform
wikitext to lint : in the "dsr" fields, what is the meaning of the 4th
value? (the first one is the beginning of the error, the second one is the
end of the error, the third one is the length of the error...)

Nico


[1] https://en.wikipedia.org/wiki/Wikipedia:WPCleaner
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CX damaging wikis : any plan to fix ?

2015-10-30 Thread Nicolas Vervelle
On Fri, Oct 30, 2015 at 2:58 PM, Runa Bhattacharjee <
rbhattachar...@wikimedia.org> wrote:

> The tool is in active development and despite the issues reported, we still
> think that the overall balance of the content contributed is still
> positive. From our observation, users translating content with Content
> Translation are normally editing the articles after creation to improve
> them, and the deletion ratio is much lower compared to that of articles
> created from scratch. The purpose of the tool is to facilitate the creation
> of those first versions to be later evolved. Disabling the tool would also
> impact the workflow of many users who chose to use Content Translation
> because it saves them time.
>

As it has become usual with tools developed by WMF (VE, Flow, ...), I think
you disregard each time the damages made by these tools on wikis,
preferring to deploy it in alpha/beta versions to a wide audience before
making it stable. I clearly think it's a big error.
"The tool is in active development" : I reported most of the problems in
the early days of CX release, none of them seem to have been fixed, so
clearly, damages is not among your primary concern, you prefer to deploy it
widely rather than taking a more cautious approach by fixing the major bugs
before expanding the audience.

Why some users now have a workflow with a really buggy tool ?
Simply because of the approach you took...
Given the number of articles I happen to fix myself, I can assure you that
most of the CX users never fix the problems created by CX...
So yes, it saves them time, at the expense of the time of other people who
try to keep a clean encyclopedia.
So, thanks for the extra work...


Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] New (?) problem with named refs

2015-10-29 Thread Nicolas Vervelle
Hi,

It seems there's a new problem with named refs : several articles are now
displaying errors of having the same name for several references with
different content.
See my description on https://phabricator.wikimedia.org/T117037

It happens with {{#tag:ref| |...}}, when there's a whitespace instead of
nothing in the content part. The whitespace seems to be considered as being
a content on its own, rather than being considered as empty.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] CX damaging wikis : any plan to fix ?

2015-10-29 Thread Nicolas Vervelle
Hi,

Going into rant mode...

It's been at least 6 months that CX (Content Translation) has been deployed
on production wikis, and after all this time, most of the articles created
with this tool contain many syntax problems.
Phabricator tasks have been created months ago, and almost nothing seems to
be done to fix this.

So, is there any plan to deactivate CX until major bugs are fixed (to stop
the creation of damaged articles) or any plan to fix the bugs quickly ?
I tried posting on the talk page for CX with this kind of questions months
ago : the answer was that the bug were almost fixed. Several months after
that : same situation, bugs still here, even with some new ones...

Examples by taking the last 5 CX edits on frwiki :

   - David Borwein
   

   : almost no problems, but because the editor translated only one sentence,
   the article is otherwise completely in English. Even with no edits, basic
   problem of templates called with the {{Modèle:...}} prefix (equivalent to
   {{Template:...) in English)
   - Grande Riviere
   

   : many problems : template prefix, nowiki tags in bad places, several
   references with the same name and the same content duplicated (the goal of
   the name is to have the content once, not in every reference), whitespace
   included at the end of internal links (reason for some nowiki tags),
   coordinates so badly handled that it results in several lines of span tags
   and complex code
   - Jozef Gregor-Tajovsky
   

   : less than 500 bytes, but with stub category added directly instead of the
   templates that should add them
   - Silva Semadeni
   

   : trailing punctuation included in internal links, unnecessary div tags,
   internal links with only nowiki tags as the displayed text (so invisible
   links), unnecessary span tags, preceding whitespace included in internal
   links
   - David Steel
   

   : almost empty, but with internal CX data added

5 articles checked, not one correct.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Upcoming SyntaxHighlight_GeSHi changes

2015-07-15 Thread Nicolas Vervelle
Hi,

Is this related to the fact that some pages are now categorized in Pages
with syntax highlighting errors
https://en.wikipedia.org/wiki/Category:Pages_with_syntax_highlighting_errors
?
How can we find what is wrong in the page ?
I tried several things to fix Wikipedia talk:WPCleaner, but didn't manage
to remove the categorization.
https://en.wikipedia.org/wiki/Wikipedia_talk:WPCleaner

Is it also related to a change of behavior when there are nowiki tags
inside the text ?
I think they were necessary before in some cases, and are simply not
recognized (treated as plain text) now.

Nico

On Tue, Jun 23, 2015 at 2:48 AM, Ori Livneh o...@wikimedia.org wrote:

 Hello,

 Over the course of the next two days, a major update to the
 SyntaxHighlight_GeSHi extension will be rolled out to Wikimedia wikis. The
 change swaps geshi, the unmaintained PHP library which performs the lexical
 analysis and output formatting of code, for another library, called
 Pygments.

 The roll-out will remove support for 31 languages while adding support for
 several hundred languages not previously supported, including Dart, Rust,
 Julia, APL, Mathematica, SNOBOL, Puppet, Dylan, Racket, Swift, and many
 others. See https://people.wikimedia.org/~ori/geshi_changes.txt for a
 full list. The languages that will lose support are mostly obscure, with
 the notable exception of ALGOL68, Oz, and MMIX.

 The change is expected to slightly improve the time it takes to load and
 render all pages on all wikis (not just those that contain code blocks!),
 at the cost of a slight penalty (about a tenth of a second) on the time it
 takes to save edits which introduce or modify a block of highlighted code
 to an article.

 Lastly, the way the extension handles unfamiliar languages will change.
 Previously, if the specified language was not supported by the extension,
 instead of a code block, the extension would print an error message. From
 now on, it will simply output a plain, unhighlighted block of monospaced
 code.

 The wikitext syntax for highlighting code will remain the same.

 -- Ori
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] global cleanup of nowiki

2015-06-30 Thread Nicolas Vervelle
On Tue, Jun 30, 2015 at 10:31 PM, C. Scott Ananian canan...@wikimedia.org
wrote:

 On Mon, Jun 22, 2015 at 11:14 AM, Nicolas Vervelle nverve...@gmail.com
 wrote:

 - Second, I'm not a big fan of VE changing wikitext in parts not
 modified by the user: experience shows that it messes the diffs, and
  makes
 watching what VE is doing a lot more difficult. It has been requested
 several times that VE doesn't start modifying wikitext in places not
 modified by the user.
 

 In case it wasn't clear, this is already the case.  Parsoid/VE uses
 selective serialization to avoid touching unmodified content.  This
 feature has been present since the beginning.


Yes, I'm aware of that, but I was answering this because it was suggested
previously in the discussion to use VE to do the cleanup...

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Upcoming SyntaxHighlight_GeSHi changes

2015-06-23 Thread Nicolas Vervelle
On Tue, Jun 23, 2015 at 2:48 AM, Ori Livneh o...@wikimedia.org wrote:

 Lastly, the way the extension handles unfamiliar languages will change.
 Previously, if the specified language was not supported by the extension,
 instead of a code block, the extension would print an error message. From
 now on, it will simply output a plain, unhighlighted block of monospaced
 code.


That's really nice, but I have a request.
Previously, the error message contained the list of languages supported by
the extension, so it was easy to find which value you should use.
Without the error message, how do we easily get the list of languages that
can really be used ?
By easily, I mean directly accessible from the page we are currently
editing.
Maybe something like a small icon in the code block with either a tooltip
listing languages or a link to a page with the list of languages.

Thanks
Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] global cleanup of nowiki

2015-06-22 Thread Nicolas Vervelle
That would be nice to have a global cleanup at some point, but it won't be
able to handle every situation.
I don't think relying on VE to clean up is good:

   - First, it will take a long time before all articles are edited with VE
   (maybe never)
   - Second, I'm not a big fan of VE changing wikitext in parts not
   modified by the user: experience shows that it messes the diffs, and makes
   watching what VE is doing a lot more difficult. It has been requested
   several times that VE doesn't start modifying wikitext in places not
   modified by the user.


Things that are probably safe to fix automatically:

   - Whitespace characters between nowiki tags at the beginning of a line:
   remove everything including the whitespace characters.
   - Whitespace characters between nowiki tags not at the beginning of a
   line: remove the tags, keep the whitespace characters.
   - Some characters (letters, digits, ...) between nowiki tags: remove the
   tags, keep the characters
   - In a table, cell content with only a dash between nowiki: remove tha
   tags, add a whitespace characters before the dash

nowiki / are more difficult to fix automatically I think:

   - Between quotes: allows to mix a real quote with italics formatting
   - After the end of a wikilink:prevents the wikilink to extend to the
   text (often an error due to a bug in VE, but sometimes it may be normal)
   - ...

Nico


On Sun, Jun 21, 2015 at 8:43 PM, Amir E. Aharoni 
amir.ahar...@mail.huji.ac.il wrote:

 Thanks Arlo. I added a few.

 But I'm not sure that it answers my original question: Will this be done
 every time a page happens to edited in VE and saved or will it be done
 globally on all pages in all wikis as some kind of a maintenance job?


 --
 Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
 http://aharoni.wordpress.com
 ‪“We're living in pieces,
 I want to live in peace.” – T. Moore‬

 2015-06-20 19:45 GMT+03:00 Arlo Breault abrea...@wikimedia.org:

  On Friday, June 19, 2015 at 1:38 AM, Amir E. Aharoni wrote:
   There may be more - I'm still looking for these.
 
 
  If you find any, please propose them on the Parsoid’s normalization talk
  page [0].
  I’ve added the ones you’ve mentioned so far.
 
  We’ve documented [1] what’s currently been implemented.
 
  A few months back, Subbu solicited feedback [2] on what style norms
 should
  be enforced. We’ve since added a `scrubWikitext` parameter to Parsoid’s
 API
  that clients (like VE) can benefit from.
 
  Cleaning up our past transgressions is great. Helping to prevent their
  continued
  existence is even better.
 
  I was reading the discussion on gradually enabling VE for new accounts
 [3]
  and
  Kww writes there,
 
  Further, we still have issues with stray nowiki tags being scattered
  across articles.
  Until those are addressed, the notion that VE doesn't cause extra work
 for
  experienced editors is simply a sign that the metrics used to analyze
  effort were
  wrong. Jdforrester, can you explain how a study that was intended to
  measure
  whether VE caused extra work failed to note that even with the current
  limited use,
  it corrupts articles at this kind of volume [4]? Why would we want to
  encourage
  such a thing?”
 
  Makes me sad.
 
 
  [0] https://www.mediawiki.org/wiki/Talk:Parsoid/Normalizations
  [1] https://www.mediawiki.org/wiki/Parsoid/Normalizations
  [2]
  https://lists.wikimedia.org/pipermail/wikitech-l/2015-April/081453.html
  [3]
 
 https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28proposals%29#Gradually_enabling_VisualEditor_for_new_accounts
  [4]
 
 https://en.wikipedia.org/w/index.php?title=Special:AbuseLogoffset=limit=500wpSearchFilter=550
 
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Superprotect user right, Comming to a wiki near you

2014-08-10 Thread Nicolas Vervelle
Le 10 août 2014 15:35, svetlana svetl...@fastmail.com.au a écrit :

 On Sun, 10 Aug 2014, at 23:19, K. Peachey wrote:
  Lets all welcome the new overlord Erik.
 
  Add a new protection level called superprotect
  Assigned to nobody by default. Requested by Erik Möller for the purposes
  of protecting pages such that sysop permissions are not sufficient to
  edit them.
  Change-Id: Idfa211257dbacc7623d42393257de1525ff01e9e
  
https://gerrit.wikimedia.org/r/#q,Idfa211257dbacc7623d42393257de1525ff01e9e,n,z

 
  https://gerrit.wikimedia.org/r/#/c/153302/

 This change solves a problem that does not exist.
 We either trust sysops, or we don't.

 Erik Moeller wrote:
  In the long run, we will want to
  apply a code review process to these
  changes as with any other deployed code

 I hope such things will not need to go through the WMF. Or is that what
you'd like?

I hope it's not an other step from WMF to prevent the application of
community decisions when they not agree with it. I fear that they will use
this to bypass community decisions. For example like forcing again VE on
everyone on enwki: last year, sysop were able to apply community decision
against Erik wishes only because they had access to site wide js or CSS.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Superprotect user right, Comming to a wiki near you

2014-08-10 Thread Nicolas Vervelle
Le 10 août 2014 17:06, James HK jamesin.hongkon...@gmail.com a écrit :

 Hi,

  It could mean that, but of course it is actually introduced to prevent
  the German community from deactivating the Media Viewer.

 User JEissfeldt, removed `mw.config.set(wgMediaViewerOnClick,
 false);` from Common.js [0] and is the same person who sets
 `protect-level-superprotect`.

 I have no idea what the German community wants or doesn't want but
 using `protect-level-superprotect` to block potential edits is rather
 questionable.

 [0]
https://de.wikipedia.org/w/index.php?title=MediaWiki:Common.jsdiff=prevoldid=132946422


Thanks for the diff... That shows what this super protect power is really
for: WMF forcing something against community wishes/discussion. My fear
wasn't unfounded. Clearly a huge step backwards for the wiki philosophy
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Bot flags and human-made edits

2014-05-19 Thread Nicolas Vervelle
On Tue, May 20, 2014 at 3:39 AM, Dan Garry dga...@wikimedia.org wrote:

 On 19 May 2014 19:36, Amir Ladsgroup ladsgr...@gmail.com wrote:

  As a bot operator I think API parameter about flagging bot or not is
  necessary
 

 Sure, but as I'm not a bot operator, can you explain why and what you use
 this for, to help me understand? :-)


I think an example of bot using this API parameter is Salebot on frwiki :
when it reverts a vandalism, this edit is not marked with the bot flag so
that it appears in the recent changes, and made human aware that a
vandalism has been reverted. For other things, it may have its edit marked
with the bot flag.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Visual Editor is trashing every article on French Wiki

2013-11-04 Thread Nicolas Vervelle
Currently, Visual Editor is trashing badly every article that is edited on
French Wiki: every accented character is replaced by strange characters all
over the article.

Could you please stop immediately VE ?

How can such a mess make its way into production ? Is there no real tests
before ?

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Visual Editor is trashing every article on French Wiki

2013-11-04 Thread Nicolas Vervelle
Well, yes it's more than urgent, several articles are trashed every minute.
How can something like that be unleashed into big wikis without proper
testing ?


On Mon, Nov 4, 2013 at 9:40 PM, Dan Garry dga...@wikimedia.org wrote:

 I'm not involved in the VisualEditor development, but I've tried to pass on
 this to one of the engineers involved in VisualEditor so he's aware of it.
 Obviously this is an urgent problem that needs fixing.

 Thanks,
 Dan


 On 4 November 2013 20:29, Nicolas Vervelle nverve...@gmail.com wrote:

  Currently, Visual Editor is trashing badly every article that is edited
 on
  French Wiki: every accented character is replaced by strange characters
 all
  over the article.
 
  Could you please stop immediately VE ?
 
  How can such a mess make its way into production ? Is there no real tests
  before ?
 
  Nico
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l




 --
 Dan Garry
 Associate Product Manager for Platform
 Wikimedia Foundation
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Visual Editor is trashing every article on French Wiki

2013-11-04 Thread Nicolas Vervelle
Yes,

it's happening everywhere, but English doesn't have accented characters, so
damages are more limited.



On Mon, Nov 4, 2013 at 9:43 PM, Risker risker...@gmail.com wrote:

 Likely related to this bugzilla:
 https://bugzilla.wikimedia.org/show_bug.cgi?id=50296

 It is also happening on English Wikipedia, according to the VE feedback
 page.

 Risker

 On 4 November 2013 15:40, Dan Garry dga...@wikimedia.org wrote:

  I'm not involved in the VisualEditor development, but I've tried to pass
 on
  this to one of the engineers involved in VisualEditor so he's aware of
 it.
  Obviously this is an urgent problem that needs fixing.
 
  Thanks,
  Dan
 
 
  On 4 November 2013 20:29, Nicolas Vervelle nverve...@gmail.com wrote:
 
   Currently, Visual Editor is trashing badly every article that is edited
  on
   French Wiki: every accented character is replaced by strange characters
  all
   over the article.
  
   Could you please stop immediately VE ?
  
   How can such a mess make its way into production ? Is there no real
 tests
   before ?
  
   Nico
   ___
   Wikitech-l mailing list
   Wikitech-l@lists.wikimedia.org
   https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 
 
 
  --
  Dan Garry
  Associate Product Manager for Platform
  Wikimedia Foundation
   ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] OAuth

2013-08-23 Thread Nicolas Vervelle
On Wed, Aug 21, 2013 at 5:04 PM, Chris Steipp cste...@wikimedia.org wrote:

 On Wed, Aug 21, 2013 at 2:05 AM, Nicolas Vervelle nverve...@gmail.com
 wrote:

  Hi,
 
  I'm completely new to OAuth, so bear with me if my questions are basic
 or I
  missed a point ;-)
  It seems interesting, but seems very oriented for web applications, not
 so
  much for desktop applications.
 

 This is true, for exactly the reason you were asking about-- the secret key
 needs to be kept private, which is impossible when you distribute the
 application to other users. OAuth 2 has a framework for dealing with this,
 but it makes controlling consumers nearly impossible. So we wanted to start
 with OAuth 1 while everyone gets familiar with the concepts, and we see
 which use cases actually get used. We may extend the framework to allow
 situations like this in the future.

 The best workaround now is probably to have each user register their copy
 of your desktop application as its own consumer. It's a little ugly having
 to give your user instructions on cutting and pasting tokens and keys
 around, but it can work (in the early days of Salesforce, several OAuth
 apps were configured this way).


Seems very complex for users, so I won't go that way for WPCleaner.
Is it possible to use only one client, with the secret key included in the
distribution ?
(A user with enough determination will be able to extract it)
This would mean that there's not 100% certainty about the client being the
true one.
But, the attacker would only be able to impersonate the application, not
the user.



 
  I'm interested in developing this for WPCleaner [1], which is a desktop
  application.
  Is the callback URL required ? If so, which one should you use for a
  desktop application ?
 

 For bots too, I'd like to have the extension implement something like
 https://developers.google.com/accounts/images/OauthUX_nocallback.pngdirectly
 in the extension, but that wasn't something we were able to finish before
 this release.


Ok, so unless there's a mechanism to work without callback URL, there's no
way for a desktop application to work.
I something like that is implemented, I will look again at OAuth for
WPcleaner.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] OAuth

2013-08-21 Thread Nicolas Vervelle
Hi,

I'm completely new to OAuth, so bear with me if my questions are basic or I
missed a point ;-)
It seems interesting, but seems very oriented for web applications, not so
much for desktop applications.

I'm interested in developing this for WPCleaner [1], which is a desktop
application.
Is the callback URL required ? If so, which one should you use for a
desktop application ?

Has anyone implemented the connection to WMF wikis using OAuth under Java ?

For this to work, you request client tokens (including secret key) for the
client : do this tokens need to be kept privately ?
I'm wondering, because keeping secrets for an open source desktop
application is not easy.

Nico

[1] http://en.wikipedia.org/wiki/Wikipedia:WPCleaner



On Wed, Aug 21, 2013 at 6:15 AM, Chris Steipp cste...@wikimedia.org wrote:

 As mentioned earlier this week, we deployed an initial version of the OAuth
 extension to the test wikis yesterday. I wanted to follow up with a few
 more details about the extension that we deployed (although if you're just
 curious about OAuth in general, I recommend starting at oauth.net, or
 https://www.mediawiki.org/wiki/Auth_systems/OAuth):

 * Use it: https://www.mediawiki.org/wiki/Extension:OAuth#Using_OAuthshould
 get you started towards using OAuth in your application.

 * Demo: Anomie setup a excellent initial app (I think counts as our first
 official, approved consumer) here
 https://tools.wmflabs.org/oauth-hello-world/. Feel free to try it out, so
 you can get a feel for the user experience as a user!

 * Timeline: We're hoping to get some use this week, and deploy to the rest
 of the WMF wikis next week if we don't encounter any issues.

 * Bugs: Please open bugzilla tickets for any issues you find, or
 enhancement requests--

 https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensionscomponent=OAuth


 And some other details for the curious:

 * Yes, you can use this on your own wiki right now! It's meant to be used
 in a single or shared environment, so the defaults will work on a
 standalone wiki. Input and patches are welcome, if you have any issues
 setting this up on your own wiki.

 * TLS: Since a few of you seem to care about https... The extension
 currently implements OAuth 1.0a, which is designed to be used without https
 (except to deliver the shared secret to the app owner, when the app is
 registered). So calls to the API don't need to use https.

 * Logging: All edits are tagged with the consumer's id (CID), so you can
 see when OAuth was used to contribute an edit.

 Enjoy!
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Add tags when saving an edit ?

2013-08-16 Thread Nicolas Vervelle
Hi,

Abuse Filter extension, Visual Editor, ... are able to create tags when
edits are saved.
Is it possible to do the same kind of things when using the API to edit a
page ?

I'd like to be able to add tags when I save a page using WPCleaner [1] for
several purposes:
* marking the edit as being done by WPCleaner, like what Visual Editor is
doing for its own edits
* when fixing errors for project Check Wiki [2], adding a tag for each kind
of error that has been fixed
* and probably other uses in the future

Having this kind of tags could help track what tools are doing if they
implemented this.
I konw I could use it to see how WPCleaner is used, and if a problem is
reported to check if several edits need to be fixed.

Nico

[1] http://en.wikipedia.org/wiki/Wikipedia:WPCleaner
[2] http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Check_Wikipedia
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Add tags when saving an edit ?

2013-08-16 Thread Nicolas Vervelle
Thanks,

I will add a comment there :-)

Nico


On Fri, Aug 16, 2013 at 8:30 PM, Bartosz Dziewoński matma@gmail.comwrote:

 No, but there's a bug about this[1] and there was a now-abandoned patch[2].

 [1] 
 https://bugzilla.wikimedia.**org/show_bug.cgi?id=18670https://bugzilla.wikimedia.org/show_bug.cgi?id=18670
 [2] 
 https://gerrit.wikimedia.org/**r/#/c/64650/https://gerrit.wikimedia.org/r/#/c/64650/

 --
 Matma Rex

 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Remove 'visualeditor-enable' from $wgHiddenPrefs

2013-07-27 Thread Nicolas Vervelle
On Sat, Jul 27, 2013 at 3:42 AM, Erik Moeller e...@wikimedia.org wrote:

 On Mon, Jul 22, 2013 at 8:44 PM, Tim Starling tstarl...@wikimedia.org
 wrote:

  Newcomers with the VisualEditor were ~43% less likely to save a
  single edit than editors with the wikitext editor (x^2=279.4,
  p0.001), meaning that Visual Editor presented nearly a 2:1 increase
  in editing difficulty.

 For the record, this datapoint included in the draft (!) analysis was
 due to faulty instrumentation. The correct numbers show only a
 marginally significant difference between VisualEditor and wikitext
 for an edit within 72 hours [1], with the caveats already given in my
 earlier response.

 Erik

 [1]
 https://meta.wikimedia.org/wiki/Research:VisualEditor%27s_effect_on_newly_registered_editors/Results#Editing_ease


Erik,

Unless I'm mistaken, there's something missing in the study. The test
users had VE enabled, whereas the control users has VE disabled.
The study seems to assume that test users always used VE, when in fact
they had the choice to use it or not.
Since it seems that a good proportion of new users with VE enabled are
actually using VE, the difference between the 2 sets of users may be
greatly decreased by the fact that many users in the test population are
using the wikitext editor.

Can you explain if I misread the study, or how you took that into account ?

Thanks,
Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Remove 'visualeditor-enable' from $wgHiddenPrefs

2013-07-23 Thread Nicolas Vervelle
I was glad to see some WMF members speak their mind against the current
stance from Erik and James.
But when I see Erik answer, I'm clearly understanding that WMF management
is simply being blind.
Erik, if I read correctly your reply :

   - You still don't have analysed the A/B test period : first evidence
   show that it has a negative feedback but you think further analysis could
   show otherwise. But, with first evidence pointing to negative feedback, you
   keep going on rolling out VE to as many people as possible without making
   sure before that this won't have more negative results.
   - Performance is the single biggest issue for VE : ouah... are you in
   denial ?

Nico


On Tue, Jul 23, 2013 at 7:23 AM, Erik Moeller e...@wikimedia.org wrote:

 On Mon, Jul 22, 2013 at 8:44 PM, Tim Starling tstarl...@wikimedia.org
 wrote:

  and the results from Aaron Halfaker's study [2]

 As noted at the top of the page, the analysis is still in progress.

 Importantly, there were many confounding variables in the test, some
 of which are already documented. This includes users being assigned to
 the test group that received VisualEditor whose browser did not
 properly support it (it would have literally just not worked if the
 user attempted to edit); these issues were fixed later. See

 https://meta.wikimedia.org/wiki/Research:VisualEditor%27s_effect_on_newly_registered_editors/Results#Limitations
 for some of these issues, but like I said, analysis is still in
 progress and we'll need to see what conclusions can actually be drawn
 from the data.

  A proponent of source editing would claim that the steep learning
  curve is justified by the end results. A visual editor is easier for
  new users, but perhaps less convenient for power users. So Aaron
  Halfaker's study took its measurements at the point in the learning
  curve where you would expect the benefit of VE to be most clear: the
  first edit.

 Actually, as noted in the draft, because the test group was assigned
 at the point of account creation, we're not taking into account any
 prior experience using wikitext as an IP editor. 59% of respondents in
 the 2011 editor survey stated that they had edited as IPs prior to
 making an account, so we should assume that this is not an
 insignificant proportion:

 https://meta.wikimedia.org/wiki/Research:Wikipedia_Editors_Survey_November_2011

  Round-trip bugs

 If you have, like I have, spent hours looking at VisualEditor diffs,
 you'll know that these are relatively rare at this point. The bug
 category of round-trip bugs is sometimes used for issues that aren't
 accurately be described this way, e.g. users typing wikitext into
 VisualEditor, having difficulty updating a template parameter, or
 accidentally deleting content (sometimes due to bugs in VE).

  Perhaps the main problem is performance. Perhaps new users are
  especially likely to quit on the first edit because they don't want to
  wait 25-30 seconds for the interface to load (the time reported in
  [3]). Performance is a very common complaint for established users also.

 You're quoting a user test from June 10 which was performed on the
 following page, which I've temporarily undeleted:

 https://www.mediawiki.org/wiki/Golden-Crowned_Sparrow

 Editing this page in Firefox on a 6-year-old system only slightly
 faster than the tester's specs today takes about 5 seconds to
 initialize. In Chrome it takes about 3 seconds, in the ballpark of
 reloading the page into the source editor. Note that Gabriel put major
 caching improvements into production around June 7, which may not have
 been in effect for this user / this page yet.

 Still, I think that the hypothesis that any actual negative impact of
 VE on new users is due to performance issues is very supportable.
 Performance is the single biggest issue overall for VE right now, and
 performance on long pages can absolutely be prohibitively poor.
 Improving it is the highest priority for the team.

 Erik
 --
 Erik Möller
 VP of Engineering and Product Development, Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Remove 'visualeditor-enable' from $wgHiddenPrefs

2013-07-23 Thread Nicolas Vervelle
Thanks Risker,

I think you've summarized the position of many experienced users.
100% agreed.

Nico


On Tue, Jul 23, 2013 at 8:14 AM, Risker risker...@gmail.com wrote:

 The numbers are important.  And perhaps what isn't being reflected well
 here is the genuine disappointment felt by so many in the enwiki community;
 there was more excitement about this project than probably any other that
 WMF has undertaken in the past 5 years.  The sudden leap from
 feature-deficient alpha to deployment as default with untested major
 features eroded a great deal of the goodwill the community had for this
 much-requested feature. There still isn't any good explanation of why it
 didn't go alpha -- opt-in beta with the referencing and templates --
 debug, debug, debug -- default deployment.  It may not be coming through
 very clearly, but the editorial community *does* want this to work, and
 there's a lot of disappointment with what they got.

 This was an error in judgment, but it does not need to be a fatal one.  The
 important thing is to do some learning and apply it.  Hold off on deploying
 this software as default editor on other projects until more of the bugs
 (especially performance related bugs) are resolved, but proceed with opt-in
 beta on more projects.  They'll find bugs that enwiki hasn't found, and
 those bugs will be found by editors who are interested and motivated to
 test all kinds of use cases.  Enable the opt-out button as a preference on
 enwiki, and give thought to making it not-default for IPs and new users.
 English Wikipedia has still paid the price of being the primary launch
 site, but there's no point in compounding it by making VisualEditor the
 default for all projects and all editors.

 The knock-on effects of this problematic deployment will be felt for a long
 time, particularly its impact on other products that need VisualEditor to
 be widely accepted by the community to succeed (such as Flow).   The
 portrayal of editors (and now volunteer and staff developers and engineers)
 as simply not understanding, or having unreasonable expectations, is not
 realistic. This was ready for beta testing on July 1; it wasn't ready for
 deployment to default.  Your own internal memoranda (as can be seen by some
 of the links provided in this thread) indicate serious problems with
 performance.  The publicly available data on Limn[1] is consistently
 showing less than 10% adoption by experienced users, and only 12% of all
 edits being done using VE.

 Please reconsider the course of action.  There is no benefit in putting
 other projects through this when you have more than enough issues to fix.

 Risker



 [1]http://ee-dashboard.wmflabs.org/datasources



 On 23 July 2013 00:01, David Cuenca dacu...@gmail.com wrote:

  I'm glad that Tim is bringing some facts and numbers that back up what
 the
  community is demanding.
  To do otherwise will be to play tug-of-war which will lead to an even
 worse
  outcome.
 
  Besides of enabling the preference, a good approach would be to activate
 or
  deactivate that preference depending on how much an user has been using
 (or
  not) Visual Editor in their last edits and to ask new users if they want
 to
  use VE or the plain text system. New users are not that new, since many
  of them have been editing anonymously before.
 
  When there are more compelling reasons to do the switch (like real-time
  collaboration), users can have a higher incentive to do the switch.
 
  Micru
 
  On Mon, Jul 22, 2013 at 11:44 PM, Tim Starling tstarl...@wikimedia.org
  wrote:
 
   On 23/07/13 11:35, James Forrester wrote:
It would imply that this is a preference that Wikimedia will support.
This would be a lie. We have always intended for VisualEditor to be a
wiki-level preference, and for this user-level preference to
 disappear
   once
the need for an opt-in (i.e., the beta roll-out to production wikis)
 is
over.
  
   The feedback from established users [1] and the results from Aaron
   Halfaker's study [2] suggest that opt-in would be the most appropriate
   policy given VE's current level of maturity. That is, disable it by
   default and re-enable the preference.
  
   A proponent of source editing would claim that the steep learning
   curve is justified by the end results. A visual editor is easier for
   new users, but perhaps less convenient for power users. So Aaron
   Halfaker's study took its measurements at the point in the learning
   curve where you would expect the benefit of VE to be most clear: the
   first edit. Despite the question being as favourable to VE as
   possible, the result strongly favoured the use of source editing:
  
   Newcomers with the VisualEditor were ~43% less likely to save a
   single edit than editors with the wikitext editor (x^2=279.4,
   p0.001), meaning that Visual Editor presented nearly a 2:1 increase
   in editing difficulty.
  
   On the Wikipedia RFC question Wikimedia should disable this software
   

Re: [Wikitech-l] Suggestion for solving the disambiguation problem

2013-07-16 Thread Nicolas Vervelle
Interesting idea...


On Mon, Jul 15, 2013 at 11:41 PM, Jon Robson jdlrob...@gmail.com wrote:

 I understand there is an issue that needs solving where various pages
 link to disambiguation pages. These need fixing to point at the
 appropriate thing.

 I had a thought on how this might be done using a variant of
 EventLogging...

 When a user clicks on a link that is a disambiguation page and then
 clicks on a link on that page we log an event that contains

 * page user was on before
 * page user is on now

 If we were to collect this data it would allow us to statistically
 suggest what the  correct disambiguation page might be.

 To take a more concrete theoretical example:
 * If I am on the Wiki page for William Blake and click on London I am
 taken to https://en.wikipedia.org/wiki/London_(disambiguation)
 * I look through and see London (poem) and click on it
 * An event is fired that links London (poem) to William Blake.

 Obviously this won't always be accurate but I'd expect generally this
 would work (obviously we'd need to filter out bots)

 Then when editing William Blake say that disambiguation links are
 surfaced. If I go to fix one it might prompt me that 80% of visitors
 go from William Blake to London (poem).


 Have we done anything like this in the past? (Collecting data from
 readers and informing editors)

 I can imagine applying this sort of pattern could have various other
 uses...




 --
 Jon Robson
 http://jonrobson.me.uk
 @rakugojon

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Disambiguator extension deployed to all WMF wikis (action required)

2013-07-16 Thread Nicolas Vervelle
Thanks a lot,

I've updated WPCleaner [1] to use the new disambiguation property,
instead of looking for templates or categories.
I've been able to reduce the number of calls to the API when analzying a
page for links to disambiguation pages with this property.

Nico

[1] http://en.wikipedia.org/wiki/Wikipedia:WPCleaner



On Wed, Jul 10, 2013 at 12:10 AM, Ryan Kaldari rkald...@wikimedia.orgwrote:

 The Disambiguator extension (http://www.mediawiki.org/**
 wiki/Extension:Disambiguatorhttp://www.mediawiki.org/wiki/Extension:Disambiguator)
 is now deployed to all WMF wikis. This will enable us to:
 1. Remove disambiguation code from core, including Special:Disambiguations
 (bug 35981)
 2. Stop requiring wikis to maintain template lists at
 MediaWiki:Disambiguationspage
 3. Add features like warning users when they are linking to disambiguation
 pages 
 (https://gerrit.wikimedia.org/**r/#/c/70564https://gerrit.wikimedia.org/r/#/c/70564
 )
 4. Remove disambiguation pages from things like Special:Random and
 Special:LonelyPages
 4. Enable the development of more powerful 3rd party tools for dealing
 with disambiguation pages

 There is, however, one action required of each wiki that wants to make use
 of the Disambiguator extension: Every disambiguation page on the wiki needs
 to include the __DISAMBIG__ magic word (or an equivalent alias). Typically,
 this only requires adding the magic word to a single template that is
 included on all the disambiguation pages. For example, on Commons, this was
 accomplished with the following change:
 https://commons.wikimedia.org/**w/index.php?title=Template%**
 3ADisambigdiff=99758122**oldid=99728960https://commons.wikimedia.org/w/index.php?title=Template%3ADisambigdiff=99758122oldid=99728960
 On English Wikipedia, it was a bit more complicated:
 https://en.wikipedia.org/w/**index.php?title=Template%**
 3ADmboxdiff=560507118oldid=**540384230https://en.wikipedia.org/w/index.php?title=Template%3ADmboxdiff=560507118oldid=540384230

 Once you've made this change, you should start seeing pages appear on
 Special:DisambiguationPages within 3 days. If you have any questions or
 problems, let me know.

 Ryan Kaldari
 Wikimedia Foundation
 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Weekly deployment highlights - week of July 15th, 2013

2013-07-14 Thread Nicolas Vervelle
On Fri, Jul 12, 2013 at 10:54 PM, Greg Grossmeier g...@wikimedia.orgwrote:

 * Pending the go/no-go decision later this afternoon (Pacific time),
   VisualEditor will be enabled for all English Wikipedia users (both
   logged in and not).


So, what was the decision ?
I do hope that extending VE roll out has been delayed, at least to prevent
damaging more articles daily than the current hundreds by edits with VE on
enwiki ?

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Disambiguator extension deployed to all WMF wikis (action required)

2013-07-10 Thread Nicolas Vervelle
Thanks Eran and MZMcBride,

I'm going to update WPCleaner to take advantage of this new possibility.
It should result in less API requests and so a faster loading of pages for
fixing disambiguation links. Great :)

Nico


On Wed, Jul 10, 2013 at 8:33 AM, Eran Rosenthal eranro...@gmail.com wrote:

 Nice extension :)

 You may use generator to enjoy this new property. For example to check
 whether there is a disambig link from SOME_TITLE

 en.wikipedia.org/w/api.php?action=querygenerator=linkstitles=SOME_TITLEprop=pagepropsppprop=disambiguationgpllimit=500

 (without this extension it was possible only with external tools such as
 dablinks: http://toolserver.org/~dispenser/view/Dablinks)



 On Wed, Jul 10, 2013 at 9:00 AM, MZMcBride z...@mzmcbride.com wrote:

  Nicolas Vervelle wrote:
  Has the API been modified so that we can ask it if a page is a
  disambiguation page ?
 
  Looks like it.
 
  Starting point:
  https://en.wikipedia.org/w/api.php
 
  List of available property names:
 
 https://en.wikipedia.org/w/api.php?action=querylist=pagepropnamesppnlimit
  =100
 
  Look up properties of a particular title:
 
 https://en.wikipedia.org/w/api.php?action=queryprop=pagepropstitles=Madon
  na
  pageprops disambiguation= wikibase_item=q1564372 /
 
  https://en.wikipedia.org/wiki/Special:PagesWithProp can look up pages by
  property name. I'm not sure if there's an equivalent module for the Web
  API yet.
 
  The Web API has disambiguation-related query pages as well (including
  Special:DisambiguationPageLinks, which I'm only now learning exists).
 
  MZMcBride
 
 
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Disambiguator extension deployed to all WMF wikis (action required)

2013-07-10 Thread Nicolas Vervelle
Good,

Thanks to Eran answer, I know how to determine for all links in a page
which one are to a disambiguation page.
For WPCleaner, I also need to be able to retrieve the complete list of
disambiguation pages through the API.
I looked at the allpages request [1], but I didn't see a way to get only
pages with this property set.
Is there an other way ?

Nico

[1] http://www.mediawiki.org/wiki/API:Allpages



On Wed, Jul 10, 2013 at 9:30 AM, Nicolas Vervelle nverve...@gmail.comwrote:

 Thanks Eran and MZMcBride,

 I'm going to update WPCleaner to take advantage of this new possibility.
 It should result in less API requests and so a faster loading of pages for
 fixing disambiguation links. Great :)

 Nico


 On Wed, Jul 10, 2013 at 8:33 AM, Eran Rosenthal eranro...@gmail.comwrote:

 Nice extension :)

 You may use generator to enjoy this new property. For example to check
 whether there is a disambig link from SOME_TITLE

 en.wikipedia.org/w/api.php?action=querygenerator=linkstitles=SOME_TITLEprop=pagepropsppprop=disambiguationgpllimit=500

 (without this extension it was possible only with external tools such as
 dablinks: http://toolserver.org/~dispenser/view/Dablinks)



 On Wed, Jul 10, 2013 at 9:00 AM, MZMcBride z...@mzmcbride.com wrote:

  Nicolas Vervelle wrote:
  Has the API been modified so that we can ask it if a page is a
  disambiguation page ?
 
  Looks like it.
 
  Starting point:
  https://en.wikipedia.org/w/api.php
 
  List of available property names:
 
 https://en.wikipedia.org/w/api.php?action=querylist=pagepropnamesppnlimit
  =100
 
  Look up properties of a particular title:
 
 https://en.wikipedia.org/w/api.php?action=queryprop=pagepropstitles=Madon
  na
  pageprops disambiguation= wikibase_item=q1564372 /
 
  https://en.wikipedia.org/wiki/Special:PagesWithProp can look up pages
 by
  property name. I'm not sure if there's an equivalent module for the Web
  API yet.
 
  The Web API has disambiguation-related query pages as well (including
  Special:DisambiguationPageLinks, which I'm only now learning exists).
 
  MZMcBride
 
 
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Disambiguator extension deployed to all WMF wikis (action required)

2013-07-10 Thread Nicolas Vervelle
Yes, thanks,

I will use pageswithprop to retrieve all disambiguation pages.
I didn't find it at first because it wasn't listed in
http://www.mediawiki.org/wiki/API:Lists

Nico


On Thu, Jul 11, 2013 at 12:49 AM, Ryan Kaldari rkald...@wikimedia.orgwrote:

 On 7/10/13 2:15 PM, Brad Jorsch (Anomie) wrote:

 On Wed, Jul 10, 2013 at 1:49 PM, Ryan Kaldari rkald...@wikimedia.org
 wrote:

 Disambiguator implements an API for retrieving all disambiguation pages:
 api.php?action=querylist=**querypageqppage=**DisambiguationPages

 ... That really shouldn't be allowed by list=querypage, since we have
 list=pageswithprop that does the same thing.


 Oops, I didn't know about that one. Looks like you can now pull all the
 disambiguation pages using:
 https://en.wikipedia.org/w/**api.php?action=query**
 generator=pageswithprop**gpwppropname=disambiguationhttps://en.wikipedia.org/w/api.php?action=querygenerator=pageswithpropgpwppropname=disambiguation

 That should take care of Nico's request.

 Ryan Kaldari


 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Disambiguator extension deployed to all WMF wikis (action required)

2013-07-09 Thread Nicolas Vervelle
Great !!!

Has the API been modified so that we can ask it if a page is a
disambiguation page ?

Nico


On Wed, Jul 10, 2013 at 12:10 AM, Ryan Kaldari rkald...@wikimedia.orgwrote:

 The Disambiguator extension (http://www.mediawiki.org/**
 wiki/Extension:Disambiguatorhttp://www.mediawiki.org/wiki/Extension:Disambiguator)
 is now deployed to all WMF wikis. This will enable us to:
 1. Remove disambiguation code from core, including Special:Disambiguations
 (bug 35981)
 2. Stop requiring wikis to maintain template lists at
 MediaWiki:Disambiguationspage
 3. Add features like warning users when they are linking to disambiguation
 pages 
 (https://gerrit.wikimedia.org/**r/#/c/70564https://gerrit.wikimedia.org/r/#/c/70564
 )
 4. Remove disambiguation pages from things like Special:Random and
 Special:LonelyPages
 4. Enable the development of more powerful 3rd party tools for dealing
 with disambiguation pages

 There is, however, one action required of each wiki that wants to make use
 of the Disambiguator extension: Every disambiguation page on the wiki needs
 to include the __DISAMBIG__ magic word (or an equivalent alias). Typically,
 this only requires adding the magic word to a single template that is
 included on all the disambiguation pages. For example, on Commons, this was
 accomplished with the following change:
 https://commons.wikimedia.org/**w/index.php?title=Template%**
 3ADisambigdiff=99758122**oldid=99728960https://commons.wikimedia.org/w/index.php?title=Template%3ADisambigdiff=99758122oldid=99728960
 On English Wikipedia, it was a bit more complicated:
 https://en.wikipedia.org/w/**index.php?title=Template%**
 3ADmboxdiff=560507118oldid=**540384230https://en.wikipedia.org/w/index.php?title=Template%3ADmboxdiff=560507118oldid=540384230

 Once you've made this change, you should start seeing pages appear on
 Special:DisambiguationPages within 3 days. If you have any questions or
 problems, let me know.

 Ryan Kaldari
 Wikimedia Foundation
 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] WMF Deployment Highlights - Week of June 17th

2013-06-24 Thread Nicolas Vervelle
On Mon, Jun 24, 2013 at 10:49 PM, Steven Walling
steven.wall...@gmail.comwrote:

 On Mon, Jun 17, 2013 at 5:42 PM, Nicolas Vervelle nverve...@gmail.com
 wrote:

  I tried VE a few times, and clearly think it's not yet in a situation
 where
  it could be rolled out to unexperienced users :
  * VE is still very limited in what you can do with it (no templates, no
  references, ...). What will be the reaction of a new user when he sees
 that
  he can't edit some parts of the article ?
 

 I would always double check the latest version before we talk about VE's
 limitations -- the feature set is evolving pretty rapidly right now, to the
 credit of the team. For instance, VE now supports adding and editing
 references and templates. (This weekend on English Wikipedia, I was able to
 pretty smoothly add navigation box templates and page protection templates
 to articles.)


My mail is more than a week old, those features have been activated since
then, they weren't available when I sent the mail...
But I still think that VE is far from being stable, tested and enhanced
enough to be set as the default editor for new contributors.
The activity on the various feedback pages is a good indicator that it's
not yet ready for being the default editor.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] WMF Deployment Highlights - Week of June 17th

2013-06-17 Thread Nicolas Vervelle
Hi,

The interest of the VE team is not the only one to take into account I
think...
The impact on the new wikipedia editors should be a more important
parameter in my opinion.

I tried VE a few times, and clearly think it's not yet in a situation where
it could be rolled out to unexperienced users :
* VE is still very limited in what you can do with it (no templates, no
references, ...). What will be the reaction of a new user when he sees that
he can't edit some parts of the article ?
* VE is still quite buggy (adding nowiki tags, deleting references,
modifying templates, ...). While it's not a problem with users that
opted-in for testing, it's quite different for users that don't even know
what VE is.
* Beta testers made a few suggestions for enhancements that would be quite
helpful for editors (like being able to choose between VE and wikitext when
editing a given section and not globally, ...)

Why do you want to rush a forced test on new users when VE is not yet a
stable, fully functional product ?

You mentionned the low number of edits with VE currently.
I think it's low because of the problems mentionned above, not because of a
lack of testers. I saw several users do like me: try it, see that many
editions can't be made or end up with side effects, report the problems,
and use again the wikitext waiting for the problems to be solved in a next
version. I do believe than once VE is stable and has more features, people
will start to use it more widely.

Has there been any analysis done to foresee the impact on new users that
would have VE enabled by default ?
Like taking a few hours or a day of modifications on enwiki, keeping only
the modifications made by users registered in the last few days, and try to
do the exact same modification with VE :
* What percentage of modifications could be achieved with the current set
of features available in VE ?
* What percentage of modifications would have been done without undesired
side effects ?
That would give an idea on how many new users would run into problems with
VE (for me, they are very low, but I'm not a new user).
With the current version of VE, I believe both those percentages will be
low, implying many new users will have problems.

Nico


On Mon, Jun 17, 2013 at 12:12 PM, Federico Leva (Nemo)
nemow...@gmail.comwrote:

 The fact that there are known issues doesn't mean that finding new,
 unknown issues will slow down the work on the known; it's up to the team to
 decide what sort and what amount of feedback they'll be able/need to
 process (and to adjust if they were wrong).

 Gradually enabling a feature is not an experiment on some poor victims,
 it's a normal development strategy (as opposed to sudden
 revolutions/waterfalls on the wikis). I still don't see any indication of
 why it should raise the end net harm of the VE development on the wikis.

 I don't know how to enable the preference at some point of users'
 lifecycle; probably, in the same way you do it for half new users. A hook I
 assume, it was mentioned in some Echo and enotif bugs.

 Nemo


 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] WMF Deployment Highlights - Week of June 17th

2013-06-15 Thread Nicolas Vervelle
On Sat, Jun 15, 2013 at 6:09 AM, MZMcBride z...@mzmcbride.com wrote:

 Greg Grossmeier wrote:
 * On Tuesday VisualEditor team will enable an A/B test, where half of
   new accounts created on English Wikipedia will get VisualEditor
   enabled by default. This is to test performance and features before
   the larger rollout in July.

 As I commented at https://bugzilla.wikimedia.org/49604, the (apparent)
 lack of an easy means of opting out of this experiment and the increased
 frequency of severe VisualEditor-related bugs being reported recently both
 make it appear that an A/B test of this nature would be premature and
 potentially very damaging. Dirty diffs, inadvertent section removals, etc.
 are still common when using VisualEditor. Are we really expecting our
 newest users to be able to spot and correct these issues?


I'm also quite surprised to see that VisualEditor could be activated by
default for some new accounts.

I tried again VisualEditor on frwiki, and I don't see how it could be
effectively used by new users :
* VE is still very limited : for example, not being able to edit template
is clearly a big limitation. How a new user will react when he tries to
edit an article for a first time and see many parts that he can't edit ?
* VE is still doing dodgy things in some situations : how a new user can
deal with modifications done by VE without his knowledge ?

Currently, I think that VE should only be activated voluntarily, so that
people know what they are doing, are prepared to fix incorrect
modifications made by VE, and can easily work outside of VE to overcome its
current limitations.
I think that new users would quickly get a bad first impression if using VE.

For me, VE is currently a good start for a more user friendly editor, but
as it still lacks a few important features and still has a few bugs, it
should stay in opt-in mode and clearly not in opt-out mode.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Support for 3D content

2013-04-22 Thread Nicolas Vervelle
Hi,

I'd like to come to the Amsterdam Hackathon to discuss the Jmol extension,
and have advice on a few things to develop it (security, ...).
But I'm not yet sure I will be in Europe that weekend :(

Nico


On Fri, Apr 19, 2013 at 5:05 PM, Dan Andreescu dandree...@wikimedia.orgwrote:

 Sounds like NicoV started work again to try to address those issues.  We
 should take a look at the Amsterdam Hackathon or Wikimania.


 On Fri, Apr 19, 2013 at 10:54 AM, Eugene Zelenko
 eugene.zele...@gmail.comwrote:

  Hi!
 
  Extension and viewer for Chemical Markup Language were created long
  time ago. However it's still not reviewed for security issues to be
  included on WMF projects. See
  https://bugzilla.wikimedia.org/show_bug.cgi?id=16491.
 
  On Fri, Apr 19, 2013 at 6:03 AM, Mathieu Stumpf
  psychosl...@culture-libre.org wrote:
   Hi,
  
   Reading the 2012-13 Plan, I see that multimedia is one the key
  activities
   for Mediawiki. So I was wondering if there was already any plan to
  integrate
   3D model viewers, which would be for example very interesting for
 anatomy
   articles, or simply 3D maths objects.
 
  Eugene.
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Support for 3D content

2013-04-19 Thread Nicolas Vervelle
You also have a Jmol extension
http://www.mediawiki.org/wiki/Extension:Jmol

It's working on wiki.jmol.org


On Fri, Apr 19, 2013 at 4:45 PM, Chris McMahon cmcma...@wikimedia.orgwrote:

 On Fri, Apr 19, 2013 at 6:03 AM, Mathieu Stumpf 
 psychosl...@culture-libre.org wrote:

  Hi,
 
  Reading the 2012-13 Plan, I see that multimedia is one the key
  activities for Mediawiki. So I was wondering if there was already any
 plan
  to integrate 3D model viewers, which would be for example very
 interesting
  for anatomy articles, or simply 3D maths objects.


 I know that work on PDBHandler is ongoing:
 http://www.mediawiki.org/wiki/Extension:PDBHandler
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Disambiguation features: Do they belong in core or in an extension?

2013-01-16 Thread Nicolas Vervelle
Hello,

My own preference would be to have this in the core for several reasons.

It seems that it makes some existing core code simpler. There's
already some code dealing with disambiguation in the core
(Special:Disambiguation, ...).

Several external tools, including my own WPCleaner [1], are dealing
with fixing disambiguation links on several wikis. In my opinion, it
will be easier for tools developers to have one standard method for
finding if a page is a disambiguation page or not.
Currently, I'm already managing two methods depending on the wiki :
based on Mediawiki:Disambiguationpages, or based on categories (for
enwiki and frwiki) which is faster and requires less API requests.
If it's in an extension, I think less wikis (outside wikimedia) will
use this new method than if it's in the core. Because extension
requires the wiki owner to first add it, whereas only contributors are
needed if it's in the core.

If a wiki doesn't need disambiguation pages, there's nothing to setup
specifically for not using it. It's just an unused feature, as it is
currently with Mediawiki:Disambiguationpages ;)

Nico

[1] http://en.wikipedia.org/wiki/Wikipedia:WPCleaner


On 1/16/13, Tyler Romeo tylerro...@gmail.com wrote:
 I agree with extension. For example, my school's IT department uses a wiki
 to collect information about common computer problems, and on a wiki about
 computer problems, none of the issues share the same name.

 *--*
 *Tyler Romeo*
 Stevens Institute of Technology, Class of 2015
 Major in Computer Science
 www.whizkidztech.com | tylerro...@gmail.com


 On Tue, Jan 15, 2013 at 9:38 PM, Chad innocentkil...@gmail.com wrote:

 On Tue, Jan 15, 2013 at 8:58 PM, Ryan Kaldari rkald...@wikimedia.org
 wrote:
  Personally, I don't mind implementing it either way, but would like to
 have
  consensus on where this code should reside. The code is pretty clean
  and
  lightweight, so it wouldn't increase the footprint of core MediaWiki
  (it
  would actually decrease the existing footprint slightly since it
  replaces
  more hacky existing core code). So core bloat isn't really an issue.
  The
  issue is: Where does it most make sense for disambiguation features to
  reside? Should disambiguation pages be supported out of the box or
 require
  an extension to fully support?
 

 I'd say extension. I can think of lots of wikis that don't use
 disambiguation pages. If we really want, we can stash it in
 the default tarball along with the other bundled extensions.

 -Chad

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Canvasmol?

2011-09-12 Thread Nicolas Vervelle
On Tue, Sep 6, 2011 at 7:26 PM, Mark A. Hershberger 
mhershber...@wikimedia.org wrote:

 Tim Starling tstarl...@wikimedia.org writes:

  On 06/09/11 18:09, Magnus Manske wrote:
  Support for 3D-rendered molecules on Wikipedia has been on the
  wishlist since ... forever. This was never done due to security
  concerns, IIRC.
 
  The security issues were just normal XSS, easily fixed with an hour or
  two of work.
 [...]
  I think we should go with Jmol, which has many excellent features (not
  just ball and stick rendering like this canvasmol), a substantial
  community behind it, and a large content base already hosted on
  MediaWiki wikis, most notably Proteopedia.

 Since there are obviously MW users for Jmol, I'd like to think there are
 interested MW developers for Jmol.  Maybe I can find someone on
 Proteopedia.


Hi,

I wrote a good part of the Jmol extension for MW.
I'm still interested in developing it if there's a chance of seeing it used
on MW.
I don't have much free time in the following weeks, but I hope this will
change before November.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] API available for wikiversity project ?

2011-04-24 Thread Nicolas Vervelle
Hi,

Is the API available for the wikiversity projecthttp://en.wikiversity.org/?
If so what is the API URL ?

Someone asked me to include the fr wikiversity in the list of wikipedias
that 
WPCleanerhttp://en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Documentationcan
work on.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Using ReCaptcha in ConfirmEdit extension ?

2011-03-05 Thread Nicolas Vervelle
Hi,

I am trying to install a Captcha extension on my own wiki (
http://wiki.jmol.org/).
ReCaptcha seems nice, but when I try to use the version in trunk I get an
error : Class 'SimpleCaptcha' not found in ReCaptcha.php on line 54.
Indeed, there is no SimpleCatcha there. What should I do to make it work ?

I am using MW 1.16.2.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Using ReCaptcha in ConfirmEdit extension ?

2011-03-05 Thread Nicolas Vervelle
Hi again,

You can forget my question.
It's now working, I misread the documentation (which is not the
same depending on which page you look at) at first.
Including ConfirmEdit.php before including ReCaptcha.php did the trick

Nico

On Sat, Mar 5, 2011 at 9:43 PM, Nicolas Vervelle nverve...@gmail.comwrote:

 Hi,

 I am trying to install a Captcha extension on my own wiki (
 http://wiki.jmol.org/).
 ReCaptcha seems nice, but when I try to use the version in trunk I get an
 error : Class 'SimpleCaptcha' not found in ReCaptcha.php on line 54.
 Indeed, there is no SimpleCatcha there. What should I do to make it work ?

 I am using MW 1.16.2.

 Nico


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikimedia engineering February report

2011-03-04 Thread Nicolas Vervelle
Hi,

Thanks for the info. In favor of posting only the link.

Nico

On Fri, Mar 4, 2011 at 10:58 AM, Guillaume Paumier
gpaum...@wikimedia.orgwrote:

 Hi,

 The February report of what was accomplished by the Wikimedia
 engineering team is now available:


 http://techblog.wikimedia.org/2011/03/wikimedia-engineering-february-report/

 As far as I know, we haven't advertised these reports on this list in
 the past.

 I'd like to ask the list if you would prefer:
 * to keep the status quo: you're content with the RSS feeds from the
 blog and there's no need to post here;
 * posting a link here is a good practice that you'd like us to continue;
 * posting a link here is good, but you'd also like to get the content of
 the report in the e-mail;
 * something else?

 Thanks,

 --
 Guillaume Paumier


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Problem with MIME type detection

2010-12-05 Thread Nicolas Vervelle
Hi,

I want to authorize uploading of CML
fileshttp://en.wikipedia.org/wiki/Chemical_Markup_Languageon my
wiki http://wiki.jmol.org/index.php/Main_Page.
These files have an XML format with a limited set of possible root elements.
I would like them to be recognized by MediaWiki with a specific MIME type :
chemical/x-cml

I'm doing some tests on a virtual machine Ubuntu 10.10 with default
MediaWiki setup (MW 1.15.5, PHP 5.3.3, MySql 5.1.49).
I have done several modifications in the configuration :

   - $wgFileExtensions[] = 'cml'; in LocalSetting.php to allow the .cml
   extension
   - chemical/x-cml cml; in includes/mime.types to attach the MIME type to
   the extension
   - $wgXMLMimeTypes = array_merge( $wgXMLMimeTypes, array( ... ) ); in
   LocalSettings.php so that MimeMagic correctly detects CML files as
   chemical/x-cml

When I upload the file, everything is ok and in the debug log the file seems
correctly detected as chemical/x-cml.
Even in the database, the file is stored with the correct MIME type.

But when I display the File: page for this file, I see MIME type:
unknown/unknown displayed by MediaWiki (same problem in the debug log).
What do I need to do to have the MIME type correctly detected once the file
is uploaded ?

I want to develop a media handler for this kind of files, so I need to
have correct Mime type detection.

Thanks in advance.
Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a Media handler extension for molecular files ?

2010-12-01 Thread Nicolas Vervelle
Hi Brion and others,

On Tue, Nov 23, 2010 at 12:46 AM, Nicolas Vervelle nverve...@gmail.comwrote:

 On Mon, Nov 22, 2010 at 11:57 PM, Brion Vibber br...@pobox.com wrote:

 On Mon, Nov 22, 2010 at 1:03 PM, Nicolas Vervelle nverve...@gmail.com
 wrote:

  Molecular files exist in several formats : pdb, cif, mol, xyz, cml, ...
  Usually they are detected as simple MIME types (either text/plain or
  application/xml) by MediaWiki and not as more precise types (even if
 this
  types exist : chemical/x-pdb, chemical/x-xyz, ...).
  It seems that to register a Media handler, I have to add an entry to
  $wgMediaHandlers[] : $wgMediaHandler['text/plain'] = 'MolecularHandler';
  Will it be a problem to use such a general MIME type to register the
  handler
  ? Especially for files of the same MIME type but that are not molecular
  files ?
 

 You'd want to make sure the type detection correctly identifies your files
 so you can associate the handler types, or it's going to make things
 confusing.

 For XML files, you should usually be able to add to the $wgXMLMimeTypes
 array, which by default recognizes the root elements for HTML, SVG, and
 Dia
 vector drawings -- see the entries in DefaultSettings.php as examples. It
 can recognize XML files by either bare or namespaced root element name,
 and
 associates the files with the given MIME type.


 Oh, good, that's what I need for file types like CML (Chemical Markup
 Language).
 I will start working on the media handler only with this kind of files,
 easier to begin with them.


I have tried your suggestion with $wgXMLMimeTypes, but it works only
partially :

   - It works for uploading : when I upload the CML file, MediaWiki detects
   MIME type chemical/x-cml.
   - It doesn't seem to work for rendering the page
(examplehttp://wiki.jmol.org/index.php/File:Nsc202.cml)
   : displayed MIME type is unknown/unknown (and MediaWiki looks for a
   media handler for unknown/unknown).

Have you any on what to do next to detect correctly the MIME type when
rendering
?


Here are the log of the upload :
MimeMagic::__construct: loading mime types from
/public_html/mediawiki-1_16_0/includes/mime.types
MimeMagic::__construct: loading mime info from
/public_html/mediawiki-1_16_0/includes/mime.info
*MimeMagic::guessMimeType: final mime type of /tmp/phpSGMnk9: chemical/x-cml
*
MediaHandler::getHandler: no handler found for chemical/x-cml.
File::getPropsFromPath: /tmp/phpSGMnk9 loaded, 10327 bytes, chemical/x-cml.
MacBinary::loadHeader: header bytes 0 and 74 not null
MimeMagic::guessMimeType: final mime type of /tmp/phpSGMnk9: chemical/x-cml


mime: chemical/x-cml extension: cml

UploadBase::verifyExtension: mime type chemical/x-cml matches extension cml,
passing file
UploadBase::detectScript: checking for embedded scripts and HTML stuff
UploadBase::detectScript: no scripts found
UploadBase::detectVirus: virus scanner disabled
UploadBase::verifyFile: all clear; passing.


\performUpload: sum:[[Category:CML file]] c: [[Category:CML file]]
w:FSRepo::publishBatch: wrote tempfile /tmp/phpSGMnk9 to
/public_html/mediawiki-1_16_0/images/8/84/Nsc202.cml
DatabaseBase::query: Writes done: INSERT IGNORE INTO `image`
(img_name,img_size,img_width,img_height,img_bits,img_media_type,img_major_mime,img_minor_mime,img_timestamp,img_description,img_user,img_user_text,img_metadata,img_sha1)
VALUES
('Nsc202.cml','10327','0','0','0','UNKNOWN','chemical','x-cml','20101201215518','[[Category:CML
file]]','2','NicolasVervelle','','nxeilddfhz427gtfakpu37xljp9ipk1')
Class SkinMonobook not found; skipped loading
Article::editUpdates: No vary-revision, using prepared edit...
Saved in parser cache with key wikijmolorg:pcache:idhash:4233-0!1!0!!en!0
and timestamp 20101201215518
DatabaseBase::query: Writes done: DELETE FROM `objectcache` WHERE keyname =
'wikijmolorg:pcache:idhash:4233-0!1!0!!en!0'
BacklinkCache::getLinks: from DB
BacklinkCache::partition: got from database
BacklinkCache::getLinks: from DB
BacklinkCache::getLinks: from DB
OutputPage::sendCacheControl: private caching;  **
Request ended normally
 accepts gzip
Start request

GET /index.php/File:Nsc202.cml
HTTP HEADERS:
AUTHORIZATION:
HOST: wiki.jmol.org
CONNECTION: keep-alive
REFERER: http://wiki.jmol.org/index.php/Special:Upload
CACHE_CONTROL: max-age=0
ACCEPT:
application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
USER_AGENT: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US)
AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.44 Safari/534.7
ACCEPT_ENCODING: gzip,deflate,sdch
ACCEPT_LANGUAGE: fr,en-US;q=0.8,en;q=0.6
ACCEPT_CHARSET: ISO-8859-1,utf-8;q=0.7,*;q=0.3
COOKIE: wikijmolorgUserID=2; wikijmolorgUserName=NicolasVervelle;
wikijmolorgToken=ee6361e8e8d8badf3b359b60295d4ae4;
wikijmolorg_session=b4ef6699ab15e1cdc6786845d1a2de22
SUEXEC_UID: 2818
SUEXEC_GID: 2818

CACHES: FakeMemCachedClient[main] SqlBagOStuff[message] SqlBagOStuff[parser]
session_set_cookie_params: 0, /, , , 1
Unstubbing $wgParser on call of $wgParser

[Wikitech-l] Creating a Media handler extension for molecular files ?

2010-11-22 Thread Nicolas Vervelle
Hello,

I am interested in developing an extension for handling molecular files
(files containing informations about chemical molecules : atoms, bonds,
...).
If I understand correctly it will enable me to display specific informations
in the File:... page, like what MediaWiki does for simple images.
Something like existing extensions (FlvHandler, OggHandler,
PagedTiffHandler, PNGHandler, TimedMediaHandler in SVN trunk for example).


I have read several pages on MediaWiki about writing extensions, but they
are not very detailed for media handler extensions.
I have also written an extension to display and interact with
moleculeshttp://wiki.jmol.org/index.php/Jmol_MediaWiki_Extension,
but I still have several questions on how I can create a handler for
molecular files in MediaWiki.
Any help or links to some explanations will be appreciated.


Molecular files exist in several formats : pdb, cif, mol, xyz, cml, ...
Usually they are detected as simple MIME types (either text/plain or
application/xml) by MediaWiki and not as more precise types (even if this
types exist : chemical/x-pdb, chemical/x-xyz, ...).
It seems that to register a Media handler, I have to add an entry to
$wgMediaHandlers[] : $wgMediaHandler['text/plain'] = 'MolecularHandler';
Will it be a problem to use such a general MIME type to register the handler
? Especially for files of the same MIME type but that are not molecular
files ?
Are there some precautions to take into account ? (like letting an other
handler deal with the file if it's not a molecular file, ...)


I want to use the Jmol http://www.jmol.org/ applet for displaying the
molecule in 3d, and allowing the user to manipulate it.
But the applet is about 1M in size, so it takes time to load the first time,
then to start and load the molecular file.
I would like to start showing a still image (generated on the server) and a
button to let the user decide when loading the applet if interested in.
Several questions for doing this with MediaWiki :

   - What hook / event should I use to be able to add this content in the
   File:... page ?
   - Is there a way to start displaying the File:... page, compute the still
   image in the background,and add it in the File:... page after ?
   - Are there any good practices for doing this kind of things ?


Is it also possible to create thumbnails in articles if they include links
to a molecular file (like [[File:example.pdb]]) ?
What hook should I use ?
Is it possible to compute the thumbnail in the background ?


Any other advice for writing a media handler extension ?
Or other possibilities that could enhance the extension ?
Among the few handler extensions in SVN, which is the better example ?


Thanks for any help
Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a Media handler extension for molecular files ?

2010-11-22 Thread Nicolas Vervelle
On Mon, Nov 22, 2010 at 11:57 PM, Brion Vibber br...@pobox.com wrote:

 On Mon, Nov 22, 2010 at 1:03 PM, Nicolas Vervelle nverve...@gmail.com
 wrote:

  Molecular files exist in several formats : pdb, cif, mol, xyz, cml, ...
  Usually they are detected as simple MIME types (either text/plain or
  application/xml) by MediaWiki and not as more precise types (even if this
  types exist : chemical/x-pdb, chemical/x-xyz, ...).
  It seems that to register a Media handler, I have to add an entry to
  $wgMediaHandlers[] : $wgMediaHandler['text/plain'] = 'MolecularHandler';
  Will it be a problem to use such a general MIME type to register the
  handler
  ? Especially for files of the same MIME type but that are not molecular
  files ?
 

 You'd want to make sure the type detection correctly identifies your files
 so you can associate the handler types, or it's going to make things
 confusing.

 For XML files, you should usually be able to add to the $wgXMLMimeTypes
 array, which by default recognizes the root elements for HTML, SVG, and Dia
 vector drawings -- see the entries in DefaultSettings.php as examples. It
 can recognize XML files by either bare or namespaced root element name, and
 associates the files with the given MIME type.


Oh, good, that's what I need for file types like CML (Chemical Markup
Language).
I will start working on the media handler only with this kind of files,
easier to begin with them.



 For plaintext types that aren't currently recognized I'm not 100% sure how
 best to proceed; might have to override $wgMimeTypesFile or even make some
 changes to MimeMagic.php (the class that encapsulates most of the file type
 detection).


Ok, but detection of mime type for some chemical file formats will be quite
difficult.
Some file formats are just a list of atom coordinates with bonds (each line
is simply several numbers separated by a tab).
I will take a look at $wgMimeTypesFile or MimeMagic.php after I manage to
work with XML files.





  I want to use the Jmol http://www.jmol.org/ applet for displaying the
  molecule in 3d, and allowing the user to manipulate it.
  But the applet is about 1M in size, so it takes time to load the first
  time,
  then to start and load the molecular file.
  I would like to start showing a still image (generated on the server) and
 a
  button to let the user decide when loading the applet if interested in.
  Several questions for doing this with MediaWiki :
 
- What hook / event should I use to be able to add this content in the
File:... page ?
- Is there a way to start displaying the File:... page, compute the
 still
image in the background,and add it in the File:... page after ?
- Are there any good practices for doing this kind of things ?
 

 You might want to look at OggHandler as an example. It too needs to create
 still-image thumbnails and delay loading of the actual video via Java
 applet, direct embedding, or HTML 5 video tag, and hooks various spots in
 order to do so.


Ok, thanks, I will try to understand how OggHandler works.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Configuring upload : authorizing some file extensions ?

2010-11-01 Thread Nicolas Vervelle
Hello,

I am the administrator of Jmol wiki, http://wiki.jmol.org, and I am trying
to authorize the upload of files with extensions .pdb, .mol, ... (chemistry
files).
It was working some time ago, but apparently it's not working any more.
I don't know when it stopped working (maybe when upgrading MediaWiki but not
sure at all).

Error message is File is corrupt or the extension does not match the file
type

We are currently using MediaWiki 1.14.0 (but I could upgrade if required,
just needs some work to change the Jmol extension to work with 1.16)

Our current configuration is :

In LocalSettings.php :
$wgEnableUploads   = true;
$wgFileExtensions[] = 'cml';
$wgFileExtensions[] = 'ico';
$wgFileExtensions[] = 'mol';
$wgFileExtensions[] = 'pdb';
$wgFileExtensions[] = 'xyz';
$wgTrustedMediaFormats[] = 'chemical/x-pdb';
$wgTrustedMediaFormats[] = 'chemical/x-xyz';

In includes/mime.types :
chemical/x-pdb pdb
chemical/x-xyz xyz

Can anyone help us to find what's going on ?
Thanks
Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Acceptable use of API

2010-09-24 Thread Nicolas Vervelle
On Fri, Sep 24, 2010 at 1:19 PM, Max Semenik maxsem.w...@gmail.com wrote:

 On 24.09.2010, 14:32 Robin wrote:

  I would like to collect data on interlanguage links for academic research
  purposes. I really do not want to use the dumps, since I would need to
  download dumps of all language Wikipedias, which would be huge.
  I have written a script which goes through the API, but I am wondering
 how
  often it is acceptable for me to query the API. Assuming I do not run
  parallel queries, do I need to wait between each query? If so, how long?

 Crawling all the Wikipedias is not an easy task either. Probably,
 toolserver.org would be more suitable. What data do you need, exactly?


Full dumps are not required for retrieving interlanguage links.
For example, the last fr dump contains a dedicated file for them :
http://download.wikimedia.org/frwiki/20100915/frwiki-20100915-langlinks.sql.gz

It will be a lot faster to download this file (only 75M) than making more
than 1 million calls to the API for the fr wiki.

Nico
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l