Re: [Wikitech-l] Improving CAPTCHA friendliness for humans, and increasing CAPTCHA difficulty for bots

2015-08-18 Thread Brian Wolff
On Tuesday, August 18, 2015, Pine W  wrote:
> what's happening with regard to improving usability for humans and
> increasing the difficulty for bots?

Generally speaking, isnt that an open problem in computer science?

--
Bawolff
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Luis Villa
On Tue, Aug 18, 2015 at 4:50 PM, Tilman Bayer  wrote:

> On Tue, Aug 18, 2015 at 3:59 PM, Luis Villa  wrote:
> > On Tue, Aug 18, 2015 at 2:06 PM, Pine W  wrote:
> >
> >> Researching the possibility of migrating all mailing lists to a newer
> >> system sounds
> >> like a good project for Community Tech
> >>
> >
> > I've been pushing to keep the team focused on things that can show a
> direct
> > impact on contribution/editing; this kind of sysadmin work really isn't
> > that?[1] May be a worthwhile clarification to add to
> > https://www.mediawiki.org/wiki/Community_Tech_team#Scope though.
> >
> > Luis
> >
> > [1] Though I do think that we should think about at least upgrading
> > mailman, and potentially switching to Google Groups or (perhaps for some
> > lists) to discourse.net.
>

To be clear, this was the very broad we, not CE/CommTech. :/


> https://phabricator.wikimedia.org/T52864 is the task for the Mailman
> upgrade btw ("Upgrading to Version 3 will come, but it won't be soon
> and very very likely won't be this year").


Surprised the bug doesn't mention the multiple python versions it requires.
Some bizarre choices there.

Luis


-- 
Luis Villa
Sr. Director of Community Engagement
Wikimedia Foundation
*Working towards a world in which every single human being can freely share
in the sum of all knowledge.*
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Tilman Bayer
On Tue, Aug 18, 2015 at 3:59 PM, Luis Villa  wrote:
> On Tue, Aug 18, 2015 at 2:06 PM, Pine W  wrote:
>
>> Researching the possibility of migrating all mailing lists to a newer
>> system sounds
>> like a good project for Community Tech
>>
>
> I've been pushing to keep the team focused on things that can show a direct
> impact on contribution/editing; this kind of sysadmin work really isn't
> that?[1] May be a worthwhile clarification to add to
> https://www.mediawiki.org/wiki/Community_Tech_team#Scope though.
>
> Luis
>
> [1] Though I do think that we should think about at least upgrading
> mailman, and potentially switching to Google Groups or (perhaps for some
> lists) to discourse.net.
>
https://phabricator.wikimedia.org/T52864 is the task for the Mailman
upgrade btw ("Upgrading to Version 3 will come, but it won't be soon
and very very likely won't be this year").


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Luis Villa
On Tue, Aug 18, 2015 at 2:06 PM, Pine W  wrote:

> Researching the possibility of migrating all mailing lists to a newer
> system sounds
> like a good project for Community Tech
>

I've been pushing to keep the team focused on things that can show a direct
impact on contribution/editing; this kind of sysadmin work really isn't
that?[1] May be a worthwhile clarification to add to
https://www.mediawiki.org/wiki/Community_Tech_team#Scope though.

Luis

[1] Though I do think that we should think about at least upgrading
mailman, and potentially switching to Google Groups or (perhaps for some
lists) to discourse.net.

-- 
Luis Villa
Sr. Director of Community Engagement
Wikimedia Foundation
*Working towards a world in which every single human being can freely share
in the sum of all knowledge.*
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] oldimage naming convention

2015-08-18 Thread Brion Vibber
I have the impression that was an old bug which got fixed sometime in the
last couple years -- it was accidentally using the current time instead of
the original upload time. But there will of course be thousands of existing
old-version files with the "wrong" prefixes stuck on their filenames...

-- brion

On Tue, Aug 18, 2015 at 3:13 PM, Daren Welsh  wrote:

> In the version history of an image (or any attached file in MediaWiki), the
> page displays "Date/Time" with a link to that version. The timestamp
> displayed is the upload timestamp of that version. If you look closely, you
> can see that the real filename includes a different timestamp. This turns
> out to be the timestamp of when that file was superseded by a subsequent
> version.
>
> I have looked in the database tables and can see that in the oldimage
> table, each row has an "oi_archive_name" with the timestamp of when that
> version was superseded and an "oi_timestamp" of when that version was
> actually uploaded.
>
> Is there a reason to name the old versions of the files with the
> superseding timestamp instead of the upload timestamp? It seems to me that
> the timestamp of when that version was uploaded is more relevant.
>
> Daren
>
>
> --
> __
> http://enterprisemediawiki.org
> http://mixcloud.com/darenwelsh
> http://www.beatportfolio.com
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] oldimage naming convention

2015-08-18 Thread Daren Welsh
In the version history of an image (or any attached file in MediaWiki), the
page displays "Date/Time" with a link to that version. The timestamp
displayed is the upload timestamp of that version. If you look closely, you
can see that the real filename includes a different timestamp. This turns
out to be the timestamp of when that file was superseded by a subsequent
version.

I have looked in the database tables and can see that in the oldimage
table, each row has an "oi_archive_name" with the timestamp of when that
version was superseded and an "oi_timestamp" of when that version was
actually uploaded.

Is there a reason to name the old versions of the files with the
superseding timestamp instead of the upload timestamp? It seems to me that
the timestamp of when that version was uploaded is more relevant.

Daren


-- 
__
http://enterprisemediawiki.org
http://mixcloud.com/darenwelsh
http://www.beatportfolio.com
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Ryan Kaldari
On Tue, Aug 18, 2015 at 2:06 PM, Pine W  wrote:

> 1. I was thinking of a tool that would let users input a variety of ways
> of referring to the retracted articles, such as DOI numbers (Peaceray is an
> expert in these). The tool would accept multiple inputs simultaneously,
> such as all 64 articles that were retracted in a batch. The tool would
> return to the user a list of all articles in which those references are
> used as citations, and highlight the paragraphs of the article where the
> citations are used. This would, I hope, greatly improve the efficiency of
> the workflow for dealing with retracted journal articles.
>

Sounds like a reasonable proposal, although I have to wonder if the time
spent building and maintaining this tool would be more or less than the
time it would save editors to search for retracted journal articles.


> 2. I'm not clear on where I should list a new idea. The list of ideas in 
> Community
> Tech team/All Our Ideas/Process
> 
> is based on a survey that has already been completed. Is there a
> Phabricator workboard that would be appropriate for listing a new idea such
> as this?
>

Community Tech is currently only accepting new tasks related to the All Our
Ideas survey results (
https://www.mediawiki.org/wiki/Community_Tech_team/All_Our_Ideas). We will
be opening up a new survey next month though (
https://www.mediawiki.org/wiki/Community_Tech_team/Community_Wishlist_Survey/Process).
In the meantime, you can post the idea at
https://meta.wikimedia.org/wiki/Community_Tech_project_ideas to get more
input on it. More details about all of this will be announced hopefully
next week.


> 3. I would prefer to have everyone using the same system, which is
> lists.wikimedia.org. It makes sense to me that everyone might migrate
> eventually to a newer system. I suggest avoiding fragmentation. Researching
> the possibility of migrating all mailing lists to a newer system sounds
> like a good project for Community Tech and I could propose that in
> Phabricator as well if there's a good place to do so.
>

That's a pretty good point. I'll request to have the mailing list moved to
lists.wikimedia.org.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Pine W
1. I was thinking of a tool that would let users input a variety of ways of
referring to the retracted articles, such as DOI numbers (Peaceray is an
expert in these). The tool would accept multiple inputs simultaneously,
such as all 64 articles that were retracted in a batch. The tool would
return to the user a list of all articles in which those references are
used as citations, and highlight the paragraphs of the article where the
citations are used. This would, I hope, greatly improve the efficiency of
the workflow for dealing with retracted journal articles.

2. I'm not clear on where I should list a new idea. The list of ideas
in Community
Tech team/All Our Ideas/Process

is based on a survey that has already been completed. Is there a
Phabricator workboard that would be appropriate for listing a new idea such
as this?

3. I would prefer to have everyone using the same system, which is
lists.wikimedia.org. It makes sense to me that everyone might migrate
eventually to a newer system. I suggest avoiding fragmentation. Researching
the possibility of migrating all mailing lists to a newer system sounds
like a good project for Community Tech and I could propose that in
Phabricator as well if there's a good place to do so.

Thanks,

Pine


On Tue, Aug 18, 2015 at 1:52 PM, Ryan Kaldari 
wrote:

> On Tue, Aug 18, 2015 at 1:22 PM, Pine W  wrote:
>
>> Thanks for the info, Tilman.
>>
>> I ended up looking at the Community Tech page on MediaWiki, which says
>> that their scope of work includes "Building article curation and monitoring
>> tools for WikiProjects", so the kind of tools that we're discussing here
>> seem to be within their scope.
>>
>
> This project sounds like a good idea, but I don't really understand how it
> would work as a tool. There's no API for retracted journal articles. It
> seems like the best way to handle it would be when you find out about a
> retracted journal article to just search Wikipedia for the title of the
> article. What would a tool for this look like and how would it be more
> efficient that just searching?
>
>
>> Ryan, you seem to be the lead communicator for the group. Can you add
>> these tools to the list of projects that are in the Community Tech backlog?
>>
>
> See
> https://www.mediawiki.org/wiki/Community_Tech_team#Work_input_and_prioritization_process
>
>
>> Also, can you clarify why Community Tech is using Google Groups for its
>> mailing list instead of lists.wikimedia.org?
>>
>
> That's what WMF Office IT recommended (probably because it's interface
> wasn't developed in 1999). Do you think it should be on
> lists.wikimedia.org instead? Personally, it doesn't matter to me.
>
>
>> Thanks,
>>
>> Pine
>>
>>
>> On Tue, Aug 18, 2015 at 12:56 PM, Tilman Bayer 
>> wrote:
>>
>>> Related discussion from 2012:
>>>
>>> https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Medicine/Archive_26#Creating_a_bot_to_search_Wikipedia_for_retracted_papers
>>> (afaics it resulted in the creation of the {{retracted}} template, but
>>> no bot)
>>>
>>> The Community Tech team has its own mailing list now btw
>>> (https://groups.google.com/a/wikimedia.org/forum/#!forum/community-tech
>>> ).
>>>
>>> On Tue, Aug 18, 2015 at 2:42 AM, Pine W  wrote:
>>> > Is there any easy way to find all of citations of specified academic
>>> > articles on Wikipedias in all languages, and the text that is
>>> supported by
>>> > those references, so that the citations of questionable articles can be
>>> > removed and the article texts can be quickly reviewed for possible
>>> changes
>>> > or removal?
>>> >
>>> > See
>>> >
>>> https://www.washingtonpost.com/news/morning-mix/wp/2015/08/18/outbreak-of-fake-peer-reviews-widens-as-major-publisher-retracts-64-scientific-papers/?tid=hp_mm
>>> >
>>> > If we don't have easy ways to deal with this (and I believe that we
>>> don't),
>>> > I'd like to suggest that the Community Tech team work on tools to help
>>> when
>>> > these situations happen.
>>> >
>>> > Thanks,
>>> >
>>> > Pine
>>> > ___
>>> > Wikitech-l mailing list
>>> > Wikitech-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>>
>>>
>>> --
>>> Tilman Bayer
>>> Senior Analyst
>>> Wikimedia Foundation
>>> IRC (Freenode): HaeB
>>>
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>
>>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Stas Malyshev
Hi!

> This project sounds like a good idea, but I don't really understand how it
> would work as a tool. There's no API for retracted journal articles. It
> seems like the best way to handle it would be when you find out about a
> retracted journal article to just search Wikipedia for the title of the
> article. What would a tool for this look like and how would it be more
> efficient that just searching?

I think maybe DOI
(https://en.wikipedia.org/wiki/Digital_object_identifier) might be
useful there, as many article references are referred by DOI (as an
identifyable template parameter) and it may be more precise indicator
than just name. Not sure which way is the best to search for it though.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Ryan Kaldari
On Tue, Aug 18, 2015 at 1:22 PM, Pine W  wrote:

> Thanks for the info, Tilman.
>
> I ended up looking at the Community Tech page on MediaWiki, which says
> that their scope of work includes "Building article curation and monitoring
> tools for WikiProjects", so the kind of tools that we're discussing here
> seem to be within their scope.
>

This project sounds like a good idea, but I don't really understand how it
would work as a tool. There's no API for retracted journal articles. It
seems like the best way to handle it would be when you find out about a
retracted journal article to just search Wikipedia for the title of the
article. What would a tool for this look like and how would it be more
efficient that just searching?


> Ryan, you seem to be the lead communicator for the group. Can you add
> these tools to the list of projects that are in the Community Tech backlog?
>

See
https://www.mediawiki.org/wiki/Community_Tech_team#Work_input_and_prioritization_process


> Also, can you clarify why Community Tech is using Google Groups for its
> mailing list instead of lists.wikimedia.org?
>

That's what WMF Office IT recommended (probably because it's interface
wasn't developed in 1999). Do you think it should be on lists.wikimedia.org
instead? Personally, it doesn't matter to me.


> Thanks,
>
> Pine
>
>
> On Tue, Aug 18, 2015 at 12:56 PM, Tilman Bayer 
> wrote:
>
>> Related discussion from 2012:
>>
>> https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Medicine/Archive_26#Creating_a_bot_to_search_Wikipedia_for_retracted_papers
>> (afaics it resulted in the creation of the {{retracted}} template, but
>> no bot)
>>
>> The Community Tech team has its own mailing list now btw
>> (https://groups.google.com/a/wikimedia.org/forum/#!forum/community-tech
>> ).
>>
>> On Tue, Aug 18, 2015 at 2:42 AM, Pine W  wrote:
>> > Is there any easy way to find all of citations of specified academic
>> > articles on Wikipedias in all languages, and the text that is supported
>> by
>> > those references, so that the citations of questionable articles can be
>> > removed and the article texts can be quickly reviewed for possible
>> changes
>> > or removal?
>> >
>> > See
>> >
>> https://www.washingtonpost.com/news/morning-mix/wp/2015/08/18/outbreak-of-fake-peer-reviews-widens-as-major-publisher-retracts-64-scientific-papers/?tid=hp_mm
>> >
>> > If we don't have easy ways to deal with this (and I believe that we
>> don't),
>> > I'd like to suggest that the Community Tech team work on tools to help
>> when
>> > these situations happen.
>> >
>> > Thanks,
>> >
>> > Pine
>> > ___
>> > Wikitech-l mailing list
>> > Wikitech-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>
>>
>> --
>> Tilman Bayer
>> Senior Analyst
>> Wikimedia Foundation
>> IRC (Freenode): HaeB
>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Improving CAPTCHA friendliness for humans, and increasing CAPTCHA difficulty for bots

2015-08-18 Thread Pine W
I see that there's an active workboard in Phabricator at
https://phabricator.wikimedia.org/project/board/225/ for CAPTCHA issues.

Returning to a subject that has been discussed several times before: the
last I heard is that our current CAPTCHAs do block some spambots, but they
also present problems for humans and aren't especially difficult for more
sophisticated spambots to solve. Can someone share a general update on
what's happening with regard to improving usability for humans and
increasing the difficulty for bots? I'm particularly concerned about the
former issue, since CAPTCHAs might be filtering out some good-faith human
editors.

Thanks,

Pine
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Pine W
Thanks for the info, Tilman.

I ended up looking at the Community Tech page on MediaWiki, which says that
their scope of work includes "Building article curation and monitoring
tools for WikiProjects", so the kind of tools that we're discussing here
seem to be within their scope.

Ryan, you seem to be the lead communicator for the group. Can you add these
tools to the list of projects that are in the Community Tech backlog? Also,
can you clarify why Community Tech is using Google Groups for its mailing
list instead of lists.wikimedia.org?

Thanks,

Pine


On Tue, Aug 18, 2015 at 12:56 PM, Tilman Bayer  wrote:

> Related discussion from 2012:
>
> https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Medicine/Archive_26#Creating_a_bot_to_search_Wikipedia_for_retracted_papers
> (afaics it resulted in the creation of the {{retracted}} template, but
> no bot)
>
> The Community Tech team has its own mailing list now btw
> (https://groups.google.com/a/wikimedia.org/forum/#!forum/community-tech
> ).
>
> On Tue, Aug 18, 2015 at 2:42 AM, Pine W  wrote:
> > Is there any easy way to find all of citations of specified academic
> > articles on Wikipedias in all languages, and the text that is supported
> by
> > those references, so that the citations of questionable articles can be
> > removed and the article texts can be quickly reviewed for possible
> changes
> > or removal?
> >
> > See
> >
> https://www.washingtonpost.com/news/morning-mix/wp/2015/08/18/outbreak-of-fake-peer-reviews-widens-as-major-publisher-retracts-64-scientific-papers/?tid=hp_mm
> >
> > If we don't have easy ways to deal with this (and I believe that we
> don't),
> > I'd like to suggest that the Community Tech team work on tools to help
> when
> > these situations happen.
> >
> > Thanks,
> >
> > Pine
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
> --
> Tilman Bayer
> Senior Analyst
> Wikimedia Foundation
> IRC (Freenode): HaeB
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Tilman Bayer
Related discussion from 2012:
https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Medicine/Archive_26#Creating_a_bot_to_search_Wikipedia_for_retracted_papers
(afaics it resulted in the creation of the {{retracted}} template, but
no bot)

The Community Tech team has its own mailing list now btw
(https://groups.google.com/a/wikimedia.org/forum/#!forum/community-tech
).

On Tue, Aug 18, 2015 at 2:42 AM, Pine W  wrote:
> Is there any easy way to find all of citations of specified academic
> articles on Wikipedias in all languages, and the text that is supported by
> those references, so that the citations of questionable articles can be
> removed and the article texts can be quickly reviewed for possible changes
> or removal?
>
> See
> https://www.washingtonpost.com/news/morning-mix/wp/2015/08/18/outbreak-of-fake-peer-reviews-widens-as-major-publisher-retracts-64-scientific-papers/?tid=hp_mm
>
> If we don't have easy ways to deal with this (and I believe that we don't),
> I'd like to suggest that the Community Tech team work on tools to help when
> these situations happen.
>
> Thanks,
>
> Pine
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Pine W
Interesting about the database situation. I was contemplating something
like a gadget that would be embedded on the page but be hosted on Labs.
Alternatively, the frequently updated info could be posted to Wikidata for
text and Commons for imagery so that information can be efficiently updated
across projects and languages.

Pine
On Aug 18, 2015 5:26 AM, "Erik Zachte"  wrote:

> As for realtime, I recommend caution with burdening Wikipedia with even
> more highly transient information, at least within our current database
> scheme.
>
> For years Serbian Wikinews has been inundated with weather info, hourly
> (!), per city (!), and thus managed 3 million revisions with a handful of
> editors.
>
> All those edits have made the dump explode in size: 6 GB uncompressed for
> 77k articles with about 2k content on average. Even if disk size doesn't
> matter, it slows batch processing, and clutters page history.
>
> Erik
>
>  -Original Message-
> From: wikitech-l-boun...@lists.wikimedia.org [mailto:
> wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Oliver Keyes
> Sent: Tuesday, August 18, 2015 13:52
> To: Wikimedia developers
> Subject: Re: [Wikitech-l] Geohack tools
>
> Discovery is currently working on a maps service, which is a first step
> towards "geo-relevant" information, but the plan is to put that on hold for
> Q2 (read: the next 3 months) while we identify a clearer use case for it.
> If you or anyone else are interested in this kind of project (or in the
> functioning of our search service) I recommend subscribing to
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>
> On 18 August 2015 at 06:09, Pine W  wrote:
> > Toby, is there any chance that the Reading team (or maybe Multimedia
> > or
> > Discovery?) will incorporate more intractive features or realtime
> > geo-relevant into Wikipedia with information like weather, air and
> > marine traffic, bus and train service (particularly for landmarks with
> > lots of tourists), star and sattelite positions in the sky,
> > socioeconomic data maps, financial statistics, etc?
> >
> > Thanks,
> >
> > Pine
> > On Aug 13, 2015 2:22 PM, "Pine W"  wrote:
> >
> >> It's great to hear that work is progressing on this.
> >>
> >> I'd like to see more interactive features on pages, for example live
> >> air traffic, ground traffic, and marine traffic data; and weather
> conditions.
> >>
> >> Pine
> >>
> >>
> >> On Fri, Aug 7, 2015 at 8:34 PM, Yuri Astrakhan
> >> 
> >> wrote:
> >>
> >>> Pine, you are right. That list is not very useful, and instead
> >>> should look something like this:
> >>>
> >>> https://en.wikivoyage.org/wiki/Salzburg#Get_around
> >>>
> >>> The bad news is that the tile service it uses is hosted on labs,
> >>> which means it cannot scale to the regular wikipedia-usage levels.
> >>> Plus there might be a potential policy problem there - default
> >>> lab-content loading on every page visit without user's consent.
> >>>
> >>> The good news is that we are very close to launching a full-blown
> >>> WMF production-hosted tile service, based on the wonderful data from
> OSM.
> >>>
> >>> See general info and some ideas people have proposed -
> >>> https://www.mediawiki.org/wiki/Maps  (feel free to add more)
> >>>
> >>>
> >>> On Fri, Aug 7, 2015 at 11:23 PM, Gergo Tisza 
> >>> wrote:
> >>>
> >>> > On Thu, Aug 6, 2015 at 11:01 PM, Pine W  wrote:
> >>> >
> >>> > > I just now realized how powerful these tools are when I started
> >>> clicking
> >>> > > around.
> >>> > >
> >>> > >
> >>> > >
> >>> >
> >>> https://tools.wmflabs.org/geohack/geohack.php?pagename=File%3AWhite-
> >>> cheeked_Starling_perching_on_a_rock.jpg¶ms=34.610576_N_135.54054
> >>> 2_E_globe:Earth_class:object_&language=en
> >>> > >
> >>> > > Is there any chance of integrating some of these tools more
> >>> > > directly
> >>> onto
> >>> > > Wikipedia pages, and into mobile web/mobile apps?
> >>> > >
> >>> >
> >>> > There is an ongoing project
> >>> >  to
> >>> > integrate maps into MediaWiki; that's probably the best place to
> >>> > catalog the use cases
> >>> of
> >>> > Geohack and decide which can and need to be supported by the new
> tools.
> >>> > ___
> >>> > Wikitech-l mailing list
> >>> > Wikitech-l@lists.wikimedia.org
> >>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>> >
> >>> ___
> >>> Wikitech-l mailing list
> >>> Wikitech-l@lists.wikimedia.org
> >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>>
> >>
> >>
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
> --
> Oliver Keyes
> Count Logula
> Wikimedia Foundation
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Subramanya Sastry

On 08/18/2015 07:58 AM, MZMcBride wrote:

Subramanya Sastry wrote:

* Unclosed HTML tags (very common)
* Misnested tags
* Misnesting of tags (ex: links in links .. [http://foo.bar this is a
[[foobar]] company])
* Fostered content in tables
(this-content-will-show-up-outside-the-table
)
... this has been one of the biggest source of complexity inside Parsoid
... in combination with templates, this is nasty.
* Other ways in which HTML5 content model might be violated. (ex:
\n*a\n*b\n)
* Look at the parser tests file and see all the tests we've added with
annotations that say "php parser relies on tidy"

I don't see why we would want to incur the maintenance cost of continuing
to support any of these bad inputs. I think we should look to deprecate,
not replace, Tidy. This is a case of the cure being worse than the disease.


Are you suggesting that you get rid of wikitext editing? If not, you 
cannot assume editors are going to write perfect markup.


What is needed is a way to define DOM scopes in wikitext and enforce 
well-formedness within scopes. So, for example, template output can be 
considered a DOM scope (either opt-in or opt-out). If we felt bold, we 
can define a list to be a DOM scope .. or a table to be a DOM scope ... 
or a image caption to be a DOM scope, and so on.


Rather than expect editors to write perfect markup, we should be 
thinking about sane semantics for them like scoping that delimit effects 
of broken markup. With proper semantics, it is easier to reason about 
markup and not rely on whimsical behavior of whatever tool we used 
yesterday or use today or might use tomorrow.


We are working towards these kind of scoping semantics and the first 
step on the way is to get a HTML5 treebuilder / parser in place.


Subbu.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Mr. Stradivarius
On Tue, Aug 18, 2015 at 11:48 PM, Derk-Jan Hartman <
d.j.hartman+wmf...@gmail.com> wrote:

> If we want to do away with Tidy, we will have to make all editors perfect
> html authors
>

In my experience, mismatched tags are quite often used on purpose. For
example, Cyberpower678 has two unmatched div tags at the end of his
StandardLayout template
, used to
put a shaded border round the posts on his talk page
. There are no
corresponding closing div tags at the end of the talk page, as they would
be moved by the talk page archive bot, and Tidy takes care of the invalid
HTML anyway.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Bartosz Dziewoński

On Tue, 18 Aug 2015 05:15:05 +0200, MZMcBride  wrote:

The only cited example of real breakage so far has been mismatched  
s.
How often are you or anyone else adding s to pages? In my  
experience,

most users rely on MediaWiki templates for any kind of complex markup.

Echoing my initial reply in this thread, I still don't really understand
what behaviors from Tidy we want to keep. I've been following
 a bit and it also hasn't  
helped

answer this question.


Mismatched any tags. A an opening  or closing  tag without a  
pair can wreak havoc on the entire page, including the interface.


I recall reports of unclosed  or  reducing the font size of or  
bolding the entire page. I can't find that one, but here's a small  
collection of bugs caused by Tidy unintentionally not running in various  
contexts: T27888 T29889 T40273 T44016 T60042 T60439.


You could easily engineer this to hide the tabs if you were malicious  
(making it impossible for casual users to edit the page, say, to fix the  
broken markup), and it might even be doable by accident.


We really do need this feature. Not anything else that Tidy does, most of  
its behavior is actually damaging, but we need to match the open and close  
tags to prevent the interface from getting jumbled.


--
Bartosz Dziewoński

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Derk-Jan Hartman
If we want to do away with Tidy, we will have to make all editors perfect
html authors, or we risk them damaging pages so much that they potentially
can't access the edit button anymore. As far as i'm concerned, this is what
Tidy does primarily. Isolate errors in the content in such a way that it
cannot influence the rest of the interface of the website. And yes I do
regularly see such problem in MediaWiki instances that do not run Tidy.

Rule one of security. Always have multiple layers of defense. Yes we should
reduce the amount of problems and make them more visible, but that doesn't
mean we don't still need a correctional method as a fallback.

DJ

On Tue, Aug 18, 2015 at 3:04 PM, David Gerard  wrote:

> On 18 August 2015 at 04:15, MZMcBride  wrote:
> > Brian Wolff wrote:
>
> >>I dont know about that. Viz editor is targeting ordinary tasks. Its the
> >>complex things that mess stuff up.
>
> > In most contexts, solving the ordinary/common cases is a pretty big win.
>
>
> Or when it turns a complex task into a simple one, e.g. table editing
> (one click to remove a column).
>
>
> - d.
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread David Gerard
On 18 August 2015 at 04:15, MZMcBride  wrote:
> Brian Wolff wrote:

>>I dont know about that. Viz editor is targeting ordinary tasks. Its the
>>complex things that mess stuff up.

> In most contexts, solving the ordinary/common cases is a pretty big win.


Or when it turns a complex task into a simple one, e.g. table editing
(one click to remove a column).


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread MZMcBride
Subramanya Sastry wrote:
>* Unclosed HTML tags (very common)
>* Misnested tags
>* Misnesting of tags (ex: links in links .. [http://foo.bar this is a
>[[foobar]] company])
>* Fostered content in tables
>(this-content-will-show-up-outside-the-table
>)
>... this has been one of the biggest source of complexity inside Parsoid
>... in combination with templates, this is nasty.
>* Other ways in which HTML5 content model might be violated. (ex:
>\n*a\n*b\n)
>* Look at the parser tests file and see all the tests we've added with
>annotations that say "php parser relies on tidy"

I don't see why we would want to incur the maintenance cost of continuing
to support any of these bad inputs. I think we should look to deprecate,
not replace, Tidy. This is a case of the cure being worse than the disease.

>So, you cannot just rip out Tidy and not replace it with something in
>its place. Even replacing it with a HTML5 parser (as per the current
>plan) is not entirely straightforward simply because of all the other
>unrelated-to-html5-semantics behavior. Part of the task of replacing
>Tidy is to figure out all the ways those pages might break and the best
>way to handle that breakage.

We shouldn't rip out Tidy immediately, we should implement a means of
disabling Tidy on a per-page or per-user basis and allow the wiki process
to correct bad markup over time. Cunningham's Law applies here.

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Erik Zachte
As for realtime, I recommend caution with burdening Wikipedia with even more 
highly transient information, at least within our current database scheme.

For years Serbian Wikinews has been inundated with weather info, hourly (!), 
per city (!), and thus managed 3 million revisions with a handful of editors.

All those edits have made the dump explode in size: 6 GB uncompressed for 77k 
articles with about 2k content on average. Even if disk size doesn't matter, it 
slows batch processing, and clutters page history.

Erik

 -Original Message-
From: wikitech-l-boun...@lists.wikimedia.org 
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Oliver Keyes
Sent: Tuesday, August 18, 2015 13:52
To: Wikimedia developers
Subject: Re: [Wikitech-l] Geohack tools

Discovery is currently working on a maps service, which is a first step towards 
"geo-relevant" information, but the plan is to put that on hold for Q2 (read: 
the next 3 months) while we identify a clearer use case for it. If you or 
anyone else are interested in this kind of project (or in the functioning of 
our search service) I recommend subscribing to 
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search

On 18 August 2015 at 06:09, Pine W  wrote:
> Toby, is there any chance that the Reading team (or maybe Multimedia 
> or
> Discovery?) will incorporate more intractive features or realtime 
> geo-relevant into Wikipedia with information like weather, air and 
> marine traffic, bus and train service (particularly for landmarks with 
> lots of tourists), star and sattelite positions in the sky, 
> socioeconomic data maps, financial statistics, etc?
>
> Thanks,
>
> Pine
> On Aug 13, 2015 2:22 PM, "Pine W"  wrote:
>
>> It's great to hear that work is progressing on this.
>>
>> I'd like to see more interactive features on pages, for example live 
>> air traffic, ground traffic, and marine traffic data; and weather conditions.
>>
>> Pine
>>
>>
>> On Fri, Aug 7, 2015 at 8:34 PM, Yuri Astrakhan 
>> 
>> wrote:
>>
>>> Pine, you are right. That list is not very useful, and instead 
>>> should look something like this:
>>>
>>> https://en.wikivoyage.org/wiki/Salzburg#Get_around
>>>
>>> The bad news is that the tile service it uses is hosted on labs, 
>>> which means it cannot scale to the regular wikipedia-usage levels. 
>>> Plus there might be a potential policy problem there - default 
>>> lab-content loading on every page visit without user's consent.
>>>
>>> The good news is that we are very close to launching a full-blown 
>>> WMF production-hosted tile service, based on the wonderful data from OSM.
>>>
>>> See general info and some ideas people have proposed - 
>>> https://www.mediawiki.org/wiki/Maps  (feel free to add more)
>>>
>>>
>>> On Fri, Aug 7, 2015 at 11:23 PM, Gergo Tisza 
>>> wrote:
>>>
>>> > On Thu, Aug 6, 2015 at 11:01 PM, Pine W  wrote:
>>> >
>>> > > I just now realized how powerful these tools are when I started
>>> clicking
>>> > > around.
>>> > >
>>> > >
>>> > >
>>> >
>>> https://tools.wmflabs.org/geohack/geohack.php?pagename=File%3AWhite-
>>> cheeked_Starling_perching_on_a_rock.jpg¶ms=34.610576_N_135.54054
>>> 2_E_globe:Earth_class:object_&language=en
>>> > >
>>> > > Is there any chance of integrating some of these tools more 
>>> > > directly
>>> onto
>>> > > Wikipedia pages, and into mobile web/mobile apps?
>>> > >
>>> >
>>> > There is an ongoing project
>>> >  to 
>>> > integrate maps into MediaWiki; that's probably the best place to 
>>> > catalog the use cases
>>> of
>>> > Geohack and decide which can and need to be supported by the new tools.
>>> > ___
>>> > Wikitech-l mailing list
>>> > Wikitech-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>> >
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>
>>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



--
Oliver Keyes
Count Logula
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Oliver Keyes
Discovery is currently working on a maps service, which is a first
step towards "geo-relevant" information, but the plan is to put that
on hold for Q2 (read: the next 3 months) while we identify a clearer
use case for it. If you or anyone else are interested in this kind of
project (or in the functioning of our search service) I recommend
subscribing to https://lists.wikimedia.org/mailman/listinfo/wikimedia-search

On 18 August 2015 at 06:09, Pine W  wrote:
> Toby, is there any chance that the Reading team (or maybe Multimedia or
> Discovery?) will incorporate more intractive features or realtime
> geo-relevant into Wikipedia with information like weather, air and marine
> traffic, bus and train service (particularly for landmarks with lots of
> tourists), star and sattelite positions in the sky, socioeconomic data
> maps, financial statistics, etc?
>
> Thanks,
>
> Pine
> On Aug 13, 2015 2:22 PM, "Pine W"  wrote:
>
>> It's great to hear that work is progressing on this.
>>
>> I'd like to see more interactive features on pages, for example live air
>> traffic, ground traffic, and marine traffic data; and weather conditions.
>>
>> Pine
>>
>>
>> On Fri, Aug 7, 2015 at 8:34 PM, Yuri Astrakhan 
>> wrote:
>>
>>> Pine, you are right. That list is not very useful, and instead should look
>>> something like this:
>>>
>>> https://en.wikivoyage.org/wiki/Salzburg#Get_around
>>>
>>> The bad news is that the tile service it uses is hosted on labs, which
>>> means it cannot scale to the regular wikipedia-usage levels. Plus there
>>> might be a potential policy problem there - default lab-content loading on
>>> every page visit without user's consent.
>>>
>>> The good news is that we are very close to launching a full-blown WMF
>>> production-hosted tile service, based on the wonderful data from OSM.
>>>
>>> See general info and some ideas people have proposed -
>>> https://www.mediawiki.org/wiki/Maps  (feel free to add more)
>>>
>>>
>>> On Fri, Aug 7, 2015 at 11:23 PM, Gergo Tisza 
>>> wrote:
>>>
>>> > On Thu, Aug 6, 2015 at 11:01 PM, Pine W  wrote:
>>> >
>>> > > I just now realized how powerful these tools are when I started
>>> clicking
>>> > > around.
>>> > >
>>> > >
>>> > >
>>> >
>>> https://tools.wmflabs.org/geohack/geohack.php?pagename=File%3AWhite-cheeked_Starling_perching_on_a_rock.jpg¶ms=34.610576_N_135.540542_E_globe:Earth_class:object_&language=en
>>> > >
>>> > > Is there any chance of integrating some of these tools more directly
>>> onto
>>> > > Wikipedia pages, and into mobile web/mobile apps?
>>> > >
>>> >
>>> > There is an ongoing project
>>> >  to integrate
>>> > maps
>>> > into MediaWiki; that's probably the best place to catalog the use cases
>>> of
>>> > Geohack and decide which can and need to be supported by the new tools.
>>> > ___
>>> > Wikitech-l mailing list
>>> > Wikitech-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>> >
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>
>>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Pine W
Toby, is there any chance that the Reading team (or maybe Multimedia or
Discovery?) will incorporate more intractive features or realtime
geo-relevant into Wikipedia with information like weather, air and marine
traffic, bus and train service (particularly for landmarks with lots of
tourists), star and sattelite positions in the sky, socioeconomic data
maps, financial statistics, etc?

Thanks,

Pine
On Aug 13, 2015 2:22 PM, "Pine W"  wrote:

> It's great to hear that work is progressing on this.
>
> I'd like to see more interactive features on pages, for example live air
> traffic, ground traffic, and marine traffic data; and weather conditions.
>
> Pine
>
>
> On Fri, Aug 7, 2015 at 8:34 PM, Yuri Astrakhan 
> wrote:
>
>> Pine, you are right. That list is not very useful, and instead should look
>> something like this:
>>
>> https://en.wikivoyage.org/wiki/Salzburg#Get_around
>>
>> The bad news is that the tile service it uses is hosted on labs, which
>> means it cannot scale to the regular wikipedia-usage levels. Plus there
>> might be a potential policy problem there - default lab-content loading on
>> every page visit without user's consent.
>>
>> The good news is that we are very close to launching a full-blown WMF
>> production-hosted tile service, based on the wonderful data from OSM.
>>
>> See general info and some ideas people have proposed -
>> https://www.mediawiki.org/wiki/Maps  (feel free to add more)
>>
>>
>> On Fri, Aug 7, 2015 at 11:23 PM, Gergo Tisza 
>> wrote:
>>
>> > On Thu, Aug 6, 2015 at 11:01 PM, Pine W  wrote:
>> >
>> > > I just now realized how powerful these tools are when I started
>> clicking
>> > > around.
>> > >
>> > >
>> > >
>> >
>> https://tools.wmflabs.org/geohack/geohack.php?pagename=File%3AWhite-cheeked_Starling_perching_on_a_rock.jpg¶ms=34.610576_N_135.540542_E_globe:Earth_class:object_&language=en
>> > >
>> > > Is there any chance of integrating some of these tools more directly
>> onto
>> > > Wikipedia pages, and into mobile web/mobile apps?
>> > >
>> >
>> > There is an ongoing project
>> >  to integrate
>> > maps
>> > into MediaWiki; that's probably the best place to catalog the use cases
>> of
>> > Geohack and decide which can and need to be supported by the new tools.
>> > ___
>> > Wikitech-l mailing list
>> > Wikitech-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> >
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Pine W
Is there any easy way to find all of citations of specified academic
articles on Wikipedias in all languages, and the text that is supported by
those references, so that the citations of questionable articles can be
removed and the article texts can be quickly reviewed for possible changes
or removal?

See
https://www.washingtonpost.com/news/morning-mix/wp/2015/08/18/outbreak-of-fake-peer-reviews-widens-as-major-publisher-retracts-64-scientific-papers/?tid=hp_mm

If we don't have easy ways to deal with this (and I believe that we don't),
I'd like to suggest that the Community Tech team work on tools to help when
these situations happen.

Thanks,

Pine
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] How to serve up subtitles

2015-08-18 Thread Derk-Jan Hartman
Well we already have a namespace of course, and indeed I was already
considering converting that to a ContentModel. The only thing that is
somewhat problematic here is the multiple languages problem. Currently each
language get it's own page and title. Do we switch the model to host all
subtitles in the same 'page' ? That will require a lot of UI and flags etc.

Current system:
https://commons.wikimedia.org/wiki/File:Folgers.ogv
https://commons.wikimedia.org/wiki/TimedText:Folgers.ogv
https://commons.wikimedia.org/wiki/TimedText:Folgers.ogv.en.srt
https://commons.wikimedia.org/wiki/TimedText:Folgers.ogv.de.srt

DJ


On Tue, Aug 18, 2015 at 9:04 AM, Ori Livneh  wrote:

> On Mon, Aug 17, 2015 at 5:56 AM, Derk-Jan Hartman <
> d.j.hartman+wmf...@gmail.com> wrote:
>
> > As part of Brion's and mine struggle for better A/V support on
> Wikipedia, I
> > have concluded that our current support for subtitles is rather...
> > improvised.
> >
> > Currently all our SRT files are referenced from HTML using action=raw.
> But
> > then not actually used from action=raw, but instead served up as semi
> html
> > using api.php. Which is ridiculous...
> > If we want to move to more HTML5 compliancy, we also will want to switch
> > from the SRT format to the VTT format.
> >
> > Ideally, I want to host multiple subtitle formats, and dynamically
> > serve/convert them as either SRT or VTT. These can be directly referenced
> > from a  element so that we are fully compatible.
> >
> > The question is now, how to best do this. The endpoint needs to be
> dynamic,
> > cacheable, allow multiple content types etc.
> >
> > Ideas suggested have been:
> > * Api.php
> > * Restbase
> > * New endpoint
> > * ResourceLoader modules
> >
> > I'm listing the current problems, future requirements and discussing
> > several ideas at:
> >
> >
> https://www.mediawiki.org/wiki/Extension:TimedMediaHandler/TimedTextRework?veaction=edit
> > If you have any ideas or remarks, please contribute them !
> >
>
> I propose adding an additional associated namespace (like Talk:), except
> for subtitles. The namespace will be associated with File pages which
> represent videos, and it will be coupled to a ContentHandler class
> representing subtitle content. The ContentHandler class will be a natural
> place for validation logic and the specification of an alternate editing
> interface suitable for editing subtitles.
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] How to serve up subtitles

2015-08-18 Thread Ori Livneh
On Mon, Aug 17, 2015 at 5:56 AM, Derk-Jan Hartman <
d.j.hartman+wmf...@gmail.com> wrote:

> As part of Brion's and mine struggle for better A/V support on Wikipedia, I
> have concluded that our current support for subtitles is rather...
> improvised.
>
> Currently all our SRT files are referenced from HTML using action=raw. But
> then not actually used from action=raw, but instead served up as semi html
> using api.php. Which is ridiculous...
> If we want to move to more HTML5 compliancy, we also will want to switch
> from the SRT format to the VTT format.
>
> Ideally, I want to host multiple subtitle formats, and dynamically
> serve/convert them as either SRT or VTT. These can be directly referenced
> from a  element so that we are fully compatible.
>
> The question is now, how to best do this. The endpoint needs to be dynamic,
> cacheable, allow multiple content types etc.
>
> Ideas suggested have been:
> * Api.php
> * Restbase
> * New endpoint
> * ResourceLoader modules
>
> I'm listing the current problems, future requirements and discussing
> several ideas at:
>
> https://www.mediawiki.org/wiki/Extension:TimedMediaHandler/TimedTextRework?veaction=edit
> If you have any ideas or remarks, please contribute them !
>

I propose adding an additional associated namespace (like Talk:), except
for subtitles. The namespace will be associated with File pages which
represent videos, and it will be coupled to a ContentHandler class
representing subtitle content. The ContentHandler class will be a natural
place for validation logic and the specification of an alternate editing
interface suitable for editing subtitles.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l