Re: [Wikitech-l] Improving CAPTCHA friendliness for humans, and increasing CAPTCHA difficulty for bots

2015-08-18 Thread Brian Wolff
On Tuesday, August 18, 2015, Pine W wrote: > what's happening with regard to improving usability for humans and > increasing the difficulty for bots? Generally speaking, isnt that an open problem in computer science? -- Bawolff ___ Wikitech-l mailing l

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Luis Villa
On Tue, Aug 18, 2015 at 4:50 PM, Tilman Bayer wrote: > On Tue, Aug 18, 2015 at 3:59 PM, Luis Villa wrote: > > On Tue, Aug 18, 2015 at 2:06 PM, Pine W wrote: > > > >> Researching the possibility of migrating all mailing lists to a newer > >> system sounds > >> like a good project for Community T

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Tilman Bayer
On Tue, Aug 18, 2015 at 3:59 PM, Luis Villa wrote: > On Tue, Aug 18, 2015 at 2:06 PM, Pine W wrote: > >> Researching the possibility of migrating all mailing lists to a newer >> system sounds >> like a good project for Community Tech >> > > I've been pushing to keep the team focused on things tha

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Luis Villa
On Tue, Aug 18, 2015 at 2:06 PM, Pine W wrote: > Researching the possibility of migrating all mailing lists to a newer > system sounds > like a good project for Community Tech > I've been pushing to keep the team focused on things that can show a direct impact on contribution/editing; this kind

Re: [Wikitech-l] oldimage naming convention

2015-08-18 Thread Brion Vibber
I have the impression that was an old bug which got fixed sometime in the last couple years -- it was accidentally using the current time instead of the original upload time. But there will of course be thousands of existing old-version files with the "wrong" prefixes stuck on their filenames... -

[Wikitech-l] oldimage naming convention

2015-08-18 Thread Daren Welsh
In the version history of an image (or any attached file in MediaWiki), the page displays "Date/Time" with a link to that version. The timestamp displayed is the upload timestamp of that version. If you look closely, you can see that the real filename includes a different timestamp. This turns out

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Ryan Kaldari
On Tue, Aug 18, 2015 at 2:06 PM, Pine W wrote: > 1. I was thinking of a tool that would let users input a variety of ways > of referring to the retracted articles, such as DOI numbers (Peaceray is an > expert in these). The tool would accept multiple inputs simultaneously, > such as all 64 articl

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Pine W
1. I was thinking of a tool that would let users input a variety of ways of referring to the retracted articles, such as DOI numbers (Peaceray is an expert in these). The tool would accept multiple inputs simultaneously, such as all 64 articles that were retracted in a batch. The tool would return

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Stas Malyshev
Hi! > This project sounds like a good idea, but I don't really understand how it > would work as a tool. There's no API for retracted journal articles. It > seems like the best way to handle it would be when you find out about a > retracted journal article to just search Wikipedia for the title of

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Ryan Kaldari
On Tue, Aug 18, 2015 at 1:22 PM, Pine W wrote: > Thanks for the info, Tilman. > > I ended up looking at the Community Tech page on MediaWiki, which says > that their scope of work includes "Building article curation and monitoring > tools for WikiProjects", so the kind of tools that we're discuss

[Wikitech-l] Improving CAPTCHA friendliness for humans, and increasing CAPTCHA difficulty for bots

2015-08-18 Thread Pine W
I see that there's an active workboard in Phabricator at https://phabricator.wikimedia.org/project/board/225/ for CAPTCHA issues. Returning to a subject that has been discussed several times before: the last I heard is that our current CAPTCHAs do block some spambots, but they also present problem

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Pine W
Thanks for the info, Tilman. I ended up looking at the Community Tech page on MediaWiki, which says that their scope of work includes "Building article curation and monitoring tools for WikiProjects", so the kind of tools that we're discussing here seem to be within their scope. Ryan, you seem to

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Tilman Bayer
Related discussion from 2012: https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Medicine/Archive_26#Creating_a_bot_to_search_Wikipedia_for_retracted_papers (afaics it resulted in the creation of the {{retracted}} template, but no bot) The Community Tech team has its own mailing list now btw

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Pine W
Interesting about the database situation. I was contemplating something like a gadget that would be embedded on the page but be hosted on Labs. Alternatively, the frequently updated info could be posted to Wikidata for text and Commons for imagery so that information can be efficiently updated acro

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Subramanya Sastry
On 08/18/2015 07:58 AM, MZMcBride wrote: Subramanya Sastry wrote: * Unclosed HTML tags (very common) * Misnested tags * Misnesting of tags (ex: links in links .. [http://foo.bar this is a [[foobar]] company]) * Fostered content in tables (this-content-will-show-up-outside-the-table ) ... thi

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Mr. Stradivarius
On Tue, Aug 18, 2015 at 11:48 PM, Derk-Jan Hartman < d.j.hartman+wmf...@gmail.com> wrote: > If we want to do away with Tidy, we will have to make all editors perfect > html authors > In my experience, mismatched tags are quite often used on purpose. For example, Cyberpower678 has two unmatched di

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Bartosz DziewoƄski
On Tue, 18 Aug 2015 05:15:05 +0200, MZMcBride wrote: The only cited example of real breakage so far has been mismatched s. How often are you or anyone else adding s to pages? In my experience, most users rely on MediaWiki templates for any kind of complex markup. Echoing my initial reply i

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread Derk-Jan Hartman
If we want to do away with Tidy, we will have to make all editors perfect html authors, or we risk them damaging pages so much that they potentially can't access the edit button anymore. As far as i'm concerned, this is what Tidy does primarily. Isolate errors in the content in such a way that it c

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread David Gerard
On 18 August 2015 at 04:15, MZMcBride wrote: > Brian Wolff wrote: >>I dont know about that. Viz editor is targeting ordinary tasks. Its the >>complex things that mess stuff up. > In most contexts, solving the ordinary/common cases is a pretty big win. Or when it turns a complex task into a sim

Re: [Wikitech-l] RFC: Replace Tidy with HTML 5 parse/reserialize

2015-08-18 Thread MZMcBride
Subramanya Sastry wrote: >* Unclosed HTML tags (very common) >* Misnested tags >* Misnesting of tags (ex: links in links .. [http://foo.bar this is a >[[foobar]] company]) >* Fostered content in tables >(this-content-will-show-up-outside-the-table >) >... this has been one of the biggest source

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Erik Zachte
As for realtime, I recommend caution with burdening Wikipedia with even more highly transient information, at least within our current database scheme. For years Serbian Wikinews has been inundated with weather info, hourly (!), per city (!), and thus managed 3 million revisions with a handful o

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Oliver Keyes
Discovery is currently working on a maps service, which is a first step towards "geo-relevant" information, but the plan is to put that on hold for Q2 (read: the next 3 months) while we identify a clearer use case for it. If you or anyone else are interested in this kind of project (or in the funct

Re: [Wikitech-l] Geohack tools

2015-08-18 Thread Pine W
Toby, is there any chance that the Reading team (or maybe Multimedia or Discovery?) will incorporate more intractive features or realtime geo-relevant into Wikipedia with information like weather, air and marine traffic, bus and train service (particularly for landmarks with lots of tourists), star

[Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-08-18 Thread Pine W
Is there any easy way to find all of citations of specified academic articles on Wikipedias in all languages, and the text that is supported by those references, so that the citations of questionable articles can be removed and the article texts can be quickly reviewed for possible changes or remov

Re: [Wikitech-l] How to serve up subtitles

2015-08-18 Thread Derk-Jan Hartman
Well we already have a namespace of course, and indeed I was already considering converting that to a ContentModel. The only thing that is somewhat problematic here is the multiple languages problem. Currently each language get it's own page and title. Do we switch the model to host all subtitles i

Re: [Wikitech-l] How to serve up subtitles

2015-08-18 Thread Ori Livneh
On Mon, Aug 17, 2015 at 5:56 AM, Derk-Jan Hartman < d.j.hartman+wmf...@gmail.com> wrote: > As part of Brion's and mine struggle for better A/V support on Wikipedia, I > have concluded that our current support for subtitles is rather... > improvised. > > Currently all our SRT files are referenced f