date:20151106

Re: [Wikitech-l] Community Tech: October report

2015-11-06 Thread Amir Ladsgroup

As a Farsi speaker, Thank you :)

Best


On Sat, Nov 7, 2015 at 3:44 AM Legoktm  wrote:

> Hi,
>
> On 11/03/2015 03:39 PM, Legoktm wrote:
> > On 11/03/2015 03:29 PM, Risker wrote:
> >> Okay, so I'm not going to say they shocked me in fact, they're
> pretty
> >> much what I expected.  However, I notice on the stats for English
> >> Wikipedia[1]  that multiple gadgets appear twice, once with a higher
> number
> >> and a second time with a "-" in front of them, and a low number.
> >
> > This is due to a bug[2] back in May 2013, which inserted a bunch of bad
> > preferences into the database table with an extra - in front. We didn't
> > notice until now that those rows were still in the database. I filed [3]
> > to manually delete them out of the database, otherwise they'll go away
> > gradually whenever those users save their preferences.
>
> And thanks to Krenair, those rows have been deleted from the database,
> and the GadgetUsage pages have been regenerated. :)
>
> >> [1] https://en.wikipedia.org/wiki/Special:GadgetUsage
> >
> > [2] https://phabricator.wikimedia.org/T50693
> > [3] https://phabricator.wikimedia.org/T117440
>
> -- Legoktm
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Forking, branching, merging, and drafts on Wikipedia

2015-11-06 Thread David Gerard

On 7 November 2015 at 00:29, Brian Wolff  wrote:

> I feel like different people want different things, and what is really
> needed is a user-centric discussion of use-cases to drive a feature
> wishlist, not any sort of discussion about implementation.


Yes. I see the rationaie in that Phabricator ticket, but it reads like
personal ideology without reference to the Wikimedia projects. What is
the use case?


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Forking, branching, merging, and drafts on Wikipedia

2015-11-06 Thread Brian Wolff

On 11/6/15, C. Scott Ananian  wrote:
> There is a proposal for the upcoming Mediawiki Dev Summit to get us
> "unstuck" on support for non-linear revision histories in Wikipedia.  This
> would include support for "saved drafts" of wikipedia edits and offline
> editing support, as well as a more permissive/friendly 'fork first' model
> of article collaboration.
>
> I outlined some proposed summit goals for the topic, but it needs a bit of
> help if it is going to make the cut for inclusion.  I hope interested folks
> will weigh in with some comments on
> https://phabricator.wikimedia.org/T113004 --- perhaps suggesting specific
> "next step" projects, for instance.
>
> Thanks for your help.
>  --scott
>
> --
> (http://cscott.net)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

I feel like different people want different things, and what is really
needed is a user-centric discussion of use-cases to drive a feature
wishlist, not any sort of discussion about implementation.

--
-bawolff

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Community Tech: October report

2015-11-06 Thread Legoktm

Hi,

On 11/03/2015 03:39 PM, Legoktm wrote:
> On 11/03/2015 03:29 PM, Risker wrote:
>> Okay, so I'm not going to say they shocked me in fact, they're pretty
>> much what I expected.  However, I notice on the stats for English
>> Wikipedia[1]  that multiple gadgets appear twice, once with a higher number
>> and a second time with a "-" in front of them, and a low number.
> 
> This is due to a bug[2] back in May 2013, which inserted a bunch of bad
> preferences into the database table with an extra - in front. We didn't
> notice until now that those rows were still in the database. I filed [3]
> to manually delete them out of the database, otherwise they'll go away
> gradually whenever those users save their preferences.

And thanks to Krenair, those rows have been deleted from the database,
and the GadgetUsage pages have been regenerated. :)

>> [1] https://en.wikipedia.org/wiki/Special:GadgetUsage
> 
> [2] https://phabricator.wikimedia.org/T50693
> [3] https://phabricator.wikimedia.org/T117440

-- Legoktm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid still doesn't love me

2015-11-06 Thread Gabriel Wicke

We don't currently store the full history of each page in RESTBase, so your
first access will trigger an on-demand parse of older revisions not yet in
storage, which is relatively slow. Repeat accesses will load those
revisions from disk (SSD), which will be a lot faster.

With a majority of clients now supporting HTTP2 / SPDY, use cases that
benefit from manual batching are becoming relatively rare. For a use case
like revision retrieval, HTTP2 with a decent amount of parallelism should
be plenty fast.

Gabriel

On Fri, Nov 6, 2015 at 2:24 PM, C. Scott Ananian 
wrote:

> I think your subject line should have been "RESTBase doesn't love me"?
>  --scott
> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

-- 
Gabriel Wicke
Principal Engineer, Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid still doesn't love me

2015-11-06 Thread C. Scott Ananian

I think your subject line should have been "RESTBase doesn't love me"?
 --scott

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread C. Scott Ananian

On Fri, Nov 6, 2015 at 4:52 PM, Daniel Friesen 
wrote:

> That all being said. I still think the original rationale for picking
> lua (more sandboxing controls including execution limits based on steps
> in lua rather than varying execution time) is still valid.
>

It's not, actually.  It may have been at the time.  But v8 now has both
time and memory limits and fine-grained counters for various events with
callbacks and all sorts of crazy sandbox-y things.  See
https://github.com/phpv8/v8js/blob/master/v8js_timer.cc (although I think
the latest v8 actually has even more direct ways of enforcing these limits.)
 --scott, who'd like to get some more work done on Scribunto/JS at some
point.

-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread C. Scott Ananian

On Fri, Nov 6, 2015 at 4:12 PM, Brion Vibber  wrote:

> On Fri, Nov 6, 2015 at 11:13 AM, C. Scott Ananian 
> wrote:
> > * Hey, we can have JavaScript and PHP running together in the same
> server.
> > Perhaps some persistence-related issues with PHP can be made easier?
> >
>
> We probably wouldn't want to break the PHP execution context concept that
> requests are self-contained (and failing out is reasonably safe). But you
> could for instance route sessions or cache data through the containing node
> server instead of on the filesystem or a separate memcache/etc service...
>

Right, exactly.  I'm currently running Opcache and APCu inside the embedded
PHP which are going to some lengths to offer persistent caches.  I'm not an
expert on PHP architecture; I suspect there are other places in mediawiki
where we are similarly jumping through hoops.  Perhaps these could be
simplified, at least for certain users.

> > * Hey, we can actually write *extensions for mediawiki-core* in
> JavaScript
> > (or CoffeeScript, or...) now.  Or run PHP code inside Parsoid.  How could
> > we use that?  (Could it grow developer communities?)
> >
>
> I'm uncertain about the desirability of general direct JS<->PHP sync call
> bridging, in that relying on it would _require_ this particular node+PHP
> distribution. I'd prefer loose enough coupling that the JS engine can be
> local or remote, and the PHP engine can be either Zend or HHVM, etc.
>

I expect that I can port php-embed to PHP 7 and/or HHVM without too much
trouble, if interest warrants.  And I already support quite a large number
of different node versions, from 2.4.0 to 5.0.0.  And there are some
interesting other alternative implementations that could export the same
interface but use RPC to bridge node and PHP, see for instance
https://github.com/bergie/dnode-php.  Even the sync/async distinction can
be bridged; if you look at the underlying implementation for php-embed all
communication is done via async message passing between the threads.  We
just "stop and wait" for certain replies to emulate sync calls (in
particular for PHP, which prefers it that way).

> Of course there are interesting possibilities like using JS as a template
> module extension language in place of / addition to Lua. A general warning:
> as I understand the php-embed bridge, JS-side code would a) have full
> rights to the system within the user the daemon runs as, and b)
> exiting/failing out of node would kill the entire daemon.
>

There is sandboxing within v8, so your warning is not accurate.

And in fact, the "mirror image" project is the PHP extension v8js, which I
believe Tim started and I worked on for a while before attempting
node-php-embed.  It also uses the native v8 sandboxing facilities.

> PHP-inside-Parsoid might be interesting for some kinds of extensions, but
> I'm not sure whether it's better to rig that up versus using looser
> coupling where we make an internal HTTP call over to the PHP MediaWiki
> side.
>

Yup.  That's what we essentially already do: after starting to implement
the template engine in Parsoid, it was scrapped and the entire templating
engine is implemented by calling over to PHP to allow it to expand
templates.  And whenever we want more information about the expansion, we
implement it in PHP.

But that's essentially the genesis of the "mediawiki as a collection of
services" idea -- once you start doing this, you find all sorts of bits of
crufty complex PHP code which you'd rather not try to reimplement.  First
templates, then image thumbnailing, next who knows, probably the skin.  One
day they might all be spun out as separate services with internal HTTP
calls between them.

I'm just providing a PoC that lets you ask questions about potential
alternatives.  I welcome the discussion.

> * How are parser extensions (like, say, WikiHiero, but there are lots of
> > them) going to be managed in the long term?  There are three separate
> > codebases to hook right now.  An extension like  might
> eventually
> > need to hook the image thumbnail service, too.  Do we have a plan?
> >
>
> This probably deserves its own thread!
>

Yeah.

Ideally you should only have to write one implementation, and it should be
> self-contained or access the container via a limited API.
>
> I'm not really sure I grasp how Parsoid handles tag hook extensions at the
> moment, actually... can anyone fill in some details?
>

It doesn't basically.  It just asks PHP to do the expansion for it, and
then wraps the whole thing in an element warning VE not to touch it.

Except for citations.

On our roadmap for this quarter we have a task to write a "proper"
extension interface, and then use it to refactor the citation code and
(hopefully) implement  support.  The end goal being to empower the
community to write Parsoid extensions for all the large number of *other*
tag extensions we don't yet support.

Note that Visual Editor needs to be extended at the same time as Parsoid is
extended, so

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Daniel Friesen

On 2015-11-06 1:12 PM, Brion Vibber wrote:
> Of course there are interesting possibilities like using JS as a template
> module extension language in place of / addition to Lua. A general warning:
> as I understand the php-embed bridge, JS-side code would a) have full
> rights to the system within the user the daemon runs as, and b)
> exiting/failing out of node would kill the entire daemon.
node has a built in vm  module that is
regularly used to execute sandboxed js that doesn't have access to the
privileged node api. This code doesn't have access to `process.exit()`
and PHP's concept of fatal errors (in addition to thrown exceptions)
that immediately halt the process and can't be caught doesn't exist in
JS. Sandboxing against infinite loops could also be done by running the
sandbox in another process (child_process even has a high-level message
passing stream for communicating with a node js child process).

That all being said. I still think the original rationale for picking
lua (more sandboxing controls including execution limits based on steps
in lua rather than varying execution time) is still valid.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Ryan Lane

On Fri, Nov 6, 2015 at 11:13 AM, C. Scott Ananian 
wrote:

> Let's not let this discussion sidetrack into "shared hosting vs VMs (vs
> docker?)" --- there's another phabricator ticket and summit topic for that
> (
> https://phabricator.wikimedia.org/T87774 and
> https://phabricator.wikimedia.org/T113210.
>
>
I only mentioned this portion of the discussion because I can't think of
any other reason your initial proposal makes sense, since it's essentially
discussing ways to distribute and run a set of microservices. Using docker
requires root, which isn't available on shared hosting. I'm fine ignoring
this topic in this discussion, though.


> I'd prefer to have discussion in *this* particular task/thread concentrate
> on:
>
> * Hey, we can have JavaScript and PHP in the same packaging system.  What
> cool things might that enable?
>
>
* Hey, we can have JavaScript and PHP running together in the same server.
> Perhaps some persistence-related issues with PHP can be made easier?
>
> * Hey, we can actually write *extensions for mediawiki-core* in JavaScript
> (or CoffeeScript, or...) now.  Or run PHP code inside Parsoid.  How could
> we use that?  (Could it grow developer communities?)
>
>
You're not talking about microservices here, so it's at least partially a
different discussion. You're talking about adding multiple languages into a
monolith and that's a path towards insanity. It's way easier to understand
and maintain large numbers of microservices than a polygot monolith. REST
with well defined APIs between services provides all of the same benefits
while also letting people manage their service independently, even with the
possibility of the service not being tied to MediaWiki or Wikimedia at all.

I'd posit that adding additional languages into the monolith will more
likely have the result of shrinking the developer community because it
requires knowledge of at least two languages to properly do development.

* How are parser extensions (like, say, WikiHiero, but there are lots of
> them) going to be managed in the long term?  There are three separate
> codebases to hook right now.  An extension like  might eventually
> need to hook the image thumbnail service, too.  Do we have a plan?
>
>
This seems like a perfect place for another microservice.


> And the pro/anti-npm and pro/anti-docker and pro/anti-VM discussion can go
> into one of those other tasks.  Thanks.
>
>
You're discussing packaging, distribution and running of services. So, I
don't believe they belong in another task. You're saying that alternatives
to your idea are only relevant when considered on their own, but these
alternatives are basically industry standards for the problem set as this
point and your proposal is something that only MediaWiki (and Wikimedia)
will be doing or maintaining.

- Ryan
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Brion Vibber

On Fri, Nov 6, 2015 at 11:13 AM, C. Scott Ananian 
wrote:
>
> * Hey, we can have JavaScript and PHP in the same packaging system.  What
> cool things might that enable?
>
> * Hey, we can have JavaScript and PHP running together in the same server.
> Perhaps some persistence-related issues with PHP can be made easier?
>

We probably wouldn't want to break the PHP execution context concept that
requests are self-contained (and failing out is reasonably safe). But you
could for instance route sessions or cache data through the containing node
server instead of on the filesystem or a separate memcache/etc service...

> * Hey, we can actually write *extensions for mediawiki-core* in JavaScript
> (or CoffeeScript, or...) now.  Or run PHP code inside Parsoid.  How could
> we use that?  (Could it grow developer communities?)
>

I'm uncertain about the desirability of general direct JS<->PHP sync call
bridging, in that relying on it would _require_ this particular node+PHP
distribution. I'd prefer loose enough coupling that the JS engine can be
local or remote, and the PHP engine can be either Zend or HHVM, etc.

Of course there are interesting possibilities like using JS as a template
module extension language in place of / addition to Lua. A general warning:
as I understand the php-embed bridge, JS-side code would a) have full
rights to the system within the user the daemon runs as, and b)
exiting/failing out of node would kill the entire daemon.

PHP-inside-Parsoid might be interesting for some kinds of extensions, but
I'm not sure whether it's better to rig that up versus using looser
coupling where we make an internal HTTP call over to the PHP MediaWiki side.

> * How are parser extensions (like, say, WikiHiero, but there are lots of
> them) going to be managed in the long term?  There are three separate
> codebases to hook right now.  An extension like  might eventually
> need to hook the image thumbnail service, too.  Do we have a plan?
>

This probably deserves its own thread!

Ideally you should only have to write one implementation, and it should be
self-contained or access the container via a limited API.

I'm not really sure I grasp how Parsoid handles tag hook extensions at the
moment, actually... can anyone fill in some details?

Note that conceptually we have a few different types of parser tag hook
extension:

* the standalone renderer (, , etc) -- these still need
storage for output caching, or CSS/JS/image assets that need serving. These
are easy to 'call out' to an external service for, which would make it easy
for parsoid to call MediaWiki, or for both to call a common separate
rendering implementation.

* the standalone wikitext wrapper/modifier (, , ,
) -- ideally these can be implemented mostly in terms of
wikitext transforms :) but may need assets, again, such as highlighting
CSS. Again mostly standalone, easy to transform in one place and return the
data to be included in output.

* the standalone renderer that needs access to the wiki's content and
rendering () -- could be implemented as a Lua module I bet! ;)
These require back-and-forth with the rest of the MediaWiki system... but
could easily be done on the MediaWiki side and output copied into the
parsoid HTML.

* the state-carrying wikitext wrapper/modifier (+) --
these require strict ordering, and build up state over the course of
parsing, *and* render things into wikitext, and  well it's just ugly.

* weird stuff (labeled section transclusion? translate?) -- not even sure
how some of these work ;)

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid still doesn't love me

2015-11-06 Thread Ricordisamoa


I mean RESTBase can't access more than 1 revision at once?

Il 06/11/2015 21:39, Subramanya Sastry ha scritto:


Parsoid is simply a wikitext -> html and a html -> wikitext conversion 
service. Everything else would be tools and libs built on top of it.


Subbu.

On 11/06/2015 02:29 PM, Ricordisamoa wrote:
What if I need to get all revisions (~2000) of a page in Parsoid 
HTML5? The prop=revisions API (in batches of 50) with 
mwparserfromhell is much quicker.
And what about ~400 revisions from a wiki without Parsoid/RESTBase? I 
would use /transform/wikitext/to/html then.

Thanks in advance.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Marcin Cieslak

On 2015-11-05, Ryan Lane  wrote:
> Is this simply to support hosted providers? npm is one of the worst package
> managers around. This really seems like a case where thin docker images and
> docker-compose really shines. It's easy to handle from the packer side,
> it's incredibly simple from the user side, and it doesn't require
> reinventing the world to distribute things.

I got heavily involved in to node world recently and I fully share your opinion
about npm and npm@3 takes the disaster to the next level.

Are we using some native npm modules in our stack? *That* is hard
to support.

> If this is the kind of stuff we're doing to support hosted providers, it
> seems it's really time to stop supporting hosted providers. It's $5/month
> to have a proper VM on digital ocean. There's even cheaper solutions
> around. Hosted providers at this point aren't cheaper. At best they're
> slightly easier to use, but MediaWiki is seriously handicapping itself to
> support this use-case.

I feel very strongly there is a need for a quick setup for people who
have their LAMP stack already working and feel familiar with that environment.
The problem is that a full-stack MediaWiki is no longer a LAMP application.
Those people aren't going away any soon and joining the coolest game in town.

I have already written scripts to keep code, vendor and core skins in sync
from git. I am beginning to write even more scripts to quickly deploy/destroy MW
instances. (My platform does not do Docker, btw.).

Maybe the right strategic move will be to implement MediaWiki phase
four in the server-side JavaScript. Then the npm way is probably the only way
forward.

Saper

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid still doesn't love me

2015-11-06 Thread Subramanya Sastry



Parsoid is simply a wikitext -> html and a html -> wikitext conversion 
service. Everything else would be tools and libs built on top of it.


Subbu.

On 11/06/2015 02:29 PM, Ricordisamoa wrote:
What if I need to get all revisions (~2000) of a page in Parsoid 
HTML5? The prop=revisions API (in batches of 50) with mwparserfromhell 
is much quicker.
And what about ~400 revisions from a wiki without Parsoid/RESTBase? I 
would use /transform/wikitext/to/html then.

Thanks in advance.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Parsoid still doesn't love me

2015-11-06 Thread Ricordisamoa

What if I need to get all revisions (~2000) of a page in Parsoid HTML5? 
The prop=revisions API (in batches of 50) with mwparserfromhell is much 
quicker.
And what about ~400 revisions from a wiki without Parsoid/RESTBase? I 
would use /transform/wikitext/to/html then.

Thanks in advance.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Tyler Romeo

That's a pretty good point. Despite my comments, I'll definitely keep an open 
mind, and am interested in what people might propose.

-- 
Tyler Romeo
https://parent5446.nyc
0x405D34A7C86B42DF

From: C. Scott Ananian 
Reply: C. Scott Ananian 
Date: November 6, 2015 at 15:01:59
To: Tyler Romeo 
CC: Wikimedia developers 
Subject:  Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`  

Tyler: I hear you.  I'm not sure it's a good idea, either -- especially not for 
core extensions used in production.

But it does perhaps allow some expansion of our developer community on the 
fringes, and makes writing extensions possible for a larger set of people?  And 
perhaps there are some cool things written in JavaScript which the extended 
community could more easily hook up to MediaWiki using `php-embed`.

I'm not sure that there are.  I'm just opening up the discussion to see if 
anyone pipes up with, "oh, yeah, I've always wanted to do XYZ!".

Greg: I agree re: premature stifling of discussion.  I'm just saying that 
"high-level" conversation is already happening elsewhere, and it's more 
productive there.  I started *this* particular thread trying to elicit 
discussion more narrowly focused on the thing I've just built.
  --scott

On Fri, Nov 6, 2015 at 2:30 PM, Tyler Romeo  wrote:
I would very, *very* much prefer to not have MediaWiki core extensions written 
in JavaScript. Even beyond my criticisms of JavaScript as a language, I feel 
like that just unnecessarily introduces complexity. The purpose of this wrapper 
is to combine separate micro-services that would otherwise be run in separate 
VMs / servers / etc. so that it can easily be run in a hosting setup.

Otherwise, I'm interested in what implications this will have, especially for 
making MediaWiki easier to install and use, which would be awesome.

-- 
Tyler Romeo
https://parent5446.nyc
0x405D34A7C86B42DF

From: C. Scott Ananian 
Reply: Wikimedia developers 
Date: November 6, 2015 at 14:14:13
To: Wikimedia developers 
Subject:  Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Let's not let this discussion sidetrack into "shared hosting vs VMs (vs
docker?)" --- there's another phabricator ticket and summit topic for that (
https://phabricator.wikimedia.org/T87774 and
https://phabricator.wikimedia.org/T113210.

I'd prefer to have discussion in *this* particular task/thread concentrate
on:

* Hey, we can have JavaScript and PHP in the same packaging system. What
cool things might that enable?

* Hey, we can have JavaScript and PHP running together in the same server.
Perhaps some persistence-related issues with PHP can be made easier?

* Hey, we can actually write *extensions for mediawiki-core* in JavaScript
(or CoffeeScript, or...) now. Or run PHP code inside Parsoid. How could
we use that? (Could it grow developer communities?)

* How are parser extensions (like, say, WikiHiero, but there are lots of
them) going to be managed in the long term? There are three separate
codebases to hook right now. An extension like  might eventually
need to hook the image thumbnail service, too. Do we have a plan?

And the pro/anti-npm and pro/anti-docker and pro/anti-VM discussion can go
into one of those other tasks. Thanks.

--scott
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



--
(http://cscott.net)

signature.asc
Description: Message signed with OpenPGP using AMPGpg
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread C. Scott Ananian

Tyler: I hear you.  I'm not sure it's a good idea, either -- especially not
for core extensions used in production.

But it does perhaps allow some expansion of our developer community on the
fringes, and makes writing extensions possible for a larger set of people?
And perhaps there are some cool things written in JavaScript which the
extended community could more easily hook up to MediaWiki using `php-embed`.

I'm not sure that there are.  I'm just opening up the discussion to see if
anyone pipes up with, "oh, yeah, I've always wanted to do XYZ!".

Greg: I agree re: premature stifling of discussion.  I'm just saying that
"high-level" conversation is already happening elsewhere, and it's more
productive there.  I started *this* particular thread trying to elicit
discussion more narrowly focused on the thing I've just built.
  --scott

On Fri, Nov 6, 2015 at 2:30 PM, Tyler Romeo  wrote:

> I would very, *very* much prefer to not have MediaWiki core extensions
> written in JavaScript. Even beyond my criticisms of JavaScript as a
> language, I feel like that just unnecessarily introduces complexity. The
> purpose of this wrapper is to combine separate micro-services that would
> otherwise be run in separate VMs / servers / etc. so that it can easily be
> run in a hosting setup.
>
> Otherwise, I'm interested in what implications this will have, especially
> for making MediaWiki easier to install and use, which would be awesome.
>
> --
> Tyler Romeo
> https://parent5446.nyc
> 0x405D34A7C86B42DF
>
> From: C. Scott Ananian  
> Reply: Wikimedia developers 
> 
> Date: November 6, 2015 at 14:14:13
> To: Wikimedia developers 
> 
> Subject:  Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`
>
> Let's not let this discussion sidetrack into "shared hosting vs VMs (vs
> docker?)" --- there's another phabricator ticket and summit topic for that
> (
> https://phabricator.wikimedia.org/T87774 and
> https://phabricator.wikimedia.org/T113210.
>
> I'd prefer to have discussion in *this* particular task/thread concentrate
> on:
>
> * Hey, we can have JavaScript and PHP in the same packaging system. What
> cool things might that enable?
>
> * Hey, we can have JavaScript and PHP running together in the same server.
> Perhaps some persistence-related issues with PHP can be made easier?
>
> * Hey, we can actually write *extensions for mediawiki-core* in JavaScript
> (or CoffeeScript, or...) now. Or run PHP code inside Parsoid. How could
> we use that? (Could it grow developer communities?)
>
> * How are parser extensions (like, say, WikiHiero, but there are lots of
> them) going to be managed in the long term? There are three separate
> codebases to hook right now. An extension like  might eventually
> need to hook the image thumbnail service, too. Do we have a plan?
>
> And the pro/anti-npm and pro/anti-docker and pro/anti-VM discussion can go
> into one of those other tasks. Thanks.
>
> --scott
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>


-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Tyler Romeo

I would very, *very* much prefer to not have MediaWiki core extensions written 
in JavaScript. Even beyond my criticisms of JavaScript as a language, I feel 
like that just unnecessarily introduces complexity. The purpose of this wrapper 
is to combine separate micro-services that would otherwise be run in separate 
VMs / servers / etc. so that it can easily be run in a hosting setup.

Otherwise, I'm interested in what implications this will have, especially for 
making MediaWiki easier to install and use, which would be awesome.

-- 
Tyler Romeo
https://parent5446.nyc
0x405D34A7C86B42DF

From: C. Scott Ananian 
Reply: Wikimedia developers 
Date: November 6, 2015 at 14:14:13
To: Wikimedia developers 
Subject:  Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`  

Let's not let this discussion sidetrack into "shared hosting vs VMs (vs
docker?)" --- there's another phabricator ticket and summit topic for that (
https://phabricator.wikimedia.org/T87774 and
https://phabricator.wikimedia.org/T113210.

I'd prefer to have discussion in *this* particular task/thread concentrate
on:

* Hey, we can have JavaScript and PHP in the same packaging system. What
cool things might that enable?

* Hey, we can have JavaScript and PHP running together in the same server.
Perhaps some persistence-related issues with PHP can be made easier?

* Hey, we can actually write *extensions for mediawiki-core* in JavaScript
(or CoffeeScript, or...) now. Or run PHP code inside Parsoid. How could
we use that? (Could it grow developer communities?)

* How are parser extensions (like, say, WikiHiero, but there are lots of
them) going to be managed in the long term? There are three separate
codebases to hook right now. An extension like  might eventually
need to hook the image thumbnail service, too. Do we have a plan?

And the pro/anti-npm and pro/anti-docker and pro/anti-VM discussion can go
into one of those other tasks. Thanks.

--scott
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

signature.asc
Description: Message signed with OpenPGP using AMPGpg
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Greg Grossmeier


> And the pro/anti-npm and pro/anti-docker and pro/anti-VM discussion can go
> into one of those other tasks.  Thanks.

Pre-mature segmentation into a walled-off subset of problems has
similarly bad inherent issues with pre-mature incorporation into larger
problems; it's a trade off. Do we want to discuss the issue at a high
level? Yes. Do we want to discuss at a low-level/specific
implementation? Also yes. But let others discuss the high level if they
want (while others should discuss the low-level if they want). Separate
threads maybe :)

Greg

-- 
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| identi.ca: @gregA18D 1138 8E47 FAC8 1C7D |

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread C. Scott Ananian

Let's not let this discussion sidetrack into "shared hosting vs VMs (vs
docker?)" --- there's another phabricator ticket and summit topic for that (
https://phabricator.wikimedia.org/T87774 and
https://phabricator.wikimedia.org/T113210.

I'd prefer to have discussion in *this* particular task/thread concentrate
on:

* Hey, we can have JavaScript and PHP in the same packaging system.  What
cool things might that enable?

* Hey, we can have JavaScript and PHP running together in the same server.
Perhaps some persistence-related issues with PHP can be made easier?

* Hey, we can actually write *extensions for mediawiki-core* in JavaScript
(or CoffeeScript, or...) now.  Or run PHP code inside Parsoid.  How could
we use that?  (Could it grow developer communities?)

* How are parser extensions (like, say, WikiHiero, but there are lots of
them) going to be managed in the long term?  There are three separate
codebases to hook right now.  An extension like  might eventually
need to hook the image thumbnail service, too.  Do we have a plan?

And the pro/anti-npm and pro/anti-docker and pro/anti-VM discussion can go
into one of those other tasks.  Thanks.

 --scott
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

2015-11-06 Thread Ryan Lane

On Thu, Nov 5, 2015 at 5:38 PM, C. Scott Ananian 
wrote:

> I view it as partly an effort to counteract the perceived complexity of
> running a forest full of separate services.  It's fine to say they're all
> preinstalled in this VM image, but that's still a lot of complexity to dig
> through: where are the all the servers? What ports are they listening all?
> Did one of them crash?  How do I restart it?
>
>
When you run docker-compose, your containers are linked together. If you
have the following containers:

parsoid
mathoid
mediawiki
mysql (hopefully not)
cassandra
redis

You'd talk to redis from mediawiki via: redis://redis:6379 and you'd talk
to parsoid via: http://parsoid and to mathoid via http://mathoid, etc etc.
It handles the networking for you.

If one of them crash then docker compose will tell you. If any of them fail
to start it will also tell you.

I'm not even a huge proponent of docker, but the docker-compose solution
for this is way more simple and way more standard than what you're
proposing and it doesn't investing a ton of effort into something that no
one other project will ever consider using.

Ride the ocean on the big boat, not the life raft.

> For some users, the VM (or an actual server farm) is indeed the right
> solution.  But this was an attempt to see if I could recapture the
> "everything's here in this one process (and one code tree)" simplicity for
> those for whom that's good enough.
>

There's no server farm here. If you're running linux it's just a set of
processes running in containers on a single node (which could be your
laptop). If you're on OSX or Windows it's a VM, but that can be totally
abstracted away using Vagrant.

If you're launching in the cloud, you could launch directly to joyent or
AWS ECS or very easily stand something up on digital ocean. If you're
really feeling like making things easier for end-users, provide
orchestration code that will automatically provision MW and its depedencies
via docker-compose in a VM in one of these services.

Orchestration + containers is what most people are doing for microservices.
Don't make something complex to maintain that's completely out of the
ordinary out of fears of complexity. Go with the solutions everyone else is
using and wrap tooling around it to make it easier for people.

- Ryan
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Forking, branching, merging, and drafts on Wikipedia

2015-11-06 Thread C. Scott Ananian

There is a proposal for the upcoming Mediawiki Dev Summit to get us
"unstuck" on support for non-linear revision histories in Wikipedia.  This
would include support for "saved drafts" of wikipedia edits and offline
editing support, as well as a more permissive/friendly 'fork first' model
of article collaboration.

I outlined some proposed summit goals for the topic, but it needs a bit of
help if it is going to make the cut for inclusion.  I hope interested folks
will weigh in with some comments on
https://phabricator.wikimedia.org/T113004 --- perhaps suggesting specific
"next step" projects, for instance.

Thanks for your help.
 --scott

-- 
(http://cscott.net)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread Subramanya Sastry

On 11/06/2015 11:15 AM, James Montalvo wrote:

Thanks for the responses. I do want to convert HTML that cannot be assumed
to be clean, so it sounds like Parsoid will not solve the problem for now.

If you give us a sample of the kind of HTML you are looking at, we can
see what kind of wikitext comes up and if there are simple tweaks that
can fix any problems.

You can also try @ http://parsoid-lb.eqiad.wikimedia.org/_html/

Note that this public access point to Parsoid will not be around much
longer.

Subbu.

--James

On Fri, Nov 6, 2015 at 11:06 AM, Gabriel Wicke wrote:

To add to what Eric & Subbu have said, here is a link to the API
documentation for this end point:

https://en.wikipedia.org/api/rest_v1/?doc#!/Transforms/post_transform_html_to_wikitext_title_revision

On Fri, Nov 6, 2015 at 8:47 AM, Subramanya Sastry
wrote:

On 11/06/2015 10:18 AM, James Montalvo wrote:

Can Parsoid be used to convert arbitrary HTML to wikitext? It's not

clear

to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering

I could take snippets of HTML from non-MediaWiki webpages and convert

them

into wikitext.

The right answer is: "It depends" :-)

As Eric responded in his reply, Parsoid does convert some kinds of
arbitrary HTML to clean wikitext. See some additional examples at the end
of this email.

However, if you really threw arbitrary HTML at it (ex: .. or
..) Parsoid wouldn't know that it could potentially use

or ''' for those tags. Or, if you gave it input with all kinds of css and
other inlined attributes, you won't necessarily get the best wikitext

from

it.

But, if you tried to convert HTML that you got from say Google docs, Open
Office, Word, or other HTML-generation tools, the wikitext you get may

not

be very pretty.

We do want to keep improving Parsoid's abilities to get there, but it has
not been a high priority for us, but it would be a great GSoC or

volunteer

project if someone wants to play with this and improve this feature given
that we are always playing catch up with all the other things we need to
get done.

But, if you didn't have really arbitrary HTML, you can get some

reasonable

looking wikitext out of it even without the markers. But, things like
images, templates, extensions .. obviously require the additional
attributes for Parsoid to generate canonical wikitext for that.

Hope this helps.

Subbu.

---

Some html -> wt examples:

[subbu@earth bin] echo "fooab" | node parse
--html2wt
== foo ==
a

b
[subbu@earth bin] echo "Hampi"

| node parse --html2wt
[[Hampi]]

[subbu@earth bin] echo "Luna"

| node parse --html2wt
[[:it:Luna|Luna]]

[subbu@earth bin] echo "Luna"

| node parse --html2wt --prefix itwiki
[[Luna]]

[subbu@earth bin] echo "abc" | node
parse --html2wt
* a
* b
* c

[subbu@earth bin] echo foo" | node parse --html2wt
foo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

--
Gabriel Wicke
Principal Engineer, Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread James Montalvo

Thanks for the responses. I do want to convert HTML that cannot be assumed
to be clean, so it sounds like Parsoid will not solve the problem for now.

--James

On Fri, Nov 6, 2015 at 11:06 AM, Gabriel Wicke  wrote:

> To add to what Eric & Subbu have said, here is a link to the API
> documentation for this end point:
>
>
> https://en.wikipedia.org/api/rest_v1/?doc#!/Transforms/post_transform_html_to_wikitext_title_revision
>
> On Fri, Nov 6, 2015 at 8:47 AM, Subramanya Sastry 
> wrote:
>
> > On 11/06/2015 10:18 AM, James Montalvo wrote:
> >
> >> Can Parsoid be used to convert arbitrary HTML to wikitext? It's not
> clear
> >> to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering
> if
> >> I could take snippets of HTML from non-MediaWiki webpages and convert
> them
> >> into wikitext.
> >>
> >
> > The right answer is: "It depends" :-)
> >
> > As Eric responded in his reply, Parsoid does convert some kinds of
> > arbitrary HTML to clean wikitext. See some additional examples at the end
> > of this email.
> >
> > However, if you really threw arbitrary HTML at it (ex: .. or
> > ..) Parsoid wouldn't know that it could potentially use
> ''
> > or ''' for those tags. Or, if you gave it input with all kinds of css and
> > other inlined attributes, you won't necessarily get the best wikitext
> from
> > it.
> >
> > But, if you tried to convert HTML that you got from say Google docs, Open
> > Office, Word, or other HTML-generation tools, the wikitext you get may
> not
> > be very pretty.
> >
> > We do want to keep improving Parsoid's abilities to get there, but it has
> > not been a high priority for us, but it would be a great GSoC or
> volunteer
> > project if someone wants to play with this and improve this feature given
> > that we are always playing catch up with all the other things we need to
> > get done.
> >
> > But, if you didn't have really arbitrary HTML, you can get some
> reasonable
> > looking wikitext out of it even without the markers. But, things like
> > images, templates, extensions .. obviously require the additional
> > attributes for Parsoid to generate canonical wikitext for that.
> >
> > Hope this helps.
> >
> > Subbu.
> >
> >
> >
> ---
> >
> > Some html -> wt examples:
> >
> > [subbu@earth bin] echo "fooab" | node parse
> > --html2wt
> > == foo ==
> > a
> >
> > b
> > [subbu@earth bin] echo "Hampi"
> > | node parse --html2wt
> > [[Hampi]]
> >
> > [subbu@earth bin] echo "Luna"
> > | node parse --html2wt
> > [[:it:Luna|Luna]]
> >
> > [subbu@earth bin] echo "Luna"
> > | node parse --html2wt --prefix itwiki
> > [[Luna]]
> >
> > [subbu@earth bin] echo "abc" | node
> > parse --html2wt
> > * a
> > * b
> > * c
> >
> > [subbu@earth bin] echo foo" | node parse --html2wt
> > foo
> >
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
>
>
>
> --
> Gabriel Wicke
> Principal Engineer, Wikimedia Foundation
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] MW support for Composer equivalent for JavaScript packages

2015-11-06 Thread Bryan Davis

(such top posting! much discussion!)

It would be great to see some of the salient points of this discussion
captured on the current phab ticket [0] on the topic.

[0]: https://phabricator.wikimedia.org/T107561

Bryan

On Fri, Nov 6, 2015 at 1:03 AM, Daniel Friesen
 wrote:
> On 2015-11-05 1:30 PM, C. Scott Ananian wrote:
>> Two other interesting pieces:
>>
>> 1. http://requirejs.org/ is still the goal-standard for async browser-type
>> loading, AFAIK, and there are good packages (`npm install requirejs`) that
>> allow interoperability with the "npm style".
> requirejs is still built for the same single application type as the
> other non-async loaders. You may not be able to get even requirejs to
> work with MediaWiki's needed pattern of different packages required by
> different extensions all integrated together without the use of node on
> the server that needs to combine them together.
>
>> 2. The recently-completed ES6 spec contains a module format.  You can
>> already use it via compatibility thunks from many existing module systems.
>> It may well see increased use, especially on the web as browsers implement
>> the spec (which is happening quickly).  There is resistance in the node
>> community to adopting ES6 modules, but it may be that we are at an
>> inflection point and ES6 will eventually win out.
> ES6 modules have a different pattern for how exports are treated,
> especially in regards to the 'default' export.
> To handle this case babel inserts the following when you are transpiling
> ES6/ES2015 module syntax:
> Object.defineProperty(exports, '__esModule', {
>   value: true
> });
>
> Unless you explicitly enable loose mode any library you develop as an es
> module and anything that uses it will instantly break in browsers like IE8.
>
> Use of ES6 import/export is gaining more and more adoption. But this
> might be one reason some people are holding back on it.
>
> es modules also have different behavior in regards to cycles that I know
> esperanto handles but I'm not sure about the other transpilers.
>
> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]
>
>>   --scott
>>
>> On Thu, Nov 5, 2015 at 3:24 PM, Daniel Friesen 
>> wrote:
>>
>>> As someone who now works in js/node primarily I might be able to add a
>>> little outside insight.
>>>
>>> IMHO bower was probably killed far before npm@3 came out.
>>>
>>> - npm has always been more advanced
>>> - bower's main isn't as reliable as it is in npm. Even if modules aren't
>>> forgetting to set it as much as they used to I still see hard to deal
>>> with things like modules that include .sass sources in it.
>>> - bower's distribution model is actually very problematic for upstream
>>> developers. bower clones git repositories directly expecting to find
>>> built/compiled js and has no local build hooks. Few major js libraries
>>> work in classic js/css sources anymore, instead having some required
>>> build step. Bower makes a complete mess of this. Requiring the build
>>> result to be committed with each change, an automated build bot, a
>>> second git repo for build results only, or the upstream dev to just not
>>> care about bower.
>>> - Thanks to the rise of browserify and the like more and more
>>> client-only libraries have moved to npm despite people traditionally
>>> thinking of it as being for nodejs modules. Most client only libraries
>>> now exist in npm. And if you wave the name browserify around you can get
>>> just about any js library to start distributing to npm.
>>>   - This has actually gone so far that once when I added a contribution
>>> to a library I noticed that they were actually forgetting to keep their
>>> bower.json up to date.
>>>
>>> npm@3 is also probably not as important as it's being made out here.
>>>
>>> npm@3 still doesn't guarantee a tree will be 100% flat. Most of npm@3's
>>> changes fix small quirks in front-end usage and stability issues with npm.
>>>
>>> The 'major' change of 'flatness' is really that when installing `a` that
>>> depends on `b`, `b` will be hoisted up to the same level as `a`, if
>>> possible, instead of always being installed under `a`. npm@2 did have
>>> some annoying quirks during development/updating that could leave a
>>> duplicate module until you recreated your node_modules tree. And there
>>> was the side case of installing two modules that both depended on
>>> something you did not depend on. But that could be dealt with by either
>>> adding it as a dep yourself or running dedupe.
>>>
>>> 
>>>
>>> The bad news is that while more and more libraries are moving to npm the
>>> underlying reason for many being on npm is for the many users using
>>> tools like browserify and webpack. So the assumption of many libraries
>>> is that when you use npm for client-side libraries is you are using it
>>> in a node-style/CommonJS way where require is available other npm
>>> dependencies can be used through it and you're not including things in
>>> the traditional way of

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread Gabriel Wicke

To add to what Eric & Subbu have said, here is a link to the API
documentation for this end point:

https://en.wikipedia.org/api/rest_v1/?doc#!/Transforms/post_transform_html_to_wikitext_title_revision

On Fri, Nov 6, 2015 at 8:47 AM, Subramanya Sastry 
wrote:

> On 11/06/2015 10:18 AM, James Montalvo wrote:
>
>> Can Parsoid be used to convert arbitrary HTML to wikitext? It's not clear
>> to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering if
>> I could take snippets of HTML from non-MediaWiki webpages and convert them
>> into wikitext.
>>
>
> The right answer is: "It depends" :-)
>
> As Eric responded in his reply, Parsoid does convert some kinds of
> arbitrary HTML to clean wikitext. See some additional examples at the end
> of this email.
>
> However, if you really threw arbitrary HTML at it (ex: .. or
> ..) Parsoid wouldn't know that it could potentially use ''
> or ''' for those tags. Or, if you gave it input with all kinds of css and
> other inlined attributes, you won't necessarily get the best wikitext from
> it.
>
> But, if you tried to convert HTML that you got from say Google docs, Open
> Office, Word, or other HTML-generation tools, the wikitext you get may not
> be very pretty.
>
> We do want to keep improving Parsoid's abilities to get there, but it has
> not been a high priority for us, but it would be a great GSoC or volunteer
> project if someone wants to play with this and improve this feature given
> that we are always playing catch up with all the other things we need to
> get done.
>
> But, if you didn't have really arbitrary HTML, you can get some reasonable
> looking wikitext out of it even without the markers. But, things like
> images, templates, extensions .. obviously require the additional
> attributes for Parsoid to generate canonical wikitext for that.
>
> Hope this helps.
>
> Subbu.
>
>
> ---
>
> Some html -> wt examples:
>
> [subbu@earth bin] echo "fooab" | node parse
> --html2wt
> == foo ==
> a
>
> b
> [subbu@earth bin] echo " href='http://en.wikipedia.org/wiki/Hampi'>Hampi"
> | node parse --html2wt
> [[Hampi]]
>
> [subbu@earth bin] echo "Luna"
> | node parse --html2wt
> [[:it:Luna|Luna]]
>
> [subbu@earth bin] echo "Luna"
> | node parse --html2wt --prefix itwiki
> [[Luna]]
>
> [subbu@earth bin] echo "abc" | node
> parse --html2wt
> * a
> * b
> * c
>
> [subbu@earth bin] echo foo" | node parse --html2wt
> foo
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
Gabriel Wicke
Principal Engineer, Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread Subramanya Sastry


On 11/06/2015 10:18 AM, James Montalvo wrote:

Can Parsoid be used to convert arbitrary HTML to wikitext? It's not clear
to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering if
I could take snippets of HTML from non-MediaWiki webpages and convert them
into wikitext.


The right answer is: "It depends" :-)

As Eric responded in his reply, Parsoid does convert some kinds of 
arbitrary HTML to clean wikitext. See some additional examples at the 
end of this email.


However, if you really threw arbitrary HTML at it (ex: .. or 
..) Parsoid wouldn't know that it could potentially use 
'' or ''' for those tags. Or, if you gave it input with all kinds of css 
and other inlined attributes, you won't necessarily get the best 
wikitext from it.


But, if you tried to convert HTML that you got from say Google docs, 
Open Office, Word, or other HTML-generation tools, the wikitext you get 
may not be very pretty.


We do want to keep improving Parsoid's abilities to get there, but it 
has not been a high priority for us, but it would be a great GSoC or 
volunteer project if someone wants to play with this and improve this 
feature given that we are always playing catch up with all the other 
things we need to get done.


But, if you didn't have really arbitrary HTML, you can get some 
reasonable looking wikitext out of it even without the markers. But, 
things like images, templates, extensions .. obviously require the 
additional attributes for Parsoid to generate canonical wikitext for that.


Hope this helps.

Subbu.

---

Some html -> wt examples:

[subbu@earth bin] echo "fooab" | node parse --html2wt
== foo ==
a

b
[subbu@earth bin] echo "href='http://en.wikipedia.org/wiki/Hampi'>Hampi" | node parse --html2wt

[[Hampi]]

[subbu@earth bin] echo "href='http://it.wikipedia.org/wiki/Luna'>Luna" | node parse --html2wt

[[:it:Luna|Luna]]

[subbu@earth bin] echo "href='http://it.wikipedia.org/wiki/Luna'>Luna" | node parse 
--html2wt --prefix itwiki

[[Luna]]

[subbu@earth bin] echo "abc" | node 
parse --html2wt

* a
* b
* c

[subbu@earth bin] echo foo" | node parse --html2wt
foo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread James Montalvo

Thanks for the quick response. Is there a simple way to do this without
RESTBase?
On Nov 6, 2015 10:32 AM, "Eric Evans"  wrote:

> On Fri, Nov 6, 2015 at 10:18 AM, James Montalvo 
> wrote:
>
> > Can Parsoid be used to convert arbitrary HTML to wikitext? It's not clear
> > to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering
> if
> > I could take snippets of HTML from non-MediaWiki webpages and convert
> them
> > into wikitext.
> >
>
> That is possible, yes.  For example (via RESTBase):
>
> curl -X POST --header "Content-Type: application/x-www-form-urlencoded"
> --header "Accept: text/plain; profile="mediawiki.org/specs/wikitext/1.0.0
> ""
> -d 'html=HeadingHello world' "
> https://en.wikipedia.org/api/rest_v1/transform/html/to/wikitext";
>
>
> --
> Eric Evans
> eev...@wikimedia.org
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread Eric Evans

On Fri, Nov 6, 2015 at 10:18 AM, James Montalvo 
wrote:

> Can Parsoid be used to convert arbitrary HTML to wikitext? It's not clear
> to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering if
> I could take snippets of HTML from non-MediaWiki webpages and convert them
> into wikitext.
>

That is possible, yes.  For example (via RESTBase):

curl -X POST --header "Content-Type: application/x-www-form-urlencoded"
--header "Accept: text/plain; profile="mediawiki.org/specs/wikitext/1.0.0""
-d 'html=HeadingHello world' "
https://en.wikipedia.org/api/rest_v1/transform/html/to/wikitext";


-- 
Eric Evans
eev...@wikimedia.org
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Parsoid convert arbitrary HTML?

2015-11-06 Thread James Montalvo

Can Parsoid be used to convert arbitrary HTML to wikitext? It's not clear
to me whether it will only work with Parsoid's HTML+RDFa. I'm wondering if
I could take snippets of HTML from non-MediaWiki webpages and convert them
into wikitext.

Thanks,
James
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Superprotect is gone

2015-11-06 Thread Jérémie Roquet

2015-11-05 18:34 GMT+01:00 Quim Gil :
> today we are removing Superprotect from Wikimedia servers.

This is great news, thanks!

-- 
Jérémie

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [RfC/Summit] Shadow namespaces

2015-11-06 Thread Legoktm

Hi,

I'm behind on my summit planning, but here's what I'd like to discuss at
the developer summit for the Shadow namespaces RfC[1].

1. At what layer should shadow namespaces be integrated into the code?
Should we continue to subclass Article? How do we efficiently implement
things like batch lookups? (T88644)

2. How will we integrate remote content with search and other discovery
mechanisms? We currently have Commons file results integrated with
normal search results, but it's usefulness is questionable (T96535).
Will it be possible to implement this for people using remote content
via the API?

3. How will we keep track of usage and invalidate caches? Should we just
have a central link table like GlobalUsage? What about API users?

4. Should we allow people to chain "repositories" like we currently
allow with file repos?

The outcome I'd like to see is having a plan ready on how to migrate the
foreign file system to a generic arbitrary namespace one, which should
completely obsolete the GlobalUserPage extension.

The Phabricator task for this session at the summit is T115762[2].

[1] https://www.mediawiki.org/wiki/Requests_for_comment/Shadow_namespaces
[2] https://phabricator.wikimedia.org/T115762

-- Legoktm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] mediawiki-extension cli extension installer/updater released

2015-11-06 Thread Daniel Friesen

What's the installation process for an extension?

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

On 2015-11-05 1:31 PM, C. Scott Ananian wrote:
> Do you think this would work with
> https://github.com/cscott/node-mediawiki-express ?
>  --scott
>
> On Thu, Oct 8, 2015 at 4:12 AM, Brion Vibber  wrote:
>
>> Nice!
>>
>> -- brion
>>
>> On Thursday, October 8, 2015, Daniel Friesen > > wrote:
>>
>>> As part of a side project in my job at Redwerks[1], I've released an
>>> early version of a mediawiki-extension cli tool.
>>>
>>>   https://github.com/redwerks/mediawiki-extension-command#readme
>>>
>>> The tool requires Node.js (in addition to git and php to run clones and
>>> composer).
>>> It can be installed with `sudo npm install -g mediawiki-extension` and
>>> then `mediawiki-extension setup`.
>>>
>>>
>>> The command can download and upgrade any extension we have in Gerrit.
>>> Extensions using composer will automatically be installed via composer
>>> otherwise it'll be cloned from git. If available, release tags (like
>>> "1.0.0" or "v3.0.0") will be used, otherwise master will be used.
>>>
>>> You'll still need to require and configure the extension yourself. But
>>> this is supposed to greatly simplify acquiring and bulk-updating
>>> extensions via the cli.
>>>
>>> Some examples of use.
>>>
>>> Clone ParserFunctions from git.
>>>   $ mediawiki-extension download ParserFunctions
>>>
>>> Install SemanticMediaWiki and SemanticForms with composer.
>>>   $ mediawiki-extension download SemanticMediaWiki SemanticForms
>>>
>>> Clone Widgets with from git and checkout the most recent version tag.
>>>   $ mediawiki-extension download Widgets
>>>
>>> Update all your MediaWiki extensions:
>>>   $ mediawiki-extension update --all
>>>
>>> Switch an extension cloned from git master to the REL branch for the
>>> installed version of MediaWiki.
>>>   $ mediawiki-extension switch ParserFunctions git-rel
>>>
>>>
>>> [1] http://redwerks.org/
>>>
>>> --
>>> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]
>>>
>>>
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] MW support for Composer equivalent for JavaScript packages

2015-11-06 Thread Daniel Friesen

On 2015-11-05 1:30 PM, C. Scott Ananian wrote:
> Two other interesting pieces:
>
> 1. http://requirejs.org/ is still the goal-standard for async browser-type
> loading, AFAIK, and there are good packages (`npm install requirejs`) that
> allow interoperability with the "npm style".
requirejs is still built for the same single application type as the
other non-async loaders. You may not be able to get even requirejs to
work with MediaWiki's needed pattern of different packages required by
different extensions all integrated together without the use of node on
the server that needs to combine them together.

> 2. The recently-completed ES6 spec contains a module format.  You can
> already use it via compatibility thunks from many existing module systems.
> It may well see increased use, especially on the web as browsers implement
> the spec (which is happening quickly).  There is resistance in the node
> community to adopting ES6 modules, but it may be that we are at an
> inflection point and ES6 will eventually win out.
ES6 modules have a different pattern for how exports are treated,
especially in regards to the 'default' export.
To handle this case babel inserts the following when you are transpiling
ES6/ES2015 module syntax:
Object.defineProperty(exports, '__esModule', {
  value: true
});

Unless you explicitly enable loose mode any library you develop as an es
module and anything that uses it will instantly break in browsers like IE8.

Use of ES6 import/export is gaining more and more adoption. But this
might be one reason some people are holding back on it.

es modules also have different behavior in regards to cycles that I know
esperanto handles but I'm not sure about the other transpilers.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

>   --scott
>
> On Thu, Nov 5, 2015 at 3:24 PM, Daniel Friesen 
> wrote:
>
>> As someone who now works in js/node primarily I might be able to add a
>> little outside insight.
>>
>> IMHO bower was probably killed far before npm@3 came out.
>>
>> - npm has always been more advanced
>> - bower's main isn't as reliable as it is in npm. Even if modules aren't
>> forgetting to set it as much as they used to I still see hard to deal
>> with things like modules that include .sass sources in it.
>> - bower's distribution model is actually very problematic for upstream
>> developers. bower clones git repositories directly expecting to find
>> built/compiled js and has no local build hooks. Few major js libraries
>> work in classic js/css sources anymore, instead having some required
>> build step. Bower makes a complete mess of this. Requiring the build
>> result to be committed with each change, an automated build bot, a
>> second git repo for build results only, or the upstream dev to just not
>> care about bower.
>> - Thanks to the rise of browserify and the like more and more
>> client-only libraries have moved to npm despite people traditionally
>> thinking of it as being for nodejs modules. Most client only libraries
>> now exist in npm. And if you wave the name browserify around you can get
>> just about any js library to start distributing to npm.
>>   - This has actually gone so far that once when I added a contribution
>> to a library I noticed that they were actually forgetting to keep their
>> bower.json up to date.
>>
>> npm@3 is also probably not as important as it's being made out here.
>>
>> npm@3 still doesn't guarantee a tree will be 100% flat. Most of npm@3's
>> changes fix small quirks in front-end usage and stability issues with npm.
>>
>> The 'major' change of 'flatness' is really that when installing `a` that
>> depends on `b`, `b` will be hoisted up to the same level as `a`, if
>> possible, instead of always being installed under `a`. npm@2 did have
>> some annoying quirks during development/updating that could leave a
>> duplicate module until you recreated your node_modules tree. And there
>> was the side case of installing two modules that both depended on
>> something you did not depend on. But that could be dealt with by either
>> adding it as a dep yourself or running dedupe.
>>
>> 
>>
>> The bad news is that while more and more libraries are moving to npm the
>> underlying reason for many being on npm is for the many users using
>> tools like browserify and webpack. So the assumption of many libraries
>> is that when you use npm for client-side libraries is you are using it
>> in a node-style/CommonJS way where require is available other npm
>> dependencies can be used through it and you're not including things in
>> the traditional way of libraries being run one after another on the page
>> in the global scope like in ResourceLoader.
>>
>> It's probably going to get harder and harder to reconcile deps shared
>> between extensions and/or use library code itself without having node
>> installed on servers. For that matter half the advancements in the css
>> space probably won't be making their way to php either.
>>
>>

Re: [Wikitech-l] Community Tech: October report

Re: [Wikitech-l] Forking, branching, merging, and drafts on Wikipedia

Re: [Wikitech-l] Forking, branching, merging, and drafts on Wikipedia

Re: [Wikitech-l] Community Tech: October report

Re: [Wikitech-l] Parsoid still doesn't love me

Re: [Wikitech-l] Parsoid still doesn't love me

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] Parsoid still doesn't love me

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] Parsoid still doesn't love me

[Wikitech-l] Parsoid still doesn't love me

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

Re: [Wikitech-l] [RFC/Summit] `npm install mediawiki-express`

[Wikitech-l] Forking, branching, merging, and drafts on Wikipedia

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

Re: [Wikitech-l] MW support for Composer equivalent for JavaScript packages

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

Re: [Wikitech-l] Parsoid convert arbitrary HTML?

[Wikitech-l] Parsoid convert arbitrary HTML?

Re: [Wikitech-l] Superprotect is gone

[Wikitech-l] [RfC/Summit] Shadow namespaces

Re: [Wikitech-l] mediawiki-extension cli extension installer/updater released

Re: [Wikitech-l] MW support for Composer equivalent for JavaScript packages

34 matches

Site Navigation

Mail list logo

Footer information